I’m a fifth year Computer Science PhD student at The Ohio State University. I feel very fortunate to be advised by Prof. Yu Su and form part of the wonderful OSU NLP lab.

Research Interests

My research interests are broadly centered around two mutually-inspiring directions. On one hand, my translational research focuses on building NLP for knowledge-rich domains, especially biomedicine. On the other hand, my familiarity with neuroscience and substantial biomedical NLP experience has inspired me to address fundamental problems with current LLMs with neurobiological insights, especially around their long-term memory.

My translational research has focused on understanding how to best leverage language models for a wide variety of biomedical NLP tasks such as document classification, information extraction and knowledge base construction. I am passionate about guiding NLP practitioners in these domains through the complex LLM landscape and help them choose the best techniques for their specific real-world problems.

My more fundamental research focuses on exploring the interdependence between memorization and reasoning in language models. Humans can succesfully use their powerful reasoning capabilities in domains they know nothing about. However, it is still unclear whether LLMs’ reasoning capacity is similarly disentangled from their more obvious semantic representation and memorization abilities.

In our most recent work, HippoRAG, we introduce a retrieval-augmented generation (RAG) framework inspired by human long-term memory which is remarkably simple but also strong and flexible. The somewhat surprising success of our first foray in this direction suggests that mimicking our brain’s architecture could be a useful framework for building future LLMs, especially those which can more effectively separate reasoning and memorization.

News

5/2024: We released “HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models”, a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents.

10/2023: Our paper examining the importance of problem formulation for real-world problems through the task of UMLS Vocabulary Insertion was accepted to EMNLP 2023 (Findings).

6/2023: Our paper exploring whether the sub-optimal segmentation performance of standard biomedical tokenizers has an effect on downstream LM performance was accepted to the BioNLP Workshop @ ACL 2023.

5/2023: Our paper unlocking the hidden relation extraction abilities of LLMs by aligning RE with question answering was accepted to ACL 2023 (Findings).

10/2022: Our paper discussing potential limitations of GPT-3 in-context learning for biomedical IE was accepted to EMNLP 2022 (Findings).

5/2022: Joined Dr. Bodenreider’s lab at the National Library of Medicine for a summer internship.

3/2021: Won an Accelerator Grant from the Translational Data Analytics Institute for Social Media Pharmacovigilance.

10/2020: Our paper introducing a text classification dataset regarding COVID-19 documents was accepted to EMNLP 2020 (Findings).

4/2020: Our paper conducting a comprehensive study on the Clinical Reading Comprehension task based on the emrQA dataset was accepted to ACL 2020.

Short Bio

I grew up in San José, Costa Rica and moved to the US in 2011 to obtain my BA in Applied Mathematics from UC Berkeley. I first encountered the wonders of backprop while doing theoretical neuroscience research at the Redwood Center for Theoretical Neuroscience. After graduation I decided to pursue my passion for practical problems and joined Dr. Karim Galil and Dr. Wael Salloum at Mendel AI, a biomedical NLP startup working to make clinical trial recruitment for precision medicine more efficient. At Mendel, I not only developed my technical and AI research skills but also saw first hand what it takes to build a successful company from the ground up.

I left Mendel three years later and started my PhD life at the OSU NLP lab. Most recently, in the summer of 2022, I completed a research internship in Dr. Olivier Bodenreider’s lab at the National Library of Medicine working on improving the construction process of the Unified Medical Language System (UMLS).

For more information, please check out my CV (December 2023).

Hobbies & Fun Times

In my free time, I like to stay active by going to the gym, dancing, playing soccer, climbing and hiking (preferably to and from tall mountains). I also love meeting interesting new people and learning about their lives and research (conferences are my happy place). When I am not moving around or socializing I like fun escapist content such as fantasy, sci-fi and some magical realism every once in a while.

Please send me an email or message me on Twitter if you wanna chat. Always down to meet new people!