Projects
Open Source
- Google BIG-Bench (Collaboration with Vinay Prabhu and Nick Roberts)
- Ruin A Name With One Edit, a task to assess whether Large Language Models (LLMs) can identify humor
- Hyperbaton Indentification, a task using the inversion of normal word order (hyperbaton) and reference resolution to test LLM robustness
- Kannada Riddles, a task using riddles in the Kannada language to measure LLMs’ abilities to understand riddle clues in a ‘low-resource’ language
- EleutherAI
- Equivariance in Machine Learning
- Contrastive Learning for Code Review
- Interpreting Large Language Models Across Time
- HuggingFace BigScience, working on developing a multilingual LLM in coordination with an international group of scientists. I’m a member of the following working groups:
- Modeling
- Evaluation
- Subgroup on Few-Shot Generalization, focusing on evaluation tasks for analogical reasoning, symbolic rules-based generalization, linguistic rules-based generalization, and compositionality
- Interpretability
- Organizing a subgroup to analyze training dynamics and emergent structures in LLMs
- NL-Augmenter (Collaboration with Vinay Prabhu, Sang Han, and Nick Roberts)
Non-Open Source
- SNAPPR
- Condition-based maintenance for complex machinery using Monte-Carlo Tree Search combined with probabilistic models in the Figaro probabilistic programming language
Research Proposals
- PRESCRIPTION
- I contributed the probabilistic modeling section to a SBIR proposal to modeling pharmaceutical supply chains, which was awarded by the Defense Logistics Agency in mid-November 2020.
- In 2021, using the Pyro probabilistic programming language, I constructed models for essential drugs and currently am working on automatic inference of missing nodes in the supply chain
Service
I co-organized the ML Collective social at ICLR, focusing on open science research in machine learning