Projects

Open Source

  • Google BIG-Bench (Collaboration with Vinay Prabhu and Nick Roberts)
    • Ruin A Name With One Edit, a task to assess whether Large Language Models (LLMs) can identify humor
    • Hyperbaton Indentification, a task using the inversion of normal word order (hyperbaton) and reference resolution to test LLM robustness
    • Kannada Riddles, a task using riddles in the Kannada language to measure LLMs’ abilities to understand riddle clues in a ‘low-resource’ language
  • EleutherAI
  • HuggingFace BigScience, working on developing a multilingual LLM in coordination with an international group of scientists. I’m a member of the following working groups:
    • Modeling
    • Evaluation
      • Subgroup on Few-Shot Generalization, focusing on evaluation tasks for analogical reasoning, symbolic rules-based generalization, linguistic rules-based generalization, and compositionality
    • Interpretability
      • Organizing a subgroup to analyze training dynamics and emergent structures in LLMs
  • NL-Augmenter (Collaboration with Vinay Prabhu, Sang Han, and Nick Roberts)

Non-Open Source

  • SNAPPR
    • Condition-based maintenance for complex machinery using Monte-Carlo Tree Search combined with probabilistic models in the Figaro probabilistic programming language

Research Proposals

  • PRESCRIPTION
    • I contributed the probabilistic modeling section to a SBIR proposal to modeling pharmaceutical supply chains, which was awarded by the Defense Logistics Agency in mid-November 2020.
    • In 2021, using the Pyro probabilistic programming language, I constructed models for essential drugs and currently am working on automatic inference of missing nodes in the supply chain

Service

I co-organized the ML Collective social at ICLR, focusing on open science research in machine learning