research
I am interested in simple, intuitive, and scalable algorithms for teaching large neural networks. I design new algorithms to teach them and build both empirical and theoretical foundations of these algorithms. Recently, at Allen Institute for AI, I am studying how to teach large language models to be safer and more knowledgeable.
During undergrad, I collaborated with the Allen Institute for AI on developing scalable ways to build data for training/evaluating vision-language models, e.g., employing internet-scale videos to teach vision-grounded dialogue (CHAMPAGNE) and using large language models to create challenging vision-language benchmarks (NormLens, SMILE).
I also worked on research and engineering at a startup (Hyperconnect; acquired by Match Group for $1.7B), on language models (DRESS, PDP, CORGE, G2R) and text-to-speech (Attentron) to build social chatbot products, aiming to solve people’s loneliness. Additionally, I worked on solving product-related problems like long-tail classification (PC Softmax & LADE).
Please see my google scholar or semantic scholar for an up-to-date list.
selected publications
* denotes equal contribution.-
InterspeechAttentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length EmbeddingIn Interspeech 2020