Hi, I'm Mukund 👋
I train frontier models
Currently an RL Researcher & Engineer at Anthropic. Previously at Character AI and co-founder & CTO of Quilt Labs. I did a PhD in Machine Learning at NYU, interned at Goldman Sachs, and graduated from Cornell with a degree in Computer Science. This website is a collection of random projects, mostly in blog form.
Recent work
I've worked on a variety of things, from simple tools to large research projects to companies. Here are a few of my favorites.

Anthropic
RL Researcher & Engineer. Working on reinforcement learning for large language models, making Claude more helpful, harmless, and honest.

Character AI
Member of Technical Staff. Built RLHF pipeline (data curation, reward models, policy optimization) and post-trained open-source LMs to match internal models; online A/B tests showed significant wins. Designed single- and multi-turn reward models and an LM-judge evaluation suite.
Reach out!
I'm always down to chat, feel free to DM me on X.