Hi, I'm Mukund 👋

I train frontier models

MS

Currently an RL Researcher & Engineer at Anthropic. Previously at Character AI and co-founder & CTO of Quilt Labs. I did a PhD in Machine Learning at NYU, interned at Goldman Sachs, and  graduated from Cornell with a degree in Computer Science. This website is a collection of random projects, mostly in blog form.

Recent work

I've worked on a variety of things, from simple tools to large research projects to companies. Here are a few of my favorites.

Anthropic

Anthropic

RL Researcher & Engineer. Working on reinforcement learning for large language models, making Claude more helpful, harmless, and honest.

Character AI

Character AI

Member of Technical Staff. Built RLHF pipeline (data curation, reward models, policy optimization) and post-trained open-source LMs to match internal models; online A/B tests showed significant wins. Designed single- and multi-turn reward models and an LM-judge evaluation suite.

Quilt Labs

Co-founder & CTO. Built an agent orchestration platform for professional investors; raised $4.6M from Altman Capital, Salesforce Ventures, South Park Commons, and others.

Reach out!

I'm always down to chat, feel free to DM me on X.