what i'm doing right now

Last updated 2025-05-20. See what is a "now page"?.

On summer break from university. My last semester will be this fall.
Working as the frontend engineer for Baby Steps, a startup that helps prospective patients decide which IVF clinic to choose using a birth rate prediction model.
Considering the following project ideas. If you're interested in collaborating with me on any of them, please reach out or book a call with me!
- A benchmark that uses a simulated attack–defense-style CTF to test how good LLMs are at red teaming and blue teaming. This could be used to track the offense-defense balance of LLM-assisted cybersecurity over time, and this style of benchmark (like GameBench) is less susceptible to saturation because it's comparative: it grades each model using something like an Elo score, so it can retain precision well above superhuman performance as long as you have another model that can compete against it. I'm hesitant about this project because capability researchers might use it as a feedback signal to improve LLMs' hacking abilities.
- A startup selling services to autonomous AI agents themselves. I think agents will become an increasingly large part of the economy and eventually eclipse human economic activity. This is a totally new customer base. What services will they want to buy that current companies – who only focus on their human audience – can't provide? I think this is a promising idea, but executing it well would probably accelerate AI progress, decreasing the chance that society makes the post-superintelligence transition wisely. I think someone else will also attempt this anyway, so maybe i could counterfactually donate more of the profits to AI safety research than my competitors would or use the company's position as a lever to influence the development and deployment of AI agents for good?
- Software that uses AI to dramatically accelerate the speed at which people can learn. Math Academy is software that uses algorithms based on the science of learning to help students learn and retain math as efficiently as possible. What if we could bring the same efficiency in learning to any arbitrary subject? My vision is something akin to an LLM-based tutor chatbot with a model of all of the things you know and how they relate to each other, who challenges you to recall and explain concepts you've learned in previous study sessions based on a spaced repetition scheduler. How this would fit into my life goal of making AI go well i haven't figured out yet. Something something differentially accelerating human cognition.