Key Moments

Scaling Past Informal AI - Carina Hong, Axiom Math

Latent Space PodcastLatent Space Podcast
Science & Technology6 min read94 min video
Jun 3, 2026|3,937 views|93|7
Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

TL;DR

Formal verification, not just for bug fixing but for scaling AI brilliance, is the critical path to superintelligence, as demonstrated by Axiom Math's perfect score on the Putnam exam.

Key Insights

1

Axiom Math, founded seven months prior, secured a $200 million Series A funding round at a $1.6 billion valuation.

2

Axiom Math achieved a perfect score of 120/120 on the 2024 Putnam exam, outperforming both human participants and other AI systems.

3

Formal verification, as advocated by Axiom Math, is positioned not as a defense against AI hallucination or errors, but as a mechanism for compounding and scaling AI brilliance.

4

The Lean theorem prover is a key technology used by Axiom Math, functioning as a formal language that enables verified generation and rigorous mathematical proofs.

5

The annual U.S. math budget for research is approximately $250 million, highlighting the significant investment Axiom Math has attracted.

6

Axiom Math's approach to formal verification demonstrates performance gains, higher sample efficiency, and the ability to match or exceed human performance on complex tasks, exemplified by the Putnam exam results.

Formal verification as the path to compounded intelligence

Carina Hong, CEO of Axiom Math, argues that the future of AI, particularly superintelligence, hinges on formal verification, not merely as a tool for identifying and fixing errors, but as a fundamental method for scaling and compounding AI capabilities. This contrasts with the common perception of verification as a compliance or bug-fixing exercise. Axiom Math, a seven-month-old company with a team of 30, has demonstrated this potential by achieving a perfect score of 120/120 on the 2024 Putnam exam, outperforming all human and AI competitors. This feat, coupled with a recent $200 million Series A funding round at a $1.6 billion valuation, underscores the significant market belief in their approach. Hong posits that structured and formal data, exemplified by mathematical proofs, possesses greater horizontal transferability than conventionally trained data, leading to more robust and broadly applicable AI reasoning.

The power of Lean and verified generation

Central to Axiom Math's strategy is the use of the Lean theorem prover. Lean is described as a formal language, akin to other proof assistants like Coq or Isabelle, that allows for the rigorous, step-by-step verification of mathematical proofs. Unlike informal mathematical reasoning or natural language proofs, Lean ensures that when a proof compiles and is validated, it is definitively correct. This process can be likened to a type checker for mathematical logic. The 'U-car Howard correspondence' links proofs directly to programs, meaning a verified proof in Lean can be understood as a correct program. This formal system allows mathematicians to leverage Lean not just for its logic capabilities but also for its functional programming aspects, enabling complex computations and even the development of tools like autograd within Lean itself.

Beyond bug fixing: Scaling brilliance, not just eliminating flaws

Hong emphasizes that formal verification, in Axiom Math's view, is not about correcting 'lousiness' or hallucinations but about 'scaling brilliance and compounding brilliance.' Drawing an analogy to Srinivasa Ramanujan, she explains how formal proofs transform intuitive insights into theorems, which then become building blocks for future mathematical advancements. This process of formalization and verification acts as a multiplier for existing intelligence. Traditional human-driven peer review, which can take years, is contrasted with the potential of AI-assisted formal verification. While mathematicians might initially rely on intuition, formal systems aid in handling low-level deductions, freeing them to navigate high-level conceptual spaces more effectively. This is where tools like Lean's 'grind' tactic can handle significant proofs, shocking some observers with their capabilities.

Axiom's performance advantage and future applications

The success on the Putnam exam, where Axiom Math scored 120 points against DeepSeek's 103 and the best human's 110, illustrates the performance benefits of their approach. Hong notes that while frontier labs possess vast resources, startups like Axiom can achieve comparable or superior performance on superhuman tasks through greater sample efficiency derived from formal methods. This approach is not limited to mathematics; Axiom sees formal verification as a foundational element for 'verified AI' applicable across various domains. The company's ambition extends to broadening its scope beyond math, with potential applications in hardware and software verification. For hardware, where partial verification yields no benefit (e.g., a GPU), perfect verification is crucial, making Axiom's technology a significant disruptive force.

The market for formal verification and team composition

The substantial $200 million Series A funding suggests a large market perception for formal verification, dwarfing the annual U.S. math research budget of approximately $250 million. Axiom Math's strategy leverages structured and formal data, akin to how early AI models demonstrated strong transfer learning from coding to reasoning. Their approach involves a system of models, post-trained using RL or SFT on 'Lean data'—data where correctness is inherently known. This allows them to compete effectively despite potentially smaller compute and data budgets compared to frontier labs. The team at Axiom is highlighted as a key differentiator, comprising expert mathematicians who are also users of the systems they develop, combined with applied ML and codegen experts.

Navigating theoretical limitations and future potential

While theoretical results like Gödel's incompleteness theorems and Rice's theorem acknowledge limitations in verifying all programs, Axiom Math focuses on verifying a majority of useful programs. The company's vision is to make verification so performant and accessible that it becomes a standard choice for complex coding tasks, from web development to distributed systems. They are developing 'verify generation' capabilities, where generated code is accompanied by a formal proof of correctness. This is distinct from simply verifying existing code. The benchmark 'Code_Marina' shows advanced performance for formal systems like Axiom's, significantly outperforming general LLMs in generating code with proofs. The challenge remains in specification—humans are not always adept at precisely defining all requirements, but Axiom believes formalization tools and interactive processes can bridge this gap.

Mathematical discovery and the role of intuition

Axiom Math is also investing in 'mathematical discovery' tools, recognizing that proof is not the only critical step in mathematics; conjecture and intuition are equally important. These tools aim to help mathematicians explore problems by suggesting constructions or identifying patterns, essentially aiding in the creative process before formal proving begins. The company plans to open-source codebases related to mathematical discovery, which have been used to solve long-standing conjectures. This reflects a belief that while formal verification can handle rigorous deduction, the generation of novel ideas and conjectures requires different AI approaches, often informed by human intuition and the exploration of examples. The goal is to make these discovery tools accessible to the broader mathematical and scientific community.

The vision for verified AI and broadened impact

Axiom Math's overarching vision is that 'anything that can be defined can be executed, and anything that can be specified can be proven.' They see verification not as a niche requirement for closed industries but as a path to openness and enhanced collaboration, whether human-AI or AI-AI. This verified AI is expected to lead to significant performance gains, higher sample efficiency, and ultimately, a democratized ability to achieve superhuman performance. The company believes that the path to superintelligence must be verified, and they are committed to building this future. Their approach aims to unlock capabilities not only in mathematics and computer science but also in related fields like science and law, leveraging the foundational advancements in reasoning and verification.

Common Questions

Axiom Math is a startup founded by Carina Hong, focused on leveraging formal verification to build superhuman AI mathematicians. Their core mission is to scale and compound brilliance through verifiable AI, aiming to solve complex mathematical problems and improve code generation.

Topics

Mentioned in this video

People
Kevin Buzzard

Mathematician who Kenny (from Axiom Math) worked with to build out mathlib, the Lean mathematics library.

Donald Knuth

A renowned computer scientist and mathematician, whose results were formalized using Claude and AXL tools, showcasing the practical application of Axiom's tools.

Terence Tao

A prominent mathematician whose video about using Lean for collaboration is mentioned. Also cited for his database of Erdos problems.

K. Kuno

A core contributor to Frontier Math and a benchmark setter, described as a key talent on the Axiom Math team who brings strong capabilities in proving and discovery.

Alex Kontorovich

A mathematician involved in blueprint writing for complex formalization projects, mentioned alongside Terence Tao for his role in organizing collaborative math efforts.

Lawrence Tribe

A Harvard Law professor and strong appellate litigator, cited as an example of someone with math training excelling in legal fields like appellate litigation due to its logical structure.

Srinivasa Ramanujan

A brilliant self-taught mathematician whose intuitions were solidified into theorems after he learned formal proofs at Cambridge, serving as an example of how verification scales brilliance.

John Edensor Littlewood

Mathematician who collaborated with Ramanujan and Hardy at Cambridge, mentioned in the context of Ramanujan's development of formal proof writing.

Carina Hong

CEO and founder of Axiom Math, with a background in neuroscience (UCL Gatsby) and a brief stint in law school before founding the company. Driven by an obsession to build AI that can do math.

Charles Parton

An 'OG' in mathematical discovery, member of technical staff at Axiom Math. Previously disproved a 30-year-old conjecture and found the solution to the 130-year-old global Leono function problem.

Gabriel Pereira

Professor whose work in the AI for Math community is highlighted for its interesting approach to conjecturing and theory building, suggesting avenues for self-improvement in AI systems.

More from Latent Space

View all 225 summaries

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free