Article Overview
Two AI companies. One city. A rivalry that is reshaping the entire technology industry.
OpenAI built ChatGPT and turned AI into something 200 million people use every week. Anthropic was founded by nine people who walked out of OpenAI — including its own VP of Research — because they believed the world's most powerful AI was being built too fast without enough care for what could go wrong.
That disagreement is not ancient history. It is the engine driving every product launch, every benchmark result, every government partnership, and every safety debate happening in AI right now.
In this article you will find out exactly why Anthropic left OpenAI and what that split means for everything both companies build. You will see how their flagship models — ChatGPT versus Claude, GPT-4o versus Fable 5, o3 versus Mythos 5 — compare across coding, reasoning, science, vision, and cybersecurity. You will understand what Constitutional AI and the Responsible Scaling Policy actually are, and why they matter beyond the marketing language. You will learn why OpenAI's board fired Sam Altman in November 2023, what really happened in those five chaotic days, and what it revealed about the fault line running through the company.
You will also get the honest answer to the question everyone asks but few answer properly: who is actually winning? The answer is different depending on whether you measure users, revenue, safety research, enterprise trust, government partnerships, or raw model capability — and this article covers all of them.
By the end, you will understand not just two competing products but two competing answers to the most consequential question in technology: how fast should you move when what you are building could change everything?
Introduction
Two companies, forty miles apart in San Francisco, are building the most powerful AI systems in human history. One of them launched the product that made AI a household conversation. The other was founded by people who walked out of the first one over a disagreement that has defined the entire industry ever since.
OpenAI gave the world ChatGPT. Anthropic gave the world Claude. Both are racing toward the same destination — AI capable of transforming how humans work, create, and solve problems — but they are traveling on fundamentally different roads, with genuinely different beliefs about how fast they should go and what guardrails belong along the way.
Understanding the difference between OpenAI and Anthropic is not just a matter of comparing benchmark scores or subscription prices. It requires understanding a philosophical argument that began inside one organization, split it in two, and has been playing out publicly through product launches, safety research, funding rounds, and government partnerships ever since.
This is that story — and an honest accounting of where both companies stand today.
How They Started — and Why It Matters
OpenAI: The Nonprofit That Became a Powerhouse
OpenAI was founded in December 2015 as a nonprofit with an unusual mission for a technology organization: to ensure that artificial general intelligence benefits all of humanity rather than concentrating its benefits in the hands of a few. The founding team included Sam Altman, Greg Brockman, Ilya Sutskever, Wojciech Zaremba, John Schulman, and Elon Musk, who departed from the board in 2018.
The nonprofit structure reflected genuine idealism about how transformative AI should be governed. If AGI was coming regardless, the argument went, better to have safety-focused researchers building it in the open than to leave it to commercial actors with no accountability to the public.
That idealism held for a few years. Then GPT-3 arrived in 2020, and with it came the recognition that training frontier AI models required compute budgets that no nonprofit could sustain. OpenAI restructured into a capped-profit company — a hybrid where investors could earn returns up to a defined multiple of their investment — and brought in Microsoft as its primary financial partner. Microsoft has since invested more than $13 billion across multiple rounds.
The commercial pivot worked. By late 2024, OpenAI was valued at $157 billion — roughly the GDP of a mid-sized country — with annualized revenue exceeding $3.7 billion. ChatGPT, launched in November 2022, became the fastest-growing consumer application in history and now serves more than 200 million weekly active users.
Anthropic: The Safety Disagreement That Became a Company
The story of Anthropic begins inside OpenAI. By 2021, Dario Amodei — who was VP of Research at OpenAI — and his sister Daniela Amodei, along with seven other colleagues, had grown increasingly uncomfortable with the direction the organization was heading. The concerns were not trivial disagreements about office policy. They centered on something fundamental: whether safety research was receiving the priority it deserved as the models grew more powerful, and whether the pace of commercialization was outrunning the team's ability to understand what they were building.
In April 2021, the group departed and founded Anthropic. The founding premise was explicit from day one: this would be an AI safety company first, not a capabilities company that added safety as an afterthought. The structure they chose — a Public Benefit Corporation rather than a standard for-profit — reflected a legal commitment to that mission, not just a marketing position.
What makes this origin story significant for understanding the present is that the people who founded Anthropic knew OpenAI from the inside. They had worked on GPT-2 and GPT-3. They understood exactly what frontier language models could do and what risks they carried. Their departure was not a rejection of AI development — it was a disagreement about the terms on which it should happen.
That disagreement is still the organizing principle of everything both companies do today.
The Numbers: Scale, Funding, and Commercial Position
Before comparing philosophies and models, the raw scale difference needs to be on the table.
Metric | OpenAI | Anthropic |
|---|---|---|
Founded | December 2015 | April 2021 |
CEO | Sam Altman | Dario Amodei |
Structure | Capped-profit | Public Benefit Corporation |
Total funding | $40 billion+ | $9 billion+ |
Key investor | Microsoft ($13B+) | Amazon ($4B+) |
Valuation (2024) | $157 billion | $18.4 billion |
Annualized revenue | $3.7 billion+ | ~$1 billion |
Employees | 3,000+ | 1,000+ |
Weekly active users | 200 million+ (ChatGPT) | Smaller, growing |
Primary cloud partner | Microsoft Azure | Amazon AWS |
The gap in scale is real and large. OpenAI has roughly four times the valuation, four times the revenue, and a consumer user base that dwarfs Anthropic's. These are not close numbers.
But the comparison of valuations and user counts does not capture the full picture. Anthropic chose not to compete with ChatGPT for casual consumer users the way OpenAI did — at least not initially. Its strategy has been more focused on enterprise and API customers who need the most capable, most reliable, and safest AI available, and who are willing to pay accordingly. The $18.4 billion valuation of a four-year-old company with roughly $1 billion in revenue is not a failure by any normal business measure. It is the profile of a company that deliberately chose depth over breadth early.
Two Different Products, Two Different Visions
OpenAI's Product Bet: Consumer First, Then Everything Else
OpenAI's product strategy is visible in its flagship: ChatGPT. Launching in November 2022, it was designed for everyone — a chat box, immediate conversational responses, and free access. Within two months, it had 100 million users. The message was clear: AI is a consumer product now, and OpenAI is going to be the brand people think of when they think of AI.
That consumer foundation gave OpenAI a distribution advantage that is genuinely difficult to overcome. When someone's first experience with conversational AI is ChatGPT, ChatGPT becomes their mental model for what AI is supposed to feel like. Every competitor gets measured against it.
The product line that grew from that foundation is broad: GPT-4o for multimodal real-time tasks, GPT-4.1 with a one-million-token context window for enterprise document work, the o3 reasoning model for complex analytical problems, DALL-E 3 for image generation, Sora for video, Codex for autonomous coding, and the Daybreak platform for cybersecurity defense. GitHub Copilot, built on OpenAI's technology through the Microsoft partnership, has 1.8 million paid developer subscribers — the most widely used AI coding tool in the world by a significant margin.
Anthropic's Product Bet: Enterprise Trust Over Consumer Scale
Anthropic took the opposite approach. Rather than launching a consumer chatbot and scaling it to hundreds of millions of users, Anthropic focused on building the most capable and most trustworthy AI for organizations that could not afford to take risks with less reliable systems.
The Claude product line reflects this orientation clearly. Claude Enterprise, Claude Code, Claude Cowork, Claude Security, the Constitutional AI safety architecture, the Responsible Scaling Policy — these are products and commitments that mean something specific to a regulated financial institution, a healthcare organization, or a government agency. These customers are not choosing between AI assistants based on which interface they find more pleasant. They are choosing based on capability, reliability, data privacy guarantees, and safety architecture.
The DXC Technology alliance, where DXC is training tens of thousands of engineers to deploy Claude inside the systems running the world's largest banks, airlines, and insurers, is a specific kind of win that ChatGPT's user count does not reflect. The Amazon partnership, where Claude's models are deeply integrated into AWS infrastructure, gives Anthropic reach into the enterprise cloud deployments that power a significant fraction of the internet. Claude Tag, launched in June 2026, represents Anthropic's most direct move into team productivity — a Slack-native model where Claude functions as a persistent team member that writes 65% of Anthropic's own product code.
The Models: What Each Company Has Actually Built
OpenAI's Model Lineup
OpenAI runs two parallel model families. The GPT family — GPT-4o, GPT-4.1, GPT-4.5, GPT-5.5 — handles generalist tasks: writing, analysis, coding, document processing, and multimodal understanding. The o-series — o1, o3, o3 Pro — applies chain-of-thought reasoning to the hardest problems: mathematics, formal logic, scientific analysis, and complex engineering.
The landmark result for OpenAI's reasoning capability came in April 2025, when o3 scored 87.5% on the ARC-AGI benchmark — a test specifically designed to measure adaptability rather than memorized pattern-matching. The human baseline on the same test sits at approximately 85%. It was the first time any AI model crossed that threshold, and the significance was widely acknowledged across the research community.
GPT-4.1's one-million-token context window changed the category of tasks OpenAI's models could handle without external tooling. GPT-5.5, the current frontier general-purpose model, and GPT-Rosalind, which leads the LifeSciBench scientific evaluation at 36.1% pass rate, represent where OpenAI's capabilities currently sit.
Anthropic's Model Lineup
Anthropic's current public frontier is Claude Fable 5, launched June 9, 2026. It leads on nearly every benchmark Anthropic tested — highest FrontierCode score at medium effort for coding quality, highest Hebbia Finance Benchmark score for senior-level analytical reasoning. Stripe reported Fable 5 completed a codebase-wide migration of a 50-million-line Ruby codebase in a single day — work estimated at over two months for a full engineering team.
Above Fable 5 sits Claude Mythos 5 — restricted to vetted cybersecurity partners and life sciences researchers only. Mythos 5 accelerated drug design workflows by approximately ten times compared to standard processes, produced scientific hypotheses preferred 80% of the time over Opus-class alternatives in blinded evaluations, and conducted autonomous genomics research that outperformed a recently published journal result while using a model 100 times smaller.
Benchmark Comparison: Task by Task
Task Category | Leader | Key Result |
|---|---|---|
General reasoning | Close — task dependent | o3: 87.5% ARC-AGI; Fable 5: leads most other benchmarks |
Coding quality | Claude Fable 5 | Highest FrontierCode score at medium effort |
Long context | GPT-4.1 | 1,000,000 token window vs Fable 5 |
Scientific research | GPT-Rosalind | 36.1% LifeSciBench pass rate (top model) |
Cybersecurity | GPT-5.5-Cyber | 85.6% CyberGym vs Mythos 5 at 83.8% |
Financial analysis | Claude Fable 5 | Highest Hebbia Finance Benchmark score |
Vision tasks | Claude Fable 5 | New state-of-art, vision-only game completion |
Conversational quality | GPT-4.5 | Emotional intelligence, nuanced dialogue |
Drug design (restricted) | Claude Mythos 5 | 10x acceleration, 9 of 14 targets viable |
Neither company has a clean permanent lead. The gap shifts depending on the task, and it shifts again with every new model release.
The Safety Question: The Difference That Actually Defines Everything
The safety debate between OpenAI and Anthropic is not about whether safety matters. Both companies say it does, and both have invested substantially in safety work. The disagreement is about what safety actually requires in practice — and whether the right approach is to move carefully before shipping or to ship and learn.
OpenAI's Approach: Deploy and Discover
OpenAI's current framing — that broad deployment and commercial success fund the safety research needed to address risks — is a coherent argument. Hundreds of millions of users generate feedback about failure modes that no laboratory evaluation could anticipate. Revenue from ChatGPT subscriptions funds the researchers working on alignment. The argument has genuine merit.
But the November 2023 board crisis exposed the tension in that framing in a way that could not be ignored. On November 17, the board fired Sam Altman, citing a lack of candor with board members. Within hours, more than 700 employees had signed a letter threatening to resign if Altman was not reinstated. Microsoft offered to hire the entire team. Within five days, Altman was back and the board was reconstituted with members more aligned with the company's commercial direction.
The full details of what triggered the firing have never been made entirely public. What the episode made visible was a genuine conflict between those who believed safety governance should constrain commercial decisions and those who believed commercial momentum should set the pace. The outcome resolved that conflict in one direction. Several of OpenAI's most prominent safety researchers have departed since. Ilya Sutskever, the co-founder who voted to remove Altman, left to found his own safety organization. The Superalignment team announced in 2023 was restructured.
OpenAI continues to publish system cards and safety evaluations. The question critics raise is whether the governance structures that would make those evaluations binding on commercial decisions are still in place.
Anthropic's Approach: Build the Architecture First
Anthropic built safety into the structure of the company and the structure of the models at the same time.
Constitutional AI is the most distinctive technical contribution. Rather than relying purely on human feedback to align models with human values, Constitutional AI trains the model to evaluate its own outputs against a set of principles — a published "constitution" of ethical guidelines — and revise them accordingly. This shifts some alignment work from human labelers reviewing millions of outputs to the model itself developing judgment about what it produces.
The Responsible Scaling Policy is the governance equivalent. It defines AI Safety Levels — ASL-1 through ASL-4 and beyond — and commits Anthropic to specific safety measures at each level before deploying models that reach that capability threshold. Anthropic was the first major AI lab to publish this kind of binding commitment publicly.
When Mythos-class models crossed a threshold that Anthropic determined carried meaningful risks for cybersecurity and biological research, they built a two-tier response: Fable 5 for the public — with safety classifiers that automatically handle concerning requests and fall back to Opus 4.8, triggering in fewer than 5% of sessions — and Mythos 5 for vetted partners only. The classifiers blocked dangerous cybersecurity queries with zero false positives across 30 different known jailbreak techniques. A bug bounty program ran for 1,000+ hours without producing a universal jailbreak.
The difference between the two companies is not that one is reckless and the other is cautious. It is that they have made different structural commitments about what happens when capability and safety pull in opposite directions. Anthropic has codified those commitments in published policy and enforced them with technical architecture. OpenAI has made commitments through discourse that carry less structural force.
Research Beyond Products: The Deeper Bets
Anthropic: Understanding What Is Inside
Anthropic has become the world's leading institution in mechanistic interpretability — the attempt to understand what is actually happening inside neural networks, not just what they output. Chris Olah and his team at Anthropic have published influential work on circuits, features, superposition, and the internal representations that correspond to specific concepts inside a model's learned world.
The goal is eventually to audit a model's reasoning the way you can audit code — to look inside and verify that what the model is doing matches what it claims to be doing. This research does not yet scale to frontier models. But Anthropic is more committed to this line of work than any other major lab, and the foundational knowledge it is building may prove critical when interpretability becomes tractable at scale.
OpenAI: Capability Evaluation and Deployment Learning
OpenAI's research priorities lean toward capability evaluation, reinforcement learning from human feedback, and the kind of benchmark-focused assessment that drives product iteration. The research output remains significant — the ARC-AGI result with o3 was a genuine contribution to the field — but its focus has shifted toward work that feeds directly into product capabilities rather than the more foundational questions Anthropic is pursuing.
The Superalignment initiative, announced in 2023 as a four-year commitment to solving superintelligence alignment, is no longer organized in its original form.
Government and Policy: Both at the Table, Different Seats
Both companies are deeply engaged with governments, but through different programs reflecting their different priorities.
OpenAI's government work centers on active cyber defense. The Daybreak platform has established Trusted Access partnerships with Australia, Canada, France, Germany, Japan, South Korea, and EU institutions including ENISA. GPT-5.5-Cyber, scoring 85.6% on CyberGym, has already helped defenders find vulnerabilities in Firefox, Safari, V8, and the Linux kernel. The Patch the Planet initiative with Trail of Bits is working to secure the open-source software that critical infrastructure runs on. Pre-deployment testing of GPT-5.5 and GPT-5.5-Cyber runs through collaboration with CAISI, ONCD, and OSTP.
Anthropic's government work is more anchored in safety frameworks. The NNSA and DOE partnership produced an AI classifier that detects dangerous nuclear weapons queries with 96.2% accuracy and zero false positives — now deployed on live Claude traffic. Project Glasswing, in collaboration with the US government, gave vetted cyber defenders access to Mythos-class models before they were publicly released. The NNSA collaboration also produced a methodology — using synthetic data to bridge classified government knowledge and private AI capability — that Anthropic has published as a template for the broader industry.
Both companies are building the government relationships that will shape AI policy for years. OpenAI's work leans toward active capability deployment for defense. Anthropic's leans toward safety architecture and governance frameworks that the government can verify and rely on.
The Timeline: A Decade of Competition in Five Years
Date | Event |
|---|---|
December 2015 | OpenAI founded as nonprofit |
April 2021 | Anthropic founded by 9 ex-OpenAI employees |
November 2022 | ChatGPT launches — AI enters mainstream |
March 2023 | GPT-4 released; Claude 1 released |
September 2023 | Amazon invests first $1.25B in Anthropic |
October 2023 | Amazon total commitment reaches $4B |
November 2023 | OpenAI board crisis — Altman fired and reinstated in 5 days |
March 2024 | Claude 3 Opus matches GPT-4 on benchmarks |
May 2024 | GPT-4o released — multimodal consumer default |
June 2024 | Claude 3.5 Sonnet — most widely used Claude model |
October 2024 | OpenAI raises $6.6B at $157B valuation |
February 2025 | Claude 3.7 Sonnet — extended thinking arrives |
April 2025 | o3 scores 87.5% ARC-AGI — first AI above human baseline |
June 2026 | Claude Fable 5 and Mythos 5 — Anthropic's frontier leap |
Who Is Winning — Honestly
Dimension | OpenAI | Anthropic |
|---|---|---|
Consumer user base | Clear lead — 200M+ WAU | Smaller but growing |
Revenue | Clear lead — $3.7B+ vs ~$1B | Significantly behind |
Valuation | Clear lead — $157B vs $18.4B | Significantly behind |
Developer API adoption | Market leader | Growing enterprise share |
Safety research depth | Less fundamental | Leading interpretability work |
Formal safety commitments | System cards, evaluations | RSP, Constitutional AI, PBC structure |
Speed of shipping | Faster | More measured |
Highest-risk enterprise deployments | Strong via Azure | Stronger for regulated industries |
Government national security work | Daybreak, CAISI, Patch the Planet | Glasswing, NNSA, nuclear classifier |
Model capability | Task-dependent | Task-dependent |
Long-term alignment research | Less foundational | More foundational |
The answer changes depending on who is asking and what they need.
A casual user who wants the most accessible AI platform will find OpenAI's ecosystem larger, more familiar, and easier to enter.
An enterprise in a regulated industry that needs maximum capability with maximum safety guarantees will find Anthropic's position compelling in ways that revenue comparisons miss.
A researcher trying to understand what is happening inside AI systems will find more material at Anthropic than anywhere else.
A developer building applications needs to evaluate both companies' models for their specific tasks — because the best answer changes every few months.
The Question Neither Can Answer Yet
Both companies are building toward systems far more capable than anything either has publicly released. Both believe AGI is approaching on a timeline measured in years rather than decades.
The argument between them is not about whether to build powerful AI. Both are clearly committed to doing exactly that. The argument is about what the right relationship between capability and safety looks like as those systems grow more powerful.
OpenAI's answer has been that you learn what safety requires by deploying capable systems and observing what happens. The commercial success that follows funds more research. The feedback from 200 million users reveals failure modes no laboratory test could predict.
Anthropic's answer has been that some failure modes are too serious to discover through deployment. If a model capable of providing meaningful assistance with biological weapons or nuclear design is released before the classifiers preventing misuse are in place, the harm from cases that slip through cannot be undone.
Both arguments have merit. Neither has been proven right by the evidence available yet. The models currently deployed are not yet capable enough for the highest-stakes failure modes to have materialized. As capabilities increase, the gap between the two philosophies will become more consequential — and the question of which approach was correct will have real answers rather than theoretical ones.
Final Takeaway
OpenAI and Anthropic are not simply two competing products from two competing companies. They are two different answers to the most consequential question in technology: how do you build transformative AI responsibly when you are not entirely sure what responsible means at the capabilities you are approaching?
OpenAI has the larger platform, the greater reach, the more recognizable brand, and the commercial momentum that comes with having launched the product that made AI unavoidable for ordinary people. Anthropic has the safety research depth, the governance architecture, the enterprise trust in the highest-stakes deployment environments, and the institutional commitment to moving carefully at exactly the moments when moving carefully is hardest.
The competition between them is not zero-sum. The safety innovations one company publishes are available for the other to learn from. The capability benchmarks one company sets force the other to respond. The government relationships one company builds set expectations that apply to both.
What is certain is that this rivalry will shape how AI develops over the next decade more than almost any other force. Both companies know it. The decisions they make in the next few years — about what to deploy, when, with what safeguards, under what governance — will not simply determine who wins the AI race. They will determine what winning means for everyone.
