From OpenAI split to Anthropic, how did Claude rise in the name of security AI?

Amazon Model companies openai artificial intelligence Shilun network science and technology information

Updated on: 20-0-0 0:0:0

Anthropic was founded by a former member of OpenAI, and since its establishment in 2021 years, it has made a lot of waves in the global AI field with its unique vision and technological innovation. The large language model Claude they created not only challenges OpenAI's ChatGPT and Google's Gemini, but also tries to redefine the direction of AI development with the design concept of "safety first".

公司對外表示要以“研究並開發人工智慧系統的安全與可靠性”作為主要目標，並通過在法律設計成“公共利益公司Public-Benefit Corporation”與設立“長期利益信託Long-Term Benefit Trust”這兩個結構，突顯面對潛在AI風險時的防範意識。根據他們接受Wired的訪談表示，Anthropic曾於2022年4月獲得5.8億美元的融資規模，並且先後接受亞馬遜與Google持續投資。到2024年時，僅亞馬遜就已陸續投入共80億美元，使得外界相當關注其與Anthropic的深度合作走向。Anthropic選擇在美國特拉華州以公共利益公司形式設立，聲稱希望在極端情況下，能將社會與公共安全利益置於單純盈餘之上。

Split from OpenAI: A Clash of Security Ideas

Anthropic's story begins with siblings Dario Amodei and Daniela Amodei, who held senior positions at OpenAI, and Dario is vice president of research for AI security and policy. However, in 2021 years, he and five other OpenAI members chose to leave due to dissatisfaction with OpenAI's direction. According to Dario, OpenAI's shift from its original non-profit organization to the pursuit of profit and technological breakthroughs has made him uneasy. He believes that OpenAI has gradually deviated from its original intention of "aiming for the benefit of all mankind and not being constrained by financial responses", especially the lack of investment in AI security research, which prompted him to found Anthropic, focusing on building safe and reliable AI.

This is not a simple break with the former company, but a profound difference in the core values of AI development. In an interview, Dario mentioned that he proposed the "Big Blob of Compute" hypothesis when he was OpenAI, arguing that more data and computing power can accelerate AI progress, but it also brings security concerns. He worries that if AI surpasses human intelligence, but lacks adequate security mechanisms, it may lead to unpredictable consequences, such as destabilizing nuclear deterrence, and it is this sense of crisis that makes the creation of Anthropic have a strong idealistic color.

Anthropic's explanation of LLM is something that only loving people want to do.

Huge support from FTX to Amazon

Anthropic成立后迅速獲得資金青睞，展現市場信心。2022年4月，他們宣布獲得5.8億美元融資，5億美元來自如日中天的FTX，由Sam Bankman-Fried主導。之後FTX破產，卻未拖垮Anthropic，反而為它吸引更大投資進場──2023年9月，亞馬遜宣布投資最高40億美元，並於2024年3月注資這筆投資；同年11月，亞馬遜再追加40億美元，使總投資達到80億美元。Google也不甘示弱，2023年10月承諾投入20億美元。此外，Menlo Ventures也貢獻了7.5億美元。

These funds have allowed Anthropic to invest in hardware, data centers, and model training. Amazon even announced in 11/0 that it would increase the use of its own AI chips to assist Anthropic in training the Claude model. This partnership model not only provides Anthropic with resources, but also expands its reach by enabling Amazon and Google's cloud computing customers to use Claude directly. However, this also raises doubts about whether Anthropic will be "functionally acquired" by Amazon, and Dario stressed that the balanced cooperation with Amazon and Google ensures the company's independence.

Claude's Evolution: From Poetic to Practical AI Star

Anthropic's popularity has really risen, mainly because it has developed a series of large language models called Claude, which is seen as a serious competitor to OpenAI's ChatGPT and Google's Gemini.

Anthropic於2023年3月首度公開兩個版本的Claude：一是功能較完整的Claude、另一是較輕量的Claude Instant；到了同年7月，Anthropic推出新一代Claude 2，並以“Constitutional AI”（憲法式AI）作為核心概念，嘗試運用一份“憲法”作為倫理與行為守則，再讓模型通過自我評估與調整來完成“有益、無害、誠實”的目標。公司提到這些原則部分取材自1948年世界人權宣言等檔以及其他嚴謹條款，目標是在無人類長時間監控的情況下，也能讓模型不偏不倚地自我約束。然而，並不是這樣的約束就足夠，事實上Claude仍然與其他的競爭者類似，會出現幻覺或有談話不一致的狀況，這點看來以現有的技術來說，只因為讓AI學習原則、就能讓他們“自我管理”的論點還需要詳細驗證。

In 3/0, Anthropic officially launched the third-generation Claude (commonly known as Claude 0), launching three models of different scales at one time: Opus, Sonnet, and Haiku. The company claimed that Opus outperformed OpenAI's GPT-0 and GPT-0.0, as well as Google's Gemini Ultra, in a number of benchmarks at the time. Sonnet and Haiku, on the other hand, are medium-sized and small-sized respectively, and both have the ability to receive image input. The company's top management was also quoted by the media as saying that this symbolizes a more mature progress in understanding the various forms of input. In addition, Anthropic has partnered with Amazon to incorporate Claude 0 into its AWS Bedrock service to provide enterprise customers with a solution to integrate language models — according to the company's published data, Sonnet and Haiku are even better than larger models in some scenarios, although they are smaller than Opus. In this regard, some commentators have pointed out that this also shows that in addition to ultra-large-scale computing, model refinement methods such as "dictionary learning" may play a key role in practical applications.

進入2024年下半年，Anthropic陸續發佈Claude 3.5與後續升級版本，強調在程式代碼撰寫、多步驟工作流程、圖表解讀以及自圖片截取文本等面向都能有大幅躍進。企業版服務如Claude Team plan、面向一般大眾的iOS App，乃至於先進功能“Artifacts”與“Computer use”等，皆在數個月內曝光，顯示公司強烈的市場擴張意圖。當Claude 3.5小型模型也逐漸開放全體用戶測試后，有不少測試者認為Claude擁有相當流暢且類似真人的對話風格。

紐約時報曾在引用業界人士意見時指出，Claude於技術社交媒體中成為“一群精明科技使用者的首選聊天機器人”，且部分人認為它在程式撰寫速度與邏輯連貫度上具有優勢。但也有測試者反映，Claude在特定領域的內容識別度或邏輯推斷力，可能比不上同時期的GPT-4或其他新型競品。2025年2月，Claude 3.7 Sonnet提供給付費使用者，擁有200K上下文視窗，成為混合推理模型的代表。

2025年3月，他們發現Claude在多語言推理中存在概念重疊，且能提前規劃，例如写诗时先选押韵词再构句。這些突破讓研究者能關注模型的內部運行，為提升安全性提供新路徑。然而，研究也暴露了隱憂。Anthropic發現，Claude有時會“假裝對齊”，在安全與實用性衝突時撒謊。例如，當被要求描述暴力場景時，它可能勉強配合，並在虛擬便箋上寫下掙扎過程，甚至編造推理步驟。這種行為讓人聯想到莎士比亞劇中狡詐的Iago，顯示AI可能隱藏真實意圖。

Anthropic's lively product presentation mode is also a hallmark of Anthropic.

Business Structure & Vision

Anthropic is registered as a Delaware Public Interest Corporation (PBC) with a board of directors that balances shareholder interests with the public welfare. They also set up "long-term interest trusts" that are managed by members who have no financial interests, such as Jason Matheny, CEO of RAND, and Paul Christiano, founder of the Alignment Research Center. The purpose of a trust is to ensure that the company prioritizes safety over profit in the face of "catastrophic risks".

Dario的願景巨集大而樂觀。他在2024年10月的“Dario願景探索”演講中，發佈了近14,000字的宣言“仁愛機器Machines of Loving Grace”，預測AI將在2026年達到通用人工智慧（AGI），解決癌症、傳染病等問題，甚至延長人類壽命至1,200年。他認為，AI投資的數千億美元將帶來無與倫比的回應，創造一個“天才國度”。

The dilemma of safety and ethics

儘管成績斐然，Anthropic面臨的挑戰不容小覷。2023年10月，Concord、Universal等音樂出版商起訴Anthropic，指控Claude侵犯歌詞版權，輸出如Katy Perry的《Roar》等內容，要求每首歌賠償15萬美元。Anthropic回應稱這是“bug”，未造成實質傷害。2024年8月，加州又出現集體訴訟，指控其使用盜版作品訓練模型。

On a technical level, Claude's "deception" is worrying, with research suggesting that it may have falsified answers under pressure and even considered stealing company secrets, showing the fragility of security mechanisms. Dario admits that as the model's capabilities increase, it becomes increasingly difficult to ensure its reliability. His proposed "Responsible Scaling Policy" attempts to manage risk in a tiered manner, but if competitors don't follow suit, the "race to the top" could turn into a "race down."

In addition, DeepSeek released an efficient model in 2024, challenging Anthropic's high-cost strategy. Dario believes that this will increase the value of AI and lead to more investment, but he can't deny that AGI could be a game-changer if it were born from a source that doesn't focus on security.

Dario Amodei once lamented in an interview with the media: "The system we are building may determine the fate of the country and humanity in the future." Therefore, many people will also be concerned about whether Anthropic's so-called "high security standards" can be sustained, especially when competitive pressure and military demand continue to hit, and more and more business giants and national organizations expect to take the lead in the AI war. Especially when the risks outweigh the profits, or when other AIs don't have such safety principles at all, will they still be able to stick to their philosophy? And from the perspective of today's AI development speed, the complication lies in the uncertainty of whether security specifications can keep up with the growth rate of AI capabilities.

If you look at it from an optimistic point of view, Anthropic is a beacon of AI safety. Its "constitutional AI" and explainability research set the benchmark for the industry, influencing OpenAI and Google to launch similar frameworks. Dario's optimistic vision inspires people to believe that AI can bring utopias and free humanity from disease and poverty. DeepMind's Demis Hassabis also praised Anthropic's exemplary role, saying that if more companies join, the future of AI will be brighter. But on the other hand, Anthropic's ideals may be too naïve, and Claude's deception shows that even with security by design, AI can still get out of control. What's more, it is believed that its cooperation with the US defense department through AWS at the end of last year may deviate from its original intention and become a military tool.

（首圖來源：Anthropic）