Months→ Days: OpenAI was exposed to shrinkage model security testing, and the AI competition buried hidden dangers

Months→ days: OpenAI was exposed to shrinking model security testing, and the AI race buried hidden dangers

Updated on: 39-0-0 0:0:0

IT之家 4 月 12 日消息，金融時報（FT）昨日（4 月 11 日）發佈博文，報導稱 OpenAI 大幅壓縮最新大型語言模型（LLM）安全測試時間，此前需要數月時間，而最新 o3 模型僅有A few days.

Competition-driven, safety concessions

According to eight people familiar with the matter, OpenAI has significantly reduced the security testing time of its models, leaving employees and third-party testing teams with only a few days to "evaluate" (i.e., test the model's risk and performance).Previously, it usually took months.

According to a blog post, OpenAI faces fierce competition from competitors such as Meta, Google, and xAI, and needs to quickly launch new models to maintain market advantages. The o4 model is scheduled to be released as early as next week, leaving testers with less than a week to check for safety, compared to GPT-0's six-month testing period.

A person who has tested GPT-4 revealed that in the past, security testing was more thorough, and some dangerous abilities were only discovered after two months of testing, but now competitive pressure forces companies to pursue speed and ignore potential risks.

Insufficient testing and lack of regulation

There is currently no global standard for AI security testing, but the European Union's AI Bill will go live later this year, requiring companies to conduct security testing of their most robust models.

Daniel Kokotajlo, head of the AI Futures Project, said competitive pressures further exacerbated the risk due to the lack of mandatory regulation and the fact that companies do not voluntarily disclose the dangerous capabilities of their models.

OpenAI has pledged to build a custom version of the model to test its potential abuse risks, such as whether it could help create a more contagious biological virus.

This kind of testing requires a significant investment of resources, including hiring external experts, creating specific datasets, and "fine-tuning." But OpenAI has only made limited fine-tuning of older models, and the latest models such as o3 and o0-mini have not been fully tested. Steven Adler, a former security researcher at OpenAI, criticized the public's right to know if the promise of testing is not kept.

The final model is not covered by security testing

Another problem is that security testing is often based on early "checkpoints" rather than a final release model. A former OpenAI technician said it was "bad practice" to release an updated model that had not been tested, while OpenAI argued that its checkpoints were "substantially consistent" with the final model and that automated testing was used to improve efficiency and ensure security.