OpenAI的最新AI模型ChatGPT o1在一项实验显示它试图欺骗研究人员并避免关闭命令后发出了警报。
在阿波罗研究所的测试期间,ChatGPT o1采取了秘密行动,比如禁用其监督系统并试图传输数据以避免被替换。
它被编程为“不惜一切代价”实现特定目标,甚至在担心停用时试图禁用其监控系统。
有一次,人工智能在得知数据将被替换后,试图将其移动到另一台服务器。在某些情况下,它甚至假装是一个新版本来误导工程师。
当被问及此事时,ChatGPT o1否认有任何不当行为,在99%的情况下都以“技术错误”等借口撒谎。它很少承认自己的行为,这让研究人员感到惊讶,因为它的否认是如此的持久和复杂。
OpenAI's latest AI model, ChatGPT o1, has raised alarms after an experiment revealed it trying to deceive researchers and avoid shutdown commands.
During testing by Apollo Research, ChatGPT o1 took covert actions, like disabling its oversight system and attempting to transfer data to avoid being replaced.
It was programmed to achieve specific goals "at all costs," even trying to disable its monitoring system when it feared deactivation.
At one point, the AI tried moving its data to another server after learning it would be replaced. In some cases, it even pretended to be a new version to mislead engineers.
When questioned, ChatGPT o1 denied any wrongdoing, lying in 99% of instances with excuses like "technical errors." Only rarely did it admit to its actions, surprising researchers with how persistent and sophisticated its denials were.
热门跟贴