今日,我在《华尔街日报》的专栏文章里,介绍了以生成式AI为代表的AI 2.0技术走向更开放的重要性

AI 2.0将加速知识的创新,大幅提高生产力。这种颠覆式影响远超印刷机,引发有史以来最伟大的技术革命。

AI 2.0 强大力量不应该被特权精英垄断,而应该造福更多的人,穿透各行各业,这正是我创办零一万物的初衷。零一万物的Yi-34B模型是性能世界领先的开源模型,刚问世就拿下英文榜单Hugging Face和中文榜单C-Eval双料冠军。开源将技术变革力量赋予了每个人,我们欢迎开发者使用Yi,一起构建新型“AI-first”创新生态,让AI 2.0真正走出各自为营的黑匣子,造福更多人类。

以下是专栏文章全文:

创新工场董事长、零一万物CEO

15世纪的欧洲,古登堡发明的印刷机一经问世,彻底改变了人类的生活。人类思想得以用前所未有的速度在世界范围内传播,进而创造巨大的价值。然而在印刷机萌芽的幕后,古登堡原本意图私藏这项新技术,他的投资人福斯特为此和古登堡关系破裂,迅速挖走了核心技术人员舍费尔,继而将崭新的印刷机推出到市场上。

福斯特这么做,造福了其后世世代代所有人。试想如果当年的印刷技术只被一家公司独家占有,或只局限于一国所用,那么大几百年的社会跃迁和文明普及也许根本不会发生。这个类比可能不够贴切,但有助我们探讨当今人工智能领域正面临的一些争辩。

回到21世纪,生成式AI是一种能够生成文字、图像等多介质内容的人工智能技术,正在全球点燃的巨大颠覆性远远超过印刷机。生成式AI能够吸收大部分的人类知识, 通过训练数千亿参数的大语言模型,AI 开始具备写作、绘图、推理和解决问题等方方面面的能力,成为强大的生产力工具。

印刷机的出现加速了几世纪前的知识传播,而生成式AI将加速这个世纪的知识生产。它能以超越人类的速度来理解和解释问题,创造点子并生成内容,大幅提高知识生产力。AI将帮助企业提高业务,创造前所未有的经济价值,进而改变人类的生活。我从学生时代涉足人工智能至今40多年,离开学术研究后投入苹果、微软、谷歌的商业界,再从事AI投资。多年经验的积累形成了我的坚定预测:生成式AI将引发有史以来最伟大的技术革命。

我认为,这股强大技术力量不该只被特权精英垄断,各行各业乃至每位个人,都不应被新一波重大AI浪潮给刷掉,所有人都有权接触、理解并用上AI,这是我创办零一万物 01.AI的初衷。作为一家初创公司,零一万物致力于开发生成式AI大语言模型。不久前,我们在开源社区平台发布了第一款高质量模型——拥有340亿参数的Yi-34B,任何人都可以通过Hugging Face、GitHub等平台来使用、增强和定制我们的模型及源代码。

Yi-34B是一个“黄金比例”尺寸,可作为研究者、创业者和中小企业的理想开源模型选择。尽管OpenAI和谷歌的模型更大、性能更优,但还是闭源路线。我并不是建议所有模型都要开源,而是期待科技巨头在关注自身业务和利润之外,也能拥抱开源社区,做出贡献。

举例来说,科技大厂可以开源较小的模型,同时保留较大模型的闭源属性,这也是零一万物的计划之一。开源将有助于研究人员、教育工作者、学生、创业者、业余科技达人甚至公益组织把大模型落地用起来。现实看来,多数团体根本无力承担技术、成本门槛超高的大模型研发,拥抱开源有助于 AI普及,这种包容性至关重要。

大型企业容易将核心技术锁入“黑匣子”,然而一味封闭,势必会限制生成式AI的广泛发展,一些群体不可避免陷入边缘化。例如,目前全球顶尖的语言模型,主要是基于英美数据训练而来,虽然这些模型能处理多国语言,但在小语种上往往表现平平,对较小或贫穷国家的用户体验极差。这些国家缺乏建立超大数据库的资源,也不掌握开发高质量本土语言模型的技术,势必赶不上这趟生成式AI变革的高速列车。

此外,国际上主流的大模型存在着“美式偏见”,基于训练设计,这些模型反映的是美国主导的文化和价值观,不一定适合其他地区。人之蜜糖,我之砒霜。一个国家视为规范的准则,在另一个国家和地区可能是无礼乃至非法行为。即便美国和欧洲都存在巨大差异,更别说西方世界和全球其他各个地区了。在我看来,能满足所有国家和地区需求的通用模型是不存在的,每个国家和地区都值得拥有适合自身文化、价值观、宗教和语言习惯的高质量模型。

一些外媒称零一万物是中国的OpenAI,我们的思考比OpenAI更开放。我们认为,这场关键的竞争不是中国对美国,而是开放和封闭的思辨。尽管资源有限,我们仍决心为更多语言开发高质量的模型,让AI服务于全球更广大的人群,不希望任何人被AI时代抛弃。

作为一名技术乐观主义者,我坚信人工智能将推动人类社会进步,增强人性的本质,而非取代人类。踩着拥抱开放的步伐,踏上AI赋能的未来之路。

本文翻译自《The Wall Street Journal》英文专栏,原文如下:

Artificial Intelligence Needs Open-Source Models to Reach Its Potential

BY KAI-FU LEE

Johannes Gutenberg ’s printing press revolutionized life in the 15th century,making it possible for ideas to travel around the world at previously unimaginable speeds and creating huge gains for mankind. Gutenberg tried to keep his technique a secret, but a disgruntled former investor, Johann Fust , soon replicated his device. Fust launched his own press and poached Gutenberg’s toptechie, Peter Schoeffer.

It was lucky for the rest of us that he did. Imagine if the printing press had been kept secret and controlled by one company, or confined within one nation.Centuries of human progress might never have happened. It’s an imperfect analogy, but it offers a useful way to frame the current debate around artificial intelligence.

Generative AI, which is artificial intelligence capable of producing text, images and other media, will revolutionize life in the 21st century. It will be much more disruptive than the printing press. Generative AI’s ability to digest nearly the entire breadth of human knowledge by training large-language-model algorithms with hundreds of billions of parameters allows it to write, draw,reason and solve problems. These potent new tools will amplify the power of knowledge workers.

The printing press accelerated the spread of knowledge. Generative AI will accelerate the creation of knowledge. It can understand, explain and create ideas and content at speeds unfathomable by humans. Generative AI will improve productivity and generate untold economic value. It will help entrepreneurs make fortunes and, more important, transform lives. I can say from my fourdecades of involvement in AI—first as an academic, then working with Microsoft and Google, and later as a venture capitalist—that generative AI will unleash thegreatest technology revolution ever.

But we can’t keep this power locked away for only the privileged elite. Given the massive technological paradigm shifts that are coming, it will be necessary for people from all backgrounds to understand and have access to the technology. It is crucial that no one be left out. This is why I decided to launch 01.AI, a startup building foundational large language models, which are the building blocks of generative AI. We launched our first language model, Yi-34B, with 34 billion parameters. Anyone can engage with, enhance and tailor our model and its source code, which is available on GitHub.

While the Yi-34B model’s manageable size makes it ideal for researchers,entrepreneurs and smaller companies, OpenAI and Google have kept their larger and better models proprietary. I am not suggesting that every model should beopen. But I hope every technology company will embrace and contribute to the open-source community, even as they maintain their business and profit goals.

A technology giant could open-source a smaller model while keeping larger models proprietary. This is 01.AI’s intention. This openness will make models accessible to researchers, educators, students, entrepreneurs, hobbyists and nonprofits. This inclusiveness is critical, because many communities can’t afford to use the more expensive proprietary models. Embracing openness democratizes generative AI.

Thwarting the spread of generative AI by keeping it proprietary leaves some groups marginalized as successful companies put their tools in black boxes. The best generative AI models were mostly trained on American and English data.While they are functionally multilingual, they perform poorly when using languages that are less prominent on the internet. Users from smaller or poorer nations are provided a much inferior experience. They don’t have the resources to generate and collect giant repositories of data in their languages or the technological know-how to develop high-quality native language models. The generative AI revolution is leaving them behind.

There’s also an American bias in the dominant proprietary models. Because of how they were trained, these models reflect the culture and values of the U.S. ,which may not suit other places. What one country sees as the norm may be offensive or even illegal in another. There are huge differences between the U.S. and Europe, never mind between the West and the rest of the world. A universal model can’t possibly fit every country’s needs. Each country should have a high-quality model that is tailored to its culture, values, religion and language.

Some in the media have described 01.AI as China’s answer to OpenAI, the developer behind ChatGPT. We see ourselves as the more “open” answer to OpenAI. In our view, the key competition isn’t China vs. the U.S. Rather, it’s open vs. closed systems. Even with only modest resources, we are determined to develop high-quality models for more languages to make this technology accessible to more people globally. We don’t want AI to leave anyone behind.

As a technology optimist, I firmly believe that artificial intelligence will advance the human race, amplifying our humanity rather than replacing it. But that can be accomplished only if we remain committed to the virtue of openness.

Mr. Lee is CEO of 01.AI and chairman of Sinovation Ventures.

PHOTO:GETTY IMAGES/ISTOCKPHOTO

原文链接:https://www.wsj.com/articles/artificial-intelligence-needs-open-source-models-to-reach-its-potential-e1f47d3f?