打开网易新闻 查看精彩图片

导语

随着o1的发布,以思维链诱导大模型推理进入新的阶段,如何利用思维链实现类o1的推理?

二期分享将于2024年12月14日晚19:30正式开始,中国科学院信息工程研究所张杰与哈尔滨工业大学初征将一起带来「大语言模型2.0的思维链及其内化机制」的分享。

打开网易新闻 查看精彩图片

分享内容简介

本次分享将带领大家观看集智俱乐部年会上Jason Wei题为「Emergent Abilities Unlocked by Scaling up Language Models」的分享。以此为引子,进一步对思维链的机理进行探讨,并对思维链变体及其相关研究进行回顾,最后给出对类o1中思维链的思考,以期将推理能力内化到大模型

分享内容大纲

  • 思维链COT机理

  • Jason Wei: Emergent abilities unlocked by scaling up language models

  • “可解释”思维链

  • X-of-Thought

  • Tree-of-Thought (ToT)

  • Graph-of-Thought (GoT)

  • And More: Buffer-of-Thoughts (BoT), Layer-of-Thought (LoT), Program-of-Thought (PoT), Equation-of-Thought (EoT), Diagram-of-Thought (DoT)

  • Chain-of-X

  • Chain-of-Thought

  • Chain-of-Augmentation

  • Chain-of-Feedback

  • Chain-of-Model

  • 类o1中的思维链

  • 思维链与Self-X

  • 思维链与系统2

  • 思维链与苏格拉底反诘法

主讲人介绍

张杰,中国科学院信息工程研究所四年级博士生,安远AI伙伴,上海人工智能实验室实习生。具有人工智能和网络安全交叉背景,关注大模型安全与对齐。

研究方向:可信AI、可解释性

初征,哈尔滨工业大学社会计算与信息检索研究中心三年级博士生,导师是刘铭教授和秦兵教授。《A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future》综述第一作者。

研究方向:智能问答、大语言模型、思维链推理,LLM-based Agent等

主要涉及到的参考文献

  • Wei, Jason, et al. "Chain-of-thought prompting elicits reasoning in large language models." Advances in neural information processing systems 35 (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf

  • Lyu, Qing, et al. "Faithful chain-of-thought reasoning." arXiv preprint arXiv:2301.13379 (2023). https://arxiv.org/pdf/2301.13379

  • Li, Jiachun, et al. "Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners." arXiv preprint arXiv:2405.18915 (2024). https://arxiv.org/pdf/2405.18915

  • Feng, Guhao, et al. "Towards revealing the mystery behind chain of thought: a theoretical perspective." Advances in Neural Information Processing Systems 36 (2024). https://proceedings.neurips.cc/paper_files/paper/2023/file/dfc310e81992d2e4cedc09ac47eff13e-Paper-Conference.pdf

  • Merrill, William, and Ashish Sabharwal. "The expressive power of transformers with chain of thought." arXiv preprint arXiv:2310.07923 (2023). https://arxiv.org/pdf/2310.07923

  • Li, Zhiyuan, et al. "Chain of thought empowers transformers to solve inherently serial problems." arXiv preprint arXiv:2402.12875 (2024). https://arxiv.org/pdf/2402.12875

  • Shibo Hao, et al. "Training Large Language Models to Reason in a Continuous Latent Space." arXiv preprint arXiv:2412.06769 (2024). https://openreview.net/pdf?id=tG4SgayTtk

  • Deng, Yuntian, Yejin Choi, and Stuart Shieber. "From explicit cot to implicit cot: Learning to internalize cot step by step." arXiv preprint arXiv:2405.14838 (2024). https://arxiv.org/pdf/2405.14838

  • Chu, Zheng, et al. "Navigate through enigmatic labyrinth a survey of chain of thought reasoning: Advances, frontiers and future." Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. https://aclanthology.org/2024.acl-long.65.pdf

  • Xia, Yu, et al. "Beyond chain-of-thought: A survey of chain-of-x paradigms for llms." arXiv preprint arXiv:2404.15676 (2024). https://arxiv.org/pdf/2404.15676

  • Wang, Xuezhi, et al. "Self-consistency improves chain of thought reasoning in language models." arXiv preprint arXiv:2203.11171 (2022). https://arxiv.org/pdf/2203.11171

  • Zhang, Di, et al. "Llama-berry: Pairwise optimization for o1-like olympiad-level mathematical reasoning." arXiv preprint arXiv:2410.02884 (2024). https://arxiv.org/pdf/2410.02884?

  • Qin, Yiwei, et al. "O1 Replication Journey: A Strategic Progress Report--Part 1." arXiv preprint arXiv:2410.18982 (2024). https://arxiv.org/pdf/2410.18982

  • Su, DiJia, et al. "Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces." arXiv preprint arXiv:2410.09918 (2024). https://arxiv.org/abs/2410.09918

  • Zheng, Xin, et al. "Critic-cot: Boosting the reasoning abilities of large language model via chain-of-thoughts critic." arXiv preprint arXiv:2408.16326 (2024). https://arxiv.org/pdf/2408.16326

  • Liu, Jiayu, et al. "SocraticLM: Exploring Socratic Personalized Teaching with Large Language Models." The Thirty-eighth Annual Conference on Neural Information Processing Systems. https://openreview.net/pdf?id=qkoZgJhxsA

直播信息

时间:

2024年12月14日(本周六)晚上19:30-21:30

扫码参与,加入群聊,获取系列读书会回看权限,成为人工智能社区的种子用户,与社区的一线科研工作者与企业实践者沟通交流,共同推动人工智能社区的发展。

报名成为主讲人

读书会成员均可以在读书会期间申请成为主讲人。主讲人作为读书会成员,均遵循内容共创共享机制,可以获得报名费退款,并共享本读书会产生的所有内容资源。详情请见:

大模型2.0读书会启动

o1模型代表大语言模型融合学习与推理的新范式。集智俱乐部联合北京师范大学系统科学学院教授张江、Google DeepMind研究科学家冯熙栋、阿里巴巴强化学习研究员王维埙和中科院信工所张杰共同发起,本次读书会将关注大模型推理范式的演进、基于搜索与蒙特卡洛树的推理优化、基于强化学习的大模型优化、思维链方法与内化机制、自我改进与推理验证。希望通过读书会探索o1具体实现的技术路径,帮助我们更好的理解机器推理和人工智能的本质。

从2024年12月7日开始,预计每周六进行一次,持续时间预计 6-8 周左右。欢迎感兴趣的朋友报名参加,激发更多的思维火花!

详情请见: