大语言模型2.0的思维链及其内化机制丨周六直播·大模型2.0读书会|主讲人|人工智能|周六直播|大模型|大语言模型|张杰|思维链|读书会

导语

随着o1的发布，以思维链诱导大模型推理进入新的阶段，如何利用思维链实现类o1的推理?

第二期分享将于2024年12月14日晚19:30正式开始，中国科学院信息工程研究所张杰与哈尔滨工业大学初征将一起带来「大语言模型2.0的思维链及其内化机制」的分享。

分享内容简介

本次分享将带领大家观看集智俱乐部年会上Jason Wei题为「Emergent Abilities Unlocked by Scaling up Language Models」的分享。以此为引子，进一步对思维链的机理进行探讨，并对思维链变体及其相关研究进行回顾，最后给出对类o1中思维链的思考，以期将推理能力内化到大模型。

分享内容大纲

思维链COT机理

Jason Wei: Emergent abilities unlocked by scaling up language models
“可解释”思维链

X-of-Thought

Tree-of-Thought (ToT)
Graph-of-Thought (GoT)
And More: Buffer-of-Thoughts (BoT), Layer-of-Thought (LoT), Program-of-Thought (PoT), Equation-of-Thought (EoT), Diagram-of-Thought (DoT)

Chain-of-X

Chain-of-Thought
Chain-of-Augmentation
Chain-of-Feedback
Chain-of-Model

类o1中的思维链

思维链与Self-X
思维链与系统2
思维链与苏格拉底反诘法

主讲人介绍

张杰，中国科学院信息工程研究所四年级博士生，安远AI伙伴，上海人工智能实验室实习生。具有人工智能和网络安全交叉背景，关注大模型安全与对齐。

研究方向：可信AI、可解释性

初征，哈尔滨工业大学社会计算与信息检索研究中心三年级博士生，导师是刘铭教授和秦兵教授。《A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future》综述第一作者。

研究方向：智能问答、大语言模型、思维链推理，LLM-based Agent等

主要涉及到的参考文献

Wei, Jason, et al. "Chain-of-thought prompting elicits reasoning in large language models." Advances in neural information processing systems 35 (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
Lyu, Qing, et al. "Faithful chain-of-thought reasoning." arXiv preprint arXiv:2301.13379 (2023). https://arxiv.org/pdf/2301.13379
Li, Jiachun, et al. "Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners." arXiv preprint arXiv:2405.18915 (2024). https://arxiv.org/pdf/2405.18915
Feng, Guhao, et al. "Towards revealing the mystery behind chain of thought: a theoretical perspective." Advances in Neural Information Processing Systems 36 (2024). https://proceedings.neurips.cc/paper_files/paper/2023/file/dfc310e81992d2e4cedc09ac47eff13e-Paper-Conference.pdf
Merrill, William, and Ashish Sabharwal. "The expressive power of transformers with chain of thought." arXiv preprint arXiv:2310.07923 (2023). https://arxiv.org/pdf/2310.07923
Li, Zhiyuan, et al. "Chain of thought empowers transformers to solve inherently serial problems." arXiv preprint arXiv:2402.12875 (2024). https://arxiv.org/pdf/2402.12875
Shibo Hao, et al. "Training Large Language Models to Reason in a Continuous Latent Space." arXiv preprint arXiv:2412.06769 (2024). https://openreview.net/pdf?id=tG4SgayTtk
Deng, Yuntian, Yejin Choi, and Stuart Shieber. "From explicit cot to implicit cot: Learning to internalize cot step by step." arXiv preprint arXiv:2405.14838 (2024). https://arxiv.org/pdf/2405.14838
Chu, Zheng, et al. "Navigate through enigmatic labyrinth a survey of chain of thought reasoning: Advances, frontiers and future." Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. https://aclanthology.org/2024.acl-long.65.pdf
Xia, Yu, et al. "Beyond chain-of-thought: A survey of chain-of-x paradigms for llms." arXiv preprint arXiv:2404.15676 (2024). https://arxiv.org/pdf/2404.15676
Wang, Xuezhi, et al. "Self-consistency improves chain of thought reasoning in language models." arXiv preprint arXiv:2203.11171 (2022). https://arxiv.org/pdf/2203.11171
Zhang, Di, et al. "Llama-berry: Pairwise optimization for o1-like olympiad-level mathematical reasoning." arXiv preprint arXiv:2410.02884 (2024). https://arxiv.org/pdf/2410.02884?
Qin, Yiwei, et al. "O1 Replication Journey: A Strategic Progress Report--Part 1." arXiv preprint arXiv:2410.18982 (2024). https://arxiv.org/pdf/2410.18982
Su, DiJia, et al. "Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces." arXiv preprint arXiv:2410.09918 (2024). https://arxiv.org/abs/2410.09918
Zheng, Xin, et al. "Critic-cot: Boosting the reasoning abilities of large language model via chain-of-thoughts critic." arXiv preprint arXiv:2408.16326 (2024). https://arxiv.org/pdf/2408.16326
Liu, Jiayu, et al. "SocraticLM: Exploring Socratic Personalized Teaching with Large Language Models." The Thirty-eighth Annual Conference on Neural Information Processing Systems. https://openreview.net/pdf?id=qkoZgJhxsA

直播信息

时间：

2024年12月14日（本周六）晚上19:30-21:30

扫码参与，加入群聊，获取系列读书会回看权限，成为人工智能社区的种子用户，与社区的一线科研工作者与企业实践者沟通交流，共同推动人工智能社区的发展。

报名成为主讲人

读书会成员均可以在读书会期间申请成为主讲人。主讲人作为读书会成员，均遵循内容共创共享机制，可以获得报名费退款，并共享本读书会产生的所有内容资源。详情请见：

大模型2.0读书会启动

o1模型代表大语言模型融合学习与推理的新范式。集智俱乐部联合北京师范大学系统科学学院教授张江、Google DeepMind研究科学家冯熙栋、阿里巴巴强化学习研究员王维埙和中科院信工所张杰共同发起，本次读书会将关注大模型推理范式的演进、基于搜索与蒙特卡洛树的推理优化、基于强化学习的大模型优化、思维链方法与内化机制、自我改进与推理验证。希望通过读书会探索o1具体实现的技术路径，帮助我们更好的理解机器推理和人工智能的本质。

从2024年12月7日开始，预计每周六进行一次，持续时间预计 6-8 周左右。欢迎感兴趣的朋友报名参加，激发更多的思维火花！

详情请见：