导语
随着o1的发布,以思维链诱导大模型推理进入新的阶段,如何利用思维链实现类o1的推理?
第二期分享将于2024年12月14日晚19:30正式开始,中国科学院信息工程研究所张杰与哈尔滨工业大学初征将一起带来「大语言模型2.0的思维链及其内化机制」的分享。
分享内容简介
本次分享将带领大家观看集智俱乐部年会上Jason Wei题为「Emergent Abilities Unlocked by Scaling up Language Models」的分享。以此为引子,进一步对思维链的机理进行探讨,并对思维链变体及其相关研究进行回顾,最后给出对类o1中思维链的思考,以期将推理能力内化到大模型。
分享内容大纲
思维链COT机理
Jason Wei: Emergent abilities unlocked by scaling up language models
“可解释”思维链
X-of-Thought
Tree-of-Thought (ToT)
Graph-of-Thought (GoT)
And More: Buffer-of-Thoughts (BoT), Layer-of-Thought (LoT), Program-of-Thought (PoT), Equation-of-Thought (EoT), Diagram-of-Thought (DoT)
Chain-of-X
Chain-of-Thought
Chain-of-Augmentation
Chain-of-Feedback
Chain-of-Model
类o1中的思维链
思维链与Self-X
思维链与系统2
思维链与苏格拉底反诘法
主讲人介绍
张杰,中国科学院信息工程研究所四年级博士生,安远AI伙伴,上海人工智能实验室实习生。具有人工智能和网络安全交叉背景,关注大模型安全与对齐。
研究方向:可信AI、可解释性
初征,哈尔滨工业大学社会计算与信息检索研究中心三年级博士生,导师是刘铭教授和秦兵教授。《A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future》综述第一作者。
研究方向:智能问答、大语言模型、思维链推理,LLM-based Agent等
主要涉及到的参考文献
Wei, Jason, et al. "Chain-of-thought prompting elicits reasoning in large language models." Advances in neural information processing systems 35 (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
Lyu, Qing, et al. "Faithful chain-of-thought reasoning." arXiv preprint arXiv:2301.13379 (2023). https://arxiv.org/pdf/2301.13379
Li, Jiachun, et al. "Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners." arXiv preprint arXiv:2405.18915 (2024). https://arxiv.org/pdf/2405.18915
Feng, Guhao, et al. "Towards revealing the mystery behind chain of thought: a theoretical perspective." Advances in Neural Information Processing Systems 36 (2024). https://proceedings.neurips.cc/paper_files/paper/2023/file/dfc310e81992d2e4cedc09ac47eff13e-Paper-Conference.pdf
Merrill, William, and Ashish Sabharwal. "The expressive power of transformers with chain of thought." arXiv preprint arXiv:2310.07923 (2023). https://arxiv.org/pdf/2310.07923
Li, Zhiyuan, et al. "Chain of thought empowers transformers to solve inherently serial problems." arXiv preprint arXiv:2402.12875 (2024). https://arxiv.org/pdf/2402.12875
Shibo Hao, et al. "Training Large Language Models to Reason in a Continuous Latent Space." arXiv preprint arXiv:2412.06769 (2024). https://openreview.net/pdf?id=tG4SgayTtk
Deng, Yuntian, Yejin Choi, and Stuart Shieber. "From explicit cot to implicit cot: Learning to internalize cot step by step." arXiv preprint arXiv:2405.14838 (2024). https://arxiv.org/pdf/2405.14838
Chu, Zheng, et al. "Navigate through enigmatic labyrinth a survey of chain of thought reasoning: Advances, frontiers and future." Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. https://aclanthology.org/2024.acl-long.65.pdf
Xia, Yu, et al. "Beyond chain-of-thought: A survey of chain-of-x paradigms for llms." arXiv preprint arXiv:2404.15676 (2024). https://arxiv.org/pdf/2404.15676
Wang, Xuezhi, et al. "Self-consistency improves chain of thought reasoning in language models." arXiv preprint arXiv:2203.11171 (2022). https://arxiv.org/pdf/2203.11171
Zhang, Di, et al. "Llama-berry: Pairwise optimization for o1-like olympiad-level mathematical reasoning." arXiv preprint arXiv:2410.02884 (2024). https://arxiv.org/pdf/2410.02884?
Qin, Yiwei, et al. "O1 Replication Journey: A Strategic Progress Report--Part 1." arXiv preprint arXiv:2410.18982 (2024). https://arxiv.org/pdf/2410.18982
Su, DiJia, et al. "Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces." arXiv preprint arXiv:2410.09918 (2024). https://arxiv.org/abs/2410.09918
Zheng, Xin, et al. "Critic-cot: Boosting the reasoning abilities of large language model via chain-of-thoughts critic." arXiv preprint arXiv:2408.16326 (2024). https://arxiv.org/pdf/2408.16326
Liu, Jiayu, et al. "SocraticLM: Exploring Socratic Personalized Teaching with Large Language Models." The Thirty-eighth Annual Conference on Neural Information Processing Systems. https://openreview.net/pdf?id=qkoZgJhxsA
直播信息
时间:
2024年12月14日(本周六)晚上19:30-21:30
扫码参与,加入群聊,获取系列读书会回看权限,成为人工智能社区的种子用户,与社区的一线科研工作者与企业实践者沟通交流,共同推动人工智能社区的发展。
报名成为主讲人
读书会成员均可以在读书会期间申请成为主讲人。主讲人作为读书会成员,均遵循内容共创共享机制,可以获得报名费退款,并共享本读书会产生的所有内容资源。详情请见:
大模型2.0读书会启动
o1模型代表大语言模型融合学习与推理的新范式。集智俱乐部联合北京师范大学系统科学学院教授张江、Google DeepMind研究科学家冯熙栋、阿里巴巴强化学习研究员王维埙和中科院信工所张杰共同发起,本次读书会将关注大模型推理范式的演进、基于搜索与蒙特卡洛树的推理优化、基于强化学习的大模型优化、思维链方法与内化机制、自我改进与推理验证。希望通过读书会探索o1具体实现的技术路径,帮助我们更好的理解机器推理和人工智能的本质。
从2024年12月7日开始,预计每周六进行一次,持续时间预计 6-8 周左右。欢迎感兴趣的朋友报名参加,激发更多的思维火花!
详情请见:
热门跟贴