AI-Agents // use cases, hype, fun

2 min readNov 27, 2023

https://twitter.com/DivGarg9/status/1728854189873549809

The idea of putting LLM in an RL loop is good, in some cases it might work, but the problem with RL — it requires too many iterations — is still here.

https://www.linkedin.com/posts/natolambert_the-q-hypothesis-tree-of-thoughts-reasoning-activity-7133823380625006592-gcWw

searching over language/reasoning steps via tree-of-thoughts reasoning…Deep RL that enabled success like AlphaGo: self-play and look-ahead planning.
Self-play is the idea that an agent can improve its gameplay by playing against slightly different versions of itself because it’ll progressively encounter more challenging situations. In the space of LLMs, it is almost certain that the largest portion of self-play will look like AI Feedback rather than competitive processes.
Look-ahead planning is the idea of using a model of the world to reason into the future and produce better actions or outputs. The two variants are based on Model Predictive Control (MPC), which is often used on continuous states, and Monte-Carlo Tree Search (MCTS), which works with discrete actions and states.

AI-Agents // use cases, hype, fun

Written by sbagency

No responses yet