AI-Agents // use cases, hype, fun

sbagency
2 min readNov 27, 2023

--

https://twitter.com/DivGarg9/status/1728854189873549809
https://www.multion.ai/

The idea of putting LLM in an RL loop is good, in some cases it might work, but the problem with RL — it requires too many iterations — is still here.

https://www.linkedin.com/posts/natolambert_the-q-hypothesis-tree-of-thoughts-reasoning-activity-7133823380625006592-gcWw
https://www.interconnects.ai/p/q-star

searching over language/reasoning steps via tree-of-thoughts reasoning…Deep RL that enabled success like AlphaGo: self-play and look-ahead planning.

Self-play is the idea that an agent can improve its gameplay by playing against slightly different versions of itself because it’ll progressively encounter more challenging situations. In the space of LLMs, it is almost certain that the largest portion of self-play will look like AI Feedback rather than competitive processes.

Look-ahead planning is the idea of using a model of the world to reason into the future and produce better actions or outputs. The two variants are based on Model Predictive Control (MPC), which is often used on continuous states, and Monte-Carlo Tree Search (MCTS), which works with discrete actions and states.

https://www.linkedin.com/posts/jasonkuperberg_we-are-excited-to-open-source-the-self-operating-activity-7134943081635672064-rOOz
https://github.com/OthersideAI/self-operating-computer

--

--

sbagency
sbagency

Written by sbagency

Tech/biz consulting, analytics, research for founders, startups, corps and govs.

No responses yet