Self-hosted/autonomous LLMs/agents // In-house vs API

4 min readApr 30, 2023

Meta sees “an opportunity to introduce AI agents to billions of people in ways that will be useful and meaningful,” CEO Mark Zuckerberg told investors today. [link]

https://twitter.com/jamie247/status/1652581929516974080

{autonomus, custom, local} agents // own infra, AI models can work locally

Funny guys play football // deepmind

https://twitter.com/DeepMind/status/1651897358894919680

https://twitter.com/PolynomialXYZ/status/1651900918437228547

Deploy open source LLM on custom infra

https://gist.github.com/timesler/4b244a6b73d6e02d17fd220fd92dfaec

Dolly v2 was trained by Databricks, has 12B parameters and an MIT license.

https://twitter.com/karpathy/status/1651999209149857793

Reasons to deploy own LLMs:
- Control, customization and ownership // no limits of big corps LLMs
- Data // privacy
- Ownership // you can sell/buy custom models/data
- Cost // optimize spends

https://twitter.com/chipro/status/1650903705385074689

Where to find LLMs? huggingface, replicate, github, twitter

https://twitter.com/rasbt/status/1650866140892069892

Self-Instruct: Aligning Language Model with Self Generated Instructions

StableVicuna

https://github.com/stability-AI/stableLM/

How to build TruthGPT?

https://twitter.com/DrJimFan/status/1649458857343864833

Self-hosted LLMs

Large language models (LLMs) generally require significant GPU infrastructure to operate. We’re now starting to see ports, like llama.cpp, that make it possible to run LLMs on different hardware — including Raspberry Pis, laptops and commodity servers. As such, self-hosted LLMs are now a reality, with open-source examples including GPT-J, GPT-JT and LLaMA. This approach has several benefits, offering better control in fine-tuning for a specific use case, improved security and privacy as well as offline access. However, you should carefully assess the capability within the organization and the cost of running such LLMs before making the decision to self-host. [link]

Some Models for Commercial Usage

Large Language Model (LLM) Primers
With the advent of ChatGPT, LLMs have been the talk of the town! We’ve recently seen a bunch of extraordinary advancements with GPT-4, LLaMA, Toolformer, RLHF, Visual ChatGPT, etc.
📝 Here are some primers on recent LLMs and related concepts to get you up to speed:
🔹 ChatGPT (http://chatgpt.aman.ai)
- Training Process
- Detecting ChatGPT generated text
- Related: InstructGPT
🔹 Reinforcement Learning from Human Feedback a.k.a. RLHF
(http://rlhf.aman.ai)
- Refresher: Basics of RL
- Training Process (Pretraining Language Models, Training a Reward Model, Fine-tuning the LM with RL)
- Bias Concerns and Mitigation Strategies
🔹 LLaMA (http://llama.aman.ai)
- Training Process (Pre-normalization, SwiGLU Activation Function, Rotary Positional Embeddings, Flash Attention)
- Visual Summary
🔹 Toolformer (http://toolformer.aman.ai)
- Approach
- Sampling and Executing API Calls
- Experimental Results
🔹 Visual ChatGPT (http://vchatgpt.aman.ai)
- System Architecture
- Managing Multiple Visual Foundation Models
- Handling Queries
- Limitations
🔹 GPT-4 (http://gpt-4.aman.ai)
- Capabilities of GPT-4
- GPT-4 vs. GPT-3
Notes written in collaboration with Vinija Jain. [link]