Open source LLMs // are they truly open-source?

A lot of experiments and startup projects based on open-source LLMs, but what about real business?

3 min readJan 29, 2024

https://en.wikipedia.org/wiki/Trojan_Horse

Open source LLMs are a great initiative, but it is not truly open-source, the weights are closed, it’s just like a binary code, and can be used as is (black box). All risks, bugs, hallucinations, cyber-security issues, etc. are here and can’t be fixed.

https://twitter.com/amasad/status/1747666962749284468

Here is the parade of “open-source” LLMs usage examples:

https://venturebeat.com/ai/how-enterprises-are-using-open-source-llms-16-examples/

Open-source developers have created thousands of derivatives of models like Llama, including increasingly, mixing models — and they are steadily achieving parity with, or even superiority over closed models on certain metrics (see examples like FinGPT, BioBert, Defog SQLCoder, and Phind).

“A lot of customer are asking themselves: Wait a second, why am I paying for super large model that knows very little about my business? Couldn’t I just use one of these open-source models, and by the way, maybe use a much smaller, open-source model for that (information retrieval) workflow?”

Companies often choose the open source route, he said, when they’re concerned about controlling access to their data, but also when they want more control over the fine-tuning of a model for specialized purposes.