Custom LLMs parade // fine-tuned, RAG-specialized, hacked

sbagency
3 min readNov 17, 2023

--

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
https://www.marktechpost.com/2023/11/17/llmware-launches-rag-specialized-7b-parameter-llms-production-grade-fine-tuned-models-for-enterprise-workflows-involving-complex-business-documents/

As more enterprises look to deploy scalable RAG systems using their own private information, there is a growing recognition of several needs:

Unified framework that integrates LLM models with a set of surrounding workflow capabilities (e.g., document parsing, embedding, prompt management, source verification, audit tracking);

High-quality, smaller, specialized LLMs that have been optimized for fact-based question-answering and enterprise workflows and

Open Source, Cost-effective, Private deployment with flexibility and options for customization.

https://huggingface.co/llmware

The DRAGON model family joins two other LLMWare RAG model collections: BLING and Industry-BERT. The BLING models are no-GPU required RAG-specialized smaller LLM models (1B — 3B) that can run on a developer’s laptop. Since the training methodology is very similar, the intent is that a developer can start with a local BLING model, running on their laptop, and then seamlessly drop-in a DRAGON model for higher performance in production. DRAGON models have all been designed for private deployment on a single enterprise-grade GPU server, so that enterprises can deploy an end-to-end RAG system, securely and privately in their own security zone.

https://blog.perplexity.ai/blog/turbocharging-llama-2-70b-with-nvidia-h100
https://www.linkedin.com/posts/aravind-srinivas-16051987_turbocharging-llama-2-70b-with-nvidia-h100-activity-7131343111749828608-xKDH

Llama on a micro-controller // hack

https://www.linkedin.com/posts/maxbbraun_hack-of-the-day-llama-on-a-microcontroller-activity-7130639355815055360-mr6d
https://github.com/maxbbraun/llama4micro
https://deepseekcoder.github.io/
https://youtu.be/o35EY8I9PXU?t=14882
https://www.lighton.ai/
https://www.lighton.ai/blog/lighton-s-blog-4/alfred-40b-1023-44
https://www.dfrobot.com/blog-13412.html

--

--

sbagency
sbagency

Written by sbagency

Tech/biz consulting, analytics, research for founders, startups, corps and govs.

No responses yet