Generalist models vs specialized ones // it depends (as usual)

sbagency
3 min readJan 9, 2024

--

https://www.linkedin.com/posts/rajistics_remember-bloomberggpt-which-was-a-specially-activity-7150518301327024129-MaK_

Opinion 1: “The smartest generalist models don’t outperform specialized models.”

https://www.linkedin.com/feed/update/urn:li:activity:7150359287024795648/

Opinion 2: “It is part of a pattern — the smartest generalist frontier models beat specialized models in specialized topics.”

https://arxiv.org/pdf/2305.05862.pdf

The most recent large language models (LLMs) such as ChatGPT and GPT-4 have shown exceptional capabilities of generalist models, achieving state-of-the-art performance on a wide range of NLP tasks with little or no adaptation. How effective are such models in the financial domain? Understanding this basic question would have a significant impact on many downstream financial analytical tasks. In this paper, we conduct an empirical study and provide experimental evidences of their performance on a wide variety of financial text analytical problems, using eight benchmark datasets from five categories of tasks. We report both the strengths and limitations of the current models by comparing them to the state-of-the-art fine-tuned approaches and the recently released domainspecific pretrained models. We hope our study can help understand the capability of the existing models in the financial domain and facilitate further improvements.

https://www.linkedin.com/feed/update/urn:li:ugcPost:7150359286244638720?commentUrn=urn%3Ali%3Acomment%3A%28ugcPost%3A7150359286244638720%2C7150463653706649600%29&dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287150463653706649600%2Curn%3Ali%3AugcPost%3A7150359286244638720%29
https://www.oneusefulthing.org/p/an-ai-haunted-world
https://lmstudio.ai/
https://www.linkedin.com/pulse/using-llms-locally-ipad-iphone-maciek-j%C4%99drzejczyk-cd0zf/
https://arxiv.org/pdf/2401.02994.pdf

In conversational AI research, there’s a noticeable trend towards developing models with a larger number of parameters, exemplified by models like ChatGPT. While these expansive models tend to generate increasingly better chat responses, they demand significant computational resources and memory. This study explores a pertinent question: Can a combination of smaller models collaboratively achieve comparable or enhanced performance relative to a singular large model? We introduce an approach termed Blending, a straightforward yet effective method of integrating multiple chat AIs. Our empirical evidence suggests that when specific smaller models are synergistically blended, they can potentially outperform or match the capabilities of much larger counterparts. For instance, integrating just three models of moderate size (6B/13B paramaeters) can rival or even surpass the performance metrics of a substantially larger model like ChatGPT (175B+ paramaters). This hypothesis is rigorously tested using A/B testing methodologies with a large user base on the Chai research platform over a span of thirty days. The findings underscore the potential of the Blended strategy as a viable approach for enhancing chat AI efficacy without a corresponding surge in computational demands. 1

--

--

sbagency
sbagency

Written by sbagency

Tech/biz consulting, analytics, research for founders, startups, corps and govs.

No responses yet