Intelligence [r]evolution // from the hope to the reality

Spatial intelligence is teaching computers to see, learn, do…

3 min readMay 16, 2024

But there are fundamental limits of intelligence..

https://www.youtube.com/watch?v=y8NtMZ7VGmU

The speaker begins by evoking a historical perspective on the emergence of sight and intelligence in the natural world, particularly focusing on trilobites and the Cambrian explosion. They then transition to the advancements in artificial intelligence (AI), particularly in computer vision and spatial intelligence. They discuss the convergence of neural networks, GPUs, and big data in modern AI, highlighting milestones such as the ImageNet challenge.

The speaker showcases advancements in generative AI and spatial intelligence, including algorithms translating images into 3D space and rooms from textual descriptions. They also discuss progress in robotic learning and language intelligence, illustrating how AI can perform tasks based on verbal instructions. The potential applications of AI in healthcare, including smart sensors and robotic assistance, are explored. The talk concludes with a vision of AI as trusted partners in enhancing human productivity and prosperity, emphasizing the importance of human-centric development.

But

Stable Diffusion revolutionised image creation from descriptive text. GPT-2, GPT-3(.5) and GPT-4 demonstrated astonishing performance across a variety of language tasks. ChatGPT introduced such language models to the general public. It is now clear that large language models (LLMs) are here to stay, and will bring about drastic change in the whole ecosystem of online text and images. In this paper we consider what the future might hold. What will happen to GPT-{n} once LLMs contribute much of the language found online? We find that use of model-generated content in training causes irreversible defects in the resulting models, where tails of the original content distribution disappear. We refer to this effect as model collapse1 and show that it can occur in Variational Autoencoders, Gaussian Mixture Models and LLMs. We build theoretical intuition behind the phenomenon and portray its ubiquity amongst all learned generative models. We demonstrate that it has to be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web. Indeed, the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of content generated by LLMs in data crawled from the Internet.

The ideas of self-observation and self-representation, and the concomitant idea of self-control, pervade both the cognitive and life sciences, arising in domains as diverse as immunology and robotics. Here, we ask in a very general way whether, and to what extent, these ideas make sense. Using a generic model of physical interactions, we prove a theorem and several corollaries that severely restrict applicable notions of self-observation, self-representation, and self-control. We show, in particular, that adding observational, representational, or control capabilities to a meta-level component of a system cannot, even in principle, lead to a complete meta-level representation of the system as a whole. We conclude that self-representation can at best be heuristic, and that self models cannot, in general, be empirically tested by the systems that implement them.

Intelligence [r]evolution // from the hope to the reality

Spatial intelligence is teaching computers to see, learn, do…

But

Written by sbagency

No responses yet