Can AI agents be useful in scientific discoveries? // Scientific AI

Welcome AI scientist, an agent who can work on a scientific problem like humans do (imitation patterns)

8 min readApr 20, 2024

For scientific research reasoning (including hypotheses generation, chain of thoughts, analysis, etc.) is a core function of an Agent(s), not just generation based on predefined patterns (data). And function to plan and execute sequences of actions (not hard-coded algorithms).

Old good brute-force pattern (using supercomputers) works very well, now it can be wrapped and optimized by Agents.

Code generation and execution represent another valuable approach, particularly when the problem can be translated into code. Agents can help.

Knowledge graphs are highly capable as a form of knowledge representation, making them valuable tools in scientific research.

The neuro-symbolic approach, also known as the sandwich architecture, serves as the foundation for processing scientific knowledge more precisely and trustworthy than sole language models.

The flexibility of natural languages (language models) and high level abstractions as AI Agents can provide any level of scientific research imitation (as humans do).

AI in scientific discovery is a complex concept that can be wrapped by Agents. It isn’t just use of LLMs, NNs or anything else, there should be right tool for a right task.

AI scientist Dalle E 3 // just metaphor, there is no body

Deepmind is the leader in using AI in scientific research

https://deepmind.google/research/publications/

https://twitter.com/GoogleDeepMind/status/1780170853524635942

Accelerating scientific discovery

Here is a summary of the key points:

- The speaker, Chris Bishop, believes the most important use case of AI will be for scientific discovery, as understanding the natural world through science has transformed the human species.

- Large language models are useful for language understanding and reasoning, but alone are insufficient for scientific discovery due to limitations like poor numerical calculation abilities and lack of integration with experiments.

- Scientific discovery benefits from leveraging centuries of prior knowledge from physics in the form of equations, symmetries, conservation laws, etc. Embedding this inductive bias compensates for scarcity of training data.

- Examples are shown of using AI to accelerate materials screening for battery electrolytes by 3 orders of magnitude, and generate targeted molecular candidates for tuberculosis drugs with over 100x improved binding efficacy.

- Techniques like transformers, attention, variational autoencoders, and diffusion models enable incorporating the geometric/physics priors along with data to make major advances in scientific AI applications spanning materials, drug discovery, and beyond.

In essence, the talk advocates for a unique scientific AI paradigm that tightly integrates machine learning, physics knowledge, and experimental data to supercharge discovery in the natural sciences.

https://twitter.com/erikdunteman/status/1779722768462024728

We envision “AI scientists” as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents that integrate machine learning tools with experimental platforms. Rather than taking humans out of the discovery process, biomedical AI agents combine human creativity and expertise with AI’s ability to analyze large datasets, navigate hypothesis spaces, and execute repetitive tasks. AI agents are proficient in a variety of tasks, including self-assessment and planning of discovery workflows. These agents use large language models and generative models to feature structured memory for continual learning and use machine learning tools to incorporate scientific knowledge, biological principles, and theories. AI agents can impact areas ranging from hybrid cell simulation, programmable control of phenotypes, and the design of cellular circuits to the development of new therapies.

Biomedical research is undergoing a transformative era with advances in computational intelligence. Presently, AI’s role is constrained to assistive tools in low-stake and narrow tasks where scientists can review the results. We outline agent-based AI to pave the way for systems capable of skeptical learning and reasoning that consist of LLM-based systems and other ML tools, experimental platforms, humans, or even combinations of them. The continual nature of human-AI interaction creates a path to achieve this vision once focused on preventing and learning from mistakes. Building trustworthy sandboxes [207], where AI agents can fail and learn from their mistakes, is one way to achieve this. This involves developing AI agents that perform tasks and consider the boundary of their generalization ability, fostering natural and artificial intelligence.

This perspective paper envisions the development of “AI scientists” as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents that integrate machine learning tools with experimental platforms.

AI agents are proposed as conversable systems that coordinate large language models, machine learning tools, experimental platforms, or combinations of them to accelerate discovery workflows and eventually make major biomedical discoveries.

The paper outlines different types of AI agents — single large language model (LLM) agents with diverse roles, and multi-agent systems with heterogeneous specialized agents collaborating.

It describes levels of increasing autonomy for AI agents from research assistants to collaborators to AI scientists capable of generating novel hypotheses.

The roadmap involves developing perception, interaction, memory, and reasoning modules for agents to engage with environments and make decisions.

Key challenges include robustness, evaluation, dataset generation, governance, risks, and implementing safeguards for responsible development of AI agents in biomedicine.

The vision is to have AI agents power platforms like hybrid cell simulators, programmable control of phenotypes, and development of new therapies through continuous human-AI interaction and learning from mistakes.

https://www.axios.com/2024/04/19/ai-agents-assistants-ethics-alignment-google

The researchers define AI assistants as “artificial agents with natural language interfaces, whose function is to plan and execute sequences of actions on behalf of a user — across one or more domains — in line with the user’s expectations.”

Scientific Research, vital for improving human life, is hindered by its inherent complexity, slow pace, and the need for specialized experts. To enhance its productivity, we propose a ResearchAgent, a large language model powered research idea writing agent, which automatically generates problems, methods, and experiment designs while iteratively refining them based on scientific literature. Specifically, starting with a core paper as the primary focus to generate ideas, our ResearchAgent is augmented not only with relevant publications through connecting information over an academic graph but also entities retrieved from an entity-centric knowledge store based on their underlying concepts, mined and shared across numerous papers. In addition, mirroring the human approach to iteratively improving ideas with peer discussions, we leverage multiple ReviewingAgents that provide reviews and feedback iteratively. Further, they are instantiated with human preference-aligned large language models whose criteria for evaluation are derived from actual human judgments. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines, showcasing its effectiveness in generating novel, clear, and valid research ideas based on human and model-based evaluation results.

This paper proposes ResearchAgent, a system powered by large language models (LLMs) to automatically generate research ideas in scientific domains. The key components are:

1. Identifying research problems, developing methods, and designing experiments mirroring the human research process, using a core scientific paper and related papers from citations as input to the LLM.

2. Augmenting the LLM with an entity-centric knowledge store extracted from a large corpus of scientific papers to provide additional relevant knowledge beyond the input papers.

3. Iteratively refining the generated research ideas through reviews and feedback from multiple LLM-based “reviewing agents” whose evaluation criteria are aligned with human preferences.

4. The authors experimentally validate ResearchAgent on scientific publications across disciplines, showing it can generate novel, clear and valid research ideas better than baselines, based on human and model-based evaluations.

The key novelties are the systematic multi-step approach to research idea generation, the knowledge augmentation from a cross-paper entity store, and the iterative refinement process mimicking peer review. Limitations include the scoped entity extraction and lack of experimental validation of the generated ideas.

AI Agents frameworks

Voice assistants/agents

This is not working complete code.
This is strictly a v0, scrapy, proof of concept for the first version of a personal AI Assistant working end to end in just ~322 LOC.
It’s only a frame of reference for you to consume the core ideas of how to build a POC of a personal AI Assistant.
To see the high level of how this works check out the explanation video. To follow our agentic journey check out the @IndyDevDan channel.
Stay focused, keep building.

Multi-agent systems

https://twitter.com/AndrewYNg/status/1780991671855161506

https://twitter.com/MengdiWang10/status/1770509917168058569

Large Language Models (LLMs) have emerged as integral tools for reasoning, planning, and decision-making, drawing upon their extensive world knowledge and proficiency in language-related tasks. LLMs thus hold tremendous potential for natural language interaction within multi-agent systems to foster cooperation. However, LLM agents tend to over-report and comply with any instruction, which may result in information redundancy and confusion in multi-agent cooperation. Inspired by human organizations, this paper introduces a framework that imposes prompt-based organization structures on LLM agents to mitigate these problems. Through a series of experiments with embodied LLM agents and human-agent collaboration, our results highlight the impact of designated leadership on team efficiency, shedding light on the leadership qualities displayed by LLM agents and their spontaneous cooperative behaviors. Further, we harness the potential of LLMs to propose enhanced organizational prompts, via a Criticize-Reflect process, resulting in novel organization structures that reduce communication costs and enhance team efficiency.