Old-good symbolic reasoning comes to the rescue of LLMs// but, fundamental limitations of building AI

Symbolic reasoning is an universal mechanism that can be used to solve any problem, if you have infinite time|memory

10 min readFeb 23, 2024

Fundamental limitations of building AI systems:
1. You can’t compute everything in appropriate time; // System 2
2. You can’t pre-{compute,train} everything to generate answers; // System 1

Obviously, the solution is a combo of System 1 & 2, but how?

There were countless attempts to overcome these limitations, but not much success here.

https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/

AlphaGeometry adopts a neuro-symbolic approach
AlphaGeometry is a neuro-symbolic system made up of a neural language model and a symbolic deduction engine, which work together to find proofs for complex geometry theorems. Akin to the idea of “thinking, fast and slow”, one system provides fast, “intuitive” ideas, and the other, more deliberate, rational decision-making.

https://www.nature.com/articles/s41586-023-06747-5

Here are the key points from the paper:

- The paper “Solving olympiad geometry without human demonstrations” presents AlphaGeometry, a neuro-symbolic system for proving theorems in Euclidean plane geometry. It solves 25 out of 30 olympiad-level geometry problems, outperforming previous methods and approaching the performance of an average International Mathematical Olympiad (IMO) gold medalist.

- AlphaGeometry uses a neural language model trained on 100 million synthetic theorems and proofs generated automatically without human input. The language model guides a symbolic deduction engine to construct auxiliary points during proof search.

- On a test set of 30 IMO geometry problems, AlphaGeometry solves 25, compared to 10 solved by the previous best method. It solves all geometry problems from IMO 2000 and 2015, a reputed milestone.

- AlphaGeometry produces human-readable proofs. An expert evaluation found its solutions for IMO 2000 and 2015 would receive full scores, passing the medal threshold.

- The method of generating synthetic training data provides a general framework applicable to other mathematical domains facing data scarcity, by combining randomized exploration and traceback algorithms.

- The paper represents progress towards human-level automated reasoning in mathematics. AlphaGeometry matches top human performance on olympiad geometry without need for human demonstrations.

It sounds impressive, but independent assessment and tests are required. Why not use it for any topic if it works?)

https://deepmind.google/discover/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/

Here is a summary of the key points from the post:

- FunSearch is a new method that uses large language models (LLMs) to make discoveries in mathematics and computer science. It works by combining an LLM that proposes creative solutions as computer code, with an evaluator that weeds out incorrect solutions.

- FunSearch was able to make progress on the longstanding open math problem called the cap set problem. It found larger cap sets than previously known, representing the biggest advance in 20 years.

- FunSearch favors concise, interpretable code, which helps researchers gain insights. For the cap set problem, studying FunSearch’s code led to new problem insights.

- FunSearch was also applied to a practical computer science problem called bin packing, used in applications like data centers. It found better packing algorithms than standard human-designed heuristics.

- The authors see FunSearch as demonstrating the potential for LLMs to drive discovery in science, mathematics, and industry problems when properly guided. FunSearch outputs code explaining its solutions, rather than just solutions, enabling further human insight.

https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/Mathematical-discoveries-from-program-search-with-large-language-models.pdf

Large Language Models (LLMs) have demonstrated tremendous capabilities in solving complex tasks, from quantitative reasoning to understanding natural language. However, LLMs sometimes suffer from confabulations (or hallucinations) which can result in them making plausible but incorrect statements (Bang et al., 2023; Borji, 2023). This hinders the use of current large models in scientific discovery. Here we introduce FunSearch (short for searching in the function space), an evolutionary procedure based on pairing a pre-trained LLM with a systematic evaluator. We demonstrate the effectiveness of this approach to surpass the best known results in important problems, pushing the boundary of existing LLM-based approaches (Lehman et al., 2022). Applying FunSearch to a central problem in extremal combinatorics — the cap set problem — we discover new constructions of large cap sets going beyond the best known ones, both in finite dimensional and asymptotic cases. This represents the first discoveries made for established open problems using LLMs. We showcase the generality of FunSearch by applying it to an algorithmic problem, online bin packing, finding new heuristics that improve upon widely used baselines. In contrast to most computer search approaches, FunSearch searches for programs that describe how to solve a problem, rather than what the solution is. Beyond being an effective and scalable strategy, discovered programs tend to be more interpretable than raw solutions, enabling feedback loops between domain experts and FunSearch, and the deployment of such programs in real-world applications.

Code generation is a cool idea, generate and execute programs for particular tasks.

Universal reasoning engine(s) // remember limitations

Again, just symbolic reasoning is 30+ y.o., it has a fundamental limitation of infinite time to generate an answer. It has to be solved.

research-papers:

Since the advent of Large Language Models a few years ago, they have often been considered the de facto solution for many AI problems. However, in addition to the many deficiencies of LLMs that prevent them from broad industry adoption, such as reliability, cost, and speed, there is a whole class of common real world problems that Large Language Models perform poorly on, namely, constraint satisfaction and optimization problems. These problems are ubiquitous and current solutions are highly specialized and expensive to implement. At Elemental Cognition, we developed our EC AI platform which takes a neuro-symbolic approach to solving constraint satisfaction and optimization problems. The platform employs, at its core, a precise and high performance logical reasoning engine, and leverages LLMs for knowledge acquisition and user interaction. This platform supports developers in specifying application logic in natural and concise language while generating application user interfaces to interact with users effectively. We evaluated LLMs against systems built on the EC AI platform in three domains and found the EC AI systems to significantly outperform LLMs on constructing valid and optimal solutions, on validating proposed solutions, and on repairing invalid solutions.

In order to ease the creation of applications that can reason and remove the need for expert programming skills, we developed a language, called Cogent, to serve as a bridge between the users and the reasoning engine. Cogent is a constrained subset of English with well defined semantics that allows users to capture optimization objectives as well as the rules and constraints governing the reasoning to be performed in unambiguous ways. It is processed by a compiler that can detect and rectify errors and report on logical inferences that need to be made to complete the understanding of an input specification, while allowing users to declare types and relationships explicitly to avoid possible incorrect inferences made by the compiler. Cogent is fundamentally a declarative language allowing the composition of complex 5 specifications based on simpler specifications. Each sentence can be independently read and validated within the context of a specification irrespective of its length or complexity.

Many real world applications require constraint satisfaction or optimization as part of their solutions. Today these applications are typically implemented using very specialized optimization engines by expert developers such as Gurobi (Gurobi Optimization, LLC, n.d.) and Google’s OR tools (Google, n.d.) or in some cases, the tasks are still carried out manually. We have identified several application areas where the EC AI platform makes a significant impact. We discuss some of these application areas along with systems we have developed or are developing

Solvers (LP/CP):

https://www.gurobi.com/solutions/gurobi-optimizer/

https://developers.google.com/optimization

https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer

Constraint optimization, or constraint programming (CP), is the name given to identifying feasible solutions out of a very large set of candidates, where the problem can be modeled in terms of arbitrary constraints. CP problems arise in many scientific and engineering disciplines. (The word “programming” is a bit of a misnomer, similar to how “computer” once meant “a person who computes”. Here, “programming” refers to the arrangement of a plan , rather than programming in a computer language.)
CP is based on feasibility (finding a feasible solution) rather than optimization (finding an optimal solution) and focuses on the constraints and variables rather than the objective function. In fact, a CP problem may not even have an objective function — the goal may simply be to narrow down a very large set of possible solutions to a more manageable subset by adding constraints to the problem.
An example of a problem that is well-suited for CP is employee scheduling. The problem arises when companies that operate continuously — such as factories — need to create weekly schedules for their employees. Here’s a very simple example: a company runs three 8-hour shifts per day and assigns three of its four employees to different shifts each day, while giving the fourth the day off. Even in such a small case, the number of possible schedules is huge: each day, there are 4! = 4 * 3 * 2 * 1 = 24 possible employee assignments, so the number of possible weekly schedules is 247, which is over 4.5 billion. Usually there will be other constraints that reduce the number of feasible solutions — for example, that each employee works at least a minimum number of days per week. The CP method keeps track of which solutions remain feasible when you add new constraints, which makes it a powerful tool for solving large, real-world scheduling problems.
CP has a widespread and very active community around the world with dedicated scientific journals, conferences, and an arsenal of different solving techniques . CP has been successfully applied in planning, scheduling, and numerous other domains with heterogeneous constraints.

https://developers.google.com/optimization/cp/cp_solver

SCIP is currently one of the fastest non-commercial solvers for mixed integer programming (MIP) and mixed integer nonlinear programming (MINLP). It is also a framework for constraint integer programming and branch-cut-and-price. It allows for total control of the solution process and the access of detailed information down to the guts of the solver.
By default, SCIP comes with a bouquet of different plugins for solving MIPs and MINLPs.
If you are new to SCIP, want to dive in and don’t know where to begin, then have a look at the Getting started page.

The GLPK (GNU Linear Programming Kit) package is intended for solving large-scale linear programming (LP), mixed integer programming (MIP), and other related problems. It is a set of routines written in ANSI C and organized in the form of a callable library.
GLPK supports the GNU MathProg modeling language, which is a subset of the AMPL language.
The GLPK package includes the following main components:
primal and dual simplex methods
primal-dual interior-point method
branch-and-cut method
translator for GNU MathProg
application program interface (API)
stand-alone LP/MIP solver

https://developers.google.com/optimization/lp/lp_example

Not all problems can be solved by such “solvers”. Universal solver or reasoning engine should be multi-strategy/multi-approach.

Incremental techniques aim at making it possible to improve the performance of the grounding and solving processes by reusing the results of previous executions. Clingo supports both incremental grounding and incremental solving computations. In order to leverage incremental computations in clingo, the incremental fragments of ASP programs must satisfy certain safety-related conditions. In a number of problem domains and reasoning tasks, these conditions can be satisfied in a fairly straightforward way. However, we have observed that in certain practical applications, satisfying the conditions becomes more challenging, to the point that it is sometimes unclear how or even if it is possible to leverage incremental computations. In this paper, we report our findings, and ultimate success, with the use of incremental grounding and solving techniques in one of these challenging cases. We describe the domain, which is linked to a large practical application, discuss the challenges we faced in attempting to leverage incremental computations, and then describe the techniques that we developed, in particular at the level of methods for encoding the domain knowledge and of algorithms supporting the intended interleaving of grounding and solving. We believe that our findings may provide valuable information to practitioners facing similar challenges and ultimately increase the adoption of clingo’s incremental capabilities for complex practical applications.

One of the approaches to solve reasoning problems is to generate and execute a code!