AI reasoning improvements // Neuro-symbolic and other knowledge frameworks to rescue from hallucinations

Reasoning engines are the future of AI-computing, not just stats

sbagency
8 min readFeb 3, 2024
https://arxiv.org/pdf/2210.05050.pdf

Neurosymbolic Programming (NP) techniques have the potential to accelerate scientific discovery. These models combine neural and symbolic components to learn complex patterns and representations from data, using high-level concepts or known constraints. NP techniques can interface with symbolic domain knowledge from scientists, such as prior knowledge and experimental context, to produce interpretable outputs. We identify opportunities and challenges between current NP models and scientific workflows, with real-world examples from behavior analysis in science: to enable the use of NP broadly for workflows across the natural and social sciences.

- Neurosymbolic programming (NP) combines neural networks and symbolic reasoning to build models that incorporate expert knowledge and constraints. NP has potential to accelerate scientific discovery by representing hypotheses as programs that can be analyzed and manipulated.

- Behavior analysis is used as an example application area. Challenges include noisy/imperfect data, rare behaviors, distribution shifts, subjectivity in annotations. NP has potential to handle these issues better than blackbox ML models.

- Two main requirements for effective NP are a good domain-specific language (DSL) and scalable learning algorithms. DSL encodes expert knowledge. Learning involves architecture search and parameter optimization.

- Opportunities and challenges for NP in science include: dealing with raw/noisy data, encoding domain knowledge, scalability, optimizing discrete/continuous representations, evaluating interpretability, building cross-domain benchmarks and tools.

- Differentiable programs enable differentiable parameter optimization within symbolic NP architectures. But challenges remain in scalability, stability, comprehensive evaluation, and practical deployment.

- Overall, NP offers promise to integrate prior knowledge and produce interpretable solutions to accelerate science, but research gaps remain in scaling and optimizing these techniques. Cross-disciplinary efforts needed.

https://twitter.com/DinuMariusC/status/1753443407836623266
https://arxiv.org/pdf/2402.00854.pdf

We introduce SymbolicAI, a versatile and modular framework employing a logic-based approach to concept learning and flow management in generative processes. SymbolicAI enables the seamless integration of generative models with a diverse range of solvers by treating large language models (LLMs) as semantic parsers that execute tasks based on both natural and formal language instructions, thus bridging the gap between symbolic reasoning and generative AI. We leverage probabilistic programming principles to tackle complex tasks, and utilize differentiable and classical programming paradigms with their respective strengths. The framework introduces a set of polymorphic, compositional, and self-referential operations for data stream manipulation, aligning LLM outputs with user objectives. As a result, we can transition between the capabilities of various foundation models endowed with zero- and few-shot learning capabilities and specialized, fine-tuned models or solvers proficient in addressing specific problems. In turn, the framework facilitates the creation and evaluation of explainable computational graphs. We conclude by introducing a quality measure and its empirical score for evaluating these computational graphs, and propose a benchmark that compares various state-of-the-art LLMs across a set of complex workflows. We refer to the empirical score as the ”Vector Embedding for Relational Trajectory Evaluation through Cross-similarity”, or VERTEX score for short. The framework codebase1 and benchmark2 are linked below

SymbolicAI, a neuro-symbolic framework that combines generative models like large language models (LLMs) with symbolic reasoning engines. The goal is to leverage the strengths of both symbolic AI and neural approaches.

- Uses a logic-based approach to concept learning and flow management in generative processes. Enables integrating LLMs as semantic parsers with symbolic expressions.

- Facilitates creating complex computational graphs through a modular probabilistic programming paradigm.

- Allows composing hierarchical and self-referential structures. Uses LLMs for few-shot learning of concepts and operations.

- Introduces a quality measure and benchmark for evaluating multi-step generative processes involving both neural and symbolic components.

- Aims to enable seamless transitions between symbolic and differentiable programming, focusing on high-level abstractions.

- Provides a toolkit to develop extensible, domain-invariant problem solvers by integrating pre-existing algorithms.

The paper evaluates SymbolicAI on tasks like logical reasoning, program synthesis, and computational graph construction using models like GPT-3.5, GPT-4, and others. It finds strengths of the framework but also limitations around model capabilities, error handling, and generalization.

Overall, SymbolicAI offers a way to combine neural and symbolic AI to create more robust and explainable systems. The introduced benchmark provides a means to evaluate progress in this area.

https://browse.arxiv.org/pdf/2402.00414.pdf

Augmenting large language models (LLMs) with user-specific knowledge is crucial for real-world applications, such as personal AI assistants. However, LLMs inherently lack mechanisms for prompt-driven knowledge capture. This paper investigates utilizing the existing LLM capabilities to enable prompt-driven knowledge capture, with a particular emphasis on knowledge graphs. We address this challenge by focusing on prompt-to-triple (P2T) generation. We explore three methods: zero-shot prompting, few-shot prompting, and fine-tuning, and then assess their performance via a specialized synthetic dataset. Our code and datasets are publicly available at https://github.com/HaltiaAI/paper-PTSKC

P2T (subject, predicate, object)

Prompt-to-triple generation Triples, composed of (’subject’, ’predicate’, ’object’). Each triple represents a distinct atom of knowledge, with the subject and object identifying entities and the predicate describing their relationship.

T2G (head,relation,tail)

https://browse.arxiv.org/pdf/2401.14003.pdf

Reasoning over Commonsense Knowledge Bases (CSKB), i.e., CSKB reasoning, has been explored as a way to acquire new commonsense knowledge based on reference knowledge in the original CSKBs and external prior knowledge. Despite the advancement of Large Language Models (LLM) and prompt engineering techniques in various reasoning tasks, they still struggle to deal with CSKB reasoning. One of the problems is that it is hard for them to acquire explicit relational constraints in CSKBs from only in-context exemplars, due to a lack of symbolic reasoning capabilities (Bengio et al., 2021). To this end, we proposed ConstraintChecker, a plugin over prompting techniques to provide and check explicit constraints. When considering a new knowledge instance, ConstraintChecker employs a rule-based module to produce a list of constraints, then it uses a zero-shot learning module to check whether this knowledge instance satisfies all constraints. The acquired constraint-checking result is then aggregated with the output of the main prompting technique to produce the final output. Experimental results on CSKB Reasoning benchmarks demonstrate the effectiveness of our method by bringing consistent improvements over all prompting methods. Codes and data are available at https://github.com/ HKUST-KnowComp/ConstraintChecker.

https://arxiv.org/pdf/2401.06853.pdf

Large language models (LLMs) learn temporal concepts from the co-occurrence of related tokens in a sequence. Compared with conventional text generation, temporal reasoning, which reaches a conclusion based on mathematical, logical and commonsense knowledge, is more challenging. In this paper, we propose TempGraph-LLM, a new paradigm towards text-based temporal reasoning. To be specific, we first teach LLMs to translate the context into a temporal graph. A synthetic dataset, which is fully controllable and requires minimal supervision, is constructed for pre-training on this task. We prove in experiments that LLMs benefit from the pre-training on other tasks. On top of that, we guide LLMs to perform symbolic reasoning with the strategies of Chain of Thoughts (CoTs) bootstrapping and special data augmentation. We observe that CoTs with symbolic reasoning bring more consistent and reliable results than those using free text.

TempGraph (sub; rel; obj; start/end; time)

TempGraph-LLM, a new paradigm for language models, has been proposed to improve their performance on temporal reasoning. Besides producing the final answers, our framework also provides the temporal graph and symbolic reasoning process. Experiments indicate that TempGraph-LLM achieves better performance than existing methods. An interesting direction for future work is to extend it to more complex applications such as induction and abductive reasoning. Due to the graph structure and capability of symbolic reasoning, it is promising to improve the performance of LLMs on these tasks with good problem formalization and methodology extension.

https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/
https://arxiv.org/pdf/2402.00591.pdf

This paper presents sandra, a neuro-symbolic reasoner combining vectorial representations with deductive reasoning. Sandra builds a vector space constrained by an ontology and performs reasoning over it. The geometric nature of the reasoner allows its combination with neural networks, bridging the gap with symbolic knowledge representations. Sandra is based on the Description and Situation (DnS) ontology design pattern, a formalization of frame semantics. Given a set of facts (a situation) it allows to infer all possible perspectives (descriptions) that can provide a plausible interpretation for it, even in presence of incomplete information. We prove that our method is correct with respect to the DnS model. We experiment with two different tasks and their standard benchmarks, demonstrating that, without increasing complexity, sandra (i) outperforms all the baselines (ii) provides interpretability in the classification process, and (iii) allows control over the vector space, which is designed a priori.

https://arxiv.org/pdf/2402.00745.pdf

An increasing amount of research in Natural Language Inference (NLI) focuses on the application and evaluation of Large Language Models (LLMs) and their reasoning capabilities. Despite their success, however, LLMs are still prone to factual errors and inconsistencies in their explanations, offering limited control and interpretability for inference in complex domains. In this paper, we focus on ethical NLI, investigating how hybrid neurosymbolic techniques can enhance the logical validity and alignment of ethical explanations produced by LLMs. Specifically, we present an abductive-deductive framework named LogicExplainer, which integrates LLMs with an external backward-chaining solver to refine step-wise natural language explanations and jointly verify their correctness, reduce incompleteness and minimise redundancy. An extensive empirical analysis demonstrates that LogicExplainer can improve explanations generated via in-context learning methods and Chainof-Thought (CoT) on challenging ethical NLI tasks, while, at the same time, producing formal proofs describing and supporting models’ reasoning. As ethical NLI requires commonsense reasoning to identify underlying moral violations, our results suggest the effectiveness of neuro-symbolic methods for multi-step NLI more broadly, opening new opportunities to enhance the logical consistency, reliability, and alignment of LLMs.

--

--

sbagency

Tech/biz consulting, analytics, research for founders, startups, corps and govs.