KG reasoning // only way to make a reasonable output

We can only work with [semi] structured data for reasoning, regardless of how it is structured

sbagency
5 min readApr 16, 2024
https://arxiv.org/pdf/2403.11996.pdf

Leveraging generative Artificial Intelligence (AI), we have transformed a dataset comprising 1,000 scientific papers focused on biological materials into a comprehensive ontological knowledge graph. Through an in-depth structural analysis of this graph, we have calculated node degrees, identified communities along with their connectivities, and evaluated clustering coefficients and betweenness centrality of pivotal nodes, uncovering fascinating knowledge architectures. We find that the graph has an inherently scale-free nature, shows a high level of connectedness, and can be used as a rich source for downstream graph reasoning by taking advantage of transitive and isomorphic properties to reveal insights into unprecedented interdisciplinary relationships that can be used to answer queries, identify gaps in knowledge, propose never-before-seen material designs, and predict material behaviors. Using a large language embedding model we compute deep node representations and use combinatorial node similarity ranking to develop a path sampling strategy that allows us to link dissimilar concepts that have previously not been related. One comparison revealed detailed structural parallels between biological materials and Beethoven’s 9th Symphony, highlighting shared patterns of complexity through isomorphic mapping. In another example, the algorithm proposed an innovative hierarchical mycelium-based composite based on integrating path sampling with principles extracted from Kandinsky’s ‘Composition VII’ painting. The resulting material integrates an innovative set of concepts that include a balance of chaos and order, adjustable porosity, mechanical strength, and complex patterned chemical functionalization. We uncover other isomorphisms across science, technology and art, revealing a nuanced ontology of immanence that reveal a context-dependent heterarchical interplay of constituents. Because our method transcends established disciplinary boundaries through diverse data modalities (graphs, images, text, numerical data, etc.), graph-based generative AI achieves a far higher degree of novelty, explorative capacity, and technical detail, than conventional approaches and establishes a widely useful framework for innovation by revealing hidden connections.

https://arxiv.org/pdf/2403.11996.pdf
https://huggingface.co/lamm-mit/BioinspiredMixtral
https://pyvis.readthedocs.io/en/latest/
https://onlinelibrary.wiley.com/doi/10.1002/advs.202306724

The study of biological materials and bio-inspired materials science is well established; however, surprisingly little knowledge is systematically translated to engineering solutions. To accelerate discovery and guide insights, an open-source autoregressive transformer large language model (LLM), BioinspiredLLM, is reported. The model is finetuned with a corpus of over a thousand peer-reviewed articles in the field of structural biological and bio-inspired materials and can be prompted to recall information, assist with research tasks, and function as an engine for creativity. The model has proven that it is able to accurately recall information about biological materials and is further strengthened with enhanced reasoning ability, as well as with Retrieval-Augmented Generation (RAG) to incorporate new data during generation that can also help to traceback sources, update the knowledge base, and connect knowledge domains. BioinspiredLLM also has shown to develop sound hypotheses regarding biological materials design and remarkably so for materials that have never been explicitly studied before. Lastly, the model shows impressive promise in collaborating with other generative artificial intelligence models in a workflow that can reshape the traditional materials design process. This collaborative generative artificial intelligence method can stimulate and enhance bio-inspired materials design workflows. Biological materials are at a critical intersection of multiple scientific fields and models like BioinspiredLLM help to connect knowledge domains.

https://neurosymbolic-ai-journal.com/content/call-papers-special-issue-knowledge-graphs-and-neurosymbolic-ai

Topics of Interest (include but are not limited to):
Neurosymbolic AI for knowledge engineering
Combination of Knowledge Engineering and Machine Learning
The use of ML techniques for creating, extending, improving or aligning knowledge graphs
Knowledge graphs for Neurosymbolic AI systems
Knowledge infusion in machine learning algorithms
Empirical results on the impact of enhancing ML systems with knowledge graphs
Knowledge graph quality and its influence on Neurosymbolic AI systems
Benchmarks, Quality Metrics
Knowledge graphs for trustworthy Neurosymbolic AI systems
Transparent/provenance-aware knowledge graphs
Policy-aware/compliant knowledge graphs
Commonsense knowledge
Explainable AI
The use of knowledge graphs for human-centric aspects of Neurosymbolic AI systems
Engineering knowledge based Neurosymbolic AI systems
Design methodologies, e.g. Pattern-based Neurosymbolic AI systems
Applications of knowledge graphs-based Neurosymbolic AI systems
Domain-specific knowledge graphs for Neurosymbolic AI systems in domains such as medicine, biology, IoT, search, and others.

https://arxiv.org/pdf/2404.07103.pdf

Large language models (LLMs), while exhibiting exceptional performance, suffer from hallucinations, especially on knowledge-intensive tasks. Existing works propose to augment LLMs with individual text units retrieved from external knowledge corpora to alleviate the issue. However, in many domains, texts are interconnected (e.g., academic papers in a bibliographic graph are linked by citations and co-authorships) which form a (text-attributed) graph. The knowledge in such graphs is encoded not only in single texts/nodes but also in their associated connections. To facilitate the research of augmenting LLMs with graphs, we manually construct a Graph Reasoning Benchmark dataset called GRBENCH, containing 1,740 questions that can be answered with the knowledge from 10 domain graphs. Then, we propose a simple and effective framework called Graph Chain-of-thought (GRAPH-COT) to augment LLMs with graphs by encouraging LLMs to reason on the graph iteratively. Each GRAPH-COT iteration consists of three substeps: LLM reasoning, LLM-graph interaction, and graph execution. We conduct systematic experiments with three LLM backbones on GRBENCH, where GRAPH-COT outperforms the baselines consistently. The code is available at https://github.com/PeterGriffinJin/ Graph-CoT.

--

--

sbagency

Tech/biz consulting, analytics, research for founders, startups, corps and govs.