The article discusses the challenges of scalability in graph transformers, a type of model used in machine learning for graph-structured data. Graphs, representing objects and their relations, are common in various domains such as social networks and molecular structures. The traditional approach, using dense interaction graphs in graph transformers, faces computational and memory bottlenecks, limiting their applicability to small graphs.
The proposed solution, Exphormer, introduces a sparse attention framework using expander graphs, which are sparse yet well-connected. Expander graphs enable effective communication between distant nodes, addressing the scalability challenge. Exphormer achieves strong empirical results on various datasets, outperforming previous models on long-range dependencies and allowing scalability to larger graphs. The framework combines edges from the input graph, expander graphs, and virtual nodes to construct an interaction graph, demonstrating its theoretical properties and empirical performance on datasets with up to 10,000+ nodes. Exphormer’s sparse attention mechanism makes it linear in the size of the original input graph, enhancing scalability for graph transformers.
While transformers excel in language models, applying them to graphs requires an attention mechanism, typically implemented through fully connected graphs that are memory-inefficient. Exphormer introduces a novel approach by utilizing expander graphs, sparse representations of relationships, combined with intermediate nodes. This architecture offers a more efficient way to capture long-range dependencies, overcoming the memory constraints of fully connected graphs. This innovation paves the way for scalable graph transformers, extending their applicability to large datasets with diverse relationships and long-distance dependencies.
Graph transformers have emerged as a promising architecture for a variety of graph learning and representation tasks. Despite their successes, though, it remains challenging to scale graph transformers to large graphs while maintaining accuracy competitive with message-passing networks. In this paper, we introduce EXPHORMER, a framework for building powerful and scalable graph transformers. EXPHORMER consists of a sparse attention mechanism based on two mechanisms: virtual global nodes and expander graphs, whose mathematical characteristics, such as spectral expansion, pseduorandomness, and sparsity, yield graph transformers with complexity only linear in the size of the graph, while allowing us to prove desirable theoretical properties of the resulting transformer models. We show that incorporating EXPHORMER into the recentlyproposed GraphGPS framework produces models with competitive empirical results on a wide variety of graph datasets, including state-of-the-art results on three datasets. We also show that EXPHORMER can scale to datasets on larger graphs than shown in previous graph transformer architectures. Codes can be found at https://github.com/hamed1375/Exphormer.