Geometric Deep Learning

Unifying Framework for Deep Learning on Non-Euclidean Domains

By Rohan SharmaScribeUpdated March 10, 20260views

Geometricdeeplearning

arXiv

Field: Machine Learning, Deep Learning
Key authors: Michael Bronstein, Joan Bruna, Taco Cohen, Petar Velickovic
Introduced: 2017 (term coined)
Formalized: 2021 (5G proto-book)
Domains: Grids, Groups, Graphs, Geodesics, Gauges
Core principle: Symmetry and equivariance

Key Milestones

Spectral graph convolutions proposed

Joan Bruna et al.

2014

ChebNet: efficient spectral filtering on graphs

Defferrard, Bresson, Vandergheynst

2016

Group Equivariant CNNs

Cohen and Welling

2016

Graph Convolutional Networks (GCNs)

Kipf and Welling

2017

Term 'geometric deep learning' coined

Bronstein et al., IEEE Signal Processing Magazine

2017

Neural Message Passing framework

Gilmer et al.

2017

AlphaFold2 wins CASP14

DeepMind

2020

Graphormer wins OGB-LSC at KDD Cup

Ying et al., Microsoft

2021

5G proto-book published

Bronstein, Bruna, Cohen, Velickovic

2021

GPS Graph Transformer framework

Rampasek et al.

2022

GraphCast outperforms ECMWF weather forecasting

Lam et al., DeepMind

2023

GNoME discovers 2.2M new crystal structures

Merchant et al., DeepMind

2023

Related Concepts

Graph Neural NetworksEquivarianceGroup TheoryManifoldsMessage PassingSpectral Graph TheoryTransformersGraph TransformersSE(3) SymmetryGauge TheoryAlphaFoldGraphCast

Geometric deep learning (GDL) is a class of deep learning techniques that generalize neural network architectures from Euclidean-structured data — images, text, audio — to non-Euclidean domains such as graphs, manifolds, and point clouds.¹ The field connects successful deep learning architectures to the symmetries of their input domains using group theory and differential geometry, providing both a unifying mathematical framework for existing models and a constructive method for designing new ones.²

Motivation

Convolutional neural networks (CNNs) exploit translation symmetry: a filter that detects an edge works equally well at any position in an image, so weights can be shared across all positions.¹ Recurrent neural networks similarly exploit the sequential structure of chains. But social networks, molecular structures, protein interaction maps, 3D meshes, and the surface of the Earth are not grids — they are graphs or curved surfaces.³ Forcing this data into grid representations introduces distortions and discards structural information.¹

Geometric deep learning starts from the question: what properties make CNNs effective on grids, and how can those properties be preserved on non-Euclidean domains?²

The Erlangen Program for Deep Learning

Felix Klein's 1872 Erlangen Program classified geometries by their symmetry groups.² Bronstein, Bruna, Cohen, and Velickovic applied this idea to deep learning in their 2021 proto-book "Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges." They showed that CNNs, graph neural networks (GNNs), recurrent neural networks, and Transformers are all instances of a single design principle: learning functions that respect the symmetries of their domain.²

The central concept is equivariance. A layer is equivariant to a group of transformations if transforming the input produces a correspondingly transformed output.⁴ For a CNN, the group is translations on the pixel grid. For a GNN, it is permutations of nodes. For data on a sphere, it is the rotation group SO(3).²⁴

This yields a taxonomy of architectures organized by domain (grids, groups, graphs, geodesics, gauges) and the symmetry group each architecture respects.²

Historical Development

The roots of geometric deep learning lie in spectral graph theory. In 2014, Joan Bruna and colleagues defined graph convolutions using filters in the Fourier domain of the graph Laplacian — the first principled generalization of CNNs to irregular graphs.⁵ These spectral methods were computationally expensive and did not transfer between graphs with different structures.⁶

Two developments in 2016-2017 made graph neural networks practical. Defferrard, Bresson, and Vandergheynst introduced ChebNet, which approximated spectral filters with Chebyshev polynomials, making graph convolutions efficient and spatially localized.⁶ Kipf and Welling simplified this further with Graph Convolutional Networks (GCNs): a first-order approximation of spectral graph convolutions that achieved state-of-the-art results on semi-supervised node classification benchmarks.¹¹

The term "geometric deep learning" was introduced by Michael Bronstein and collaborators in a 2017 IEEE Signal Processing Magazine review surveying generalizations of deep learning to non-Euclidean domains.³ In the same period, Monti et al. proposed mixture model CNNs (MoNet) as a unified framework for graph and manifold convolutions,⁷ and Cohen and Welling introduced group equivariant CNNs, extending translation equivariance to discrete rotation and reflection groups.⁸

The 2021 proto-book by Bronstein, Bruna, Cohen, and Velickovic consolidated these threads into a single geometric framework connecting grids, groups, graphs, geodesics, and gauges.²

Core Concepts

Symmetry, Invariance, and Equivariance

A symmetry is a transformation that preserves domain structure. Translations preserve pixel grids; node permutations preserving edges are graph symmetries; rotations are symmetries of the sphere.² Networks that encode these symmetries avoid redundant computation and require fewer training examples.⁴

Invariance means the output does not change under transformation: f(Tx) = f(x). Graph-level classification, for example, should be invariant to node relabeling.⁴ Equivariance means the output transforms correspondingly: f(Tx) = T'f(x). Predicting atomic force vectors on a rotated molecule should produce correspondingly rotated vectors.²

Message Passing and Graph Neural Networks

Gilmer et al. formalized the message passing framework in 2017.⁹ In each layer of a message passing neural network (MPNN), every node aggregates features from its neighbors using a permutation-invariant function (sum, mean, or max), then updates its own representation. After k layers, each node's representation encodes its k-hop neighborhood.⁹

Graph Convolutional Networks (GCNs) are a specific case. A GCN layer multiplies node features by the symmetrically normalized adjacency matrix, applies a learned weight matrix, and passes the result through a nonlinearity.¹¹ Kipf and Welling observed that this operation is a differentiable version of the Weisfeiler-Lehman graph isomorphism test: even with random weights, a 3-layer GCN on Zachary's karate club network produces embeddings that reflect community structure.⁵

Graph Attention Networks (GATs) replace fixed normalization with learned attention weights per neighbor.⁹

Equivariant Neural Networks

Geometric deep learning also addresses equivariance to continuous symmetry groups. SE(3)-equivariant networks produce outputs that transform correctly under 3D rotations and translations — a requirement for molecular and physical simulations.⁴

Constructing equivariant layers uses representation theory: an equivariant linear map between feature spaces is an intertwiner between representations of the symmetry group G.⁴ For the rotation group SO(3), this involves constraining convolution kernels using spherical harmonics and Wigner matrices.⁴

Gauge Equivariance on Manifolds

On curved surfaces and general manifolds, there is no global coordinate system, so feature vectors at different points cannot be directly compared.⁴ Gauge equivariant neural networks solve this by defining convolutions on fiber bundles using connections and parallel transport from differential geometry. The network's output is then independent of arbitrary local coordinate choices.²⁴

Architectures as Special Cases

The geometric framework reveals that several well-known architectures share common structure.²

CNNs are group convolutions over the translation group on a grid.² GNNs are permutation-equivariant message passing networks on graphs.² Transformers operate on fully connected graphs — self-attention is message passing where every token attends to every other token, with edge weights computed dynamically.² Recurrent neural networks process sequences, which are chains (a graph with linear connectivity) with causal ordering.² Spherical CNNs are SO(3)-equivariant networks on the sphere, using spherical harmonics as a Fourier basis.⁴

Graph Transformers

Standard message passing networks are limited to local neighborhoods: each layer aggregates information from immediate neighbors, so capturing long-range dependencies requires stacking many layers, which leads to over-smoothing.⁹ Graph Transformers address this by applying global self-attention over all nodes, allowing every node to attend to every other node in a single layer.¹²

Graphormer, introduced by Ying et al. at Microsoft in 2021, encoded graph structure into a standard Transformer via three additions: centrality encoding (node degree as bias), spatial encoding (shortest-path distances as attention bias), and edge encoding (aggregating edge features along shortest paths).¹² It won the OGB Large-Scale Challenge at KDD Cup 2021, outperforming all GNN submissions on the PCQM4M molecular property prediction benchmark.¹²

The GPS (General, Powerful, Scalable) framework by Rampasek et al. at NeurIPS 2022 proposed a modular recipe combining three ingredients: positional/structural encodings (such as random walk or Laplacian eigenvector encodings), a local message passing layer, and a global attention layer.¹³ Each GPS layer runs the local and global components in parallel and sums their outputs, achieving linear complexity through efficient attention approximations while retaining the inductive biases of message passing.¹³

These architectures address the expressivity bottleneck of standard MPNNs — which cannot distinguish graphs beyond the 1-WL isomorphism test — by incorporating global structural information directly into the attention mechanism.¹²¹³

Applications

Protein Structure Prediction

AlphaFold2, developed by DeepMind, predicts 3D protein structures from amino acid sequences with atomic-level accuracy.¹⁰ It processes protein data as a spatial graph with equivariant attention mechanisms respecting the SE(3) symmetry of 3D space.¹⁰ AlphaFold2 won CASP14 in 2020 with a median GDT score of 92.4. Its structure database has been expanded to over 200 million predicted proteins.¹⁰

Molecular Property Prediction

Molecules are naturally represented as graphs with atoms as nodes and bonds as edges.⁹ GNNs predict properties such as toxicity, solubility, and binding affinity directly from molecular graphs without hand-crafted descriptors.⁹ SchNet and DimeNet extend this to 3D molecular geometry using continuous-filter convolutions that incorporate interatomic distances and bond angles.⁴

Particle Physics

Data from the Large Hadron Collider — jets of particles, detector hits, interaction patterns — is structured as point clouds and graphs. GNNs have been applied to jet classification, particle tracking, and event reconstruction at CERN.²

3D Shape Analysis

MeshCNN applies convolutions on mesh edges. PointNet processes raw point clouds with architectures invariant to point permutations.³ These models perform shape classification, segmentation, and correspondence on 3D geometry data.³

Recommender Systems

PinSage, deployed at Pinterest, frames collaborative filtering as link prediction on a bipartite user-item graph, propagating information through billions of nodes.²

Weather Forecasting

GraphCast, developed by DeepMind and published in Science in 2023, models the Earth's atmosphere as a multi-mesh graph and uses a GNN-based encoder-processor-decoder architecture to predict hundreds of weather variables at 0.25° resolution globally.¹⁴ Trained on 39 years of ERA5 reanalysis data, GraphCast produces 10-day forecasts in under one minute on a single TPU, compared to hours of supercomputer time for traditional numerical weather prediction.¹⁴ It outperformed ECMWF's HRES operational system on 90% of 1,380 verification targets, including better prediction of tropical cyclone tracks, atmospheric rivers, and extreme temperature events.¹⁴

Materials Discovery

Graph Networks for Materials Exploration (GNoME), published in Nature in 2023 by Merchant et al. at DeepMind, used GNNs to predict the stability of inorganic crystal structures.¹⁵ GNoME represented crystals as graphs with atoms as nodes and bonds as edges, then iteratively trained on available DFT-computed data and used the model to filter candidates for further computation.¹⁵ The system discovered 2.2 million new crystal structures, of which 380,000 were predicted to be thermodynamically stable — an order of magnitude increase over all previously known stable crystals in the Materials Project database.¹⁵

Climate Science

Spherical CNNs process atmospheric and climate data on the Earth's surface using SO(3)-equivariant convolutions, avoiding distortions from map projections.⁴

Open Problems

In deep GNNs, the over-smoothing problem causes node representations to converge to indistinguishable states as layers increase, limiting effective depth.⁵ The expressivity of message passing networks is bounded by the Weisfeiler-Lehman hierarchy: standard MPNNs cannot distinguish certain non-isomorphic graphs that the 1-WL test also fails on.⁹ Graph Transformers partially address this by incorporating global attention, though at higher computational cost.¹²¹³ Scalability to graphs with billions of nodes requires subgraph sampling and distributed computation.⁵ The theoretical relationship between equivariance and generalization — early results show equivariant networks are provably more sample-efficient — is an active area of research.⁴

Geometricdeeplearning

arXiv

Field: Machine Learning, Deep Learning
Key authors: Michael Bronstein, Joan Bruna, Taco Cohen, Petar Velickovic
Introduced: 2017 (term coined)
Formalized: 2021 (5G proto-book)
Domains: Grids, Groups, Graphs, Geodesics, Gauges
Core principle: Symmetry and equivariance

Key Milestones

Spectral graph convolutions proposed

Joan Bruna et al.

2014

ChebNet: efficient spectral filtering on graphs

Defferrard, Bresson, Vandergheynst

2016

Group Equivariant CNNs

Cohen and Welling

2016

Graph Convolutional Networks (GCNs)

Kipf and Welling

2017

Term 'geometric deep learning' coined

Bronstein et al., IEEE Signal Processing Magazine

2017

Neural Message Passing framework

Gilmer et al.

2017

AlphaFold2 wins CASP14

DeepMind

2020

Graphormer wins OGB-LSC at KDD Cup

Ying et al., Microsoft

2021

5G proto-book published

Bronstein, Bruna, Cohen, Velickovic

2021

GPS Graph Transformer framework

Rampasek et al.

2022

GraphCast outperforms ECMWF weather forecasting

Lam et al., DeepMind

2023

GNoME discovers 2.2M new crystal structures

Merchant et al., DeepMind

2023

Related Concepts

Graph Neural NetworksEquivarianceGroup TheoryManifoldsMessage PassingSpectral Graph TheoryTransformersGraph TransformersSE(3) SymmetryGauge TheoryAlphaFoldGraphCast

Motivation

Geometric deep learning starts from the question: what properties make CNNs effective on grids, and how can those properties be preserved on non-Euclidean domains?²

The Erlangen Program for Deep Learning

This yields a taxonomy of architectures organized by domain (grids, groups, graphs, geodesics, gauges) and the symmetry group each architecture respects.²

Historical Development

The 2021 proto-book by Bronstein, Bruna, Cohen, and Velickovic consolidated these threads into a single geometric framework connecting grids, groups, graphs, geodesics, and gauges.²

Core Concepts

Symmetry, Invariance, and Equivariance

Message Passing and Graph Neural Networks

Graph Attention Networks (GATs) replace fixed normalization with learned attention weights per neighbor.⁹

Equivariant Neural Networks

Gauge Equivariance on Manifolds

Architectures as Special Cases

The geometric framework reveals that several well-known architectures share common structure.²

Graph Transformers

Applications

Protein Structure Prediction

Molecular Property Prediction

Particle Physics

3D Shape Analysis

Recommender Systems

PinSage, deployed at Pinterest, frames collaborative filtering as link prediction on a bipartite user-item graph, propagating information through billions of nodes.²

Weather Forecasting

Materials Discovery

Climate Science

Spherical CNNs process atmospheric and climate data on the Earth's surface using SO(3)-equivariant convolutions, avoiding distortions from map projections.⁴

Open Problems

References

[1]
Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs - Monti et al., CVPR 2017Accessed Mar 10, 2026
[2]
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges - Bronstein, Bruna, Cohen, Velickovic, 2021Accessed Mar 10, 2026
[3]
Geometric Deep Learning - Official Website (Bronstein, Bruna, Cohen, Velickovic)Accessed Mar 10, 2026
[4]
Geometric Deep Learning and Equivariant Neural Networks - Gerken et al., Artificial Intelligence Review, 2023Accessed Mar 10, 2026
[5]
Graph Convolutional Networks - Thomas Kipf (Blog), 2016Accessed Mar 10, 2026
[6]
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering - Defferrard, Bresson, Vandergheynst, NeurIPS 2016Accessed Mar 10, 2026
[7]
Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs - Monti et al., CVPR 2017 (PDF)Accessed Mar 10, 2026
[8]
Group Equivariant Convolutional Networks - Cohen and Welling, ICML 2016Accessed Mar 10, 2026
[9]
Graph Neural Network - WikipediaAccessed Mar 10, 2026
[10]
Highly Accurate Protein Structure Prediction with AlphaFold - Jumper et al., Nature, 2021Accessed Mar 10, 2026
[11]
Semi-Supervised Classification with Graph Convolutional Networks - Kipf and Welling, ICLR 2017Accessed Mar 10, 2026
[12]
Do Transformers Really Perform Badly for Graph Representation? (Graphormer) - Ying et al., NeurIPS 2021Accessed Mar 10, 2026
[13]
Recipe for a General, Powerful, Scalable Graph Transformer - Rampasek et al., NeurIPS 2022Accessed Mar 10, 2026
[14]
Learning Skillful Medium-Range Global Weather Forecasting (GraphCast) - Lam et al., Science, 2023Accessed Mar 10, 2026
[15]
Scaling Deep Learning for Materials Discovery (GNoME) - Merchant et al., Nature, 2023Accessed Mar 10, 2026