GNNMoE combines Graph Neural Networks and Graph Transformers, creating a universal node classification model that adapts to various graph types while addressing over-smoothing and inefficiency issues.
-----
https://arxiv.org/abs/2412.08193
Original Problem 🤔:
Graph Neural Networks struggle with heterophilous data and long-range dependencies, while Graph Transformers face scalability and noise challenges on large-scale graphs.
-----
Solution in this Paper 💡:
→ GNNMoE introduces a universal model architecture for node classification.
→ It combines fine-grained message-passing operations with a mixture-of-experts mechanism.
→ The architecture incorporates soft and hard gating layers to assign suitable expert networks to each node.
→ Adaptive residual connections and an enhanced Feed Forward Network module improve node representation expressiveness.
→ The model uses stackable PT-Blocks and an enhanced FFN to process node features and adjacency information.
→ GNNMoE employs four message passing experts (PP, PT, TP, TT) to handle different graph features.
→ The enhanced FFN module uses three activation function experts (SwishGLU, GEGLU, REGLU) for further feature encoding.
-----
Key Insights from this Paper:
→ Decoupling message-passing allows flexible combinations of operations for different graph types.
→ Adaptive architecture search mechanism enhances expressiveness and adaptability.
→ Combining GNN and GT strengths creates a more versatile and efficient model.
-----
Results 📊:
→ Outperforms existing methods across homophilous and heterophilous datasets.
→ Maintains consistent performance while stacking PT-block message passing modules.
→ Consumes 2-7 times less training time compared to spatial-domain GNN and GT-based methods.
→ Converges in fewer epochs than traditional GNN methods.
Share this post