Granite 4.0 Mamba-2 Interpretability Tool

Visualize hidden attention in IBM Granite 4.0's hybrid Mamba-2/Transformer architecture.

Mamba-2 layers don't have explicit attention matrices — this tool extracts implicit "hidden attention" using the formula from Ali et al. (2024), making the black-box SSM layers interpretable alongside standard Transformer attention.

Architecture: 32 layers — 28 Mamba-2 (magma) + 4 Transformer (viridis)

Example Prompts
View Mode
0 31
Head Aggregation