👀 BONUS MODULE: Attention Visualizer — See How the Model “Thinks”

Estimated duration of this module: 1 - 1.5 hours
Objective: Visualize Transformer attention matrices to understand which words the model “looked at” when generating an output. We’ll use libraries like bertviz and matplotlib to create interactive and static visualizations.
Requirements: Completed Module 5 (using Hugging Face). Module 6 is recommended but not required.

Bonus Lesson 1 — Why Visualize Attention? A Window into the Model’s “Reasoning”

Transformers are not black boxes. Their decisions are based on attention weights — numbers indicating how much one word “looked at” another.

Visualizing these weights allows you to:

✅ Understand errors: Why did the model answer incorrectly? Maybe it looked at the wrong words.
✅ Validate logic: Is it reasoning like a human? Or taking spurious shortcuts?
✅ Teach and explain: Show others how the model works (ideal for presentations, classes, or reports).
✅ Debug models: In advanced projects, see which layers or heads are “asleep” or biased.

🔹 Useful analogy:

It’s like seeing the eye-tracking heatmap of a reader while reading a text.
Where do they pause most? Which words do they relate? What do they ignore?
That’s exactly what we’ll do with the model.

Bonus Lesson 2 — Installing `bertviz`: Your Magnifying Glass for Attention

bertviz is an open-source library created by IBM and Google researchers, specifically designed to visualize attention in Transformer models.

Install it with:

pip install bertviz

Note: If using Google Colab, run !pip install bertviz in a cell.

Bonus Lesson 3 — Visualization 1: Head View — See Attention by Head and Layer

The most popular view. Shows a heatmap for each attention head in each layer.

Let’s visualize attention in a simple sentence.

from transformers import AutoTokenizer, AutoModel
from bertviz import head_view

# Load model and tokenizer (we’ll use BERT base)
model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name, output_attentions=True)  # IMPORTANT!

# Example text
text = "The cat sat on the mat because it was tired."

# Tokenize
inputs = tokenizer(text, return_tensors="pt")

# Pass through model
outputs = model(**inputs)

# Get attention matrices
# Shape: (batch_size, num_heads, seq_len, seq_len)
attentions = outputs.attentions  # tuple of tensors, one per layer

# Visualize with head_view
head_view(attentions, tokenizer.convert_ids_to_tokens(inputs.input_ids[0]))

🔹 What you’ll see:

An interactive interface (if in Jupyter/Colab).
You can choose which layer and which head to visualize.
Each cell shows how much attention one word (row) pays to another (column).
Red = high attention. Blue = low attention.

🔹 Example of what to expect:

In lower layers: local attention (nearby words).
In higher layers: global attention (distant but related words).
The word “it” should have high attention on “cat.”
“because” should attend to “tired” and “sat.”

Bonus Lesson 4 — Visualization 2: Model View — See All Layers and Heads at Once

Ideal for getting an overall view of attention flow through the entire model.

from bertviz import model_view

model_view(attentions, tokenizer.convert_ids_to_tokens(inputs.input_ids[0]))

🔹 What you’ll see:

All layers stacked vertically.
All heads of each layer horizontally.
A bird’s-eye view of the “information journey” through the model.

Bonus Lesson 5 — Visualization 3: Neuron View — See Attention at the Neuron Level (Advanced)

Shows how attention is calculated for a specific word pair, breaking down the dot product between Query and Key.

from bertviz import neuron_view

# Visualize attention from "it" to "cat"
neuron_view(model, "bert", tokenizer, text, display_mode="dark", layer=5, head=0)

Note: Requires passing the original model, not just the attentions.

Bonus Lesson 6 — Practical Application: Visualize Attention in Our QA System

Let’s apply bertviz to the QA model we built in Module 6.

from transformers import AutoTokenizer, AutoModelForQuestionAnswering
from bertviz import head_view

# Load QA model (with output_attentions=True!)
model_name = "deepset/roberta-base-squad2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name, output_attentions=True)

# Context and question
context = "Barcelona is famous for Gaudí's architecture like the Sagrada Familia."
question = "Who is the famous architect in Barcelona?"

# Prepare input (question + context)
inputs = tokenizer(question, context, return_tensors="pt")

# Pass through model
outputs = model(**inputs)

# Visualize attentions
head_view(outputs.attentions, tokenizer.convert_ids_to_tokens(inputs.input_ids[0]))

🔹 Observe:

Which words in the question attend to “Gaudí”?
Does “architect” attend to “Gaudí”?
How does information flow between question and context?

This gives you powerful intuition into how the model “reasons” to extract the answer.

Bonus Lesson 7 — Interpretation: What Do Attention Patterns Mean?

Not all heads do the same thing. Common patterns:

🔹 Local Attention:
Words attending to neighbors. Common in lower layers.

e.g., “the cat” → “the” attends to “cat.”

🔹 Syntactic Attention:
Subject → verb, verb → object.

e.g., “cat sat” → “sat” attends to “cat.”

🔹 Anaphoric Attention:
Pronouns → antecedent nouns.

e.g., “it” → “cat.”

🔹 Semantic Attention:
Words related by meaning.

e.g., “tired” → “sleep,” “exhausted.”

🔹 Separator Attention:
Words attending to [SEP] or [CLS]. May indicate “summary” or “classification.”

Bonus Lesson 8 — Limitations of Attention Visualizations

⚠️ Important: Attention is not always interpretable.

High attention weights don’t always imply causality.
The model may use multiple heads together (not just one).
Attention is only part of the final calculation (FFN comes after).

🔹 Advice: Use visualizations as hypotheses, not absolute truths. They are an exploration tool, not a certainty.

✍️ Bonus Reflection Exercise 1

Take an ambiguous sentence:
“I saw the man with the telescope.”
Who has the telescope? Me or the man?

Visualize attention in a BERT model.
What word does “with” attend to? “I” or “man”?
Which layer and head show a clearer pattern?
Does it match your intuition?

📊 Conceptual Diagram Bonus 1 — Visualization Flow (described)

Text → Tokenizer → input_ids → Model (with output_attentions=True) → attentions (tensors) → 
                                                              ↓
                                                  bertviz.head_view() → Interactive heatmap
                                                              ↓
                                              Interpretation: Which words are related?

🧠 Bonus Module Conclusion

You’ve learned to open the “black box” of Transformers.
You no longer just use models — you understand them.
You know how they look at text, which words they relate, and how they build understanding layer by layer, head by head.

This skill is invaluable:

For research.

For debugging models.

For explainability to users or regulators.

And above all, for your own curiosity and mastery of the subject.

Congratulations! You’ve completed the course “Transformers Without Mystery” with a depth few achieve.

← Module6

Course Info

Course: AI-course2

Language: EN

Lesson: Module7