💻 MODULE 5: From Theory to Practice — Using Transformers with Hugging Face

Estimated duration of this module: 1.5 - 2 hours
Objective: Enable the student to independently load any Transformer model from Hugging Face, understand its inputs and outputs, and use it for common tasks like classification, generation, or question-answering — without fine-tuning.

Lesson 5.1 — What is Hugging Face? The Library that Democratized Transformers

Hugging Face is not just a library. It’s a complete ecosystem for language models.

🔹 What it offers:

🤗 transformers: the main library for loading, using, and training models.
🧩 datasets: access to thousands of ready-to-use datasets.
🏷️ tokenizers: tools to convert text into tokens (the input models understand).
🧪 evaluate: standardized metrics for evaluating models.
🌐 Model Hub: a repository with hundreds of thousands of pretrained models, ready to download (BERT, GPT-2, T5, Llama, Mistral, etc.).

🔹 Useful analogy:

Hugging Face is like the “App Store for AI models.”
Need a model to summarize text? There are dozens.
One to detect emotions? Hundreds.
One in Spanish? Also available.
One small enough to run on your laptop? Absolutely!
Just search, install, and use.

Lesson 5.2 — Installation and Initial Setup

Before we begin, we need to install the libraries. We’ll do it in a clean environment (recommended: Google Colab or a local virtual environment).

pip install torch transformers datasets

Note: torch (PyTorch) is the deep learning framework Hugging Face uses by default. You can also use TensorFlow, but PyTorch is more common in the community.

Lesson 5.3 — Your First Model: Loading a Sentiment Classifier

Let’s start simple: a model that reads text and tells you if it’s positive or negative.

We’ll use distilbert-base-uncased-finetuned-sst-2-english, a small, fast, pretrained model for this task.

from transformers import pipeline

# Create a text classification pipeline
classifier = pipeline("sentiment-analysis")

# Test with a phrase
result = classifier("I love this course! It's amazing and very clear.")
print(result)
# Output: [{'label': 'POSITIVE', 'score': 0.9998}]

That’s it! In 3 lines, you have a working AI model.

🔹 What did pipeline do?

Automatically downloaded the model and tokenizer from the Hub.
Preprocessed the text (tokenization, padding, etc.).
Ran the model (inference).
Post-processed the output (converted logits into labels and probabilities).

Lesson 5.4 — Understanding “Pipelines”: The Easiest Way to Get Started

Hugging Face offers pipelines for common tasks:

"sentiment-analysis" → sentiment classification
"text-generation" → text generation (GPT-2, etc.)
"question-answering" → question answering (BERT, etc.)
"translation" → translation
"summarization" → summarization
"ner" → named entity recognition (people, places, etc.)

Example: text generation with GPT-2

generator = pipeline("text-generation", model="gpt2")

text = "The future of artificial intelligence is"
result = generator(text, max_length=50, num_return_sequences=1)
print(result[0]['generated_text'])

Possible output:

“The future of artificial intelligence is one of the most exciting areas of research today. It has the potential to revolutionize...”

Lesson 5.5 — What Happens Behind the Scenes? Tokenization, IDs, and attention_mask

Pipelines are magical… but to truly understand what’s happening, we need to see the manual process.

Let’s take the same classifier, step by step.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 1. Load tokenizer and model
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 2. Tokenize the text
text = "This movie is absolutely wonderful!"
inputs = tokenizer(text, return_tensors="pt")  # "pt" = PyTorch tensors
print(inputs)

Output:

{
  'input_ids': tensor([[  101,  2023,  3042,  2003,  2675, 12712,  1029,   102]]),
  'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1]])
}

🔹 Explanation:

input_ids: each number is a token. 101 = [CLS], 102 = [SEP], the rest are words.
attention_mask: indicates which tokens are real (1) and which are padding (0). Here, all are real.

Lesson 5.6 — Running the Model Manually

# 3. Pass inputs to the model
outputs = model(**inputs)

# 4. Get logits (unnormalized outputs)
logits = outputs.logits
print("Logits:", logits)  # e.g., tensor([[-4.5, 3.2]])

# 5. Apply softmax to get probabilities
probabilities = torch.softmax(logits, dim=-1)
print("Probabilities:", probabilities)  # e.g., tensor([[0.001, 0.999]])

# 6. Get predicted label
predicted_class = torch.argmax(probabilities, dim=-1).item()
labels = ["NEGATIVE", "POSITIVE"]
print("Predicted label:", labels[predicted_class])  # → POSITIVE

🔹 Why do it manually? To understand how it works, debug errors, or customize the process (e.g., change the decision threshold).

Lesson 5.7 — How to Choose the Right Model for Your Task

Not all models are suitable for everything. Here’s a quick guide:

Task	Model Type	Example Models on Hugging Face
Text Classification	Encoder-only (BERT-style)	`bert-base-uncased`, `distilbert-base-uncased`, `nlptown/bert-base-multilingual-uncased-sentiment`
Text Generation	Decoder-only (GPT-style)	`gpt2`, `facebook/opt-350m`, `mistralai/Mistral-7B-v0.1` (requires more RAM)
Question Answering (extractive)	Encoder-only	`bert-large-uncased-whole-word-masking-finetuned-squad`, `deepset/roberta-base-squad2`
Translation / Summarization	Encoder-Decoder	`t5-small`, `facebook/bart-large-cnn`, `Helsinki-NLP/opus-mt-en-es`
Named Entity Recognition (NER)	Encoder-only	`dslim/bert-base-NER`, `Jean-Baptiste/roberta-large-ner-english`

🔍 Tip: On the Hugging Face Model Hub, you can filter by:

Task

Language

Model size

License

Framework (PyTorch, TensorFlow)

Lesson 5.8 — Working with Spanish Models

Yes, there are many models in Spanish!

Example: sentiment classification for Spanish tweets.

classifier_es = pipeline(
    "sentiment-analysis",
    model="nlptown/bert-base-multilingual-uncased-sentiment"
)

result = classifier_es("Este curso es increíble, ¡lo recomiendo mucho!")
print(result)
# Output: [{'label': '5 stars', 'score': 0.8742}]

Another more specific model: finiteautomata/bertweet-base-sentiment-analysis (for social media).

Lesson 5.9 — Handling Common Errors and Solutions

Error 1: “CUDA out of memory”

Solution: Use a smaller model (distilbert instead of bert-large), or run on CPU (slower, but works).

model = AutoModelForSequenceClassification.from_pretrained(model_name).to("cpu")

Error 2: Text too long

Solution: Truncate the text.

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

Error 3: Model not found

Solution: Verify the exact name on the Model Hub. Use autocomplete on the website.

✍️ Reflection Exercise 5.1

Choose a task that interests you (e.g., summarizing news, detecting spam, translating phrases).
Go to the Hugging Face Model Hub and find 2-3 candidate models.
Compare their metrics, size, language, and license.
Which one would you choose? Why?

📊 Conceptual Diagram 5.1 — Inference Flow with Hugging Face (described)

Text → Tokenizer → input_ids + attention_mask → Model → logits → softmax → label + probability
          ↑               ↑                          ↑            ↑
      Converts     Indicates real     Pretrained     Converts to
      to numbers   vs padding tokens  neural net     probability

🧠 Module 5 Conclusion

Hugging Face removed the entry barrier for using Transformer models.
You no longer need weeks of setup, expensive GPUs, or deep deep learning knowledge to get started.
With a few lines of code, you can have an AI model generating text, classifying emotions, or answering questions.

But… this is just the beginning! In the next module, we’ll apply everything learned in a guided project: we’ll build a question-answering system that answers queries about a given text — using a pretrained model, without fine-tuning.

← Module4 Module6 →

Course Info

Course: AI-course2

Language: EN

Lesson: Module5