Building Your First Image Classifier with PyTorch: A Step-by-Step Guide Using the MNIST Dataset - V

Content:

17: Model Optimization and Quantization in PyTorch
18: Model Deployment and Serving in PyTorch
19: Common Mistakes and Debugging Tips
20: Visualizing Model Predictions on MNIST

⚙️ Section 17: Model Optimization and Quantization in PyTorch

Making Deep Learning Models Smaller, Faster, and Deployment-Ready

Deep learning models often deliver impressive accuracy — but that power comes at a cost: large memory size, high latency, and slow inference speed.
When deploying models to mobile devices, IoT systems, or edge hardware, these issues can make real-time AI nearly impossible.

That’s where model optimization and quantization come in — allowing you to make your models lighter, faster, and more efficient without losing much accuracy.

In this section, we’ll explore PyTorch’s built-in tools for model optimization and demonstrate how to apply them effectively.

is Model Optimization?

Model optimization refers to techniques that make a neural network run efficiently — reducing size, memory footprint, and inference time.

In essence:

“Optimization helps your deep learning model think faster without losing its intelligence.”

Common optimization goals:

💾 Reduce model size
⚡ Improve inference speed
🔋 Lower power consumption
🧠 Maintain accuracy

🔍 2️⃣ Techniques for Model Optimization

PyTorch supports several optimization techniques, including:

Optimization Technique	Purpose
Pruning	Remove redundant weights and neurons
Quantization	Convert 32-bit floating-point weights to smaller formats (e.g., int8)
Knowledge Distillation	Train a smaller “student” model to mimic a larger “teacher”
Mixed Precision Training	Use half-precision (float16) for faster computation
Model Scripting/Tracing	Convert models to TorchScript for deployment optimization

🧠 3️⃣ Understanding Quantization

Quantization is one of the most effective optimization methods.

It reduces model size and improves inference speed by converting high-precision numbers (like float32) to lower-precision integers (like int8).

🧮 Example:

Original weight: 0.152347 (float32 → 4 bytes)
Quantized weight: 0.15 (int8 → 1 byte)

This gives a 4x reduction in memory usage.

🔹 Types of Quantization in PyTorch

Type	Description	Use Case
Dynamic Quantization	Quantizes weights post-training (on-the-fly during inference)	RNNs, Transformers
Static Quantization	Quantizes both weights and activations using calibration data	CNNs
Quantization-Aware Training (QAT)	Simulates quantization during training for best accuracy	Edge deployment

⚙️ 4️⃣ Dynamic Quantization Example

Let’s start with a Dynamic Quantization example using a pre-trained model like LSTM or BERT.

import torch
import torch.nn as nn
import torch.quantization

# Define a simple model
class SimpleLSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super(SimpleLSTM, self).__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers)
        self.fc = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        out, _ = self.lstm(x)
        out = self.fc(out[-1])
        return out

model_fp32 = SimpleLSTM(10, 20, 2, 2)

# Apply dynamic quantization
model_quantized = torch.quantization.quantize_dynamic(
    model_fp32, {nn.LSTM, nn.Linear}, dtype=torch.qint8
)

print("Original Model Size:", sum(p.numel() for p in model_fp32.parameters()))
print("Quantized Model Size:", sum(p.numel() for p in model_quantized.parameters()))

🧩 Result: The model now consumes significantly less memory while maintaining nearly the same accuracy.

⚙️ 5️⃣ Static Quantization Example

For CNN-based models (like ResNet), Static Quantization provides even better optimization.

It requires calibration data to estimate activation ranges.

from torchvision import models

# Load pretrained model
model = models.resnet18(pretrained=True)
model.eval()

# Define quantization configuration
model.qconfig = torch.quantization.get_default_qconfig('fbgemm')

# Prepare for calibration
torch.quantization.prepare(model, inplace=True)

# Run some calibration data
for _ in range(5):
    dummy_input = torch.randn(1, 3, 224, 224)
    model(dummy_input)

# Convert to quantized model
torch.quantization.convert(model, inplace=True)

print("Model successfully quantized!")

Result:

Model size reduced by ~70%
Inference latency improved by ~2-3x
Accuracy drop: typically < 1%

🧠 6️⃣ Quantization-Aware Training (QAT)

In QAT, the model simulates quantization during training, allowing the network to adapt to lower precision.

This yields the best accuracy for deployment.

import torch.quantization

# Define model and configuration
model = models.resnet18(pretrained=False)
model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')

# Prepare for QAT
torch.quantization.prepare_qat(model, inplace=True)

# Train normally
for epoch in range(2):
    inputs = torch.randn(16, 3, 224, 224)
    outputs = model(inputs)
    loss = outputs.mean()
    loss.backward()

# Convert to quantized model
torch.quantization.convert(model.eval(), inplace=True)
print("QAT model ready for deployment!")

✅ Result: QAT provides a balance between accuracy and efficiency.

🧩 7️⃣ Pruning: Removing Redundant Weights

Pruning helps by removing neurons that contribute little to the final prediction.

import torch.nn.utils.prune as prune

# Define a simple model
model = nn.Sequential(nn.Linear(10, 20), nn.ReLU(), nn.Linear(20, 2))

# Apply 40% pruning to the first layer
prune.random_unstructured(model[0], name="weight", amount=0.4)

# Remove pruning mask and finalize the model
prune.remove(model[0], 'weight')

print("Pruning complete! Non-zero weights:", torch.count_nonzero(model[0].weight))

🧠 Result: Model becomes lighter and faster with minimal accuracy loss.

⚙️ 8️⃣ Mixed Precision Training

Using float16 (half precision) instead of float32 can significantly reduce training time, especially on GPUs that support Tensor Cores.

from torch.cuda.amp import autocast, GradScaler

model = models.resnet18().cuda()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
scaler = GradScaler()

for epoch in range(5):
    inputs = torch.randn(16, 3, 224, 224).cuda()
    labels = torch.randint(0, 10, (16,)).cuda()

    optimizer.zero_grad()
    with autocast():
        outputs = model(inputs)
        loss = nn.CrossEntropyLoss()(outputs, labels)
    
    scaler.scale(loss).backward()
    scaler.step(optimizer)
    scaler.update()
    
    print(f"Epoch {epoch+1} | Loss: {loss.item():.4f}")

⚡ Mixed precision = 2× speedup, 50% less memory usage, with negligible accuracy difference.

🧱 9️⃣ Model Scripting and Tracing (TorchScript)

To deploy models on C++, mobile, or edge devices, convert your model to TorchScript format.

# Example model
model = models.resnet18(pretrained=True).eval()

# Convert model using tracing
example_input = torch.randn(1, 3, 224, 224)
traced_model = torch.jit.trace(model, example_input)

# Save model
traced_model.save("optimized_resnet18.pt")
print("Model saved for mobile or C++ deployment!")

TorchScript optimizes your model by:

Removing Python overhead
Precomputing operations
Enabling deployment on devices without Python

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

🌍 🔟 Real-World Applications

Industry	Use Case	Optimization Technique
Mobile AI	Face Recognition on smartphones	Quantization + TorchScript
IoT Devices	Smart Cameras / Sensors	Pruning + Static Quantization
Healthcare	Portable Medical Imaging	Mixed Precision Training
Autonomous Vehicles	Real-time Object Detection	Quantization-Aware Training
Voice Assistants	On-device NLP	Dynamic Quantization

💡 Tips for Effective Optimization

✅ Always measure latency and accuracy before and after optimization.
✅ Use PyTorch Profiler (torch.profiler) to identify performance bottlenecks.
✅ Combine multiple techniques — e.g., pruning + quantization.
✅ Fine-tune your quantized model if accuracy drops significantly.
✅ Use TorchScript for deployment on mobile or edge devices.

✅ Conclusion of Section 17

In this section, you learned how to:

Use quantization (dynamic, static, QAT) to shrink model size.
Apply pruning and mixed precision training for faster computation.
Convert models into TorchScript for real-world deployment.
Optimize AI systems for mobile, edge, and embedded platforms.

With these optimization techniques, you can make your PyTorch models production-ready — balancing speed, accuracy, and efficiency.

🌐 Section 18: Model Deployment and Serving in PyTorch

From Jupyter Notebook to Real-World Production Environments

After building, training, and optimizing your model — the next crucial step is deployment.
A model’s true value lies not in accuracy numbers during training, but in how it performs in the real world — serving predictions to users or systems in real time.

In this section, we’ll explore how to deploy PyTorch models to web APIs, mobile devices, and edge environments — turning your trained models into scalable, production-ready services.

🚀 1️⃣ What is Model Deployment?

Model deployment is the process of making a trained model available for inference — where users, applications, or other systems can send input data and receive predictions.

In simple terms:

“Training teaches your model how to think. Deployment lets it speak to the world.”

🔍 2️⃣ Model Deployment Workflow

A typical deployment workflow consists of the following stages:

Model Training: Develop and train a model in PyTorch.
Model Saving: Export the trained model (.pt or .pth file).
Model Optimization: Apply quantization or pruning to improve efficiency.
Model Serving: Expose the model through an API or server.
Monitoring: Track performance, latency, and accuracy over time.

🔹 Deployment Targets

Deployment Platform	Description
Web Server / Cloud API	Serve predictions via REST APIs (Flask/FastAPI)
Mobile (Android/iOS)	Deploy using TorchScript or PyTorch Mobile
Edge Devices	Run models on Raspberry Pi, Jetson Nano, or IoT devices
Cloud Services	Use AWS, GCP, or Azure for scalable serving
Browser	Use ONNX.js or TorchScript for browser inference

🧠 3️⃣ Saving and Loading PyTorch Models

Before deployment, the first step is saving your trained model.

import torch

# Save trained model
torch.save(model.state_dict(), "mnist_model.pth")

# Load model for inference
model.load_state_dict(torch.load("mnist_model.pth"))
model.eval()

✅ Tip: Always call model.eval() before inference to disable dropout and batch normalization updates.

⚙️ 4️⃣ Deploying with Flask (Web API)

One of the easiest ways to deploy a PyTorch model is by serving it via a Flask API.

Let’s create a simple image classifier API for the MNIST digit recognition model.

Step 1: Install Dependencies

pip install flask torch torchvision pillow

Step 2: Create Flask App (`app.py`)

from flask import Flask, request, jsonify
from torchvision import transforms
from PIL import Image
import torch
import torch.nn as nn

# Define a simple CNN model (same as trained one)
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(28*28, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = x.view(-1, 28*28)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Load trained model
model = Net()
model.load_state_dict(torch.load("mnist_model.pth", map_location="cpu"))
model.eval()

# Define image transform
transform = transforms.Compose([
    transforms.Grayscale(),
    transforms.Resize((28, 28)),
    transforms.ToTensor()
])

app = Flask(__name__)

@app.route('/predict', methods=['POST'])
def predict():
    file = request.files['image']
    img = Image.open(file).convert('L')
    img = transform(img).unsqueeze(0)

    with torch.no_grad():
        outputs = model(img)
        _, predicted = torch.max(outputs, 1)

    return jsonify({'prediction': int(predicted.item())})

if __name__ == '__main__':
    app.run(debug=True)

Step 3: Test the API

curl -X POST -F "image=@digit.png" http://127.0.0.1:5000/predict

✅ Output:

{"prediction": 7}

Now your model is live and ready to serve predictions through an API!

⚡ 5️⃣ Scalable Deployment with FastAPI

For faster, production-grade performance, use FastAPI instead of Flask.

from fastapi import FastAPI, File, UploadFile
from torchvision import transforms
from PIL import Image
import torch

app = FastAPI()

@app.post("/predict/")
async def predict(file: UploadFile = File(...)):
    image = Image.open(file.file).convert('L')
    transform = transforms.Compose([transforms.Resize((28, 28)), transforms.ToTensor()])
    img_tensor = transform(image).unsqueeze(0)
    outputs = model(img_tensor)
    _, pred = torch.max(outputs, 1)
    return {"prediction": int(pred.item())}

Run it using:

uvicorn app:app --reload

🌐 Access at: http://127.0.0.1:8000/docs (interactive Swagger UI)

🧩 6️⃣ Deploying Models on Mobile (PyTorch Mobile)

PyTorch supports mobile deployment using TorchScript.

Convert model to TorchScript:

scripted_model = torch.jit.script(model)
scripted_model.save("mnist_mobile.pt")

Now, this model can be used in:

Android (Java/Kotlin) via org.pytorch.Module
iOS (Swift) via TorchModule

Example (Android Kotlin):

val module = Module.load(assetFilePath(this, "mnist_mobile.pt"))
val inputTensor = TensorImageUtils.bitmapToFloat32Tensor(bitmap)
val outputTensor = module.forward(IValue.from(inputTensor)).toTensor()
val scores = outputTensor.dataAsFloatArray
val predictedDigit = scores.indices.maxBy { scores[it] } ?: -1

✅ Your MNIST model now runs offline on smartphones!

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

💡 7️⃣ Deploying on Edge Devices (e.g., Raspberry Pi, Jetson Nano)

Deploying on edge hardware helps achieve low latency and offline inference.

Steps:

Optimize model using quantization
Install PyTorch for ARM/Linux
Deploy using TorchScript model

Example command on Raspberry Pi:

python3 infer.py --model optimized_mnist.pt --input test.png

Edge devices benefit from:

Static quantization
Reduced memory consumption
Real-time inference

🌍 8️⃣ Cloud Deployment with TorchServe

For enterprise-grade deployment, use TorchServe — an official PyTorch model serving framework.

Install TorchServe:

pip install torchserve torch-model-archiver

Archive Model:

torch-model-archiver --model-name mnist --version 1.0 \
--serialized-file mnist_model.pth --handler image_classifier

Start TorchServe:

torchserve --start --ncs --model-store model_store --models mnist=mnist.mar

Access predictions via REST API:

curl -X POST http://127.0.0.1:8080/predictions/mnist -T digit.png

✅ Output:

Prediction: 5

TorchServe also provides:

Batch inference
Logging & metrics
Multi-model management

🧠 9️⃣ Model Conversion for Cross-Platform Deployment

Sometimes, you may need to convert PyTorch models for other frameworks:

Format	Target	Conversion Tool
ONNX	TensorFlow, Caffe2, OpenVINO	`torch.onnx.export()`
CoreML	iOS/macOS	`coremltools`
TensorRT	NVIDIA GPUs	`torch2trt`

Example (PyTorch → ONNX)

dummy_input = torch.randn(1, 1, 28, 28)
torch.onnx.export(model, dummy_input, "mnist_model.onnx", input_names=['input'], output_names=['output'])
print("Model converted to ONNX format!")

📊 🔟 Monitoring and Maintaining Models

After deployment, continuous monitoring ensures the model performs as expected.

Key metrics to track:

Latency: Time per prediction
Throughput: Predictions per second
Accuracy drift: Deviation from training accuracy
Hardware utilization: GPU/CPU efficiency

Tools:

Prometheus + Grafana for monitoring
MLflow for model tracking and versioning
Sentry or Datadog for performance alerts

✅ Conclusion of Section 18

In this section, you learned how to:

Serve models via Flask or FastAPI APIs
Deploy to mobile and edge devices
Use TorchServe for scalable cloud deployments
Convert PyTorch models to ONNX, CoreML, or TorchScript
Monitor performance for reliable real-world usage

Deployment is the bridge between research and impact — it turns your trained model into a real-world AI service used by thousands of users or devices.

🧩 Section 19: Common Mistakes and Debugging Tips

Even though training a neural network on MNIST seems simple, beginners often face subtle bugs that can completely derail their results. Let’s go through the most common mistakes and how to fix them, along with real debugging strategies you can use in PyTorch projects.

🧠 1. Forgetting to Normalize the Data

Problem:
MNIST images have pixel values in the range 0–255, while most neural networks expect inputs roughly between 0–1 or −1–1. If you skip normalization, the gradients may explode, and the network will take longer to converge.

Fix:
Use transforms.Normalize() when loading the dataset.

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

✅ Always normalize your inputs — it stabilizes training and helps the optimizer converge faster.

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

⚙️ 2. Forgetting to Call `model.train()` or `model.eval()`

Problem:
PyTorch models behave differently during training and evaluation:

model.train() enables dropout and batch normalization updates.
model.eval() freezes them for inference.

If you forget these, your test results might fluctuate randomly.

Fix:

# During training
model.train()

# During evaluation
model.eval()

🔢 3. Mixing Up Target Labels

Problem:
Loss functions like nn.CrossEntropyLoss expect integer class labels, not one-hot encoded vectors. Passing a one-hot target causes shape mismatch errors.

Fix:

Use integer labels (0–9 for MNIST).
Do not apply softmax() before CrossEntropyLoss.

criterion = nn.CrossEntropyLoss()

PyTorch’s CrossEntropyLoss already applies log_softmax internally.

🚫 4. Forgetting to Zero the Gradients

Problem:
Gradients in PyTorch accumulate by default. If you don’t clear them between batches, the updates become incorrect.

Fix:

optimizer.zero_grad()
loss.backward()
optimizer.step()

💡 Always call optimizer.zero_grad() before backpropagation in every training iteration.

🧩 5. Too High or Too Low Learning Rate

Problem:

Too high: The model oscillates or diverges.
Too low: The model learns extremely slowly or gets stuck.

Fix:
Start with a learning rate around 0.001–0.01 for Adam or SGD and monitor the loss.

Example:

optimizer = optim.Adam(model.parameters(), lr=0.001)

You can also use a learning rate scheduler:

scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)

📊 6. Incorrect Accuracy Calculation

Problem:
Sometimes people forget to use torch.max() when converting predicted logits to class labels.

Fix:

_, predicted = torch.max(outputs, 1)
correct += (predicted == labels).sum().item()

🕵️ 7. Overfitting to Training Data

Problem:
The model performs extremely well on the training set but poorly on unseen data.

Fixes:

Use dropout layers in the model.
Apply data augmentation (RandomRotation, RandomAffine).
Implement early stopping.

Example:

self.dropout = nn.Dropout(0.5)

🧩 8. Using CPU Instead of GPU

Problem:
Training is very slow because the code runs on the CPU.

Fix:
Always move your model and data to the appropriate device:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

🧠 9. Forgetting `with torch.no_grad()` During Evaluation

Problem:
If you forget this, PyTorch keeps track of gradients even during testing — wasting memory.

Fix:

with torch.no_grad():
    # evaluation loop

🧰 10. Not Saving and Loading Models Properly

Problem:
Many beginners only save model weights incorrectly and then fail to reload them.

Fix:
Save:

torch.save(model.state_dict(), "mnist_model.pth")

Load:

model.load_state_dict(torch.load("mnist_model.pth"))
model.eval()

💬 Real-World Debugging Mindset

When your model isn’t working as expected:

Check data shapes — most bugs are tensor mismatches.
Print layer outputs — find where values vanish or explode.
Track gradients — use loss.item() and for p in model.parameters(): print(p.grad) for debugging.
Simplify — start with a smaller model and dataset slice.
Visualize — plot loss and accuracy after each epoch.

✅ Summary Table: MNIST Debugging Cheatsheet

Problem	Symptom	Fix
Data not normalized	Loss doesn’t decrease	Use `transforms.Normalize()`
Missing `model.train()`	Poor accuracy	Add `model.train()` during training
Gradients accumulating	Nan/Inf loss	Use `optimizer.zero_grad()`
Wrong targets	Runtime error	Pass integer class labels
Too high LR	Loss diverges	Lower learning rate
Wrong accuracy calc	Accuracy always 0	Use `torch.max()` for predictions
Overfitting	Validation loss increases	Add dropout / augmentation
Using CPU	Slow training	Move to GPU
No `torch.no_grad()`	Memory overflow	Wrap eval in `with torch.no_grad()`
Model not saved	Lost progress	Use `torch.save()` and `torch.load()`

🧠 “Train a Simple Image Classifier (MNIST)”

and now dive into Section 20: Visualizing Model Predictions on MNIST — an exciting and essential step that makes our model results interpretable and engaging for readers.

🎨 Section 20: Visualizing Model Predictions on MNIST

Training a model isn’t enough — as data scientists and AI engineers, we need to see what our model has learned. Visualization allows us to:

Validate the quality of predictions.
Detect misclassifications.
Build intuition about model confidence.

In this section, we’ll visualize predictions made by our trained PyTorch MNIST classifier.

🧩 1. Why Visualize Predictions?

Visualizing results helps to:

Confirm that digits are correctly recognized.
Identify confusing patterns (e.g., distinguishing “3” from “8”).
Evaluate how confident the model is for each class.

💡 Neural networks can achieve high accuracy, but still make “humanly obvious” mistakes — visualization makes these errors clear.

🧠 2. Getting Model Predictions

We’ll use the test dataset and display both the image and the predicted label from the trained model.

import matplotlib.pyplot as plt
import numpy as np
import torch

# Make sure the model is in evaluation mode
model.eval()

# Select a small batch of test images
dataiter = iter(test_loader)
images, labels = next(dataiter)

# Move data to the same device as the model
images, labels = images.to(device), labels.to(device)

# Get model outputs
with torch.no_grad():
    outputs = model(images)
    _, preds = torch.max(outputs, 1)

Now we have:

images → actual handwritten digits
labels → ground truth
preds → model predictions

📊 3. Visualizing a Grid of Predictions

Let’s plot a 4×4 grid showing random predictions.

# Move tensors to CPU and convert to NumPy
images = images.cpu().numpy()
preds = preds.cpu().numpy()
labels = labels.cpu().numpy()

fig = plt.figure(figsize=(8,8))
for idx in np.arange(16):
    ax = fig.add_subplot(4, 4, idx+1)
    ax.imshow(np.squeeze(images[idx]), cmap='gray')
    ax.set_title(f"Pred: {preds[idx]}, True: {labels[idx]}")
    ax.axis('off')
plt.show()

🖼️ Output:
A 4×4 grid displaying digits with:

Pred: predicted label
True: actual label

✅ If most labels match, congratulations — your model is performing well!

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

📉 4. Visualizing Model Confidence (Softmax Probabilities)

To understand how confident our model is, we can apply softmax to get probabilities for each class.

import torch.nn.functional as F

# Choose one image
idx = 0
image = images[idx]
label = labels[idx]

# Convert to tensor and pass through model
img_tensor = torch.tensor(image).unsqueeze(0).to(device)
output = model(img_tensor)
probs = F.softmax(output, dim=1).cpu().numpy().squeeze()

# Plot probabilities
plt.bar(np.arange(10), probs)
plt.title(f"True Label: {label}, Predicted: {np.argmax(probs)}")
plt.xlabel("Digit Class")
plt.ylabel("Probability")
plt.show()

🧩 Interpretation:

A tall, single bar indicates high confidence.
Multiple high bars indicate confusion (e.g., the model unsure between 5 and 6).

🕵️ 5. Visualizing Misclassified Images

Let’s analyze what the model got wrong — this is crucial for improving future models.

# Predict all test samples
all_preds = torch.tensor([])
all_labels = torch.tensor([])

model.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, preds = torch.max(outputs, 1)
        all_preds = torch.cat((all_preds, preds.cpu()), dim=0)
        all_labels = torch.cat((all_labels, labels.cpu()), dim=0)

# Find misclassified examples
misclassified_idx = (all_preds != all_labels).nonzero().squeeze()

# Display first 16 misclassified samples
fig = plt.figure(figsize=(8,8))
for idx in np.arange(16):
    i = misclassified_idx[idx]
    img, pred, true = test_data[i][0], all_preds[i].item(), all_labels[i].item()
    ax = fig.add_subplot(4, 4, idx+1)
    ax.imshow(img.squeeze(), cmap='gray')
    ax.set_title(f"Pred: {pred}, True: {true}", color='red')
    ax.axis('off')
plt.show()

🔍 Observation Example:

Model might confuse 4 ↔ 9 or 3 ↔ 8, as they have similar shapes.
This reveals the biases and limitations of your training data or model architecture.

📈 6. Confusion Matrix Visualization

A confusion matrix gives a big-picture view of performance across all classes.

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

cm = confusion_matrix(all_labels, all_preds)
disp = ConfusionMatrixDisplay(confusion_matrix=cm)
disp.plot(cmap=plt.cm.Blues)
plt.title("MNIST Confusion Matrix")
plt.show()

💬 Interpretation:

The diagonal cells represent correct predictions.
Off-diagonal cells show misclassifications — e.g., how often “3” was mistaken for “8”.

🧠 7. Real-World Insight

Visualizing outputs helps in:

Model explainability — seeing where AI goes wrong.
Improving datasets — adding more examples for commonly confused classes.
Debugging model bias — checking if the model fails systematically on certain digits.

In real-world AI systems (like handwriting recognition for bank cheques or digitized forms), visualization is vital to validate correctness before deployment.

✅ Summary: Visualization Techniques

Visualization	Purpose	Library Used
Sample Images	See data quality	matplotlib
Prediction Grid	Compare predicted vs actual labels	matplotlib
Probability Bars	View model confidence	matplotlib
Misclassified Samples	Debug training issues	matplotlib
Confusion Matrix	Assess class-level performance	sklearn

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

Image Classifier - V : Model Optimization, Quantization, Deployment, Serving and Visualizing in PyTorch and debugging tips

Building Your First Image Classifier with PyTorch: A Step-by-Step Guide Using the MNIST Dataset - V

Content:

⚙️ Section 17: Model Optimization and Quantization in PyTorch

is Model Optimization?

🔍 2️⃣ Techniques for Model Optimization

🧠 3️⃣ Understanding Quantization

🧮 Example:

🔹 Types of Quantization in PyTorch

⚙️ 4️⃣ Dynamic Quantization Example

⚙️ 5️⃣ Static Quantization Example

🧠 6️⃣ Quantization-Aware Training (QAT)

🧩 7️⃣ Pruning: Removing Redundant Weights

⚙️ 8️⃣ Mixed Precision Training

🧱 9️⃣ Model Scripting and Tracing (TorchScript)

Sponsor Key-Word

🌍 🔟 Real-World Applications

💡 Tips for Effective Optimization

✅ Conclusion of Section 17

🌐 Section 18: Model Deployment and Serving in PyTorch

🚀 1️⃣ What is Model Deployment?

🔍 2️⃣ Model Deployment Workflow

🔹 Deployment Targets

🧠 3️⃣ Saving and Loading PyTorch Models

⚙️ 4️⃣ Deploying with Flask (Web API)

Step 1: Install Dependencies

Step 2: Create Flask App (app.py)

Step 3: Test the API

⚡ 5️⃣ Scalable Deployment with FastAPI

🧩 6️⃣ Deploying Models on Mobile (PyTorch Mobile)

Convert model to TorchScript:

Sponsor Key-Word

💡 7️⃣ Deploying on Edge Devices (e.g., Raspberry Pi, Jetson Nano)

🌍 8️⃣ Cloud Deployment with TorchServe

Install TorchServe:

Archive Model:

Start TorchServe:

🧠 9️⃣ Model Conversion for Cross-Platform Deployment

Example (PyTorch → ONNX)

📊 🔟 Monitoring and Maintaining Models

✅ Conclusion of Section 18

🧩 Section 19: Common Mistakes and Debugging Tips

🧠 1. Forgetting to Normalize the Data

Sponsor Key-Word

⚙️ 2. Forgetting to Call model.train() or model.eval()

🔢 3. Mixing Up Target Labels

🚫 4. Forgetting to Zero the Gradients

🧩 5. Too High or Too Low Learning Rate

📊 6. Incorrect Accuracy Calculation

🕵️ 7. Overfitting to Training Data

🧩 8. Using CPU Instead of GPU

🧠 9. Forgetting with torch.no_grad() During Evaluation

🧰 10. Not Saving and Loading Models Properly

💬 Real-World Debugging Mindset

✅ Summary Table: MNIST Debugging Cheatsheet

🎨 Section 20: Visualizing Model Predictions on MNIST

🧩 1. Why Visualize Predictions?

🧠 2. Getting Model Predictions

📊 3. Visualizing a Grid of Predictions

Sponsor Key-Word

📉 4. Visualizing Model Confidence (Softmax Probabilities)

🕵️ 5. Visualizing Misclassified Images

📈 6. Confusion Matrix Visualization

🧠 7. Real-World Insight

✅ Summary: Visualization Techniques

Sponsor Key-Word

Comments

Post a Comment

Step 2: Create Flask App (`app.py`)

⚙️ 2. Forgetting to Call `model.train()` or `model.eval()`

🧠 9. Forgetting `with torch.no_grad()` During Evaluation