Building Your First Image Classifier with PyTorch: A Step-by-Step Guide Using the MNIST Dataset - IV

Content:

13: Hyperparameter Tuning & Optimization Techniques in PyTorch

14: Model Evaluation & Confusion Matrix Analysis in PyTorch

15: Model Saving, Loading, and Deployment in PyTorch

16: Transfer Learning with Pretrained Models in PyTorch

⚙️ Section 13: Hyperparameter Tuning & Optimization Techniques in PyTorch

Once you have your model training successfully, the next step is to optimize its performance.
Even a small change in learning rate, batch size, or optimizer can dramatically improve accuracy.
This section explores Hyperparameter Tuning — the art and science of finding the best configuration for your neural network.

🎯 What Are Hyperparameters?

Hyperparameters are the external configurations of your model — they are not learned during training but directly influence how your model learns.

Type	Examples	Description
Model Hyperparameters	Number of layers, neurons per layer, activation functions	Define the model’s architecture
Training Hyperparameters	Learning rate, batch size, epochs	Control how the model is trained
Regularization Hyperparameters	Dropout rate, L2 penalty	Prevent overfitting
Optimizer Hyperparameters	Momentum, beta values (for Adam)	Fine-tune optimizer behavior

🧠 Why Hyperparameter Tuning Matters

A model’s accuracy can rise from 85% to 95% simply by adjusting hyperparameters — no architecture change needed.

Example:
For MNIST:

Learning rate = 0.01 → accuracy = 89%
Learning rate = 0.001 → accuracy = 96%

Just tuning one number led to a 7% performance boost!

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

🧮 Key Hyperparameters in PyTorch

Let’s discuss the most critical ones in detail:

1️⃣ Learning Rate (lr)

Controls how big a step the optimizer takes in each iteration.

Too high → model overshoots minima.
Too low → model converges very slowly.

Example:

optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

Visualization Concept:

Learning rate too high = bouncing around the minima.
Learning rate too low = slow crawl to the minima.

You can also use Learning Rate Schedulers to dynamically adjust it during training:

scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)

2️⃣ Batch Size

Defines how many samples are used to compute each gradient update.

Batch Size	Description
Small (e.g., 16–32)	More noise, faster generalization
Large (e.g., 256–512)	Stable gradients, but needs more memory

train_loader = DataLoader(dataset=train_data, batch_size=64, shuffle=True)

3️⃣ Number of Epochs

One epoch = one complete pass through the dataset.

Too few → underfitting
Too many → overfitting

Use Early Stopping to avoid training too long:

if val_loss > best_val_loss:
    trigger_times += 1
    if trigger_times >= patience:
        print("Early stopping triggered!")
        break

4️⃣ Optimizer Choice

PyTorch offers several optimizers. Each suits different learning patterns.

Optimizer	Description	Code Example
SGD	Basic gradient descent	`torch.optim.SGD(model.parameters(), lr=0.01)`
Adam	Adaptive learning, widely used	`torch.optim.Adam(model.parameters(), lr=0.001)`
RMSprop	Great for RNNs	`torch.optim.RMSprop(model.parameters(), lr=0.001)`

5️⃣ Activation Functions

These define nonlinearities and help models learn complex relationships.

Activation	Use Case	PyTorch Example
ReLU	Most CNNs, general networks	`torch.nn.ReLU()`
Sigmoid	Binary classification	`torch.nn.Sigmoid()`
Tanh	Normalized outputs (-1 to 1)	`torch.nn.Tanh()`
LeakyReLU	Avoids “dead ReLU” issue	`torch.nn.LeakyReLU(0.01)`

6️⃣ Regularization (Dropout, Weight Decay)

Prevents overfitting by adding constraints.

Dropout: Randomly drops neurons during training.

torch.nn.Dropout(p=0.3)

Weight Decay: Penalizes large weights (L2 regularization).

optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-5)

🔬 Manual Hyperparameter Search (Grid Search)

You can loop through combinations manually:

learning_rates = [0.1, 0.01, 0.001]
batch_sizes = [32, 64, 128]

for lr in learning_rates:
    for batch in batch_sizes:
        print(f"Training with lr={lr}, batch={batch}")
        optimizer = torch.optim.Adam(model.parameters(), lr=lr)
        train_loader = DataLoader(train_data, batch_size=batch, shuffle=True)
        train_model(model, train_loader, optimizer)

This approach is simple but computationally expensive for larger models.

🤖 Automated Hyperparameter Tuning

For larger projects, libraries like Optuna, Ray Tune, or Weights & Biases (W&B) automate the search.

Example (Optuna):

import optuna

def objective(trial):
    lr = trial.suggest_loguniform('lr', 1e-5, 1e-1)
    dropout = trial.suggest_uniform('dropout', 0.1, 0.5)
    
    model = Net(dropout=dropout)
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    
    acc = train_and_validate(model, optimizer)
    return acc

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=30)
print(study.best_params)

This allows you to automatically find the best hyperparameters.

📈 Visualization: Learning Rate vs Accuracy

You can visualize the effect of hyperparameters:

import matplotlib.pyplot as plt

lrs = [0.1, 0.01, 0.001, 0.0001]
accuracy = [0.80, 0.88, 0.95, 0.91]

plt.plot(lrs, accuracy, 'bo-')
plt.xscale('log')
plt.xlabel('Learning Rate')
plt.ylabel('Validation Accuracy')
plt.title('Learning Rate vs Accuracy')
plt.show()

💡 Best Practices for Hyperparameter Tuning

Start with defaults – frameworks like PyTorch choose reasonable defaults.
Tune one parameter at a time – isolate effects.
Use validation sets – never tune on test data.
Track experiments – log results systematically.
Leverage GPU resources – use CUDA to parallelize training.
Use early stopping – avoid unnecessary computation.

🧠 Real-World Example:

In Google’s BERT model, tuning learning rate (2e-5 to 5e-5) drastically affected convergence.
In ResNet training, batch size scaling (32 → 512) required learning rate warmup and adaptive scheduling.
This shows how hyperparameter tuning is an integral part of every production-grade AI system.

✅ Conclusion of Section 13

Hyperparameter tuning transforms a working model into a high-performing one.
It’s not guesswork — it’s a process of systematic experimentation, observation, and optimization.

📊 Section 14: Model Evaluation & Confusion Matrix Analysis in PyTorch

After successfully training and tuning your model, the next crucial step is to evaluate its performance.
Model evaluation ensures that your neural network not only performs well on the training data but also generalizes effectively to unseen (test) data.

In this section, you’ll learn:
✅ How to measure model performance using metrics like accuracy, precision, recall, F1-score
✅ How to generate and visualize a confusion matrix
✅ How to interpret evaluation metrics for real-world decision-making

🧠 1️⃣ The Need for Model Evaluation

Training accuracy alone doesn’t tell the full story.
A model can have high training accuracy but low test accuracy, meaning it’s memorizing rather than learning — a classic case of overfitting.

Hence, we use evaluation metrics to:

Understand model performance on unseen data
Identify biases or misclassifications
Compare multiple models systematically

⚙️ 2️⃣ Basic Evaluation Metrics

Let’s go through key metrics used in deep learning model evaluation.

Metric	Formula	Description
Accuracy	( \frac{TP + TN}{TP + TN + FP + FN} )	Overall correctness of the model
Precision	( \frac{TP}{TP + FP} )	How many predicted positives are actually positive
Recall (Sensitivity)	( \frac{TP}{TP + FN} )	How many actual positives were correctly predicted
F1-Score	( 2 \times \frac{Precision \times Recall}{Precision + Recall} )	Balance between precision and recall

Where:

TP = True Positives
TN = True Negatives
FP = False Positives
FN = False Negatives

🔍 3️⃣ Confusion Matrix Explained

A Confusion Matrix gives a detailed breakdown of correct and incorrect predictions.

	Predicted: Positive	Predicted: Negative
Actual: Positive	True Positive (TP)	False Negative (FN)
Actual: Negative	False Positive (FP)	True Negative (TN)

For example:

In a digit classifier, predicting 7 when the true label is 7 → ✅ (TP)
Predicting 3 when true label is 7 → ❌ (FP or FN depending on context)

🧩 4️⃣ Implementing Model Evaluation in PyTorch

Let’s evaluate our MNIST image classifier trained earlier.

import torch
from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns

# Switch model to evaluation mode
model.eval()

y_true = []
y_pred = []

with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        y_true.extend(labels.cpu().numpy())
        y_pred.extend(predicted.cpu().numpy())

# Print classification report
print(classification_report(y_true, y_pred))

# Confusion Matrix
cm = confusion_matrix(y_true, y_pred)
plt.figure(figsize=(10,8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix for MNIST Classifier")
plt.show()

🧮 5️⃣ Example Output

              precision    recall  f1-score   support

           0       0.99      0.99      0.99      980
           1       0.98      0.98      0.98     1135
           2       0.96      0.97      0.97     1032
           3       0.97      0.95      0.96     1010
           4       0.98      0.97      0.98      982
           5       0.95      0.95      0.95      892
           6       0.98      0.99      0.99      958
           7       0.97      0.97      0.97     1028
           8       0.96      0.95      0.96      974
           9       0.96      0.97      0.97     1009

    accuracy                           0.97     10000
   macro avg       0.97      0.97      0.97     10000
weighted avg       0.97      0.97      0.97     10000

Here, the macro average gives equal weight to each class, while weighted average considers class imbalance.

📈 6️⃣ Visualizing Confusion Matrix

A heatmap representation makes misclassifications visible.

Example visualization:

🟦 Diagonal cells = correct predictions (TP)
🟥 Off-diagonal cells = misclassifications

If most of the heatmap’s intensity is along the diagonal — your model is performing well.

This visual cue helps you spot systematic errors — e.g., the model confusing 3 with 8, or 5 with 6.

🧠 7️⃣ Real-World Analogy

Think of a COVID-19 test:

Prediction / Reality	Positive	Negative
Predicted Positive	True Positive (correctly identified infected)	False Positive (healthy person incorrectly flagged)
Predicted Negative	False Negative (infected person missed)	True Negative (correctly identified healthy)

Depending on the use case, you might prioritize:

Recall → For medical tests (catch all positives)
Precision → For spam filters (avoid false alarms)

Similarly, in AI systems:

Fraud detection → prioritize recall
Email classification → prioritize precision

🧪 8️⃣ Practical Tip — Normalize Confusion Matrix

For clearer visual comparison between classes:

cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
sns.heatmap(cm_normalized, annot=True, fmt='.2f', cmap='Greens')
plt.title("Normalized Confusion Matrix")
plt.show()

This displays percentage-based performance per class — helpful when class counts differ.

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

💡 9️⃣ Save Metrics for Reporting

You can log evaluation metrics for reproducibility or further analysis:

import pandas as pd

metrics = classification_report(y_true, y_pred, output_dict=True)
df_metrics = pd.DataFrame(metrics).transpose()
df_metrics.to_csv("mnist_metrics.csv", index=True)

This can later be integrated into dashboards or monitoring systems for continuous evaluation.

🧠 10️⃣ Beyond Accuracy: Advanced Metrics

In advanced use cases (like medical imaging or NLP), you might use:

AUC-ROC Curve → Tradeoff between True Positive Rate & False Positive Rate
PR Curve (Precision-Recall) → Especially useful for imbalanced data
Top-k Accuracy → For multi-class classification (e.g., ImageNet)

Example:

from sklearn.metrics import roc_curve, auc
fpr, tpr, _ = roc_curve(y_true, y_pred_binary)
roc_auc = auc(fpr, tpr)

✅ Conclusion of Section 14

Model evaluation is where training meets reality — it’s how you confirm that your neural network truly understands patterns rather than memorizing data.

You now know:

How to compute precision, recall, and F1-score
How to generate and interpret confusion matrices
How to visualize and export evaluation metrics for real-world reporting

🚀 Section 15: Model Saving, Loading, and Deployment in PyTorch

You’ve trained, tuned, and evaluated your PyTorch model — now it’s time to make it useful in the real world.
Model deployment bridges the gap between training and inference — turning your research code into a production-ready AI application.

In this section, we’ll explore how to:
✅ Save and load PyTorch models safely
✅ Perform inference on new (unseen) data
✅ Export models for deployment (TorchScript / ONNX)
✅ Integrate with APIs and real-world applications

🧠 1️⃣ Why Model Saving and Loading Matters

Training a model can take hours or even days, depending on data and architecture.
If you don’t save your model, you’ll lose all your learned weights!

By saving the model:

You can pause and resume training anytime.
You can share the model with other developers.
You can deploy it for inference on different platforms.

💾 2️⃣ Saving a PyTorch Model

There are two main methods to save models in PyTorch:

✅ A) Save Only Model Weights (Recommended)

This is the most common and flexible approach.

import torch

# Save model weights
torch.save(model.state_dict(), "mnist_model_weights.pth")
print("Model weights saved successfully!")

This saves all the parameters (weights and biases) learned during training — not the model class itself.

✅ B) Save the Entire Model

You can also save both the model structure and weights together (not recommended for production due to serialization issues).

torch.save(model, "mnist_full_model.pth")

However, this approach makes it harder to reload the model across PyTorch versions or systems.

🔄 3️⃣ Loading the Model

To reuse a model for inference or further training, you load it as follows:

If you saved weights only:

model = Net()  # recreate the model architecture
model.load_state_dict(torch.load("mnist_model_weights.pth"))
model.eval()  # switch to evaluation mode
print("Model loaded successfully!")

If you saved the full model:

model = torch.load("mnist_full_model.pth")
model.eval()

The .eval() method is crucial — it deactivates dropout and batch normalization updates, ensuring consistent predictions during inference.

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

🔍 4️⃣ Making Predictions (Inference Mode)

Once your model is loaded, you can use it to make predictions on new data:

import torch
from torchvision import transforms
from PIL import Image

# Load image and preprocess
image = Image.open("digit_sample.png").convert("L")
transform = transforms.Compose([
    transforms.Resize((28, 28)),
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

image = transform(image).unsqueeze(0)  # add batch dimension

# Predict
model.eval()
with torch.no_grad():
    output = model(image)
    predicted_label = output.argmax(1).item()

print(f"Predicted Digit: {predicted_label}")

Output Example:

Predicted Digit: 7

⚙️ 5️⃣ Deploying PyTorch Models Efficiently

When deploying in production, performance and portability matter.
Two key export formats make deployment easier:

🔹 A) TorchScript — Deployment within PyTorch Ecosystem

TorchScript allows you to convert PyTorch models into a format that runs without Python, enabling deployment on mobile or embedded systems.

# Convert to TorchScript
traced_model = torch.jit.trace(model, torch.randn(1, 1, 28, 28))
traced_model.save("mnist_traced_model.pt")
print("Model exported to TorchScript format!")

To load and run:

loaded_model = torch.jit.load("mnist_traced_model.pt")
output = loaded_model(torch.randn(1, 1, 28, 28))

Benefits:

Faster inference
Runs in C++ environments
No need for Python runtime

🔹 B) ONNX — Open Neural Network Exchange Format

ONNX is an open standard for AI models, compatible with multiple frameworks like TensorFlow, Caffe2, or even edge devices.

Export to ONNX:

dummy_input = torch.randn(1, 1, 28, 28)
torch.onnx.export(model, dummy_input, "mnist_model.onnx", 
                  input_names=['input'], output_names=['output'])
print("Model exported to ONNX format!")

You can then load the ONNX model using frameworks like ONNX Runtime, TensorRT, or deploy it on Azure, AWS, or NVIDIA Jetson.

🧩 6️⃣ Real-World Deployment Example — Flask API

You can deploy your trained PyTorch model as a REST API using Flask.

Example:

from flask import Flask, request, jsonify
import torch

app = Flask(__name__)
model = Net()
model.load_state_dict(torch.load("mnist_model_weights.pth"))
model.eval()

@app.route('/predict', methods=['POST'])
def predict():
    data = request.json['image']
    tensor = torch.tensor(data).unsqueeze(0).float()
    with torch.no_grad():
        output = model(tensor)
        pred = output.argmax(1).item()
    return jsonify({'prediction': pred})

if __name__ == '__main__':
    app.run(debug=True)

You can now send a POST request with image data, and the API returns a predicted label in JSON.

🧠 7️⃣ Model Deployment Options

Depending on your needs, here are popular deployment paths:

Platform	Description	Example
Local Inference	Run directly on CPU/GPU locally	Ideal for testing
Flask/FastAPI	Lightweight REST API	Quick deployment
Docker	Containerized deployment	For production
ONNX Runtime	Fast cross-platform inference	Cloud or Edge devices
TorchServe	Official PyTorch model server	Enterprise-grade deployment

🧮 8️⃣ Example: Dockerizing Your PyTorch Model

Docker enables your model to run consistently across environments.

Dockerfile Example:

FROM pytorch/pytorch:latest
COPY mnist_model_weights.pth /app/
COPY app.py /app/
WORKDIR /app
RUN pip install flask torchvision
CMD ["python", "app.py"]

Then build and run:

docker build -t mnist-api .
docker run -p 5000:5000 mnist-api

Your model is now live inside a containerized Flask API!

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

💡 9️⃣ Model Versioning and Monitoring

In real-world ML systems:

You’ll retrain models frequently as new data arrives.
You’ll track metrics like accuracy drift or prediction latency.

Use tools like:

Weights & Biases (wandb) – Track experiments and versions.
MLflow – Manage model lifecycle (training → serving).
TorchServe + Prometheus – Monitor inference performance.

✅ Conclusion of Section 15

You’ve now learned how to:

Save and load models efficiently
Perform inference on new data
Export to TorchScript or ONNX for deployment
Serve your model using Flask and Docker

With this, your PyTorch model transitions from an academic project to a production-ready AI service.

🚀 Section 16: Transfer Learning with Pretrained Models in PyTorch

Training deep neural networks from scratch requires huge datasets and extensive computational resources.
However, in most real-world scenarios, we can leverage existing models that have already learned useful features from massive datasets (like ImageNet).

This is where Transfer Learning comes in — a game-changing technique that helps you build accurate models quickly and efficiently.

🧠 1️⃣ What is Transfer Learning?

Transfer Learning is the process of taking a pre-trained model (already trained on a large dataset) and adapting it to a new, but related task.

Think of it like this:

“Why reinvent the wheel when you can fine-tune it for your own car?”

A pre-trained model already understands features like edges, shapes, and textures, which can be reused for new image classification tasks.

🔹 Key Benefits of Transfer Learning:

Benefit	Description
Faster Training	You start from pre-learned weights, not random initialization.
Better Accuracy	Beneficial when you have limited data.
Reduced Computational Cost	Requires less training time and resources.
Proven Architectures	Leverage powerful models like ResNet, VGG, or MobileNet.

🧩 2️⃣ Types of Transfer Learning

There are two main strategies:

✅ A) Feature Extraction

You freeze all layers of the pretrained model and only train the final classifier layer.

Used when your dataset is small.
The pretrained model acts as a fixed feature extractor.

✅ B) Fine-Tuning

You unfreeze some layers and retrain them along with the final layer.

Used when your dataset is large or similar to the original dataset.
The model adjusts its internal features to fit the new task.

🧠 3️⃣ Common Pretrained Models in PyTorch

PyTorch provides a wide range of pretrained models through the torchvision.models library, including:

Model	Description
ResNet	Deep residual network with skip connections
VGG	Deep network with large convolutional layers
AlexNet	Classic CNN for image classification
DenseNet	Uses dense connections between layers
MobileNet	Lightweight model for mobile applications
EfficientNet	Scalable model balancing accuracy and efficiency

⚙️ 4️⃣ Example: Transfer Learning with ResNet18

Let’s train a model to classify cats and dogs using a pretrained ResNet18 model.

Step 1: Import Libraries

import torch
import torch.nn as nn
from torchvision import models, transforms, datasets
from torch.utils.data import DataLoader

Step 2: Load Pretrained Model

# Load pretrained ResNet18
model = models.resnet18(pretrained=True)

# Freeze all layers (Feature Extraction Mode)
for param in model.parameters():
    param.requires_grad = False

# Modify the final layer for 2 output classes (cat/dog)
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 2)

Step 3: Define Transforms and Load Data

transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

train_data = datasets.ImageFolder('data/train', transform=transform)
train_loader = DataLoader(train_data, batch_size=32, shuffle=True)

Step 4: Train the Modified Layer

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.fc.parameters(), lr=0.001)

for epoch in range(5):
    for inputs, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")

Step 5: Save and Test

torch.save(model.state_dict(), "resnet18_cats_dogs.pth")
print("Fine-tuned ResNet18 model saved successfully!")

🧩 5️⃣ Fine-Tuning Selected Layers

If you have more data, you can unfreeze some deeper layers:

# Unfreeze last two blocks
for name, param in model.named_parameters():
    if "layer4" in name:
        param.requires_grad = True

Then adjust the optimizer to train both new and unfrozen parameters:

optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.0001)

This allows the model to adapt to your dataset while retaining its powerful pretrained knowledge.

🧠 6️⃣ Evaluating Transfer Learning Performance

After training, evaluate the model on the test set:

correct = 0
total = 0

with torch.no_grad():
    for inputs, labels in test_loader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy: {100 * correct / total:.2f}%')

Output Example:

Accuracy: 97.85%

🔍 7️⃣ Visualizing Model Predictions

Visualizations make it easier to understand how your model performs.

import matplotlib.pyplot as plt
import numpy as np

def imshow(inp, title=None):
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    if title:
        plt.title(title)
    plt.show()

inputs, classes = next(iter(test_loader))
outputs = model(inputs)
_, preds = torch.max(outputs, 1)

imshow(inputs[0], title=f"Predicted: {preds[0].item()}")

⚡ 8️⃣ Transfer Learning in NLP

Transfer learning isn’t limited to computer vision — it’s also dominant in Natural Language Processing (NLP).

Popular pretrained models include:

BERT (Bidirectional Encoder Representations from Transformers)
GPT (Generative Pretrained Transformer)
RoBERTa
DistilBERT

Using Hugging Face’s Transformers library:

from transformers import BertTokenizer, BertForSequenceClassification

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

Fine-tuning BERT for text classification follows a process similar to fine-tuning ResNet for images.

🌍 9️⃣ Real-World Applications of Transfer Learning

Domain	Application	Pretrained Model
Healthcare	Medical Image Classification	ResNet, DenseNet
Retail	Product Recommendation	EfficientNet
Finance	Fraud Detection	TabNet
NLP	Sentiment Analysis	BERT, RoBERTa
Autonomous Vehicles	Object Detection	YOLO, Faster R-CNN

🧠 🔟 Tips for Effective Transfer Learning

✅ Choose a model pretrained on a similar domain
✅ Use smaller learning rates when fine-tuning
✅ Normalize images properly to match pretrained model stats
✅ Freeze early layers when your dataset is small
✅ Always evaluate before and after fine-tuning to measure improvement

✅ Conclusion of Section 16

You’ve learned how to:

Reuse pretrained models like ResNet18
Apply feature extraction and fine-tuning strategies
Perform training and evaluation efficiently
Extend transfer learning to NLP models like BERT
Understand real-world deployment scenarios

Transfer learning allows you to achieve state-of-the-art performance without massive datasets or compute power.

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

First Image Classifier with PyTorch - IV : Hyperparameter Tuning & Optimization Techniques, Model Evaluation & Confusion Matrix Analysis and saving and loading and deployment the pytorch model

Building Your First Image Classifier with PyTorch: A Step-by-Step Guide Using the MNIST Dataset - IV

Content:

13: Hyperparameter Tuning & Optimization Techniques in PyTorch

14: Model Evaluation & Confusion Matrix Analysis in PyTorch

15: Model Saving, Loading, and Deployment in PyTorch

16: Transfer Learning with Pretrained Models in PyTorch

⚙️ Section 13: Hyperparameter Tuning & Optimization Techniques in PyTorch

🎯 What Are Hyperparameters?

🧠 Why Hyperparameter Tuning Matters

Sponsor Key-Word

🧮 Key Hyperparameters in PyTorch

1️⃣ Learning Rate (lr)

2️⃣ Batch Size

3️⃣ Number of Epochs

4️⃣ Optimizer Choice

5️⃣ Activation Functions

6️⃣ Regularization (Dropout, Weight Decay)

🔬 Manual Hyperparameter Search (Grid Search)

🤖 Automated Hyperparameter Tuning

📈 Visualization: Learning Rate vs Accuracy

💡 Best Practices for Hyperparameter Tuning

🧠 Real-World Example:

✅ Conclusion of Section 13

📊 Section 14: Model Evaluation & Confusion Matrix Analysis in PyTorch

🧠 1️⃣ The Need for Model Evaluation

⚙️ 2️⃣ Basic Evaluation Metrics

🔍 3️⃣ Confusion Matrix Explained

🧩 4️⃣ Implementing Model Evaluation in PyTorch

🧮 5️⃣ Example Output

📈 6️⃣ Visualizing Confusion Matrix

🧠 7️⃣ Real-World Analogy

🧪 8️⃣ Practical Tip — Normalize Confusion Matrix

Sponsor Key-Word

💡 9️⃣ Save Metrics for Reporting

🧠 10️⃣ Beyond Accuracy: Advanced Metrics

✅ Conclusion of Section 14

🚀 Section 15: Model Saving, Loading, and Deployment in PyTorch

🧠 1️⃣ Why Model Saving and Loading Matters

💾 2️⃣ Saving a PyTorch Model

✅ A) Save Only Model Weights (Recommended)

✅ B) Save the Entire Model

🔄 3️⃣ Loading the Model

If you saved weights only:

If you saved the full model:

Sponsor Key-Word

🔍 4️⃣ Making Predictions (Inference Mode)

⚙️ 5️⃣ Deploying PyTorch Models Efficiently

🔹 A) TorchScript — Deployment within PyTorch Ecosystem

🔹 B) ONNX — Open Neural Network Exchange Format

🧩 6️⃣ Real-World Deployment Example — Flask API

🧠 7️⃣ Model Deployment Options

🧮 8️⃣ Example: Dockerizing Your PyTorch Model

Sponsor Key-Word

💡 9️⃣ Model Versioning and Monitoring

✅ Conclusion of Section 15

🚀 Section 16: Transfer Learning with Pretrained Models in PyTorch

🧠 1️⃣ What is Transfer Learning?

🔹 Key Benefits of Transfer Learning:

🧩 2️⃣ Types of Transfer Learning

✅ A) Feature Extraction

✅ B) Fine-Tuning

🧠 3️⃣ Common Pretrained Models in PyTorch

⚙️ 4️⃣ Example: Transfer Learning with ResNet18

Step 1: Import Libraries

Step 2: Load Pretrained Model

Step 3: Define Transforms and Load Data

Step 4: Train the Modified Layer

Step 5: Save and Test

🧩 5️⃣ Fine-Tuning Selected Layers

🧠 6️⃣ Evaluating Transfer Learning Performance

🔍 7️⃣ Visualizing Model Predictions

⚡ 8️⃣ Transfer Learning in NLP

🌍 9️⃣ Real-World Applications of Transfer Learning

🧠 🔟 Tips for Effective Transfer Learning

✅ Conclusion of Section 16

Sponsor Key-Word

Comments

Post a Comment