Autoencoders Explained: A Complete Guide- II

content:

5. Building Your First Autoencoder in PyTorch (Full Code + Explanation)

6. Building a Denoising Autoencoder (Theory + Full PyTorch Code)

7. Types of Autoencoders

8. Building an Autoencoder — Step-by-Step (Conceptual Walkthrough)

📘 Section 5: Building Your First Autoencoder in PyTorch (Full Code + Explanation)

Now that we understand the theory and math, it’s time to build a real autoencoder using PyTorch.
In this section, we’ll walk step-by-step through:

✔ Preparing the dataset
✔ Writing the Autoencoder class
✔ Training the model
✔ Evaluating reconstructions
✔ Visualizing outputs

This is your first working autoencoder, and it becomes the foundation for all advanced versions (Denoising AE, VAE, CVAE, etc.).

🔶 1. Import Dependencies

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
import matplotlib.pyplot as plt

torch.nn → building model
torch.optim → optimizers
torchvision.datasets → MNIST dataset
matplotlib → visualization

🔶 2. Preparing the MNIST Dataset

We use 28×28 grayscale handwritten digits, perfect for beginners.

transform = transforms.Compose([
    transforms.ToTensor()
])

train_data = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_data  = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_data, batch_size=128, shuffle=True)
test_loader  = DataLoader(test_data, batch_size=128, shuffle=False)

✔ Pixel values converted to tensors
✔ No normalization required (autoencoder learns distribution)

🔶 3. Define the Autoencoder Class

Here’s a simple fully connected (dense) autoencoder:

Architecture:

Input: 784 (flattened 28×28)
Hidden: 256 → 64 → bottleneck = 16
Decoder: reverse of encoder

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        
        # Encoder
        self.encoder = nn.Sequential(
            nn.Linear(784, 256),
            nn.ReLU(),
            nn.Linear(256, 64),
            nn.ReLU(),
            nn.Linear(64, 16)   # bottleneck
        )
        
        # Decoder
        self.decoder = nn.Sequential(
            nn.Linear(16, 64),
            nn.ReLU(),
            nn.Linear(64, 256),
            nn.ReLU(),
            nn.Linear(256, 784),
            nn.Sigmoid()       # outputs 0–1
        )
    
    def forward(self, x):
        x = x.view(-1, 784)  # flatten
        z = self.encoder(x)
        out = self.decoder(z)
        out = out.view(-1, 1, 28, 28)  # reshape back
        return out, z

✔ encoder() compresses input
✔ decoder() reconstructs input
✔ Sigmoid output suits image reconstruction

🔶 4. Initialize Model, Loss Function & Optimizer

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = Autoencoder().to(device)

criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

MSE works perfectly for pixel reconstruction
Adam is stable and efficient

🔶 5. Training Loop

num_epochs = 10

for epoch in range(num_epochs):
    total_loss = 0
    
    for images, _ in train_loader:
        images = images.to(device)
        
        optimizer.zero_grad()
        
        outputs, latent = model(images)
        loss = criterion(outputs, images)
        
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item()
    
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss/len(train_loader):.4f}")

✔ Forward pass
✔ Compute reconstruction loss
✔ Backpropagation
✔ Parameter update

🔶 6. Reconstruct Images (Qualitative Evaluation)

def show_images(original, reconstructed, n=10):
    plt.figure(figsize=(15, 4))
    
    for i in range(n):
        # Original
        plt.subplot(2, n, i+1)
        plt.imshow(original[i].squeeze().cpu().numpy(), cmap='gray')
        plt.axis('off')
        
        # Reconstructed
        plt.subplot(2, n, i+1+n)
        plt.imshow(reconstructed[i].squeeze().cpu().detach().numpy(), cmap='gray')
        plt.axis('off')

    plt.show()

Now test:

test_images, _ = next(iter(test_loader))
test_images = test_images.to(device)

reconstructed, _ = model(test_images)
show_images(test_images, reconstructed)

🔶 7. Inspecting the Latent Space (Optional)

The latent vector (size 16) is accessible via:

_, latent_vectors = model(test_images)
print(latent_vectors.shape)

Output:

torch.Size([128, 16])

Each image is now compressed from 784 → 16 dimensions.

🔶 8. Example Reconstruction Results

You will typically see:

✔ Blurry but accurate digit reconstructions
✔ Clear retention of original shapes
✔ Effective noise reduction (even without training as denoiser)

This confirms that the autoencoder has learned essential features.

🔶 Section 5 Summary

In this section, we:

✔ Loaded MNIST
✔ Built a fully connected autoencoder
✔ Defined encoder + decoder architecture
✔ Trained it for 10 epochs
✔ Reconstructed test images
✔ Saw latent vectors representing compressed data

This is the foundational model upon which all advanced autoencoders are built.

📘 Section 6: Building a Denoising Autoencoder (Theory + Full PyTorch Code)

So far, we have built a vanilla autoencoder that learns to reconstruct images.
But real-world data is often corrupted, noisy, or incomplete.

To handle this, researchers introduced a powerful variation:

🔶 Denoising Autoencoder (DAE)

A model trained to remove noise and recover original clean data.

This turns the autoencoder into a robust feature extractor.

🔶 1. What Is a Denoising Autoencoder?

A Denoising Autoencoder works like this:

Start with clean input ( x )
Add noise → corrupted input ( \tilde{x} )
Feed ( \tilde{x} ) into the encoder
The decoder reconstructs clean output ( \hat{x} \approx x )

Formally:

[
\tilde{x} = x + \epsilon, \quad \epsilon \sim \mathcal{N}(0, \sigma^2)
]

[
\hat{x} = \text{Decoder}(\text{Encoder}(\tilde{x}))
]

The loss is still Mean Squared Error (MSE):

[
\mathcal{L} = | x - \hat{x} |^2
]

✔ Why DAE is powerful?

Learns robust, noise-invariant features
Avoids trivial identity function
Generalizes better than vanilla AE

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

🔶 2. Adding Noise to Images

The most common way:

Gaussian Noise

[
\tilde{x} = x + \mathcal{N}(0, 0.1)
]

Salt & Pepper Noise

Random black/white pixels.

In this tutorial, we use Gaussian noise.

🔶 3. Dataset with Added Noise

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
import matplotlib.pyplot as plt

# Add Gaussian Noise
class AddGaussianNoise(object):
    def __init__(self, mean=0., std=0.2):
        self.mean = mean
        self.std = std
        
    def __call__(self, tensor):
        noise = torch.randn(tensor.size()) * self.std + self.mean
        return torch.clamp(tensor + noise, 0., 1.)

Training data = noisy

Target = clean

train_transform = transforms.Compose([
    transforms.ToTensor(),
    AddGaussianNoise(0., 0.3)
])

clean_transform = transforms.ToTensor()

train_data_noisy = datasets.MNIST(root='./data', train=True, download=True, transform=train_transform)
train_data_clean = datasets.MNIST(root='./data', train=True, download=True, transform=clean_transform)

train_loader = DataLoader(list(zip(train_data_noisy, train_data_clean)), batch_size=128, shuffle=True)

🔶 4. Denoising Autoencoder Architecture

We’ll reuse the same autoencoder structure from Section 5.

class DenoisingAutoencoder(nn.Module):
    def __init__(self):
        super(DenoisingAutoencoder, self).__init__()
        
        self.encoder = nn.Sequential(
            nn.Linear(784, 256),
            nn.ReLU(),
            nn.Linear(256, 64),
            nn.ReLU(),
            nn.Linear(64, 16),
        )
        
        self.decoder = nn.Sequential(
            nn.Linear(16, 64),
            nn.ReLU(),
            nn.Linear(64, 256),
            nn.ReLU(),
            nn.Linear(256, 784),
            nn.Sigmoid()
        )
    
    def forward(self, x):
        x = x.view(-1, 784)
        z = self.encoder(x)
        out = self.decoder(z)
        return out.view(-1, 1, 28, 28)

🔶 5. Training the Denoising Autoencoder

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = DenoisingAutoencoder().to(device)

criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

num_epochs = 10

for epoch in range(num_epochs):
    total_loss = 0
    
    for (noisy_imgs, clean_imgs) in train_loader:
        noisy_imgs = noisy_imgs.to(device)
        clean_imgs = clean_imgs.to(device)
        
        optimizer.zero_grad()
        
        reconstructed = model(noisy_imgs)
        loss = criterion(reconstructed, clean_imgs)
        
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item()
    
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss/len(train_loader):.4f}")

✔ The model learns to map noisy → clean
✔ Loss steadily decreases over epochs

🔶 6. Testing the Denoising Model

test_data = datasets.MNIST(root='./data', train=False, download=True,
                           transform=transforms.ToTensor())

test_loader = DataLoader(test_data, batch_size=10, shuffle=True)

# Add noise manually for testing
noise = AddGaussianNoise(0., 0.3)

images, _ = next(iter(test_loader))
noisy_images = noise(images)

images = images.to(device)
noisy_images = noisy_images.to(device)

output = model(noisy_images)

🔶 7. Visualizing Results

def show_denoising(original, noisy, reconstructed):
    plt.figure(figsize=(15, 5))
    
    for i in range(10):
        # Original
        plt.subplot(3, 10, i+1)
        plt.imshow(original[i].cpu().squeeze(), cmap='gray')
        plt.axis('off')

        # Noisy
        plt.subplot(3, 10, i+11)
        plt.imshow(noisy[i].cpu().squeeze(), cmap='gray')
        plt.axis('off')

        # Reconstructed
        plt.subplot(3, 10, i+21)
        plt.imshow(reconstructed[i].detach().cpu().squeeze(), cmap='gray')
        plt.axis('off')

    plt.show()

show_denoising(images, noisy_images, output)

🔶 8. Expected Result

Your output visuals will show:

Top row → original digits
Middle row → noisy corrupted digits
Bottom row → cleaned images produced by the autoencoder

A Denoising Autoencoder can remove:

✔ Gaussian blur
✔ Random pixel noise
✔ Light distortions

🔶 Section 6 Summary

You now have:

✔ Full theory of denoising autoencoders
✔ Noise injection pipeline
✔ Complete PyTorch implementation
✔ Training + visualization
✔ Reconstructed clean images

This model is extremely useful and forms the basis for:

Deepfake cleaners
Speech denoisers
Image restoration tools
Medical image cleanup
Real-world preprocessing pipelines

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

7. Types of Autoencoders

Autoencoders come in several variants, each designed to improve reconstruction quality, generalization, or latent space structure. Below are the most important types used in modern AI.

7.1 Undercomplete Autoencoder

Definition

An autoencoder where the latent space (bottleneck) has fewer dimensions than the input.

Why it exists

To force the model to learn the most important features, not memorize.

Use Cases

Feature extraction
Dimensionality reduction (PCA alternative)
Noise removal

Diagram (conceptually)

Input → Encoder → Small bottleneck → Decoder → Output

7.2 Overcomplete Autoencoder

Definition

Latent space has more dimensions than the input.

Why it exists

For tasks where richer representations are needed.

Risk

The model may memorize the data.

Fix

Use regularization:

Sparse autoencoder
Denoising autoencoder
Contractive autoencoder

7.3 Sparse Autoencoder

Definition

Uses sparsity constraint (like L1 regularization or KL divergence):
Only a small number of neurons activate at once.

Why it exists

Mimics biological neurons → leads to features like edge detection.

Use Cases

Extracting meaningful features
Pretraining deep networks
Speech feature extraction

7.4 Denoising Autoencoder (DAE)

Definition

The model removes noise:
Noisy Input → Autoencoder → Clean Output

Why it exists

To create a robust encoder that can recover from corrupted data.

Use Cases

Noise removal in images
Improving robustness
Pretraining deep networks

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

7.5 Contractive Autoencoder (CAE)

Definition

Adds a penalty on the encoder gradients to ensure the mapping is stable.

Why it exists

To make the representation less sensitive to small changes in input.

Use Cases

Semantic feature extraction
Smooth latent space learning

7.6 Variational Autoencoder (VAE)

Definition

A probabilistic autoencoder that learns a distribution of latent variables instead of fixed values.

Why it exists

For generating new images, not just reconstructing old ones.

Use Cases

Image generation
Synthetic dataset creation
Style transfer
Anomaly detection

7.7 Convolutional Autoencoder (CAE)

Definition

Uses Conv2D layers instead of dense layers → works well for images.

Why it exists

Captures spatial structure.

Use Cases

Image compression
Image denoising
Feature extraction from images

7.8 Sequence Autoencoder

Definition

Uses LSTM/GRU for sequence-to-sequence autoencoding.

Why it exists

To handle sequential data.

Use Cases

Text embedding
Speech compression
Time-series anomaly detection

7.9 Multimodal Autoencoders

Definition

Autoencoders that process multiple data types (e.g., image + text).

Use Cases

Vision + Language models
Cross-modal retrieval
Multimodal representation learning

✔ Summary Table (Perfect for Your Blog)

Autoencoder Type	Key Idea	Best For
Undercomplete	Latent space < input	Compact feature learning
Overcomplete	Latent space > input	Rich representations
Sparse	Few active neurons	Meaningful features
Denoising	Recover from noise	Image denoising
Contractive	Stable mapping	Smooth latent space
Convolutional	Conv layers	Image tasks
VAE	Probabilistic latent space	Generative models
Sequence	LSTM-based	NLP, time series
Multimodal	Multi-input	Vision–Language

Section 8: Building an Autoencoder — Step-by-Step (Conceptual Walkthrough)

In this section, we break down exactly how an autoencoder is built, from data preparation to training. This gives your blog readers a clear roadmap before jumping into code.

8. Building an Autoencoder — Step-by-Step Guide

Autoencoders follow a simple pipeline:
Input → Encoder → Latent Space → Decoder → Output (Reconstruction)

Here is the complete conceptual workflow:

8.1 Step 1 — Choose Your Dataset

Autoencoders work best on:

Images (MNIST, CIFAR-10, Fashion MNIST)
Tabular data
Text (for sequence autoencoders)
Time-series data

For your blog example, we normally use:
👉 MNIST Handwritten Digits dataset
because it is simple, clean, and widely used for autoencoder demos.

8.2 Step 2 — Normalize the Data

Autoencoders are sensitive to scale.

For image datasets (0–255 pixel values):

x = x / 255.0

Why normalization?

Faster training
More stable gradients
Better reconstruction quality

8.3 Step 3 — Define the Encoder Architecture

The encoder compresses data into a small representation.

Typical choices:

Dense layers (for simple examples)
Conv2D layers (for image autoencoders)

Example encoder (conceptually):

Input (28×28)
→ Dense(128)
→ Dense(64)
→ Dense(32)  ← latent dimension

Key rule:
➡ Each layer gets smaller, funneling down into the bottleneck.

8.4 Step 4 — Define the Latent Space (Bottleneck)

This is the heart of the autoencoder.

Latent space determines:

How much the model compresses
What features it learns
How well it can reconstruct inputs

Typically:

16, 32, 64 dimensions
Much smaller than input (784 for MNIST)

Good rule of thumb:
👉 Start with 16 or 32 latent dims for MNIST.

8.5 Step 5 — Define the Decoder Architecture

The decoder mirrors the encoder.

Conceptually:

Latent vector (32)
→ Dense(64)
→ Dense(128)
→ Dense(784) → Reshape to 28×28

Key idea:
➡ The decoder gradually expands compressed data back to original shape.

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"

8.6 Step 6 — Choose a Loss Function

For autoencoders, the most common is:

Mean Squared Error (MSE)

Measures pixel-by-pixel difference:

MSE = mean((original - reconstructed)^2)

Ideal when:

Output is continuous
Data is normalized

Alternative losses:

Binary Cross-Entropy (BCE)
MAE (L1 loss)
SSIM (for better image quality)

8.7 Step 7 — Choose Optimizer

Common choices:

Adam (best for beginners)
RMSprop
SGD (slower but stable)

Recommended:

optimizer = Adam(learning_rate=0.001)

8.8 Step 8 — Train the Autoencoder

Key hyperparameters:

Batch size: 32–128
Epochs: 20–50
Validation split: 10–20%

Training goal:
➡ Minimize reconstruction loss
➡ Learn essential features automatically

8.9 Step 9 — Evaluate Reconstruction Quality

You should check:

Loss value
Reconstructed images
Latent space distribution
Underfitting or overfitting

For images:
👉 Visual comparison is the best evaluator.

8.10 Step 10 — Use the Autoencoder for Applications

After training, autoencoders can be used for:

Denoising images
Dimensionality reduction
Anomaly detection (compare reconstruction error)
Feature extraction
Image compression

This is where the model becomes practical.

✔ Section 8 Summary (Perfect for your blog)

Step	Description
1	Select dataset
2	Normalize data
3	Build encoder
4	Define latent space
5	Build decoder
6	Choose loss
7	Choose optimizer
8	Train model
9	Evaluate reconstruction
10	Apply the trained autoencoder

Sponsor Key-Word

"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"