Autoencoders Explained: A Complete Guide - II - Building Your First Autoencoder and Denoising, types of autoencoders and building it
Autoencoders Explained: A Complete Guide- II
content:
📘 Section 5: Building Your First Autoencoder in PyTorch (Full Code + Explanation)
Now that we understand the theory and math, it’s time to build a real autoencoder using PyTorch.
In this section, we’ll walk step-by-step through:
✔ Preparing the dataset
✔ Writing the Autoencoder class
✔ Training the model
✔ Evaluating reconstructions
✔ Visualizing outputs
This is your first working autoencoder, and it becomes the foundation for all advanced versions (Denoising AE, VAE, CVAE, etc.).
🔶 1. Import Dependencies
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
import matplotlib.pyplot as plt
-
torch.nn→ building model -
torch.optim→ optimizers -
torchvision.datasets→ MNIST dataset -
matplotlib→ visualization
🔶 2. Preparing the MNIST Dataset
We use 28×28 grayscale handwritten digits, perfect for beginners.
transform = transforms.Compose([
transforms.ToTensor()
])
train_data = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_data = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
train_loader = DataLoader(train_data, batch_size=128, shuffle=True)
test_loader = DataLoader(test_data, batch_size=128, shuffle=False)
✔ Pixel values converted to tensors
✔ No normalization required (autoencoder learns distribution)
🔶 3. Define the Autoencoder Class
Here’s a simple fully connected (dense) autoencoder:
Architecture:
-
Input: 784 (flattened 28×28)
-
Hidden: 256 → 64 → bottleneck = 16
-
Decoder: reverse of encoder
class Autoencoder(nn.Module):
def __init__(self):
super(Autoencoder, self).__init__()
# Encoder
self.encoder = nn.Sequential(
nn.Linear(784, 256),
nn.ReLU(),
nn.Linear(256, 64),
nn.ReLU(),
nn.Linear(64, 16) # bottleneck
)
# Decoder
self.decoder = nn.Sequential(
nn.Linear(16, 64),
nn.ReLU(),
nn.Linear(64, 256),
nn.ReLU(),
nn.Linear(256, 784),
nn.Sigmoid() # outputs 0–1
)
def forward(self, x):
x = x.view(-1, 784) # flatten
z = self.encoder(x)
out = self.decoder(z)
out = out.view(-1, 1, 28, 28) # reshape back
return out, z
✔ encoder() compresses input
✔ decoder() reconstructs input
✔ Sigmoid output suits image reconstruction
🔶 4. Initialize Model, Loss Function & Optimizer
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = Autoencoder().to(device)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
-
MSE works perfectly for pixel reconstruction
-
Adam is stable and efficient
🔶 5. Training Loop
num_epochs = 10
for epoch in range(num_epochs):
total_loss = 0
for images, _ in train_loader:
images = images.to(device)
optimizer.zero_grad()
outputs, latent = model(images)
loss = criterion(outputs, images)
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss/len(train_loader):.4f}")
✔ Forward pass
✔ Compute reconstruction loss
✔ Backpropagation
✔ Parameter update
🔶 6. Reconstruct Images (Qualitative Evaluation)
def show_images(original, reconstructed, n=10):
plt.figure(figsize=(15, 4))
for i in range(n):
# Original
plt.subplot(2, n, i+1)
plt.imshow(original[i].squeeze().cpu().numpy(), cmap='gray')
plt.axis('off')
# Reconstructed
plt.subplot(2, n, i+1+n)
plt.imshow(reconstructed[i].squeeze().cpu().detach().numpy(), cmap='gray')
plt.axis('off')
plt.show()
Now test:
test_images, _ = next(iter(test_loader))
test_images = test_images.to(device)
reconstructed, _ = model(test_images)
show_images(test_images, reconstructed)
🔶 7. Inspecting the Latent Space (Optional)
The latent vector (size 16) is accessible via:
_, latent_vectors = model(test_images)
print(latent_vectors.shape)
Output:
torch.Size([128, 16])
Each image is now compressed from 784 → 16 dimensions.
🔶 8. Example Reconstruction Results
You will typically see:
✔ Blurry but accurate digit reconstructions
✔ Clear retention of original shapes
✔ Effective noise reduction (even without training as denoiser)
This confirms that the autoencoder has learned essential features.
🔶 Section 5 Summary
In this section, we:
✔ Loaded MNIST
✔ Built a fully connected autoencoder
✔ Defined encoder + decoder architecture
✔ Trained it for 10 epochs
✔ Reconstructed test images
✔ Saw latent vectors representing compressed data
This is the foundational model upon which all advanced autoencoders are built.
📘 Section 6: Building a Denoising Autoencoder (Theory + Full PyTorch Code)
So far, we have built a vanilla autoencoder that learns to reconstruct images.
But real-world data is often corrupted, noisy, or incomplete.
To handle this, researchers introduced a powerful variation:
🔶 Denoising Autoencoder (DAE)
A model trained to remove noise and recover original clean data.
This turns the autoencoder into a robust feature extractor.
🔶 1. What Is a Denoising Autoencoder?
A Denoising Autoencoder works like this:
-
Start with clean input ( x )
-
Add noise → corrupted input ( \tilde{x} )
-
Feed ( \tilde{x} ) into the encoder
-
The decoder reconstructs clean output ( \hat{x} \approx x )
Formally:
[
\tilde{x} = x + \epsilon, \quad \epsilon \sim \mathcal{N}(0, \sigma^2)
]
[
\hat{x} = \text{Decoder}(\text{Encoder}(\tilde{x}))
]
The loss is still Mean Squared Error (MSE):
[
\mathcal{L} = | x - \hat{x} |^2
]
✔ Why DAE is powerful?
-
Learns robust, noise-invariant features
-
Avoids trivial identity function
-
Generalizes better than vanilla AE
Sponsor Key-Word
"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"
🔶 2. Adding Noise to Images
The most common way:
Gaussian Noise
[
\tilde{x} = x + \mathcal{N}(0, 0.1)
]
Salt & Pepper Noise
Random black/white pixels.
In this tutorial, we use Gaussian noise.
🔶 3. Dataset with Added Noise
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
import matplotlib.pyplot as plt
# Add Gaussian Noise
class AddGaussianNoise(object):
def __init__(self, mean=0., std=0.2):
self.mean = mean
self.std = std
def __call__(self, tensor):
noise = torch.randn(tensor.size()) * self.std + self.mean
return torch.clamp(tensor + noise, 0., 1.)
Training data = noisy
Target = clean
train_transform = transforms.Compose([
transforms.ToTensor(),
AddGaussianNoise(0., 0.3)
])
clean_transform = transforms.ToTensor()
train_data_noisy = datasets.MNIST(root='./data', train=True, download=True, transform=train_transform)
train_data_clean = datasets.MNIST(root='./data', train=True, download=True, transform=clean_transform)
train_loader = DataLoader(list(zip(train_data_noisy, train_data_clean)), batch_size=128, shuffle=True)
🔶 4. Denoising Autoencoder Architecture
We’ll reuse the same autoencoder structure from Section 5.
class DenoisingAutoencoder(nn.Module):
def __init__(self):
super(DenoisingAutoencoder, self).__init__()
self.encoder = nn.Sequential(
nn.Linear(784, 256),
nn.ReLU(),
nn.Linear(256, 64),
nn.ReLU(),
nn.Linear(64, 16),
)
self.decoder = nn.Sequential(
nn.Linear(16, 64),
nn.ReLU(),
nn.Linear(64, 256),
nn.ReLU(),
nn.Linear(256, 784),
nn.Sigmoid()
)
def forward(self, x):
x = x.view(-1, 784)
z = self.encoder(x)
out = self.decoder(z)
return out.view(-1, 1, 28, 28)
🔶 5. Training the Denoising Autoencoder
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = DenoisingAutoencoder().to(device)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
num_epochs = 10
for epoch in range(num_epochs):
total_loss = 0
for (noisy_imgs, clean_imgs) in train_loader:
noisy_imgs = noisy_imgs.to(device)
clean_imgs = clean_imgs.to(device)
optimizer.zero_grad()
reconstructed = model(noisy_imgs)
loss = criterion(reconstructed, clean_imgs)
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss/len(train_loader):.4f}")
✔ The model learns to map noisy → clean
✔ Loss steadily decreases over epochs
🔶 6. Testing the Denoising Model
test_data = datasets.MNIST(root='./data', train=False, download=True,
transform=transforms.ToTensor())
test_loader = DataLoader(test_data, batch_size=10, shuffle=True)
# Add noise manually for testing
noise = AddGaussianNoise(0., 0.3)
images, _ = next(iter(test_loader))
noisy_images = noise(images)
images = images.to(device)
noisy_images = noisy_images.to(device)
output = model(noisy_images)
🔶 7. Visualizing Results
def show_denoising(original, noisy, reconstructed):
plt.figure(figsize=(15, 5))
for i in range(10):
# Original
plt.subplot(3, 10, i+1)
plt.imshow(original[i].cpu().squeeze(), cmap='gray')
plt.axis('off')
# Noisy
plt.subplot(3, 10, i+11)
plt.imshow(noisy[i].cpu().squeeze(), cmap='gray')
plt.axis('off')
# Reconstructed
plt.subplot(3, 10, i+21)
plt.imshow(reconstructed[i].detach().cpu().squeeze(), cmap='gray')
plt.axis('off')
plt.show()
show_denoising(images, noisy_images, output)
🔶 8. Expected Result
Your output visuals will show:
-
Top row → original digits
-
Middle row → noisy corrupted digits
-
Bottom row → cleaned images produced by the autoencoder
A Denoising Autoencoder can remove:
✔ Gaussian blur
✔ Random pixel noise
✔ Light distortions
🔶 Section 6 Summary
You now have:
✔ Full theory of denoising autoencoders
✔ Noise injection pipeline
✔ Complete PyTorch implementation
✔ Training + visualization
✔ Reconstructed clean images
This model is extremely useful and forms the basis for:
-
Deepfake cleaners
-
Speech denoisers
-
Image restoration tools
-
Medical image cleanup
-
Real-world preprocessing pipelines
Sponsor Key-Word
"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"
7. Types of Autoencoders
Autoencoders come in several variants, each designed to improve reconstruction quality, generalization, or latent space structure. Below are the most important types used in modern AI.
7.1 Undercomplete Autoencoder
Definition
An autoencoder where the latent space (bottleneck) has fewer dimensions than the input.
Why it exists
To force the model to learn the most important features, not memorize.
Use Cases
-
Feature extraction
-
Dimensionality reduction (PCA alternative)
-
Noise removal
Diagram (conceptually)
Input → Encoder → Small bottleneck → Decoder → Output
7.2 Overcomplete Autoencoder
Definition
Latent space has more dimensions than the input.
Why it exists
For tasks where richer representations are needed.
Risk
The model may memorize the data.
Fix
Use regularization:
-
Sparse autoencoder
-
Denoising autoencoder
-
Contractive autoencoder
7.3 Sparse Autoencoder
Definition
Uses sparsity constraint (like L1 regularization or KL divergence):
Only a small number of neurons activate at once.
Why it exists
Mimics biological neurons → leads to features like edge detection.
Use Cases
-
Extracting meaningful features
-
Pretraining deep networks
-
Speech feature extraction
7.4 Denoising Autoencoder (DAE)
Definition
The model removes noise:
Noisy Input → Autoencoder → Clean Output
Why it exists
To create a robust encoder that can recover from corrupted data.
Use Cases
-
Noise removal in images
-
Improving robustness
-
Pretraining deep networks
Sponsor Key-Word
"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"
7.5 Contractive Autoencoder (CAE)
Definition
Adds a penalty on the encoder gradients to ensure the mapping is stable.
Why it exists
To make the representation less sensitive to small changes in input.
Use Cases
-
Semantic feature extraction
-
Smooth latent space learning
7.6 Variational Autoencoder (VAE)
Definition
A probabilistic autoencoder that learns a distribution of latent variables instead of fixed values.
Why it exists
For generating new images, not just reconstructing old ones.
Use Cases
-
Image generation
-
Synthetic dataset creation
-
Style transfer
-
Anomaly detection
7.7 Convolutional Autoencoder (CAE)
Definition
Uses Conv2D layers instead of dense layers → works well for images.
Why it exists
Captures spatial structure.
Use Cases
-
Image compression
-
Image denoising
-
Feature extraction from images
7.8 Sequence Autoencoder
Definition
Uses LSTM/GRU for sequence-to-sequence autoencoding.
Why it exists
To handle sequential data.
Use Cases
-
Text embedding
-
Speech compression
-
Time-series anomaly detection
7.9 Multimodal Autoencoders
Definition
Autoencoders that process multiple data types (e.g., image + text).
Use Cases
-
Vision + Language models
-
Cross-modal retrieval
-
Multimodal representation learning
✔ Summary Table (Perfect for Your Blog)
| Autoencoder Type | Key Idea | Best For |
|---|---|---|
| Undercomplete | Latent space < input | Compact feature learning |
| Overcomplete | Latent space > input | Rich representations |
| Sparse | Few active neurons | Meaningful features |
| Denoising | Recover from noise | Image denoising |
| Contractive | Stable mapping | Smooth latent space |
| Convolutional | Conv layers | Image tasks |
| VAE | Probabilistic latent space | Generative models |
| Sequence | LSTM-based | NLP, time series |
| Multimodal | Multi-input | Vision–Language |
Section 8: Building an Autoencoder — Step-by-Step (Conceptual Walkthrough)
In this section, we break down exactly how an autoencoder is built, from data preparation to training. This gives your blog readers a clear roadmap before jumping into code.
8. Building an Autoencoder — Step-by-Step Guide
Autoencoders follow a simple pipeline:
Input → Encoder → Latent Space → Decoder → Output (Reconstruction)
Here is the complete conceptual workflow:
8.1 Step 1 — Choose Your Dataset
Autoencoders work best on:
-
Images (MNIST, CIFAR-10, Fashion MNIST)
-
Tabular data
-
Text (for sequence autoencoders)
-
Time-series data
For your blog example, we normally use:
👉 MNIST Handwritten Digits dataset
because it is simple, clean, and widely used for autoencoder demos.
8.2 Step 2 — Normalize the Data
Autoencoders are sensitive to scale.
For image datasets (0–255 pixel values):
x = x / 255.0
Why normalization?
-
Faster training
-
More stable gradients
-
Better reconstruction quality
8.3 Step 3 — Define the Encoder Architecture
The encoder compresses data into a small representation.
Typical choices:
-
Dense layers (for simple examples)
-
Conv2D layers (for image autoencoders)
Example encoder (conceptually):
Input (28×28)
→ Dense(128)
→ Dense(64)
→ Dense(32) ← latent dimension
Key rule:
➡ Each layer gets smaller, funneling down into the bottleneck.
8.4 Step 4 — Define the Latent Space (Bottleneck)
This is the heart of the autoencoder.
Latent space determines:
-
How much the model compresses
-
What features it learns
-
How well it can reconstruct inputs
Typically:
-
16, 32, 64 dimensions
-
Much smaller than input (784 for MNIST)
Good rule of thumb:
👉 Start with 16 or 32 latent dims for MNIST.
8.5 Step 5 — Define the Decoder Architecture
The decoder mirrors the encoder.
Conceptually:
Latent vector (32)
→ Dense(64)
→ Dense(128)
→ Dense(784) → Reshape to 28×28
Key idea:
➡ The decoder gradually expands compressed data back to original shape.
Sponsor Key-Word
"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"
8.6 Step 6 — Choose a Loss Function
For autoencoders, the most common is:
Mean Squared Error (MSE)
Measures pixel-by-pixel difference:
MSE = mean((original - reconstructed)^2)
Ideal when:
-
Output is continuous
-
Data is normalized
Alternative losses:
-
Binary Cross-Entropy (BCE)
-
MAE (L1 loss)
-
SSIM (for better image quality)
8.7 Step 7 — Choose Optimizer
Common choices:
-
Adam (best for beginners)
-
RMSprop
-
SGD (slower but stable)
Recommended:
optimizer = Adam(learning_rate=0.001)
8.8 Step 8 — Train the Autoencoder
Key hyperparameters:
-
Batch size: 32–128
-
Epochs: 20–50
-
Validation split: 10–20%
Training goal:
➡ Minimize reconstruction loss
➡ Learn essential features automatically
8.9 Step 9 — Evaluate Reconstruction Quality
You should check:
-
Loss value
-
Reconstructed images
-
Latent space distribution
-
Underfitting or overfitting
For images:
👉 Visual comparison is the best evaluator.
8.10 Step 10 — Use the Autoencoder for Applications
After training, autoencoders can be used for:
-
Denoising images
-
Dimensionality reduction
-
Anomaly detection (compare reconstruction error)
-
Feature extraction
-
Image compression
This is where the model becomes practical.
✔ Section 8 Summary (Perfect for your blog)
| Step | Description |
|---|---|
| 1 | Select dataset |
| 2 | Normalize data |
| 3 | Build encoder |
| 4 | Define latent space |
| 5 | Build decoder |
| 6 | Choose loss |
| 7 | Choose optimizer |
| 8 | Train model |
| 9 | Evaluate reconstruction |
| 10 | Apply the trained autoencoder |
Sponsor Key-Word
"This Content Sponsored by SBO Digital Marketing.
Mobile-Based Part-Time Job Opportunity by SBO!
Earn money online by doing simple content publishing and sharing tasks. Here's how:
Job Type: Mobile-based part-time work
Work Involves:
Content publishing
Content sharing on social media
Time Required: As little as 1 hour a day
Earnings: ₹300 or more daily
Requirements:
Active Facebook and Instagram account
Basic knowledge of using mobile and social media
For more details:
WhatsApp your Name and Qualification to 9994104160
a.Online Part Time Jobs from Home
b.Work from Home Jobs Without Investment
c.Freelance Jobs Online for Students
d.Mobile Based Online Jobs
e.Daily Payment Online Jobs
Keyword & Tag: #OnlinePartTimeJob #WorkFromHome #EarnMoneyOnline #PartTimeJob #jobs #jobalerts #withoutinvestmentjob"


Comments
Post a Comment