• Major League Gaming
  • GameBattles
  • GotFrag
  • MMO-Champion
  • Game Room
Latest News: Call of Duty: MW2 Ladder Update
  • Login
  • Join GameBattles
  • Help
  • MLG Store
Username Password
  • Join GameBattles
  • Trouble logging in?
What is GameBattles? How does it work?
GameBattles.com
  • Home
  • My GB
  • Xbox 360
    • Xbox 360 Frontpage
    • Xbox 360 Paid Tournaments
    • Xbox 360 Forums
    • Aliens vs. Predator
    • Arcade
    • Army of Two: The 40th Day
    • Battlefield: Bad Company 2
    • BioShock 2
    • Call of Duty 4: Modern Warfare
    • Call of Duty: Modern Warfare 2
    • Call of Duty: World at War
    • FIFA Soccer 10
    • Fight Night Round 4
    • Forza Motorsport 3
    • Gears of War
    • Gears of War 2
    • Ghost Recon Advanced Warfighter 2
    • Guitar Hero 5
    • Halo 3
    • Left 4 Dead 2
    • Lost Planet 2
    • Madden NFL 10
    • Madden NFL Arcade
    • NBA 2K10
    • NBA Live 10
    • NCAA Basketball 10
    • NCAA Football 10
    • NHL 10
    • Rainbow Six: Vegas 2
    • Street Fighter IV
    • Tekken 6
    • UFC 2009 Undisputed
  • PlayStation 3
    • PlayStation 3 Frontpage
    • PlayStation 3 Paid Tournaments
    • PlayStation 3 Forums
    • Aliens vs. Predator
    • Army of Two: The 40th Day
    • Battlefield: Bad Company 2
    • BioShock 2
    • Call of Duty 4: Modern Warfare
    • Call of Duty: Modern Warfare 2
    • Call of Duty: World at War
    • FIFA Soccer 10
    • Fight Night Round 4
    • Ghost Recon Advanced Warfighter 2
    • Killzone 2
    • Lost Planet 2
    • Madden NFL 10
    • Madden NFL Arcade
    • MAG
    • MLB '09: The Show
    • NBA 2K10
    • NBA Live 10
    • NCAA Basketball 10
    • NHL 10
    • Rainbow Six: Vegas 2
    • Resistance 2
    • SOCOM Confrontation
    • Street Fighter IV
    • Tekken 6
    • UFC 2009 Undisputed
    • Uncharted 2: Among Thieves
    • Warhawk
  • PSP
    • Sony PSP Frontpage
    • Sony PSP Paid Tournaments
    • Sony PSP Forums
    • Resistance: Retribution
    • SOCOM: Fireteam Bravo 2
    • SOCOM: Fireteam Bravo 3
  • Wii
    • Nintendo Wii Frontpage
    • Nintendo Wii Paid Tournaments
    • Nintendo Wii Forums
    • Call of Duty: Modern Warfare: Reflex
    • Madden NFL 10
    • Mario Kart Wii
    • Super Smash Bros. Brawl
    • Tatsunoko vs. Capcom: Ultimate All Stars
  • DS
    • Nintendo DS Frontpage
    • Nintendo DS Paid Tournaments
    • Nintendo DS Forums
    • Pokemon Diamond/Pearl/Platinum
  • PC
    • PC Frontpage
    • PC Paid Tournaments
    • PC Forums
    • Aliens vs. Predator
    • Battlefield: Bad Company 2
    • Call of Duty 4: Modern Warfare
    • Call of Duty: Modern Warfare 2
    • Call of Duty: World at War
    • Counter-Strike
    • Half-Life 2: Team Fortress 2
    • Left 4 Dead 2
    • Mini-Games
    • RuneScape
  • Tournaments
    • Tournaments Frontpage
    • Tournaments Achievements
    • What are Online Tournaments?
    • Purchase Credits
  • Members
    • GB Rank
    • Members Frontpage
    • Member Search
    • Game Room (User Videos)
    • Chat
    • Hottest Gamer
  • Forums
    • Forums Frontpage
    • Site Suggestions
    • New Users Lounge
    • Forum Staff
    • Forum Rules
    • Search
    • View Today's Posts
    • View All New Posts
    • View Subscribed Threads
    • Mark All Posts Read
    • View Forum Leaders
    • User Preferences (User CP)
    • Edit Signature
    • Edit Profile
    • Edit Options
    • Edit Avatar
  • More
    • Volunteer for GB
    • MLG
    • 2009 MLG Pro Circuit
    • MLG Video
    • Management Staff
    • Help
navigation
  • Home Icon home
  • Blog Icon blog
  • Projects Icon projects
  • About Icon about me
  • Resume Icon resume
  • Certs Icon certificates
Halo 3 for Xbox 360 (Xbox360) Logo
  • Halo 3 Arena
  • Frontpage
  • Ladders
    • Team Ladder
    • Team (EU) Ladder
    • Team Mayhem Ladder
    • Doubles Ladder
    • Doubles (EU) Ladder
    • Singles Ladder
    • Singles (EU) Ladder
  • Tournaments
  • Teams
  • Free Agents
  • Forums
  • Create a Team
  • GB Rank
  • Support
Advertisement
  • GameBattles Forums
  • \ Xbox 360
  • \ Halo 3
  • \

    ians projects

Go to Page...
Reply
View First Unread View First Unread
Thread Tools Display Modes
Old 4 Feb 2010, 10:39 PM EST #1
ian ryan
forum dweller

ian ryan's Avatar

join date: Apr 2007
Posts: 9,000
ian ryan is on a distinguished road

Watermark Removal Deep Learning CNN (Self-Supervised) via PyTorch


https://github.com/ianhenryryan/watermarkcnn

Machine Learning Engineer (aspiring): Ian H. Ryan
Version: 0.1
Timeline: May 22, 2025 - June 24th, 2025

Upon making a variety of image scrapers for the world wide web, I noticed that a portion of the images I found contained watermarks. This sprung the idea of creating a deep learning model with the use case of removing watermarks from images. I found a research paper on ArXiv called A self-supervised CNN for image watermark removal . I used it as a reference for this project. This is a Image Restoration (Regression) Deep Learning Model.

Environment


  • Make & Model: Alienware m15 R7
  • GPU: NVIDIA GeForce RTX 3060 Mobile (6 GB VRAM)
  • Secondary GPU: Integrated AMD Radeon Graphics
  • CPU: AMD Ryzen 7 6800H (16 threads @ 4.78 GHz)
  • RAM: 16 GB DDR5
  • CUDA Version: CUDA Version: 12.8
  • Operating System: Pop!_OS 22.04 LTS
  • Kernel: 6.12.10-76061203-generic

Notebook Index


Table of Contents

  • Libraries & Imports .................................... 1
    • Libraries ........................................ 2
    • Imports ......................................... 3
  • CUDA ............................... 4
    • Check GPU Availability for CUDA .......................... 5
    • CUDA allocation limiting ~64mb ......................... 6
  • Seed ............................ 7
  • Dataset Description ............................ 8
  • CNN Model ............................ 9
    • Model Architecture .......................... 10
    • ColorAwareLoss Loss Function ......................... 11
    • Total Parameters ......................... 12
  • Data Generation (Self-Supervised) .......................................... 13
    • Dataset Class .................................. 14
  • Hyperparameters ........................ 15
    • Batch Size, Epochs, Learning Rate, Weight Decay ............................... 16
  • Transform .............................. 17
  • Pathing ............................... 18
  • Data Loader .............................. 19
  • Criterion, Optimizer, & Scaler for AMP ......... 20
  • Training CNN Model ........................................ 21
  • Save Summary for Regression Task Report JSON .............. 22
  • Log Training History ............................. 23
  • Save Model Weights or Save Model of Training .................................... 31
  • Test Accuracy of CNN Training ............................... 24
  • Visualizations .................................. 25
    • Metrics .......................... 26
      • Summary - Everything .......................... 27
      • Model Performance Function .................... 28
      • Training & Validation Loss Plots .............. 29
      • Confusion Matrices ............................ 30
      • Feature Maps ................................. 31
      • Kernel Visualizations ........................ 32
      • Gradient Visualizations ...................... 33
      • CAM / Grad-CAM ............................... 34
      • Training/Validation Curves ................... 35
      • Explainability Tools ......................... 36
    • Visuals .......................... 36
      • Residual Histogram .................... 37
      • Local PSNR/SSIM Maps .............. 38
      • Side by side Comparison ............................ 39
      • Watermark Residual ................................. 40
      • Training Progress ........................ 41
      • Watermark Attention ...................... 42
      • Batch Processing ............................... 43
      • Learning Rate Visual ................... 44
      • Train Loss, Peak Signal-to-Noise Ratio, Structural Similarity Index ......................... 45
      • Residual Error ................... 46
      • Multi-Layer Activation ................... 47
      • Feature Activation Maps (Decoder Focus) ................... 48
      • Kernel/Visualizations ................... 49
      • Gradient Visualization ................... 50
  • Literature Cited ................................ 51
  • Environment .................................... 52
  • Recommended Resources .......................... 53
  • Permission ............................ 54

Dataset Description


Utilized self-supervised pairing & watermark synthesis. Thus not being reliant on finding a myriad of paired images with watermark and without watermark.

This project is trained from data that I scraped off of the internet from various sources while making image scrapers. With that being said, I will not be providing downloads to the data.
  • 1000 Images, 996 jpg, 4 png file types.
Aim to create or find a dataset that is diverse in textures, colors, brightness, edges, and backgrounds.

If you need datasets consider checking out: Kaggle | Roboflow | COCO


Heterogeneous U-Net CNN Architecture


  • Encoder-Decoder Backbone - multi-resolution feature extraction.
  • DoubleConv Blocks - ReLU and LeakyReLU to capture diverse activations.
  • Attention Gates - at each skip connection for feature relevance gating.
  • Learnable Upsampling (Transpose Convs) avoid interpolation artifcats from bilinear.
  • Self-Attention Bottleneck - global spatial context.
  • Perceptual Feature Extractor (VGG16) - texture-aware loss.

Model Diagram

Seed Class Example

Loss Function


There are multiple components to the loss function.

1) Masked L1 Loss

Penalizes per pixel differences between predicted & ground truth images, optionally focusing on the masked region (where the watermark is located).

Formula:
L1_masked = (1/N) ∑_(i=1)^(N) ​M_i * ∣P_i − T_i∣

Where:
- P_i: Predicted Pixel
- T_i: Target Pixel
- M_i: Binary Mask (1=focus, 0=ignore)
- N: Sum of Mask Values (non-zero pixels)

2) LAB Color Loss

Transforms images to LAB color space & weights L-channel more (70%) to prioritize luminance, which human eyes are more sensitive to.

Formula:
LAB_loss = (0.7) * L1(L_p,L_t) + (0.3) * L1(𝐴𝐵_𝑝, 𝐴𝐵_𝑡)

Where:
- L_p, L_t: Luminance channels (predicted, target) - 𝐴𝐵_𝑝, 𝐴𝐵_𝑡: Chrominance channels

3) Perceptual Loss (VGG16)

Uses pre-trained VGG16 to extract deep features from intermediate layers and compares their activations.

Formula:
Perceptual_loss = L1(ϕ(P),ϕ(T))

Where:
- ϕ: Feature extractor from pretrained VGG16
- P,T: Predicted and target images

4) SSIM (Structural Similarity Index)

Captures structural similarity of images over patches.

Formula:
SSIM_loss = (1 - SSIM(P,T))

SSIM compares means, variances, and covariances of image patches. It’s luminance-driven and more perceptually aligned than L1.

5) Laplacian Edge Loss

Applies a Laplacian kernel to both predicted and target images to extract edges, then uses L1 on the resulting edge maps.

Formula:
Edge_loss = L1((∇^2)P,(∇^2)T)

Where:
- ∇^2 is the Laplacian operator.

Total Loss


Formula:
Total Loss = L1 + α * Perceptual + β * LAB + γ * SSIM + δ * Laplacian

Weights:
- α=0.3
- β=0.3
- γ=0.4
- δ=0.1

Loss Diagram

Seed Class Example

Total Parameters


Total Parameters: 5383366
Trainable Parameters: 3647878

The reason only 3647878 are trainable is that the other 1735488 are being used in VGG.

Data Generation (Self-Supervised)


This section contains: Watermark Generation, Watermark Application, Generate Pair, Data Augmentation, & Watermark Dataset Class.
The model learns from synthetic watermark pairs that are generated from the myriad of 1000 clean images I used in the dataset.

Watermark Generation

Creates a diverse variety of synthetic watermarks.
  • Watermark text contains 25 multi-langual phrases or watermarks. For example, ENG, SPAN, FR, GER, CH, JAP, KOR, RUSS, ARA, HIN.
  • The watermarks vary in size, rotations, opacity, & positions. As well as partial occlusion.
  • A slight amount of watermarks generated were lines, rectangles, & circles as opposed to text for more diversity.
  • Distortions: post processing with image filters to replicate low quality scanning or camera captures.

Semi-Random Alpha Blending (Applying Watermark)

The watermark is integrated into the clean image via spatially varying alpha mask:
  • Random opacity levels across image.
  • Occasional brightness & contrast adjustments.
  • Binary supervision mask generated simultaneously to mark affected regions.

Generate Pair (Pairwise Training)

Each training sample is a triplet:
  • wm_img1: The input with a synthetic watermark
  • Twm_img2: A second watermarked version (reference)
  • mask: The region where the watermark is present (used for masked L1 and Laplacian loss)
This mirrors the core idea in the research paper referenced, Simulates watermark removal in the absence of ground truth. It simulates the learning signal by generating different but related views of the same clean image.

Data Augmentation Pipeline (Albumentations)

Goal of enhancing robustness & domain generalization:
  • Geometric: crop, rotate, flip
  • Photometric: jitter, grayscale, gamma
  • Noise/Artifacts: blur, compression, rain, shadows, pixel dropout

Generate Pair (Pairwise Training)

Each training sample is a triplet:
  • wm_img1: The input with a synthetic watermark
  • Twm_img2: A second watermarked version (reference)
  • mask: The region where the watermark is present (used for masked L1 and Laplacian loss)

Hyperparameters


  • batch_size = 8
  • epochs = 25
  • learning_rate = 0.001
  • weight_decay = 1e-5

  • Transform


    create transform variable and set image size to 256 pixels.
    transform = get_augmentations(image_size=256)

    Pathing


    data/clean_images/.jpg & data/clean_images/.png
    convert clean image data to RGB

    split %:
    train: 70%
    validate: 20%
    test: 10%

    Train: 699, Val: 200, Test: 100

    Loss, Optimizer, GradScaler for AMP & Learning Rate Scheduler


    criterion = ColorAwareLoss(alpha=0.3, beta=0.3, gamma=0.4, delta=0.1).to(device)

    Perceptual weight (alpha): = 0.3
    Color weight (beta): = 0.3
    SSIM weight (gamma): = 0.4
    Laplacian weight (delta): = 0.1

    AdamW Optimizer Algorithm


    optimizer = torch.optim.AdamW(net.parameters(), lr=learning_rate, weight_decay=weight_decay)

    Model is trained on AdamW optimization algorithm. It decouples weight decay from gradient updates which allows for better generalization & stability in training.
    AdamW applies weight decay directly to the parameters, not the gradients like Adam which in turn improves generalization.

    GradScaler for AMP


    scaler = torch.cuda.amp.GradScaler()

    Keeping GPU memory & training speed in mind, I utilized Automatic Mixed Precision (AMP) through the GradScaler. It dynamically scales gradients preventing underflow during the mixed precision training.

    OneCycleLR Learning Rate Scheduler


    scheduler = torch.optim.lr_scheduler.OneCycleLR(
    optimizer,
    max_lr=0.001,
    steps_per_epoch=len(train_loader),
    epochs=epochs + 2,
    pct_start=0.3,
    div_factor=10,
    final_div_factor=100
    )

    LR Scheduler Diagram

    Seed Class Example Using OneCycleLR Learning Rate Scheduler. It is a dynamic scheduelr that:
    • warms up learning rate early in training.
    • Anneal to lower value nearing end of training.
    • Improves convergence & generalization.

    Training


    Forward Propogation, Loss Calculation, Backward Propagation, Weight Update.

    Warmup Loop

    Making adjustments to the hyperparameters, architecture, loss functions, etc can cause early epochs to act up adjusting to the changes made. Thus implementing a warmup loop.
    • A 2-epoch warmup with a reduced learning rate (1e-4) using a separate AdamW optimizer to stabilize initial convergence.

    Training Loop

    training runs for 25 epochs (27 including the 2 warmup) using the main AdamW optimizer & OneCycleLR learning rate scheduler.
    • Manual learning rate decay is applied at epoch 15 (10x drop) to encourage finer convergence.
    • Epochs compute: Train Loss, Train PSNR, Train SSIM, Val Loss, VaL PSNR, Val SSIM, LR
    • Best-performing model (lowest val loss) is saved to wm_best_model.pth
    • Training halts early if no improvement is seen for early_stop_patience epochs.

    Results:

    Warmup Epoch 1 | Loss: 6.72903
    Warmup Epoch 2 | Loss: 3.8560
    Epoch 1: Train Loss=4.7524, Train PSNR=14.79, Train SSIM=0.3273 , Val Loss=4.1361, Val PSNR=15.20, Val SSIM=0.3542, LR=0.000134
    Epoch 2: Train Loss=3.8627, Train PSNR=16.61, Train SSIM=0.3853 , Val Loss=2.8606, Val PSNR=18.33, Val SSIM=0.4380, LR=0.000229
    Epoch 3: Train Loss=2.6319, Train PSNR=19.94, Train SSIM=0.4862 , Val Loss=1.6640, Val PSNR=22.47, Val SSIM=0.5786, LR=0.000372
    Epoch 4: Train Loss=1.5781, Train PSNR=23.88, Train SSIM=0.6011 , Val Loss=1.2799, Val PSNR=24.28, Val SSIM=0.6117, LR=0.000542
    Epoch 5: Train Loss=1.3406, Train PSNR=25.09, Train SSIM=0.6366 , Val Loss=1.1182, Val PSNR=25.67, Val SSIM=0.6214, LR=0.000713
    Epoch 6: Train Loss=1.3366, Train PSNR=25.31, Train SSIM=0.6567 , Val Loss=0.9333, Val PSNR=26.98, Val SSIM=0.6998, LR=0.000860
    Epoch 7: Train Loss=1.3522, Train PSNR=25.02, Train SSIM=0.6596 , Val Loss=1.0313, Val PSNR=26.46, Val SSIM=0.6463, LR=0.000960
    Epoch 8: Train Loss=1.2893, Train PSNR=25.61, Train SSIM=0.6700 , Val Loss=1.0174, Val PSNR=26.29, Val SSIM=0.6912, LR=0.001000
    Epoch 9: Train Loss=1.2259, Train PSNR=25.95, Train SSIM=0.6788 , Val Loss=0.9617, Val PSNR=26.69, Val SSIM=0.6800, LR=0.000994
    Epoch 10: Train Loss=1.2163, Train PSNR=26.00, Train SSIM=0.6792 , Val Loss=0.9782, Val PSNR=26.75, Val SSIM=0.6751, LR=0.000975
    Epoch 11: Train Loss=1.2225, Train PSNR=26.00, Train SSIM=0.6952 , Val Loss=0.8795, Val PSNR=27.65, Val SSIM=0.7164, LR=0.000943
    Epoch 12: Train Loss=1.2029, Train PSNR=26.03, Train SSIM=0.6865 , Val Loss=0.9174, Val PSNR=27.46, Val SSIM=0.6902, LR=0.000898
    Epoch 13: Train Loss=1.1444, Train PSNR=26.32, Train SSIM=0.6899 , Val Loss=0.8914, Val PSNR=27.29, Val SSIM=0.6994, LR=0.000843
    Epoch 14: Train Loss=1.2011, Train PSNR=26.00, Train SSIM=0.6932 , Val Loss=0.8762, Val PSNR=27.71, Val SSIM=0.7081, LR=0.000778
    Epoch 15: Train Loss=1.1981, Train PSNR=26.01, Train SSIM=0.6880 , Val Loss=0.8976, Val PSNR=27.41, Val SSIM=0.7213, LR=0.000705
    Lowering learning rate by 10x for fine-tuning phase.
    Epoch 16: Train Loss=1.1729, Train PSNR=26.21, Train SSIM=0.6848 , Val Loss=0.9213, Val PSNR=27.10, Val SSIM=0.6916, LR=0.000627
    Epoch 17: Train Loss=1.1742, Train PSNR=26.08, Train SSIM=0.6929 , Val Loss=0.9165, Val PSNR=26.96, Val SSIM=0.7061, LR=0.000545
    Epoch 18: Train Loss=1.1629, Train PSNR=26.25, Train SSIM=0.6963 , Val Loss=0.8613, Val PSNR=27.75, Val SSIM=0.7229, LR=0.000462
    Epoch 19: Train Loss=1.1667, Train PSNR=26.12, Train SSIM=0.6958 , Val Loss=0.8495, Val PSNR=27.56, Val SSIM=0.7105, LR=0.000380
    Epoch 20: Train Loss=1.1428, Train PSNR=26.28, Train SSIM=0.6810 , Val Loss=0.8530, Val PSNR=27.83, Val SSIM=0.7173, LR=0.000302
    Epoch 21: Train Loss=1.1608, Train PSNR=26.27, Train SSIM=0.6959 , Val Loss=0.8109, Val PSNR=27.90, Val SSIM=0.7294, LR=0.000229
    Epoch 22: Train Loss=1.1382, Train PSNR=26.19, Train SSIM=0.6985 , Val Loss=0.8479, Val PSNR=27.54, Val SSIM=0.7122, LR=0.000163
    Epoch 23: Train Loss=1.0897, Train PSNR=26.64, Train SSIM=0.7037 , Val Loss=0.8687, Val PSNR=27.52, Val SSIM=0.7099, LR=0.000107
    Epoch 24: Train Loss=1.1151, Train PSNR=26.60, Train SSIM=0.7008 , Val Loss=0.8570, Val PSNR=27.33, Val SSIM=0.7125, LR=0.000061
    Epoch 25: Train Loss=1.1098, Train PSNR=26.65, Train SSIM=0.6970 , Val Loss=0.8431, Val PSNR=27.54, Val SSIM=0.7151, LR=0.000028
    
    Training Time Elapsed: 228.00 Minutes  
    			
    Summary:
    • The warmup phase helped stabilize initial training as seen with the loss going from 6.72903 -> 3.8560 from first to second warmup epoch.
    • Train PSNR started at ~14.8 dB and reached ~26.6 dB
    • Val PSNR steadily improved to ~27.5–28 dB, a solid metric for image restoration tasks.
    • Val SSIM reached ~0.72, indicating perceptual quality and structure recovery are good.

    Metrics & Visuals


    All of the outputs that I used to make adjustments to the model as needed after each training run.

    Metrics


    Evaluation
    Test Loss: 0.7706
    Test PSNR: 28.67 dB
    Test SSIM: 0.7202

    Visuals


    Train & Val Loss, PSNR Progress

    Seed Class Example Plots show loss & PSNR over time for both training and validation sets. These curves verify the convergence behavior of the network, confirm generalization to unseen validation data, and track performance metrics epoch-by-epoch.

    Training Loss, Peak Signal to Noise Ratio, & Structural Similarity Plots

    Seed Class Example Three synchronized subplots provide a compact summary of the network’s learning trajectory:
    • Loss Curve: steady convergence from high initial error.
    • PSNR Curve: clear climb toward 26–27 dB, a common benchmark in high-fidelity restoration.
    • SSIM Curve: shows perceptual similarity nearing ~0.70, indicating structural retention of clean images.


    Learning Rate Scheduler

    Seed Class Example
    OneCycleLR plot shows the cyclical learning rate schedule used during training. The LR increases for the first 30% of epochs (warm-up), then decays gradually helping the model converge quickly while avoiding sharp minima.

    Local PSNR & Local SSIM Maps

    Seed Class Example Seed Class Example
    These heatmaps depict spatially localized Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) metrics across image patches. Brighter regions indicate higher fidelity restoration, offering insight into performance variability across different spatial contexts.

    Residual Histogram

    res hist
    histogram displays the distribution of absolute pixel-wise residuals between the predicted image and the original watermark-free image. A sharp peak near zero indicates strong restoration fidelity, where most pixels are accurately reconstructed with minimal error.

    Side by Side Comparison

    res hist
    Visual triplets are shown in three columns: the clean original, the watermarked version, and the predicted output. These qualitative results help highlight how well the model removes diverse watermark styles, maintaining visual structure and content.

    Watermark Residual

    res hist
    A three-panel plot visualizes the residual heatmap, a binary thresholded mask, and the original vs predicted absolute difference. This highlights exactly where watermark artifacts remain and how well they’re suppressed in the output.

    Watermark Attention

    res hist
    Decoder-level attention maps (heatmaps) illustrate how different layers in the network respond to watermark perturbations. This diagnostic tool verifies that deeper layers are focusing on regions corrupted by watermarks—demonstrating task-specific feature learning.

    Batch Processing

    res hist
    This view shows the entire training batch as input, reference, and predicted samples across multiple augmentations. It’s useful to understand the model’s robustness to various synthetic watermark styles generated on-the-fly.

    Residual Error

    res hist Highlights pixel-level discrepancies between the original and predicted outputs using three subplots:
    • Residual Heatmap: This shows the absolute error (|original - predicted|) in pixel intensities. Brighter areas signify larger prediction errors, often correlated with watermark regions.
    • Thresholded Mask (> 0.1): Highlights only those pixels where the residual exceeds a certain perceptual threshold (0.1). This isolates major prediction faults or watermark remnants.
    • Overlay of Original vs Predicted: Gives a side-by-side visual of the predicted error distribution relative to the original image, making it easy to spot spatial trends in reconstruction quality.
    Helps pinpoint where the model struggles, and guides improvements to the network or loss function.

    Multi-Layer Activation

    res hist
    Visualizes encoder and decoder layer outputs (e.g., enc1, enc2, dec1, dec2, dec3). Which reveals internal representation learning and feature abstraction across layers. It is useful for understanding how watermark-related features propagate through the network.

    Feature Activation Maps (Decoder Focus)

    res hist
    Feature maps from deeper decoder layers of the U-Net architecture:
    • The right-side plot shows the activation map from dec3 (final decoder layer before output), scaled from 0 to 1.
    • It reveals where the model is focusing its attention during reconstruction, particularly in recovering watermark-affected areas.
    The decoder's high response in watermark-heavy zones suggests successful spatial attention learning. This also acts as a diagnostic tool to ensure the decoder is not ignoring important visual features.

    Referenced Paper


    @article{liu2024heterogeneous,
      title={Heterogeneous U-Net for Image Restoration},
      author={Liu, Jiang and Zhang, Yulun and Li, Wangmeng and others},
      journal={arXiv preprint arXiv:2403.05807},
      year={2024},
      url={https://arxiv.org/abs/2403.05807}
    }
    


    Grá Mór

    ian ryan is offline Reply With Quote
    ian
    View Public Profile
    Send a private message to ian
    Find More Posts by ian ryan
    Old 4 Feb 2010, 10:39 PM EST #1
    ian ryan
    forum dweller

    ian ryan's Avatar

    join date: Apr 2007
    Posts: 9,000
    ian ryan is on a distinguished road

    Capstone - Classification Convolutional Neural Network by Ian Ryan


    https://github.com/ianhenryryan/capstone

    Welcome to My Fall 2024 Capstone Project advised by Dr. Soltys.

    The objective of this semester-long project was to create a Convolutional Neural Network (CNN) using PyTorch by scratch, rather than using pre-trained models like ResNet-18. After developing an acceptable classification model, the next goal was to adapt it into an object detection model by implementing YOLOv8.

    Although my primary interest was detecting humans, I chose to create a multiclass model to avoid building a simple perceptron. To up the ante, I also made the dataset that the model was trained on an imbalanced dataset. Developing my own model presented a challenging learning opportunity and helped strengthen my understanding of CNN fundamentals for computer vision, a field in which I aspire to build a career.

    This repository contains the majority of my work, including various iterations of fine-tuning, visualizations, the CNN architecture schematic, datasets used, comprehensive Jupyter Notebooks, my capstone poster, final capstone pitch, literature citations, and more. The Jupyter Notebooks feature detailed indexes for easy navigation through different sections of the programs.

    Convolutional Neural Network Architecture


    IR Architecture

    Notebook Index


    Table of Contents

    • Libraries & Imports .................................... 1
      • Libraries ........................................ 2
      • Imports ......................................... 3
    • CUDA GPU Availability ............................... 4
    • Download & Load Datasets ............................ 5
      • First Dataset Resource .......................... 6
      • Second Dataset Resource ......................... 7
    • Functions .......................................... 8
      • Path Datasets .................................. 9
      • First Dataset .................................. 10
      • Second Dataset ................................. 11
    • Pre-Processing The Datasets ........................ 12
      • Combine Datasets ............................... 13
      • Save Combo Set ................................. 14
      • Redistribute Images Dynamically Function ....... 15
    • Pathing (Jump here after Libraries & Imports if using own dataset and not combining into imbalanced set.) 16
      • Normalize File Extensions ...................... 17
      • Extensive Pathing Checking ..................... 18
    • Transforms & Loaders .............................. 19
    • Classification Or Object Detection Model? ......... 20
      • Classification Task ............................ 21
      • Regression Task ................................ 22
    • CNN Model ........................................ 23
      • Define Class Weights .......................... 24
      • Define CNN Model Architecture ................. 25
      • Weights & Biases .............................. 26
      • Weight Initialization ......................... 27
      • Hyperparameters ............................... 28
    • Class Confirmation in Training Data .............. 29
    • Profile Memory Usage ............................. 30
    • Training Prep .................................... 31
    • Training CNN Model ............................... 32
    • Test Accuracy of Convolutional Neural Network ... 33
    • Visualizations .................................. 34
      • Summary - Everything .......................... 35
      • Model Performance Function .................... 36
      • Training & Validation Loss Plots .............. 37
      • Confusion Matrices ............................ 38
      • Feature Maps ................................. 39
      • Kernel Visualizations ........................ 40
      • Gradient Visualizations ...................... 41
      • CAM / Grad-CAM ............................... 42
      • Training/Validation Curves ................... 43
      • Explainability Tools ......................... 44
    • Acknowledgements ................................ 45
    • Literature Cited ................................ 46
    • Environment .................................... 47
    • Recommended Resources .......................... 48
    • Creator Information ............................ 49

    Acknowledgements


    I would like to express my gratitude to Dr. Michael Soltys and Dr. William Barber for sparking my interest in Machine Learning and Artificial Intelligence. Over the past two years at California State University Channel Islands (CSUCI), it has been a pleasure taking multiple courses with both professors, each bringing their own perspectives and experiences. They both were able to articulate complex information in a digestible way effortlessly.

    Dr. William Barber
    Dr. Barber, with his background in Physics and a distinguished career in Medical Imaging and Research, provided a scientific and mathematical foundation that enhanced my understanding of computational models, data analysis, and data extraction. His extensive industry experience includes serving as a Director of Medical Imaging. His engaging teaching style and passion for image processing techniques and pattern recognition concepts inspired me to explore this field further.

    Dr. Michael Soltys
    Dr. Soltys, with his comprehensive background in Algorithms, Machine Learning, and Cloud Computing, fostered my interest in pursuing a Computer Vision Capstone Project. As an accomplished author of two books and more than 60 published research papers, and as a Principal Scientist and Software Engineer, Dr. Soltys brings a wealth of in-depth knowledge and real-world applications to his teaching. His courses and recommendations proved to be instrumental during the development of my Convolutional Neural Network (CNN) Capstone Project.

    Bayne H. Ryan
    Bayne, my dear nephew. You arrived the first week of December, right in the middle of preparations for my Capstone Showcase. I truly admire your impeccable sense of urgency to stop being a submarine and surface just in time to appreciate my Convolutional Neural Network (CNN) Capstone project. Your surprise guest appearance was undoubtedly the highlight of my year. I cannot wait to forcibly teach you calculus when you reach the prestigious age of juice boxes and nap time (also known as four years old).


    Literature Cited


    Tan, M., & Le, Q. V. (2019). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv preprint arXiv:1901.08688.

    Brownlee, J. How to Develop Convolutional Neural Network Models for Time Series Forecasting. Machine Learning Mastery. Available at: https://machinelearningmastery.com/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting/.

    Dougherty, G. Pattern Recognition and Classification: An Introduction. Springer, 2012.

    Eliot, D. Deep Learning with PyTorch Step-by-Step: A Beginner's Guide, Volume I: Fundamentals. Self-published, 2020.

    Jocher, G., Chaurasia, A., & Qiu, J. (2023). Ultralytics YOLOv8. Available at: https://github.com/ultralytics/ultralytics.


    Environment


    AWS SageMaker Jupyter Notebook


  • Notebook Instance Type: ml.g4dn.2xlarge
  • Processor: NVIDIA T4 Tensor Core GPU with 16 GB GDDR6 VRAM
  • Memory: 32 GB System RAM
  • Storage: 5 GB EBS Volume
  • Operating System: Amazon Linux 2 with Jupyter Lab 3 (notebook-al2-v2)
  • Lifecycle Configuration: None
  • Elastic Inference: Not Enabled
  • Minimum IMDS Version: 2

  • Recommended Resources


    Courses for Computer Science Students at California State University Channel Islands (CSUCI) Interested in Machine Learning & AI

    Channel Island Specific Courses:
    • COMP 345 - Digital Image Processing (Fall Course)
    • COMP 354 - Analysis of Algorithms
    • COMP 445 - Image Analysis and Pattern Recognition (Spring Course)
    • COMP 454 - Automata, Languages & Computation
    • COMP 469 - Intro to Artificial Intelligence / Machine Learning (Fall Course)
    • COMP 491 - Capstone Prep (Specifically with Dr. Soltys)
    • COMP 499 - Capstone (Specifically with Dr. Soltys)
    LinkedIn Learning Courses (if offered):
    If a College/University student, it is worth seeing what resources your institution provides at your disposal. For example, CSUCI offers LinkedIn Learning and access to the O'Reilly website.
    • Deep Learning and Generative AI: Data Prep, Analysis, and Visualization with Python - Leverage Generative AI for Analytics and Insights by Gwendolyn Stripling
    • Deep Learning: Image Recognition - Learning Image Recognition by Isil Berkun
    • Advanced AI: Transformers for Computer Vision by Jonathan Fernandes
    • Building Computer Vision Applications with Python by Eduardo Corpeño
    • Applied Machine Learning: Algorithms by Matt Harrison
    • Building Deep Learning Applications with Keras by Isil Berkun
    Helpful Resources:
    • Think Python: How to Think Like a Computer Scientist (2nd Edition, Version 2.4.0) by Allen Downey
    • Deep Learning with PyTorch Step-by-Step: A Beginner's Guide - Volume I: Fundamentals by Daniel Voigt Godoy
    • Pattern Recognition and Classification: An Introduction (Springer, 2012) by Geoff Dougherty
    • An Introduction to the Analysis of Algorithms: (3rd Edition, 2018) by Michael Soltys
    • Ultralytics YOLOv8 Documentation
    • Object Detection and Localization with YOLOv3 by B. Rupadevi and J. Pallavi
    • Digital Image Processing for Medical Applications by Geoff Dougherty

    Creator Information


    • GitHub: https://github.com/ianhenryryan
    • LinkedIn: https://linkedin.com/in/ianhenryryan/
    • Websites:
      • http://ianhryan.com/
    • Kaggle: https://kaggle.com/ianryan
    • Hugging Face: https://huggingface.co/Ianryan

    Permission


    Students, educators, and anyone else keen on learning about Convolutional Neural Networks (CNNs), Computer Vision, Supervised Learning Algorithms, Machine Learning, Deep Learning, AI, or related fields are welcome to use this notebook in any capacity as a learning resource.

    This notebook is part of my Fall 2024 Capstone Project for my Bachelor’s degree in Computer Science at California State University Channel Islands. I structured the content to be approachable and digestible, reflecting my own learning journey in AI and Machine Learning.

    I hope this notebook can be of use to those exploring similar topics. This specific Jupyter Notebook focuses on Image Classification and demonstrates combining two datasets to create a class imbalance for training purposes. A separate notebook dedicated to Object Detection will be available soon (if not already).

    Important Note on Datasets: The datasets used in this project are not my property. Credit is given to the original dataset creators, and their links are provided within the notebook in the dataset section at the beginning.

    I understand that grasping the fundamentals of CNNs and related AI concepts can be overwhelming at first. My goal is to make these topics more accessible through these notebooks.

    Permissions:
    You are free to download, use, edit, and reference the notebooks, Python code, and Markdown content. I aim for accuracy in the explanations provided, though I acknowledge that scientific understanding is always evolving. I welcome constructive feedback and corrections.

    This project is intended as a learning resource.

    Ian Ryan

    Grá Mór

    ian ryan is offline Reply With Quote
    ian
    View Public Profile
    Send a private message to ian
    Find More Posts by ian ryan
    Unread 4 Feb 2010, 11:24 PM EST #2
    ian ryan
    forum dweller

    ian ryan's Avatar

    join date: Apr 2007
    posts: 9,000
    ian ryan has a spectacular aura about

    Deconvolution (De-blur) & Enhancement by Ian Ryan



    A deep dive into digital image processing using ImageJ to correct blurry images through deconvolution and enhance poorly illuminated images. Techniques used include Wiener filtering, custom PSFs, and enhancement via brightness/contrast normalization, often layered with macros to test results efficiently.

    Table of Contents

    • Introduction
    • Methods
    • Deconvolution Results
      • blurred.tif (License Plate)
      • brainGB2.tif (Skull)
      • carnivalride.png (Rotational Blur)
      • vintagephotographer.tif
    • Enhancement Results
      • bad_background.jpg
      • dark_foreground.jpg
    • Conclusion
    • Resources & Environment
    • Downloadable Macros

    Introduction


    The project focused on sharpening blurry images and enhancing dark ones using various digital image processing techniques. This was done in ImageJ, with careful experimentation on gamma values, deconvolution filters, and visual comparison of results.

    Deconvolution (De-Blur) initial images Seed Class Example

    Enhancement initial images
    Seed Class Example


    Methods


    • Software
      • ImageJ (Fiji distribution)
    • Filters
      • Wiener Filter (Deconvolution)
      • Enhance Contrast
      • Sharpen
      • Despeckle
      • Brightness/Contrast
      • Shadow Enhancement
    • Point Spread Functions (PSF)
      • pillbox.tif
      • GaussianBlur.tif (with custom modifications)
      • Handcrafted disc-shaped PSF (for carnivalride.png)
    • Custom Macros
      • One macro per image with step-by-step automation
      • Used for documentation and reproducibility
    • Hardware
      • Alienware M15 Nvidia 3060
      • Windows 11 with ImageJ

    Deconvolution Results


    • blurred.tif (License Plate)
      • Used: Wiener Filter (gamma = 0.05), pillbox PSF
      • Enhancements: Sharpen → Shadows (SW) → Invert
      • Goal: Make the plate readable
      • Before / After: Seed Class Example
    • brainGB2.tif (Skull)
      • Custom PSF based on modified Gaussian blur
      • Steps: Wiener Filter (0.005) → Sharpen → Enhance Contrast
      • Also explored a non-convolution deblur trick using subtraction
      • Before / After: Seed Class Example
    • carnivalride.png (Rotational Blur)
      • Most difficult case (rotational motion blur)
      • Hand-drawn disc-shaped PSF used
      • Multiple iterations of deconvolution and blending
      • Before / After: Seed Class Example
    • vintagephotographer.tif
      • Modified Gaussian PSF
      • Complex multi-step process involving mean filter, math gamma, image subtraction, despeckle, etc.
      • Before / After: Seed Class Example
  • Custom Made Point Spread Functions for Project.
    • Seed Class Example

  • Enhancement Results


    • bad_background.jpg
      • Brightness adjusted -> Enhance contrast
      • Result: Partial face revealed
      • Before / After: Seed Class Example
    • dark_foreground.jpg
      • MATH gamma failed, fallback to brightness + histogram equalization
      • Result: Visible facial details without losing the background
      • Before / After: Seed Class Example

    Conclusion


    The project was a rewarding challenge, I despised dealing with rotational blur in the carnival ride. Experimenting with the Wiener filter, custom PSFs, and enhancement orders provided valuable insight into digital image restoration.

    Environment


    • Software
      • ImageJ (Fiji distribution)
    • System
      • Alienware m15 Nvidia GeForce RTX 3060
      • Windows 11
    • File Types
      • .tif
      • .jpg
      • .png

    Macros


    Capable of sending macro zip upon request if interested.



    Grá Mór

    daddy is offline Reply With Quote
    ian ryan
    view public profile
    Send a private message to ian ryan
    Find More Posts by ian ryan
    Unread 5 Feb 2010, 12:15 AM EST #3
    ian ryan
    forum dweller

    ian ryan's Avatar

    join date: Apr 2007
    Posts: 9,000
    Why.So.Saucey is a glorious beacon of light
    Send a message via AIM to daddy

    Seed Pattern Recognition by Ian Ryan


    This project explores seed classification using pattern recognition techniques. I used self-captured seed images, processed them in ImageJ to extract measurable features, and then trained a K-Means Clustering model in JMP. The goal was to accurately differentiate between 5 seed types based on shape-based features.

    Table of Contents

    • Seed Classes
    • Image Capturing
    • Image Processing + Segmentation
      • blurred.tif (License Plate)
      • brainGB2.tif (Skull)
      • carnivalride.png (Rotational Blur)
      • vintagephotographer.tif
    • Classification Models
      • K-Means Clustering
      • Neural Net
    • Visuals
    • Conclusion

    Seed Classes


    This project contains 5 classes: Weed, Carrot, Pumpkin, Pea, Spinach
    Training Data: 5 classes - 4 images per class
    Testing Data: 5 classes - 3 images per class Seed Class Example

    Image Capturing


    This project focuses on classifying a variety of seeds. Conveniently, my parents grow vegetables and my dad reliving his glory days from the 1970s, grows weed. This left a plethora of seeds to choose from to capture images of and then extract features from. The chosen classes of seed types utilized are weed, carrot, pumpkin, pea, and spinach.
    Seed Class Example first raw training image of each class.

    For capturing the images I placed a piece of 8.5x11 white paper into an In-N-Out tray. After set-up of the environment I poured seeds into the tray and captured 4 images per seed type to gather the training data using an Iphone 13. I repeated this process again and captured 3 images per seed type to gather testing data since you cannot test with the data you trained on. All images captured at various distances, meaning area cannot be used as a feature for the classes.

    Image Processing + Segmentation + Feature Extraction


    For the processing of the images in both the training and testing images I utilized a standard approach to all of them using the same techniques and made minor adjustments to the techniques as needed. Over a few hours of tinkering in ImageJ with the images and extracting their data, I successfully segmented and feature extracted both sets of images.

    Techniques Performed
    • 8-bit
    • Duplicate
    • Subtract Background
    • Enhance Contrast
    • Duplicate
    • Gaussian Blur
    • ImageCalc: Duplicate of Duplicate subtract Duplicate
    • Enhance Contrast
    • Threshold
    • Remove Outliers
    • Fill Holes
    • Set Measurements
    • Analyze Particles
    • Invert LUT
    Train Weed before/after:
    Seed Class Example Seed Class Example

    Train Carrot before/after:
    Seed Class Example Seed Class Example

    Train Pumpkin before/after:
    Seed Class Example Seed Class Example

    Train Pea before/after:
    Seed Class Example Seed Class Example

    Train Spinach before/after:
    Seed Class Example Seed Class Example


    Classification Models


    Now for putting the extracted training data to use. Following the image processing and segmentation we now have acceptable data/features/measures to train the model with. The next step is to export all the data from Google Sheets to an XLS file. Once that is done I open the statistical software JMP (student edition). Here I click on New Data Table which brings up an empty array table. Then press open and click on the all XLS file created earlier. Import it and then save it as a JMP file. Seed Class Example

    K-Means Clustering


    • Used JMP Statistical Software (Student Version)
    • Imported all labeled feature data into a data table
    • Set number of clusters = 5 (one per seed class)
    • Results:
      • Decent clustering overall
      • Noticeable overlap between certain seed types
      • Visually similar seeds (like carrot and pea) contributed to the overlap
    Seed Class Example Here is the scatterplot matrix, As you can see there is a lot of overlap and overlapping in the clusters. Seed Class Example

    Neural Net (out of curiousity, not enough data for accurate NN)


    Seed Class Example Seed Class Example

    Conclusion


    In conclusion, the segmentation portion of the project took me the longest to complete. There were cases in which I had to keep trying to process images over and over again until I produced an acceptable number of objects in each image. It is important to crop the testing and training images so there is less noise to try and eliminate from the images. It is also a good idea to overly take measurements/data to give more options on how to manipulate it in certain ways to benefit the model more. Using seeds that are more varying in circularity and shape would have been beneficial as a lot of my data gets misread in the models.

    Specifications


    • Camera: iPhone 13
    • Alienware m15 R7 Laptop
    • Processor:
      • AMD Ryzen 7 6800H (20 MB total cache, 8 cores, 16 threads)
    • Graphics Card:
      • NVIDIA GeForce RTX 3060, 6GB GDDR6
    • Memory:
      • 16 GB (2 x 8 GB, DDR5, 4800 MHz)
    • Hard Drive:
      • Crucial T500 2TB PCIe Gen4 NVMe M.2 SSD
    • Operating Systems:
      • Windows 11
    • Software:
      • ImageJ
      • Google Sheets
      • JMP: Statistical Software: Student Version

    Resources


    • Images:
      • Training Images:
        • 4 x Images (self-taken) x 5 different classes (seed types)
      • Testing Images:
        • 3 x Images (self-taken) x 5 different classes (seed types)
    • Code:
      • ImageJ Macros:
        • 4 x Segmentation Training Macros x 5 different classes XLS files
        • 3 x Segmentation Testing Macros x 5 different classes XLS files
    • Data:
      • Training Data:
        • 4 x Classes (images) x 5 different classes (seed types) XLS files
        • 1 x All Training Data XLS file with label
      • Testing Data:
        • 3 x Classes (images) x 5 different classes (seed types) XLS files
        • 1 x All Testing Data XLS file with label


    Please reach out if interested in the project zip file containing all resources.

    Grá Mór


    daddy is offline Reply With Quote
    Why.So.Saucey
    View Public Profile
    Send a private message to Why.So.Saucey
    Find More Posts by Why.So.Saucey
    Reply
    • GameBattles Forums
    • \ Xbox 360
    • \ Halo 3

    Previous Thread | Next Thread

    Thread Tools
    Show Printable Version Show Printable Version
    Email this Page Email this Page
    Display Modes
    Linear Mode Linear Mode
    Hybrid Mode Switch to Hybrid Mode
    Threaded Mode Switch to Threaded Mode

    Posting Rules
    You may not post new threads
    You may not post replies
    You may not post attachments
    You may not edit your posts

    vB code is On
    Smilies are On
    [IMG] code is On
    HTML code is Off

    All times are GMT -4. The time now is 4:39 AM EST.
    Powered by: vBulletin, Copyright \A92000 - 2010, Jelsoft Enterprises Ltd.
    Copyright \A9 2010 Major League Gaming, Inc. All Rights Reserved.
    • Terms of Use
    • Privacy Policy
    • Contact Us
    • About MLG GameBattles
    • Advertise With Us