CS180 Project 1

Introduction

In this project, we are required to implement an image alignment algorithm that colorizing the glass plate images from Prokudin-Gorskii photo collection by aligning the three channels of each image.

Part 1: Simple implementation on smaller .jpg images

In this part, my implementation is simple: search over the displacement window of \([-15, 15]\) pixels to find the best displacement for red and green channels that minimize the errors with the blue channel. Below are my results, each with the displacements of red and green channels:

Part1 image 1
Cathedral
R: (12, 3) G: (5, 2)
Runtime: 0.63s
Part1 image 2
Monastery
R: (3, 2) G: (-3, 2)
Runtime: 0.57s
Part1 image 3
Tobolsk
R: (6, 3) G: (3, 3)
Runtime: 1.67s

It is worth noticing that I tried calculating the errors with MSE (Mean Squared Error, \(\frac{1}{hw}\sum_{i=1}^h\sum_{j=1}^w (\mathbf{X}_{ij} - \mathbf{Y}_{ij})^2\)) and NCC (Normalized Cross-Correlation, \(\frac{\mathbf{x}^T\mathbf{y}}{\Vert\mathbf{x}\Vert_2\Vert\mathbf{y}\Vert_2}\)). While both of them turned out to work well in terms of visual effects, the runtime of NCC was larger than that of MSE, with a runtime of 1.11s on cathedral.jpg compared to 0.63s, so I used MSE for all of the images to enhance speed. Also, it is important to crop the image before processing it, as the black edges would influence error calculation and the aligning process would be less accurate. I cropped 5% of the height and 10% of the width of each image before processing.

Part1 image 4.1
Without cropping
Part1 image 4.2
With cropping

Part 2: Coarse-to-fine pyramid speedup on large .tif images

Next, I applied the algorithms to the larger images. To reduce runtime, I downsampled each image at 4 levels, each averaging the pixels in 2x2 blocks. At level \(k\), the optimal shift \((i_k, j_k)\) is found within a search window \(([-windowH_k, windowH_k], [-windowW_k, windowW_k])\), then the shift \((i_{k+1}, j_{k+1})\) at the next level with a finer image is found within a search window \(([-windowH_{k+1} + 2i_{k+1}, windowH_{k+1} + 2i_{k+1}], [-windowW_{k+1} + 2j_{k+1}, windowW_{k+1} + 2j_{k+1}])\). The window sizes at each level (from coarse to fine) were set to [8, 6, 4, 2], [16, 8, 4, 2], and [18, 9, 6, 3]. Here are the results:

Part2 image 1
Church
R: (58, -4) G: (25, 3)
Runtime: 10.26s
Part2 image 2
Emir
R: (107, 40) G: (49, 23)
Runtime: 16.21s
Part2 image 3
Harvesters
R: (124, 13) G: (61, 16)
Runtime: 13.09s
Part2 image 4
Icon
R: (90, 23) G: (41, 17)
Runtime: 13.44s
Part2 image 5
Italil
R: (77, 35) G: (38, 21)
Runtime: 13.40s
Part2 image 6
Lastochikino
R: (76, -9) G: (-3, -2)
Runtime: 13.30s
Part2 image 7
Lugano
R: (93, -29) G: (41, -17)
Runtime: 13.36s
Part2 image 8
Melons
R: (179, 12) G: (85, 9)
Runtime: 22.56s
Part2 image 9
Self Portrait
R: (176, 36) G: (81, 29)
Runtime: 23.20s
Part2 image 10
Siren
R: (97, -25) G: (50, -7)
Runtime: 13.92s
Part2 image 11
Three Generations
R: (112, 10) G: (55, 13)
Runtime: 13.22s

Extra self-selected images:

Part2 image 12
Kivach
R: (126, 19) G: (37, 11)
Runtime: 14.02s
Part2 image 13
Isfandiyar
R: (105, 1) G: (41, 5)
Runtime: 13.45s
Part2 image 14
Religious Painting
R: (52, 38) G: (35, 24)
Runtime: 13.18s

Part 3: Bells and whistles

During the implementation on emir.tif, it was not as easy as I expected, as the outcome image is still awful after a few refinements on my algorithm. Therefore, I decided to use the gradient calculated by Sobel operator as the alignment clue:

Part3 image 1.1
Without using gradient
Part3 image 1.2
After using gradient

Meanwhile, the colors of some of the images were not as realistic as expected, being either too blueish or too yellowish. I attempted to adjust the white balance of the images by matching each channel's mean to that of an assigned channel:

Part3 image 2.1
Church, before adjust WB
Part3 image 2.2
Church, after adjusting WB
Part3 image 2.3
Lastochikino, before adjust WB
Part3 image 2.4
Lastochikino, after adjusting WB

Summary

This project is not very hard, as the way to build the algorithm is very clear. However, it is a rather interesting one, though I have already done some other image processing projects before. I believe the projects of this course would be more and more interesting later on.