Intro

Computer vision models like convolutional neural network’s use image processing techniques to extract the important features from an image. One of the first steps is smoothing or blurring the image. This makes the distribution of the pixels closer to normal, which is an assumption of many statistical methods. In this post we will look into how to achieve a blurring effect of images as seen in instagram or adobe photoshop.

Images are represented in a computer using tensors or arrays. An image will be a matrix expressing its height, width, and number of components. The number of components determines if the image is colored or gray scale. Colored images may have 3 components, red, green, and blue. Gray scale images will have one component. The pixels in an image can range from [0, 1], [0, 255], or other options depending on it’s configuration. 0 will mean the pixel has no red at that element and the max value like 1 will mean it’s very red.

Rotating images

When building a computer vision model images can be taken at all angles. It is possible for the model to misclassify an image due to the image being taken at a strange angle. So a flexible model will take these edge cases into account. Here we are reading in an image and formatting as a 3 dimensional array. There are three components for red, green, and blue. Each component is a 2 dimensional array with width 159 and height 240. By using techniques like the transpose of a matrix we are able to rotate the image.

#Clockwise rotations
img <- readJPEG('images/mud.jpeg')
dims <- dim(img)
red <- img[,,1]
green <- img[,,2]
blue <- img[,,3]
#90 degrees 
#Reverse order of rows, take transpose
r <- t(red[nrow(red):1,])
g <- t(green[nrow(green):1,])
b <- t(blue[nrow(blue):1,])
ninety <- array(c(r, g, b), dim=c(dim(b), 3))
#180 degrees 
#Reverse order of rows and cols
r <- red[nrow(red):1, ncol(red):1]
g <- green[nrow(green):1, ncol(green):1]
b <- blue[nrow(blue):1, ncol(blue):1]
oneeighty <- array(c(r, g, b), dim=c(dim(b), 3))
#270 degrees 
#Take transpose, reverse column order
r <- t(red)[ncol(red):1,]
g <- t(green)[ncol(green):1,]
b <- t(blue)[ncol(blue):1,]
twoseventy <- array(c(r, g, b), dim=c(dim(b), 3))

Gaussian Smoothing

A computer vision model tries different mathematical techniques to extract the most important information from an image. A common first step is smoothing the image which reduces it’s noise. One way to do this is to apply a Gaussian filter.

The distribution of this image is quite hectic so computer vision models will struggle to detect anything. There are many different peaks and skew.

After applying a 2d convolution to the image using a gaussian filter, we can see the distribution os the image is much closer to normal!

Smoothing the image is done by performing a 2d convolution. We set values in a 5x5 kernel using a Gaussian discrete approximation. We can see how adjusting the standard deviation of the Gaussian distribution affects the blur of the image. Next, we will explain what the Gaussian distribution is and later what a convolution is.

Gaussian Distribution aka Normal Distribution

When working with images, we will use a 2 dimensional Gaussian function. \[G(x, y) = \frac{1}{2\pi\sigma^2}e^\frac{x^2+y^2}{2\sigma^2}\]

gaussian2d <- function(x, y) {
  part1 <- 1 / (2*pi*sd(x*y)^2)
  part2 <- exp(1)^((-(x^2+y^2))/(2*(sd(x*y)^2)))
  return(part1*part2)
}

This is what a 2 dimensional Gaussian function looks like. The Gaussian function is a continuous function. So we must fit it to a discrete approximation in order to apply it over the image. This is done by taking advantage of the fact that 1, 2, and 3 standard deviations away from the center of the bell curve contain 68%, 95%, and 99.7% of the values in the Gaussian distribution. So we can get a pretty accurate kernel with just 3x3 or 5x5 dimensions.

Convolution

Convolution is the process of applying the kernel over the original image. We place the anchor of kernel over every pixel in the image. Then perform a matrix multiplication with the kernel and the pixel’s neighbors. After that, compute the sum of the products. The output of this value is the new pixel value in the transformed image.

2D convolution forumla

\[y[i, j] = \sum_{m=-\infty}^{\infty} \sum_{n=-\infty}^{m=\infty}h[m, n]*x[i-m,j-n]\]

Here x is the input image and y is the output which is the new image. H is the kernel matrix. I and j iterate over the image while m and n deal with that of the kernel.

When applying a 2d convolution using a Gaussian filter it reduces the noise in the image. This gives the effect of blurring the image. One can do this in R using the following code.

#Make a kernel of size 3x3
x <- c(1, 0, 1, 
       1, 0, 1, 
       1, 0, 1)
y <- c(1, 1, 1,
       0, 0, 0, 
       1, 1, 1)
sd <- 1.0
#Fixed because we are setting the standard deviation
kernel <- gaussian2d_fixed(x, y, sd)
#Apply the convolution
img <- readJPEG('images/noisy_lady.jpeg')
blur <- conv2(img, kernel, 'valid')
blur <- array(blur, dim=c(dim(blur)))

Conclusion

Computer vision depends on mathematical formulas that are commonly found in statistics as well. Understanding these statistical and mathematical formulas can give one a deeper understanding of what the computer vision model is doing. I hope you found this application of the Gaussian distribution interesting! Next, we will look at other distributions and see their effect on images.

Convolutions