(Scikit-) Image processing, via NumPy and SciPy#

This page will explore foundational image processing techniques, as operations on the values in a NumPy image array. First, we will explore how to achieve specific effects using NumPy and SciPy. We will demonstrate what these operations are doing to an image at the level of the array pixels. After that, we will show how more sophisticated extensions of these techniques can be implemented with Scikit-image. We will focus on the way that Scikit-image often uses NumPy and SciPy operations “under the hood”.

Remember that “image processing” is when we do something that analyzes or changes the numbers inside the image array? Well, in fact, all that even the fanciest image processing software is doing is changing the pixel values inside image arrays, in various ways. This is true for image processing software with a graphical user interface, like Adobe Photoshop and the GNU Image Manipulation Program, as well as for code-based image processing software like Scikit-image.

Let’s again build a simple image array, and look at the ways we can use NumPy alone to achieve some pretty radical changes to the original image. We will then look at the specific purposes that such changes are used for, with more complex images.

First, we create do our usual imports, and create our image array:

import numpy as np
import matplotlib.pyplot as plt
import skimage as ski

# Set precision for float numbers
%precision 2

# Set 'gray' as the default colormap
plt.rcParams['image.cmap'] = 'gray'

# Import hints for some exercises, and custom function for showing image
# attributes.
from skitut import hints, show_attributes

# Create our image array.
i_img = np.array([[0, 0, 0, 0, 0, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 0, 0, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 0, 0, 0, 0, 0]],
                  dtype=float)

# Show the image array.
plt.imshow(i_img);
_images/e2d5f31f4442465d729851f9024f8732311e3e3be101d081d87bc275a38d9edb.png

We have already encountered the use of np.flip() as a tool for rudimentary image manipulation. We use it to, well, flip an image array on its head:

# Flip the array.
flipped_i = np.flip(i_img)

# Show the "raw" array pixel values.
flipped_i
array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])
# Display the array with Matplotlib.
plt.imshow(flipped_i);
_images/209add903c33efd7aa20bfa370c7f14595adb25270945d89cd43948cd77a2740.png

Resizing by repeating#

Now, any operation that changes the numbers in the array is a form of image manipulation. The term image processing generally means we are applying image manipulations to achieve a specific purpose - such as improving image quality or clarity.

Let’s say we want to resize our image array. Using NumPy, there are many ways to this same destination. Provided we want to double (or triple, or quadruple) the size along a given dimension, we can achieve what we want using np.repeat().

# Double the image, by repeating each row.
doubled_i_rows = np.repeat(i_img,
                           repeats=2,
                           axis=0)

# Show the "raw" array pixel values.
doubled_i_rows
array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])
# Display the array with Matplotlib.
plt.imshow(doubled_i_rows);
_images/ecdc223ae9a06603c4bf5947849d794ca65b3c8f0ce739a3c74c862eed290d4b.png

We can compare the attributes, including the shape of each array, using a custom function we defined in the first cell of this notebook:

show_attributes(i_img)
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (15, 8)
Max Pixel Value: 1.0
Min Pixel Value: 0.0
show_attributes(doubled_i_rows)
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (30, 8)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

We can see that we have twice the number of rows in the doubled_i_rows image.

We can also double along the columns, by setting axis=1:

# Double along the columns.
doubled_i_cols = np.repeat(i_img,
                           repeats=2,
                           axis=1)
doubled_i_cols
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
plt.imshow(doubled_i_cols);
_images/a321a3b41957cba07cac178cc7bcb8eb8859bf2d25170f8848d683e4c1ce7152.png
# Indeed, the columns have doubled.
show_attributes(doubled_i_cols)
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (15, 16)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

By combining these operations, we can double along both the rows and the columns:

# Double the whole image.
doubled_i =  np.repeat(i_img,
                       repeats=2,
                       axis=0)
double_doubled_i =  np.repeat(doubled_i,
                              repeats=2,
                              axis=1)
double_doubled_i
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
plt.imshow(doubled_i);
_images/ecdc223ae9a06603c4bf5947849d794ca65b3c8f0ce739a3c74c862eed290d4b.png
# The original image size was (15, 8).
show_attributes(double_doubled_i)
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (30, 16)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Resizing an image with skimage#

The ski.transform module contains a function called resize. Somewhat obviously, ski.transform.resize() takes an input image and a requested image shape, and returns an output image of the requested size. Because all computer images are at least 2D arrays, this involves changing the shape of the image. Let’s demonstrate this with the following image array:

# Create an image array.
squares = np.array([[1, 0,],
                    [0, 1,]],
                   dtype=float)

# Show the array ("raw" output from NumPy)
squares
array([[1., 0.],
       [0., 1.]])
# Show the array, visualised with Matplotlib
plt.matshow(squares);
_images/537da698d538dc2e027b8fef4e66635a37eb47d3d8fe0b431d863547b3f90def.png

What happens if we resize squares to (10, 10)? We will use the optional Boolean preserve_range argument for forward compatibility with the next big release of the Scikit-image package. It has the effect of preventing some automatic processing of the range of values in the image array on input.

# Pass our `squares` array to the `ski.transform.resize()` function.
squares_ten_by_ten = ski.transform.resize(squares,
                                          output_shape=(10, 10),
                                          preserve_range=True)

# Show the resized array.
squares_ten_by_ten
array([[0.52, 0.56, 0.6 , 0.56, 0.52, 0.48, 0.44, 0.4 , 0.44, 0.48],
       [0.56, 0.68, 0.8 , 0.68, 0.56, 0.44, 0.32, 0.2 , 0.32, 0.44],
       [0.6 , 0.8 , 1.  , 0.8 , 0.6 , 0.4 , 0.2 , 0.  , 0.2 , 0.4 ],
       [0.56, 0.68, 0.8 , 0.68, 0.56, 0.44, 0.32, 0.2 , 0.32, 0.44],
       [0.52, 0.56, 0.6 , 0.56, 0.52, 0.48, 0.44, 0.4 , 0.44, 0.48],
       [0.48, 0.44, 0.4 , 0.44, 0.48, 0.52, 0.56, 0.6 , 0.56, 0.52],
       [0.44, 0.32, 0.2 , 0.32, 0.44, 0.56, 0.68, 0.8 , 0.68, 0.56],
       [0.4 , 0.2 , 0.  , 0.2 , 0.4 , 0.6 , 0.8 , 1.  , 0.8 , 0.6 ],
       [0.44, 0.32, 0.2 , 0.32, 0.44, 0.56, 0.68, 0.8 , 0.68, 0.56],
       [0.48, 0.44, 0.4 , 0.44, 0.48, 0.52, 0.56, 0.6 , 0.56, 0.52]])
# Show the image attributes.
show_attributes(squares_ten_by_ten)
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (10, 10)
Max Pixel Value: 1.0
Min Pixel Value: 0.0
# Display the image.
plt.imshow(squares_ten_by_ten);
_images/f59de3b41a018accdd230c34e8570ba26c4b3afb325ab5afa567d3a9a52de9b5.png

Well, that is certainly more artistic than the original!

We now have many more unique values in the output array than there were in the input array (the input array contained only 0’s and 1’s), because skimage is interpolating for many new pixels. Interpolation is the process of estimating values for the new pixels which fall in between the array pixels from the original array image, based on the weighted average of the values of the original pixels to which they are nearest.

# Show the `unique` values.
np.unique(squares_ten_by_ten)
array([0.  , 0.2 , 0.2 , 0.2 , 0.32, 0.32, 0.32, 0.32, 0.4 , 0.4 , 0.4 ,
       0.44, 0.44, 0.48, 0.48, 0.52, 0.52, 0.56, 0.56, 0.56, 0.6 , 0.6 ,
       0.6 , 0.68, 0.68, 0.68, 0.68, 0.8 , 0.8 , 0.8 , 1.  ])

The array pixels highlighted in red are the original pixels from the (2, 2) original array:

All the other pixels have been added by skimage during the resize-ing process. Pixels closer to the original pixels share closer intensity values to the original pixel (meaning they are more black or more white, depending on the original pixel). Images further from the original pixels become more gray.

We can control the type of interpolation that skimage uses by changing the (somewhat cryptically named) order argument. Setting order=0 will activate nearest neighbor interpolation. This method of interpolation (estimation) merely uses the nearest existing pixel to give the value for any new pixel in the output image.

# Pass our `squares` array to the `ski.transform.resize()` function.
squares_ten_by_ten = ski.transform.resize(squares,
                                          output_shape=(10, 10),
                                          preserve_range=True,
                                          order=0) # Nearest neighbor

# Show the resized array.
plt.imshow(squares_ten_by_ten);
_images/b5c2bb53083ac8e767a9af251102588858fc42ed2ba1975298e2d2439d264a17.png

This seems much closer to what we want when we resize the image. However, the results of image processing are highly context-dependent, and there may be images for which the default interpolation setting works better…

Rotation#

Another common image manipulation we may want to do is to rotate an image.

Rotations in 90 degree increments#

Should we only want to rotate by increments of 90 degrees, we can use the helpfully named np.rot90() function:

# Rotate the image.
rotated_i = np.rot90(i_img)
rotated_i
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
plt.imshow(rotated_i);
_images/19d9b78b04a9a44c84a6d7e2cc6901ec92132457f2ad81f200fc24d78199fe8e.png

We can control the number of rotations with the k argument:

# Rotate the image, twice!
rotated_i_180 = np.rot90(i_img,
                         k=2) # Two 90 degree rotations.
rotated_i_180
array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])
plt.imshow(rotated_i_180);
_images/209add903c33efd7aa20bfa370c7f14595adb25270945d89cd43948cd77a2740.png

Rotating in increments of 90 degrees will not change the size (e.g. number of pixels) in the array, however it will change the integer index location of the pixel values:

# Show the shape of the original image and both 90 degree rotated images.
plt.subplot(1, 3, 1)
plt.title(f"`.shape` = {i_img.shape}")
plt.imshow(i_img)
plt.subplot(1, 3, 2)
plt.title(f"`.shape` = {rotated_i.shape}")
plt.imshow(rotated_i)
plt.subplot(1, 3, 3)
plt.title(f"`.shape` = {rotated_i_180.shape}")
plt.imshow(rotated_i_180);
_images/3f6d04921436781726746514f2f9d0a7a3db36776e9838e361b2e2a85ae68f8c.png
# Original image and 90 degree rotations all have the same number of elements (15*8 = 120)
i_img.size == rotated_i.size == rotated_i_180.size
True

Rotations by arbitrary angles with Scipy#

To rotate an image by more flexible increments than 90 degrees, we need to bring in SciPy, another foundation library for Scikit-image. The SciPy function ndimage.rotate() offers more flexible rotation. However, rotating by other angles will alter both the shape and size of the output image:

# Import SciPy using the conventional name (`sp`).
import scipy as sp

# Rotate the image by 193 degrees.
rotated_i_193 = sp.ndimage.rotate(i_img,
                                  angle=193) # Specify the rotation angle.

# Show the "raw" array.
rotated_i_193
array([[ 0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,
         0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  9.19e-02,  3.28e-01,  1.46e-02,
        -9.30e-03,  1.14e-03,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  1.19e-02, -7.40e-02,  8.85e-01,  1.30e+00,  2.34e-01,
        -5.79e-02,  9.74e-03,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  2.30e-02, -9.94e-02,  5.32e-01,  1.16e+00,  4.46e-01,
        -9.18e-02,  1.97e-02,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  1.33e-02, -7.05e-02,  2.94e-01,  1.18e+00,  7.22e-01,
        -9.12e-02,  2.01e-02,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  3.35e-03, -2.49e-02,  7.87e-02,  1.06e+00,  9.45e-01,
        -2.01e-02,  1.59e-03,  3.04e-05,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  1.05e-02, -5.31e-02,  8.77e-01,  1.11e+00,
         1.43e-01, -4.04e-02,  6.37e-03,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  2.16e-02, -9.77e-02,  6.31e-01,  1.19e+00,
         3.73e-01, -8.34e-02,  1.70e-02,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  1.69e-02, -8.35e-02,  3.76e-01,  1.20e+00,
         6.38e-01, -9.80e-02,  2.14e-02,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  6.43e-03, -4.04e-02,  1.40e-01,  1.07e+00,
         8.46e-01, -5.22e-02,  1.08e-02,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  2.94e-05,  1.59e-03, -2.13e-02,  1.05e+00,
         1.21e+00,  8.46e-02, -2.23e-02,  2.08e-03,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  2.07e-02, -7.40e-02,  3.77e-01,
         2.86e-01,  1.34e-02, -6.91e-04,  2.08e-03,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  2.41e-04, -3.74e-03,  1.05e-01,
         6.11e-01,  4.42e-01, -1.04e-01,  2.57e-02,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  8.91e-03, -6.26e-02,  2.63e-01,
         1.43e+00,  9.24e-01, -7.36e-02,  1.14e-02,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  1.17e-03, -9.12e-03,  1.40e-02,
         3.06e-01,  8.45e-02,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,
         0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00]])
# Render the image graphically.
plt.imshow(rotated_i_193);
_images/de3e5f8637aec11dcc177d81ca1ecc1b4807d581f026f3f9e4c0f9755df82e98.png
# Show the attributes of the rotated image.
show_attributes(rotated_i_193)
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (16, 11)
Max Pixel Value: 1.43
Min Pixel Value: -0.1

The cell below will loop through some different rotation angles, the shape of each image is shown below each plot:

# A for loop to show multiple rotations, and the effect on
# the shape of the resultant image array.
plt.figure(figsize=(12, 4))
for i, i_2 in enumerate(np.arange(361, step=45)):
    plt.subplot(1, 9, i+1)
    current_rot = sp.ndimage.rotate(i_img, 
                                    angle=i_2)
    plt.imshow(current_rot)
    plt.title(f"{i_2}°")
    plt.xlabel(f"{current_rot.shape}")
    plt.xticks([])
    plt.yticks([])
_images/fdfa5c78f0619ed0b28ffc66329546631ca2c93d82615786d078be9b4a1207c9.png

By default, the shape is altered so that the rotated original array is shown within the output array. SciPy uses interpolation to estimate the values of the pixels it adds, where the shape of the output image is larger than the shape of the input image.

We can disable this behaviour by settings reshape=False, however, this means that we will clip any parts of the array that have been rotated out of the field of view of the original array shape.

# A for loop to show multiple rotations, and the effect on
# the shape of the resultant image array, but this time
# we do not allow SciPy to reshape the output arrays.
plt.figure(figsize=(12, 4))
for i, i_2 in enumerate(np.arange(361, step=45)):
    plt.subplot(1, 9, i+1)
    current_rot = sp.ndimage.rotate(i_img,
                                    angle=i_2,
                                    reshape=False) # Don't reshape output.
    plt.imshow(current_rot)
    plt.title(f"{i_2}°")
    plt.xlabel(f"{current_rot.shape}")
    plt.xticks([])
    plt.yticks([])
_images/c2db327cff9c8b44a106d8cd0da85c14dc3ff00d94738258441ca8639ce19bf3.png

Rotating with skimage#

Let’s now look now at how rotating image arrays is handled in skimage. Image rotation, which we saw above using np.rot90 and scipy.ndimage.rotate() can be achieved using the straightforwardly named ski.transform.rotate(), and the syntax works identically to scipy.ndimage.rotate(). All this rotating has left us thirsty and caffeine-deprived, so let’s get some coffee:

# Import and show an image.
coffee = ski.data.coffee()
plt.imshow(coffee);
_images/97fe6e20ae416a866bb7ec14a2e5b284f2a37fea6a05fbac65c3646d0ddd8270.png

We can achieve easy and flexible rotation with ski.transform.rotate():

# Rotate the `coffee` image with `skimage`.
# resize=True ensures all the original image fits inside the output.
rotated_coffee = ski.transform.rotate(coffee,
                                      angle=75,
                                      resize=True)

plt.imshow(rotated_coffee);
_images/909af6b8f1420db6e08295f4a0d49fe47fc56fc2ac257555339ed188f7ae74e0.png

The cell below plots a variety of rotations, using skimage.transform.rotate() to perform each rotation, this time disabling resize of the output to fit the rotated input.

# Many rotations...
plt.figure(figsize=(16, 10))
for i, i_2 in enumerate(np.arange(361, step=45)):
    plt.subplot(3, 3, i+1)
    current_rot = ski.transform.rotate(coffee,
                                       angle=i_2,
                                       resize=False)
    plt.imshow(current_rot)
    plt.title(f"{i_2}°")
    plt.xticks([])
    plt.yticks([])
_images/b295c0df6206cd0d1707bb5343ebdcb2fe9c586d1a1bdde8ac30f68bd41fae04.png

Rotation compared to flips and transpos#

Rotating is a different operation that flipping the image with np.flip. Flipping causes a reflection in the image around its center. The difference between rotation and applying a flip becomes obvious with an image that is not left-right symmetrical.

The cell below demonstrates np.flip-ping an image, as well as np.rot-ating an image by 180 degrees:

# Load in `camera`
camera = ski.data.camera()

# Rotate, flip 'n' plot!
plt.figure(figsize=(14, 4))
plt.subplot(1, 3, 1)
plt.imshow(camera)
plt.title('Original')
plt.subplot(1, 3, 2)
plt.imshow(np.rot90(camera, k=2))
plt.title('np.rot90(k=2)')
plt.subplot(1, 3, 3)
plt.imshow(np.flip(camera))
plt.title('np.flip()');
_images/eca27dd0ebcff97883ac79b3f3eeced5d3f9d29f54615c2aa01f60cbafaf501a.png

Similarly, rotations differ from transpose operations on the array.

Specifically for 90 degree rotations, you might be tempted to use a NumPy shortcut, and use the .T (transpose) method. This however, will do something different to rotation. The cell below demonstrates the .T method, with the camera image:

# Transpose `camera`.
camera_transposed = camera.T
show_attributes(camera_transposed)
plt.imshow(camera_transposed)
plt.title("camera.T");
Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 512)
Max Pixel Value: 255
Min Pixel Value: 0
_images/b37afc8a257d6d001851fc9fdfd890e388f15a70509f111756a988fa5931a358.png

We now show a 90 degree rotation, using ski.transform.rotate()

# Rotate by 90 degrees.
plt.imshow(ski.transform.rotate(camera, 
                                angle=90))
plt.title('ski.transform.rotate()');
_images/c81025159ea6474d497721608e43edf5e4dfa324f0571b40dc573b4a6cdd3b4b.png

We can see that the cameraman is facing a different direction in each image (taking a photo of the bottom of the image for the .T method, and taking a photo of the top of the image for a 90 degree rotation via skimage).

The difference here is that transposing an image switches the rows and columns, such that the first row becomes the first column etc. Conversely, skimage.transform.rotate() pivots the pixels around a central point. Essentially, transposing gives a mirroring effect which is different from a rotation.

Pay attention to the location of the spoon in the coffee image. First, we ski.transform.rotate() it by 90 degrees. Then, we show it .transposed, switching the rows and columns. As coffee is a 3D image, the .T method will produce an error, because the color channels will be moved into the wrong dimension - to avoid this we use the .transpose method, to keep the color channels in the third dimension, whilst switching the rows and columns:

# The `shape` of the original `coffee` image.
coffee.shape
(400, 600, 3)
# Why we cannot use the `.T` method. We get an array which
# is the wrong `shape` for a color image!
coffee.T.shape
(3, 600, 400)
# Move the columns into the rows, the rows into the columns, and leave the
# color channels in the third dimension.
coffee_transposed = coffee.transpose((1, 0, 2))
plt.imshow(coffee_transposed)

# Show the attributes (not that the `shape` is still correct for a color
# image).
show_attributes(coffee_transposed)
Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (600, 400, 3)
Max Pixel Value: 255
Min Pixel Value: 0
_images/e092896a32985974d8bfe34dddd4251f5a45d69285dddc58178c10b94f948bea.png

Compare this to a 90 degree rotation via skimage; pay attention to the spoon!

# Show the difference between rotating and transposing.
plt.subplot(1, 2, 1)
plt.imshow(ski.transform.rotate(coffee, angle=90, resize=True))
plt.title("ski.transform.rotate()")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(coffee_transposed)
plt.title(".transpose()")
plt.axis('off');
_images/3f1caf516bedb2502b0d5360fbbac189ca3cb8009774db1a991c2ab8f1c8b592.png

Unless you specifically want a mirroring transformation, then use .rotate()!

Cropping#

The process of cropping is the removal of areas of pixels from an image.

Because our images are just NumPy arrays, cropping is just NumPy indexing (duh!). As such, we can crop images just with indexing operations, without using specific NumPy (or skimage) functions.

For instance, we can “shave” our i_img array in half, along the columns by slicing along the columns:

# Cut in half.
half_i = i_img[:, 4:8]
half_i
array([[0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 0., 0., 0.]])
plt.imshow(half_i);
_images/d49cf5fc6278b914715513a6ba4e3407bb4b73221ae93baf52d055534f43b0e1.png

Likewise along the rows (albeit the number of rows is odd!):

plt.imshow(i_img[0:8, :]);
_images/53e4ef5e55ffd250477e11cfac8c56455e60026b5dd8764ec05cf70ca3103380.png

Masks#

In image processing, a mask is an array where the elements express weights or binary (0 or 1) values to select areas in another image (array).

A mask is “placed” on an image, and one typically then applies operations to the pixels indicated by the mask. For example, one might use the mask with an image array to replace pixels indicated by the mask with a specific value. Let’s demonstrate with a real image — a grayscale version of the standard coffee image:

# Make coffee RGB image into single-channel image.
coffee_gray = ski.color.rgb2gray(ski.data.coffee())
show_attributes(coffee_gray)
plt.imshow(coffee_gray);
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (400, 600)
Max Pixel Value: 1.0
Min Pixel Value: 0.0
_images/d82d52811692c04712ff3dd4c3535ca3c131a987f9aaed636b100e92840c3bfd.png

In this case we are going to create a new mask image, that corresponds to the coffee_gray image (has the same shape), but where we will create the mask values with a mathematical formula using the row and column indices. In fact we’ll do this to create a circular mask. Bear with us, all should be come clear as we go.

# Unpack and store the number of rows and number of columns.
dim_0, dim_1 = coffee_gray.shape
dim_0, dim_1
(400, 600)

We are going to create a circular mask, using the formula for a circle.

That formula needs the i (row) and j coordinates for each pixel, the center coordinate of the circle, and the radius r of the desired circle.

Call the row and column center coordinates \(c_i, c_j\) respectively.

We can tell if a particular pixel at (i, j) is outside the circle by testing whether the Euclidean distance of the pixel position (i, j) from the center is greater than \(r\).

\[ \sqrt{(i - c_i)^2 + (j - c_j)^2} > r \]

First we use the Numpy meshgrid function to return two arrays, one containing all the i (row) coordinates at each pixel, and another containing all the j (column) coordinates at each pixel.

# indexing='ij' tells meshgrid to return `i` and `j` coordinates.  There are other modes, not relevant here.
i_coords, j_coords = np.meshgrid(np.arange(dim_0), np.arange(dim_1), indexing='ij')
# i coordinate for each element.
i_coords
array([[  0,   0,   0, ...,   0,   0,   0],
       [  1,   1,   1, ...,   1,   1,   1],
       [  2,   2,   2, ...,   2,   2,   2],
       ...,
       [397, 397, 397, ..., 397, 397, 397],
       [398, 398, 398, ..., 398, 398, 398],
       [399, 399, 399, ..., 399, 399, 399]], shape=(400, 600))
# j coordinate for each element.
j_coords
array([[  0,   1,   2, ..., 597, 598, 599],
       [  0,   1,   2, ..., 597, 598, 599],
       [  0,   1,   2, ..., 597, 598, 599],
       ...,
       [  0,   1,   2, ..., 597, 598, 599],
       [  0,   1,   2, ..., 597, 598, 599],
       [  0,   1,   2, ..., 597, 598, 599]], shape=(400, 600))
# The coordinate of the image center in pixels.  Remember that pixel indices
# start at 0.
c_i, c_j = (dim_0 - 1) / 2, (dim_1 - 1) / 2
# Radius
r = 275

We can then use the formula above to generate a 2D array where True (== 1) means outside the circle and False (== 0) means inside the circle.

# Create a circular mask.
mask = np.sqrt((i_coords - c_i) ** 2 + (j_coords - c_j) ** 2) > r
show_attributes(mask)
plt.imshow(mask);
Type: <class 'numpy.ndarray'>
dtype: bool
Shape: (400, 600)
Max Pixel Value: True
Min Pixel Value: False
_images/c4a9148ba2313342c28dca009ce893a938f0fc0681e4ef2e74a754e3d1f87436.png

Once we have our mask - which is just a Boolean array - it is just a matter of Boolean indexing to set all the pixel values in the image to the same value, where there is a True in the corresponding element in the mask:

# Apply the mask.
coffee_gray_masked = coffee_gray.copy()
coffee_gray_masked[mask] = 0

plt.matshow(coffee_gray_masked);
_images/30aecc7b34cc5409048f2e563d4e56f26945cc833cd94251a78d05a77fa6e915.png

Inverting image colors with Numpy#

We saw color inversion on an earlier page. This is where all the pixel values in an image are, shockingly, inverted: high numbers become low numbers and vice versa:

For a binary image, this involves swapping 1s and 0s…

# Original image
i_img
array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])
plt.imshow(i_img);
_images/e2d5f31f4442465d729851f9024f8732311e3e3be101d081d87bc275a38d9edb.png

…which can be accomplished with some simple numeric operations:

inverted_i = 1 - i_img
inverted_i
array([[1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1.]])
plt.imshow(inverted_i);
_images/2d8e811c0a14ecbc1fab91f7354a8bc4fd87a807f43c8dc62f8372118b4eb837.png

What about a color image?

colorwheel = ski.data.colorwheel()
show_attributes(colorwheel)
plt.imshow(colorwheel);
Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (370, 371, 3)
Max Pixel Value: 255
Min Pixel Value: 0
_images/a45f666e5776e35adfcfa813430b2f2d3006553ddc5420ec486e87d226653365.png

Because the maximum value is now 255, we can subtract each array pixel value in each color channel from 255 to “reverse” the values:

# Invert the color image, manually.
inverted_colorwheel = colorwheel.copy()
for i in np.arange(3):
    inverted_colorwheel[:, :, i] = 255 - inverted_colorwheel[:, :, i] 

plt.imshow(inverted_colorwheel);
_images/23537d3a744b91511a23625f2f053c610e2d91c008f648dc04efa41f8920764f.png

Inverting colors with skimage#

ski.util.invert() handles color inversion, we simply pass it a color NumPy image array and voilà!:

# Invert the color image, with `skimage`.
invert_colorwheel_with_skimage = ski.util.invert(colorwheel)
plt.imshow(invert_colorwheel_with_skimage);
_images/23537d3a744b91511a23625f2f053c610e2d91c008f648dc04efa41f8920764f.png

Greyscale to binary conversion#

Greyscale to binary conversion can be achieved using comparison operators (<, >, <=, >=, ==). The resulting Boolean array will always be binary (True or False, 1 or 0).

Let’s demonstrate with a grayscale image:

# Create a grayscale image.
# Make a random number generator, with predictable outputs.
rng = np.random.default_rng(10)
random_check = np.array([[1, 0, 1, 0],
                         [0, 1, 0, 1],
                         [1, 0, 1, 0],
                         [0, 1, 0, 1]], dtype=int)
# Replace ones with random integers.
n_checks = np.count_nonzero(random_check)
random_check[random_check == 1] = rng.integers(3, 12,  # From 3 through 11.
                                               size=n_checks)
plt.matshow(random_check);
_images/3443bf1db602944693def16c48eec84dc7e0cb1f18466125b591d329b9ad9204.png
# Convert to a binary image:
binary_check = random_check > np.median(random_check)
binary_check
array([[ True, False,  True, False],
       [False,  True, False,  True],
       [ True, False,  True, False],
       [False,  True, False,  True]])
plt.matshow(binary_check);
_images/c1818365f2f5e6dfe7ddae7e6cbbba5c1ce017291453bdc561b981468d1a7324.png

We can also use skimage to do this work - see the filtering page for more detail. For now, we can use the ski.filters.threshold_minimum() function. This supplies us a recommended threshold value to attempt to divide the array pixels into two classes e.g. two classes where the pixels in each class are maximally different from pixels in the other class:

# Get a recommended threshold from `skimage`.
threshold = ski.filters.threshold_minimum(random_check)
threshold
np.int64(2)

We can then use this threshold to create a binary array, successfully binarizing our grayscale image:

# Binarize the array, based on the threshold.
binary_check_from_ski = random_check > threshold
show_attributes(binary_check_from_ski)
plt.matshow(binary_check_from_ski);
Type: <class 'numpy.ndarray'>
dtype: bool
Shape: (4, 4)
Max Pixel Value: True
Min Pixel Value: False
_images/c1818365f2f5e6dfe7ddae7e6cbbba5c1ce017291453bdc561b981468d1a7324.png

Color to grayscale conversion with Numpy#

To downgrade a color image to grayscale we can use a brute force method of taking the mean of the three color channels, to produce a 2D monochrome image array:

gray_wheel = np.mean(colorwheel, axis=2)
plt.imshow(gray_wheel);
_images/b48f40f237e716152b1b6ca7866c7e1183b7cb5a4800bbfc1a262e1fffc306f7.png

A better option, that we encountered on the previous page, is to use the luminance formula:

\[ Y = 0.2126R + 0.7152G + 0.0722B \]

This collapses a 3D color image into a 2D grayscale image via a weighted sum of the channels.

Color to grayscale conversion with skimage#

As always, the ski.color module has us covered here with the rgb2gray() function. We simply pass it the color array that we want to convert to grayscale, without the direct need for the luminance formula:

# Convert color to grayscale.
gray_colorwheel_from_ski = ski.color.rgb2gray(colorwheel)
show_attributes(gray_colorwheel_from_ski)
plt.imshow(gray_colorwheel_from_ski);
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (370, 371)
Max Pixel Value: 1.0
Min Pixel Value: 0.0
_images/1e80785bd9b842a80dd933939a9c22ab33656fa8602973c0c9d03912f0b630c8.png

Summary#

This page has shown how to implement some fundamental image processing operations with NumPy, SciPy and Scikit-image. The next page will delve into image filtering.

References#

Gulati, J. (2024) NumPy for Image Processing. KDnuggets. Available from: https://www.kdnuggets.com/numpy-for-image-processing

Also see color images page references.