(Scikit-) Image processing, via NumPy and SciPy

(Scikit-) Image processing, via NumPy and SciPy#

Note

This tutorial is adapted from “Image manipulation and processing using NumPy and SciPy” by Emmanuelle Gouillart and Gaël Varoquaux, and “scikit-image: image processing” by Emmanuelle Gouillart. Please see the References section at the end of the page for other sources and resources.

This page will explore foundational image processing techniques, as operations on the values in a NumPy image array. First, we will explore how to achieve specific effects using NumPy and SciPy. We will demonstrate what these operations are doing to an image at the level of the array pixels. After that, we will show how more sophisticated extensions of these techniques can be implemented with Scikit-image. We will focus on the way that Scikit-image often uses NumPy and SciPy operations “under the hood”.

Remember that “image processing” is when we do something that analyzes or changes the numbers inside the image array? Well, in fact, all that even the fanciest image processing software is doing is changing the pixel values inside image arrays, in various ways. This is true for image processing software with a graphical user interface, like Adobe Photoshop and the GNU Image Manipulation Program, as well as for code-based image processing software like Scikit-image.

Let’s again build a simple image array, and look at the ways we can use NumPy alone to achieve some pretty radical changes to the original image. We will then look at the specific purposes that such changes are used for, with more complex images.

First, we create do our usual imports, and create our image array:

import numpy as np
import matplotlib.pyplot as plt
import skimage as ski

# Set precision for float numbers
%precision 2

# Set 'gray' as the default colormap
plt.rcParams['image.cmap'] = 'gray'

# Import hints for some exercises, and custom function for showing image
# attributes.
from skitut import hints, show_attributes

# Create our image array.
i_img = np.array([[0, 0, 0, 0, 0, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 0, 0, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0, 0],
                  [0, 0, 0, 0, 0, 0, 0, 0]],
                  dtype=float)

# Show the image array.
plt.imshow(i_img);

_images/e2d5f31f4442465d729851f9024f8732311e3e3be101d081d87bc275a38d9edb.png

We have already encountered the use of np.flip() as a tool for rudimentary image manipulation. We use it to, well, flip an image array on its head:

# Flip the array.
flipped_i = np.flip(i_img)

# Show the "raw" array pixel values.
flipped_i

array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

# Display the array with Matplotlib.
plt.imshow(flipped_i);

_images/209add903c33efd7aa20bfa370c7f14595adb25270945d89cd43948cd77a2740.png

Resizing by repeating#

Now, any operation that changes the numbers in the array is a form of image manipulation. The term image processing generally means we are applying image manipulations to achieve a specific purpose - such as improving image quality or clarity.

Let’s say we want to resize our image array. Using NumPy, there are many ways to this same destination. Provided we want to double (or triple, or quadruple) the size along a given dimension, we can achieve what we want using np.repeat().

# Double the image, by repeating each row.
doubled_i_rows = np.repeat(i_img,
                           repeats=2,
                           axis=0)

# Show the "raw" array pixel values.
doubled_i_rows

array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

# Display the array with Matplotlib.
plt.imshow(doubled_i_rows);

_images/ecdc223ae9a06603c4bf5947849d794ca65b3c8f0ce739a3c74c862eed290d4b.png

We can compare the attributes, including the shape of each array, using a custom function we defined in the first cell of this notebook:

show_attributes(i_img)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (15, 8)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

show_attributes(doubled_i_rows)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (30, 8)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

We can see that we have twice the number of rows in the doubled_i_rows image.

We can also double along the columns, by setting axis=1:

# Double along the columns.
doubled_i_cols = np.repeat(i_img,
                           repeats=2,
                           axis=1)
doubled_i_cols

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

plt.imshow(doubled_i_cols);

_images/a321a3b41957cba07cac178cc7bcb8eb8859bf2d25170f8848d683e4c1ce7152.png

# Indeed, the columns have doubled.
show_attributes(doubled_i_cols)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (15, 16)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

By combining these operations, we can double along both the rows and the columns:

# Double the whole image.
doubled_i =  np.repeat(i_img,
                       repeats=2,
                       axis=0)
double_doubled_i =  np.repeat(doubled_i,
                              repeats=2,
                              axis=1)
double_doubled_i

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

plt.imshow(doubled_i);

# The original image size was (15, 8).
show_attributes(double_doubled_i)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (30, 16)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Exercise 11

Use NumPy operations only to create the following image array, using the i_img array as your starting point:

Your final array should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (30, 8)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Hint: you may want to investigate NumPy functions for combining arrays

i_exclam = i_img.copy()
# YOUR CODE HERE

Solution to Exercise 11

You can do this in one line of code by using a combination of np.concat(), np.flip() and plt.imshow(). Notably,np.concat() will “stick” (concatenate) arrays together, rather than np.repeat()-ing the values…

Combined with np.flip() we can produce the desired result in one line of code:

# Combine the two `i_img` arrays, flip the second.
plt.imshow(np.concat([i_img, np.flip(i_img)]));

_images/89f1b41ae46927713fdc46e31adcfd1ed7c0ab38fbde611fd8a595fc499fae15.png

Exercise 12

Here again is the i_img array, and a printout of its attributes:

plt.imshow(i_img)
show_attributes(i_img);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (15, 8)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Your task is to find a way, again using only NumPy, to alter the i_img array so it becomes this target image:

The output image should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (15, 7)
Max Pixel Value: 1.0
Min Pixel Value: -1.0

Note: notice how we have lost a column, relative to the original i_img array… We also now have -1 values in the array. Have a think about which colors in the displayed image you think that these negative values will correspond to.

Hint: there are various ways to do this, but the most efficient way we could think of is one short line of Numpy processing. You might consider having a look at Functions on arrays for inspiration.

Hint: run the function hints.split_i(), which was imported at the beginning of this notebook, to see a helpful hint.

# YOUR CODE HERE
split_i = ...

Solution to Exercise 12

The target image can be created using the np.diff() function, which calculates the difference between adjacent values in an array. By default np.diff() operates in steps of 1 across the columns:

# Calculate the difference between values along the columns, in steps of 1.
split_i_solution = np.diff(i_img)

# Show the "raw" NumPy array.
print(split_i_solution)

# Display the image.
plt.imshow(split_i_solution);

[[ 0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.]]

_images/1fda69c491389043f1e3da1f976816e5e32fc3394dfc9c272a52af0ffa1d9da6.png

Why is the background now gray? Well, remember that, for a single-channel image, colors arise from applying the current colormap. Visualised with the gray colormap: the most intense (highest value) pixels are white, and the least intense (lowest value) pixels are black, and intermediate values are a shade of gray.

When we subtract along the column, with np.diff() the third value is subtracted from the fourth value etc., and so the resulting differences are calculated as:

1 - 0 = 1 for 1 values on the left-hand side of the original image, and…
…0 - 1 = -1 for 1 values on the right-hand side of the original image…
…and 1 - 1 = 0 for adjacent 1’s; 0 - 0 = 0 for adjacent 0’s.

Essentially we are detecting edges (big changes in intensity between adjacent pixels) in the horizontal direction (e.g. within each row, along the columns):

# Show before/after applying `np.diff()` to the image array.
print(f'\nBefore:\n {i_img}')
print(f'\nAfter:\n {split_i_solution}')

Before:
 [[0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]]

After:
 [[ 0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  1.  0. -1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.]]

So our highest values are now 1, our lowest values are -1, and our intermediate values are 0 (e.g. where the difference between elements was calculated from 1 - 1 = 0 or 0 - 0 = 0, indicating no change in the gradient of pixel intensity). A range from -1 to 1 is still supported in skimage for the float64 dtype.

White areas (pixel = 1) indicate a change between pixels in the horizontal direction (moving left-to-right) from black-to-white, in the original image. Black areas (pixel = -1) indicate a change in the horizontal direction from white-to-black. Grey areas indicate no change (pixel = 0).

We can detect big changes (edges) in the vertical direction by telling np.diff() to operate along the rows, rather than along the columns:

vertical_gradients = np.diff(i_img,
                             axis=0) # Subtract along the rows, not the columns.

plt.imshow(vertical_gradients);

_images/66e99086e79991cf258b69bab0223484ba3b77d4ccac5529339b3eb5807013fa.png

Resizing an image with `skimage`#

The ski.transform module contains a function called resize. Somewhat obviously, ski.transform.resize() takes an input image and a requested image shape, and returns an output image of the requested size. Because all computer images are at least 2D arrays, this involves changing the shape of the image. Let’s demonstrate this with the following image array:

# Create an image array.
squares = np.array([[1, 0,],
                    [0, 1,]],
                   dtype=float)

# Show the array ("raw" output from NumPy)
squares

array([[1., 0.],
       [0., 1.]])

# Show the array, visualised with Matplotlib
plt.matshow(squares);

_images/537da698d538dc2e027b8fef4e66635a37eb47d3d8fe0b431d863547b3f90def.png

What happens if we resize squares to (10, 10)? We will use the optional Boolean preserve_range argument for forward compatibility with the next big release of the Scikit-image package. It has the effect of preventing some automatic processing of the range of values in the image array on input.

# Pass our `squares` array to the `ski.transform.resize()` function.
squares_ten_by_ten = ski.transform.resize(squares,
                                          output_shape=(10, 10),
                                          preserve_range=True)

# Show the resized array.
squares_ten_by_ten

array([[0.52, 0.56, 0.6 , 0.56, 0.52, 0.48, 0.44, 0.4 , 0.44, 0.48],
       [0.56, 0.68, 0.8 , 0.68, 0.56, 0.44, 0.32, 0.2 , 0.32, 0.44],
       [0.6 , 0.8 , 1.  , 0.8 , 0.6 , 0.4 , 0.2 , 0.  , 0.2 , 0.4 ],
       [0.56, 0.68, 0.8 , 0.68, 0.56, 0.44, 0.32, 0.2 , 0.32, 0.44],
       [0.52, 0.56, 0.6 , 0.56, 0.52, 0.48, 0.44, 0.4 , 0.44, 0.48],
       [0.48, 0.44, 0.4 , 0.44, 0.48, 0.52, 0.56, 0.6 , 0.56, 0.52],
       [0.44, 0.32, 0.2 , 0.32, 0.44, 0.56, 0.68, 0.8 , 0.68, 0.56],
       [0.4 , 0.2 , 0.  , 0.2 , 0.4 , 0.6 , 0.8 , 1.  , 0.8 , 0.6 ],
       [0.44, 0.32, 0.2 , 0.32, 0.44, 0.56, 0.68, 0.8 , 0.68, 0.56],
       [0.48, 0.44, 0.4 , 0.44, 0.48, 0.52, 0.56, 0.6 , 0.56, 0.52]])

# Show the image attributes.
show_attributes(squares_ten_by_ten)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (10, 10)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

# Display the image.
plt.imshow(squares_ten_by_ten);

_images/f59de3b41a018accdd230c34e8570ba26c4b3afb325ab5afa567d3a9a52de9b5.png

Well, that is certainly more artistic than the original!

We now have many more unique values in the output array than there were in the input array (the input array contained only 0’s and 1’s), because skimage is interpolating for many new pixels. Interpolation is the process of estimating values for the new pixels which fall in between the array pixels from the original array image, based on the weighted average of the values of the original pixels to which they are nearest.

# Show the `unique` values.
np.unique(squares_ten_by_ten)

array([0.  , 0.2 , 0.2 , 0.2 , 0.32, 0.32, 0.32, 0.32, 0.4 , 0.4 , 0.4 ,
       0.44, 0.44, 0.48, 0.48, 0.52, 0.52, 0.56, 0.56, 0.56, 0.6 , 0.6 ,
       0.6 , 0.68, 0.68, 0.68, 0.68, 0.8 , 0.8 , 0.8 , 1.  ])

The array pixels highlighted in red are the original pixels from the (2, 2) original array:

All the other pixels have been added by skimage during the resize-ing process. Pixels closer to the original pixels share closer intensity values to the original pixel (meaning they are more black or more white, depending on the original pixel). Images further from the original pixels become more gray.

We can control the type of interpolation that skimage uses by changing the (somewhat cryptically named) order argument. Setting order=0 will activate nearest neighbor interpolation. This method of interpolation (estimation) merely uses the nearest existing pixel to give the value for any new pixel in the output image.

# Pass our `squares` array to the `ski.transform.resize()` function.
squares_ten_by_ten = ski.transform.resize(squares,
                                          output_shape=(10, 10),
                                          preserve_range=True,
                                          order=0) # Nearest neighbor

# Show the resized array.
plt.imshow(squares_ten_by_ten);

_images/b5c2bb53083ac8e767a9af251102588858fc42ed2ba1975298e2d2439d264a17.png

This seems much closer to what we want when we resize the image. However, the results of image processing are highly context-dependent, and there may be images for which the default interpolation setting works better…

Rotation#

Another common image manipulation we may want to do is to rotate an image.

Rotations in 90 degree increments#

Should we only want to rotate by increments of 90 degrees, we can use the helpfully named np.rot90() function:

# Rotate the image.
rotated_i = np.rot90(i_img)
rotated_i

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

plt.imshow(rotated_i);

_images/19d9b78b04a9a44c84a6d7e2cc6901ec92132457f2ad81f200fc24d78199fe8e.png

We can control the number of rotations with the k argument:

# Rotate the image, twice!
rotated_i_180 = np.rot90(i_img,
                         k=2) # Two 90 degree rotations.
rotated_i_180

array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

plt.imshow(rotated_i_180);

Rotating in increments of 90 degrees will not change the size (e.g. number of pixels) in the array, however it will change the integer index location of the pixel values:

# Show the shape of the original image and both 90 degree rotated images.
plt.subplot(1, 3, 1)
plt.title(f"`.shape` = {i_img.shape}")
plt.imshow(i_img)
plt.subplot(1, 3, 2)
plt.title(f"`.shape` = {rotated_i.shape}")
plt.imshow(rotated_i)
plt.subplot(1, 3, 3)
plt.title(f"`.shape` = {rotated_i_180.shape}")
plt.imshow(rotated_i_180);

_images/3f6d04921436781726746514f2f9d0a7a3db36776e9838e361b2e2a85ae68f8c.png

# Original image and 90 degree rotations all have the same number of elements (15*8 = 120)
i_img.size == rotated_i.size == rotated_i_180.size

True

Rotations by arbitrary angles with Scipy#

To rotate an image by more flexible increments than 90 degrees, we need to bring in SciPy, another foundation library for Scikit-image. The SciPy function ndimage.rotate() offers more flexible rotation. However, rotating by other angles will alter both the shape and size of the output image:

# Import SciPy using the conventional name (`sp`).
import scipy as sp

# Rotate the image by 193 degrees.
rotated_i_193 = sp.ndimage.rotate(i_img,
                                  angle=193) # Specify the rotation angle.

# Show the "raw" array.
rotated_i_193

array([[ 0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,
         0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  9.19e-02,  3.28e-01,  1.46e-02,
        -9.30e-03,  1.14e-03,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  1.19e-02, -7.40e-02,  8.85e-01,  1.30e+00,  2.34e-01,
        -5.79e-02,  9.74e-03,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  2.30e-02, -9.94e-02,  5.32e-01,  1.16e+00,  4.46e-01,
        -9.18e-02,  1.97e-02,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  1.33e-02, -7.05e-02,  2.94e-01,  1.18e+00,  7.22e-01,
        -9.12e-02,  2.01e-02,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  3.35e-03, -2.49e-02,  7.87e-02,  1.06e+00,  9.45e-01,
        -2.01e-02,  1.59e-03,  3.04e-05,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  1.05e-02, -5.31e-02,  8.77e-01,  1.11e+00,
         1.43e-01, -4.04e-02,  6.37e-03,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  2.16e-02, -9.77e-02,  6.31e-01,  1.19e+00,
         3.73e-01, -8.34e-02,  1.70e-02,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  1.69e-02, -8.35e-02,  3.76e-01,  1.20e+00,
         6.38e-01, -9.80e-02,  2.14e-02,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  6.43e-03, -4.04e-02,  1.40e-01,  1.07e+00,
         8.46e-01, -5.22e-02,  1.08e-02,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  2.94e-05,  1.59e-03, -2.13e-02,  1.05e+00,
         1.21e+00,  8.46e-02, -2.23e-02,  2.08e-03,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  2.07e-02, -7.40e-02,  3.77e-01,
         2.86e-01,  1.34e-02, -6.91e-04,  2.08e-03,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  2.41e-04, -3.74e-03,  1.05e-01,
         6.11e-01,  4.42e-01, -1.04e-01,  2.57e-02,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  8.91e-03, -6.26e-02,  2.63e-01,
         1.43e+00,  9.24e-01, -7.36e-02,  1.14e-02,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  1.17e-03, -9.12e-03,  1.40e-02,
         3.06e-01,  8.45e-02,  0.00e+00,  0.00e+00,  0.00e+00],
       [ 0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,
         0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00,  0.00e+00]])

# Render the image graphically.
plt.imshow(rotated_i_193);

_images/de3e5f8637aec11dcc177d81ca1ecc1b4807d581f026f3f9e4c0f9755df82e98.png

# Show the attributes of the rotated image.
show_attributes(rotated_i_193)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (16, 11)
Max Pixel Value: 1.43
Min Pixel Value: -0.1

The cell below will loop through some different rotation angles, the shape of each image is shown below each plot:

# A for loop to show multiple rotations, and the effect on
# the shape of the resultant image array.
plt.figure(figsize=(12, 4))
for i, i_2 in enumerate(np.arange(361, step=45)):
    plt.subplot(1, 9, i+1)
    current_rot = sp.ndimage.rotate(i_img, 
                                    angle=i_2)
    plt.imshow(current_rot)
    plt.title(f"{i_2}°")
    plt.xlabel(f"{current_rot.shape}")
    plt.xticks([])
    plt.yticks([])

_images/fdfa5c78f0619ed0b28ffc66329546631ca2c93d82615786d078be9b4a1207c9.png

By default, the shape is altered so that the rotated original array is shown within the output array. SciPy uses interpolation to estimate the values of the pixels it adds, where the shape of the output image is larger than the shape of the input image.

We can disable this behaviour by settings reshape=False, however, this means that we will clip any parts of the array that have been rotated out of the field of view of the original array shape.

# A for loop to show multiple rotations, and the effect on
# the shape of the resultant image array, but this time
# we do not allow SciPy to reshape the output arrays.
plt.figure(figsize=(12, 4))
for i, i_2 in enumerate(np.arange(361, step=45)):
    plt.subplot(1, 9, i+1)
    current_rot = sp.ndimage.rotate(i_img,
                                    angle=i_2,
                                    reshape=False) # Don't reshape output.
    plt.imshow(current_rot)
    plt.title(f"{i_2}°")
    plt.xlabel(f"{current_rot.shape}")
    plt.xticks([])
    plt.yticks([])

_images/c2db327cff9c8b44a106d8cd0da85c14dc3ff00d94738258441ca8639ce19bf3.png

Rotating with `skimage`#

Let’s now look now at how rotating image arrays is handled in skimage. Image rotation, which we saw above using np.rot90 and scipy.ndimage.rotate() can be achieved using the straightforwardly named ski.transform.rotate(), and the syntax works identically to scipy.ndimage.rotate(). All this rotating has left us thirsty and caffeine-deprived, so let’s get some coffee:

# Import and show an image.
coffee = ski.data.coffee()
plt.imshow(coffee);

_images/97fe6e20ae416a866bb7ec14a2e5b284f2a37fea6a05fbac65c3646d0ddd8270.png

We can achieve easy and flexible rotation with ski.transform.rotate():

# Rotate the `coffee` image with `skimage`.
# resize=True ensures all the original image fits inside the output.
rotated_coffee = ski.transform.rotate(coffee,
                                      angle=75,
                                      resize=True)

plt.imshow(rotated_coffee);

_images/909af6b8f1420db6e08295f4a0d49fe47fc56fc2ac257555339ed188f7ae74e0.png

The cell below plots a variety of rotations, using skimage.transform.rotate() to perform each rotation, this time disabling resize of the output to fit the rotated input.

# Many rotations...
plt.figure(figsize=(16, 10))
for i, i_2 in enumerate(np.arange(361, step=45)):
    plt.subplot(3, 3, i+1)
    current_rot = ski.transform.rotate(coffee,
                                       angle=i_2,
                                       resize=False)
    plt.imshow(current_rot)
    plt.title(f"{i_2}°")
    plt.xticks([])
    plt.yticks([])

_images/b295c0df6206cd0d1707bb5343ebdcb2fe9c586d1a1bdde8ac30f68bd41fae04.png

Rotation compared to flips and transpos#

Rotating is a different operation that flipping the image with np.flip. Flipping causes a reflection in the image around its center. The difference between rotation and applying a flip becomes obvious with an image that is not left-right symmetrical.

The cell below demonstrates np.flip-ping an image, as well as np.rot-ating an image by 180 degrees:

# Load in `camera`
camera = ski.data.camera()

# Rotate, flip 'n' plot!
plt.figure(figsize=(14, 4))
plt.subplot(1, 3, 1)
plt.imshow(camera)
plt.title('Original')
plt.subplot(1, 3, 2)
plt.imshow(np.rot90(camera, k=2))
plt.title('np.rot90(k=2)')
plt.subplot(1, 3, 3)
plt.imshow(np.flip(camera))
plt.title('np.flip()');

_images/eca27dd0ebcff97883ac79b3f3eeced5d3f9d29f54615c2aa01f60cbafaf501a.png

Similarly, rotations differ from transpose operations on the array.

Specifically for 90 degree rotations, you might be tempted to use a NumPy shortcut, and use the .T (transpose) method. This however, will do something different to rotation. The cell below demonstrates the .T method, with the camera image:

# Transpose `camera`.
camera_transposed = camera.T
show_attributes(camera_transposed)
plt.imshow(camera_transposed)
plt.title("camera.T");

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 512)
Max Pixel Value: 255
Min Pixel Value: 0

_images/b37afc8a257d6d001851fc9fdfd890e388f15a70509f111756a988fa5931a358.png

We now show a 90 degree rotation, using ski.transform.rotate()

# Rotate by 90 degrees.
plt.imshow(ski.transform.rotate(camera, 
                                angle=90))
plt.title('ski.transform.rotate()');

_images/c81025159ea6474d497721608e43edf5e4dfa324f0571b40dc573b4a6cdd3b4b.png

We can see that the cameraman is facing a different direction in each image (taking a photo of the bottom of the image for the .T method, and taking a photo of the top of the image for a 90 degree rotation via skimage).

The difference here is that transposing an image switches the rows and columns, such that the first row becomes the first column etc. Conversely, skimage.transform.rotate() pivots the pixels around a central point. Essentially, transposing gives a mirroring effect which is different from a rotation.

Pay attention to the location of the spoon in the coffee image. First, we ski.transform.rotate() it by 90 degrees. Then, we show it .transposed, switching the rows and columns. As coffee is a 3D image, the .T method will produce an error, because the color channels will be moved into the wrong dimension - to avoid this we use the .transpose method, to keep the color channels in the third dimension, whilst switching the rows and columns:

# The `shape` of the original `coffee` image.
coffee.shape

(400, 600, 3)

# Why we cannot use the `.T` method. We get an array which
# is the wrong `shape` for a color image!
coffee.T.shape

(3, 600, 400)

# Move the columns into the rows, the rows into the columns, and leave the
# color channels in the third dimension.
coffee_transposed = coffee.transpose((1, 0, 2))
plt.imshow(coffee_transposed)

# Show the attributes (not that the `shape` is still correct for a color
# image).
show_attributes(coffee_transposed)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (600, 400, 3)
Max Pixel Value: 255
Min Pixel Value: 0

_images/e092896a32985974d8bfe34dddd4251f5a45d69285dddc58178c10b94f948bea.png

Compare this to a 90 degree rotation via skimage; pay attention to the spoon!

# Show the difference between rotating and transposing.
plt.subplot(1, 2, 1)
plt.imshow(ski.transform.rotate(coffee, angle=90, resize=True))
plt.title("ski.transform.rotate()")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(coffee_transposed)
plt.title(".transpose()")
plt.axis('off');

_images/3f1caf516bedb2502b0d5360fbbac189ca3cb8009774db1a991c2ab8f1c8b592.png

Unless you specifically want a mirroring transformation, then use .rotate()!

Exercise 13

Your mission now is to transform camera into this slightly brain-bending image:

For comparison, here is the original camera image and its attributes:

camera = ski.data.camera()
plt.imshow(camera)

<matplotlib.image.AxesImage at 0x7f1b15918a50>

_images/35f9dfe13bfb22923a0ffb4e30a0a71877c6b8ffb9cf0556ebe377899835cbf3.png

show_attributes(camera)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 512)
Max Pixel Value: 255
Min Pixel Value: 0

Your final image should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 1024)
Max Pixel Value: 255
Min Pixel Value: 0

Hint: if you did not complete the earlier exercise involving combining two i_img arrays, then you may want to investigate NumPy functions for combining arrays together to complete the current exercise…

Caution: you may run into some errors/odd outcomes because of dtypes here… so use the ski.util conversion functions if you need to…

# YOUR CODE HERE
look_at_me = ski.data.camera()

Solution to Exercise 13

Here we need a transpose operation, followed by a rotation, followed by a concatenation.

That is a mouthful, let’s break it down. First we transpose the camera array, we can use .T here rather than .transpose as we are dealing with a 2D grayscale image:

# Transpose `camera`.
camera = ski.data.camera()
mirror_camera = camera.T
plt.imshow(mirror_camera);

_images/a312af2ed7b8df0254fb872f41b139923a568f305478e438ea3bcf2dd4cd594a.png

Now the image is transposed, we have achieved our “mirror” effect. To get this mirrored cameraman to look the other guy in the eye, we now need to rotate the image 270 degrees, for which we use ski.transform.rotate():

# Rotate and show.
rotated_mirror_camera = ski.transform.rotate(mirror_camera,
                                             angle=270,
                                             resize=True) # Preserve shape.
plt.imshow(rotated_mirror_camera);

_images/2ab979f7a4113c3bbdf5667a0609830c8e63ab4173326bd53040963c70f48477.png

Now, we must combine the images, using np.concat(), combining along the columns (axis = 1). First, however, we need to ensure the dtypes match! If we don’t we get an unexpected result:

# Combine the images, non-matching `dtype`s...
look_at_me = np.concat([camera, rotated_mirror_camera], axis=1)
plt.imshow(look_at_me);

_images/cebd976eee86ca6c8ce8e6934ec15ffbef529cad220047e876a92c88967cb578.png

Oops. This has occurred because ski.transform.rotate() has converted the image to the float64 dtype, whereas the original is uint8:

# The attributes of the original image (look at the `dtype`).
show_attributes(camera)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 512)
Max Pixel Value: 255
Min Pixel Value: 0

# The attributes of the rotated mirror image (look at the `dtype`).
show_attributes(rotated_mirror_camera)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (512, 512)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Using ski.util.img_as_ubyte() will ensure that the pixel values match, giving us our final, surreal result:

# Ensure matching `dtype`s.
rotated_mirror_camera = ski.util.img_as_ubyte(rotated_mirror_camera)

# Combine the images and show the result.
look_at_me = np.concat([camera, rotated_mirror_camera], axis=1)
plt.imshow(look_at_me);

_images/9e24e2a902819adf6f68ee36efb7d589a3dd9612d87549c17ba336017d693fa6.png

If in doubt, check the dtype!

Exercise 14

For this exercise, you should load in the cat image from ski.data. Here is the original cat image:

cat = ski.data.cat()
plt.imshow(cat)

<matplotlib.image.AxesImage at 0x7f1b10289450>

_images/4ee7477b739aa9fda7066966d5336ad7e5db317659e8f8e26fc7d7318b27b680.png

The original image has the following attributes:

show_attributes(cat)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (300, 451, 3)
Max Pixel Value: 231
Min Pixel Value: 0

Now, using only numpy and skimage, try to recreate this target image:

…poor cat!

Your output image should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (30, 30, 3)
Max Pixel Value: 0.76
Min Pixel Value: 0.0

Hint: Note the yellow tinge to the image. You can achieve this tinge by modifying the weights of the colors in the image.

# YOUR CODE HERE

Solution to Exercise 14

The solution here again involves multiple steps. First we load in the cat image:

# Load in and show the `cat` image.
cat_solution = ski.data.cat()
show_attributes(cat_solution)
plt.imshow(cat_solution);

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (300, 451, 3)
Max Pixel Value: 231
Min Pixel Value: 0

Next, to replicate the yellow-y tone of the target image, we “switch off” the blue color channel (because red + green = yellow):

# Make it yellow!
cat_solution[:, :, 2] = 0
plt.imshow(cat_solution);

_images/08e8e5dd7c0f42b2197dd3c364efb4f5c7bfef36fc7dd37628114738eeb39988.png

The target image has the .shape (30, 30, 3); the original image has the .shape (300, 451, 3).

To get the correct output image size, we must downscale the image. To do this, we can use ski.transform.resize():

# Downscale the image.
cat_solution = ski.transform.resize(cat_solution,
                                    output_shape=(30, 30, 3))
# Show the result.
show_attributes(cat_solution)
plt.imshow(cat_solution);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (30, 30, 3)
Max Pixel Value: 0.8
Min Pixel Value: 0.0

_images/0c810295b59256c67247892171b3accec01450632a09ae8127ffa82cdb398a3b.png

You’ll notice that downsizing the image has seriously downgraded the quality. This makes sense, because the original high-resolution image utilized its many pixels to depict intricate details. After resizing to a lower-resolution, this detail is lost, because now we are trying to show the same image using substantially fewer pixels. skimage has done some averaging behind the scenes, to work out what should go in the remaining pixels. At least we can still see the image is of a cat…

Now, we can see that the target image requires a rotation, through good judgment, or just trial and error, we can work out that this needs to be 45 degrees. We can also see that the image of the downscaled cat does not “fit” inside the final image. As such, we need to rotate without resizing the image. By default, resize=False so we can leave it on the default setting:

# Rotate and show the result.
cat_solution = ski.transform.rotate(cat_solution,
                                    angle=45)
show_attributes(cat_solution)
plt.imshow(cat_solution);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (30, 30, 3)
Max Pixel Value: 0.76
Min Pixel Value: 0.0

_images/d736118e2b0467a26d5167a0c8eb1d58c57c60beb92e3ab5ccdbf70ec1dd90fd.png

We will leave it to you to use this image as a logo for your “Pixel Cat” meme coin.

Cropping#

The process of cropping is the removal of areas of pixels from an image.

Because our images are just NumPy arrays, cropping is just NumPy indexing (duh!). As such, we can crop images just with indexing operations, without using specific NumPy (or skimage) functions.

For instance, we can “shave” our i_img array in half, along the columns by slicing along the columns:

# Cut in half.
half_i = i_img[:, 4:8]
half_i

array([[0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 0., 0., 0.]])

plt.imshow(half_i);

_images/d49cf5fc6278b914715513a6ba4e3407bb4b73221ae93baf52d055534f43b0e1.png

Likewise along the rows (albeit the number of rows is odd!):

plt.imshow(i_img[0:8, :]);

_images/53e4ef5e55ffd250477e11cfac8c56455e60026b5dd8764ec05cf70ca3103380.png

Exercise 15

Take the original camera image:

camera = ski.data.camera()
plt.imshow(camera)

<matplotlib.image.AxesImage at 0x7f1b24466d50>

…and crop it down to this target image, using only NumPy indexing:

These are the attributes that your final image should possess:

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (30, 80)
Max Pixel Value: 250
Min Pixel Value: 23

Hint: using plt.grid() might be of use in identifying the part of the image you need.

# YOUR CODE HERE
camera_crop_exercise = camera.copy()
plt.imshow(camera);

Solution to Exercise 15

This is a simple task once you know what indices you need. Using plt.grid() can make this information easier to obtain:

# Show `camera`, with a helpful grid.
plt.imshow(camera)
plt.grid();

_images/24d8a74b83195f651da0c97b76a0e53f3df346ca709308f6365c3dc858d2f0b5.png

The exact indices we used are here:

# Solution.
plt.imshow(camera[130:160, 240:320]);

_images/8a1b9f3b223b1ddea3392ea77c0e7d61f519ec7c253d44e2a45a03e5c26e17a9.png

# Solution (attributes).
show_attributes(camera[130:160, 240:320])

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (30, 80)
Max Pixel Value: 250
Min Pixel Value: 23

Masks#

In image processing, a mask is an array where the elements express weights or binary (0 or 1) values to select areas in another image (array).

A mask is “placed” on an image, and one typically then applies operations to the pixels indicated by the mask. For example, one might use the mask with an image array to replace pixels indicated by the mask with a specific value. Let’s demonstrate with a real image — a grayscale version of the standard coffee image:

# Make coffee RGB image into single-channel image.
coffee_gray = ski.color.rgb2gray(ski.data.coffee())
show_attributes(coffee_gray)
plt.imshow(coffee_gray);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (400, 600)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

_images/d82d52811692c04712ff3dd4c3535ca3c131a987f9aaed636b100e92840c3bfd.png

In this case we are going to create a new mask image, that corresponds to the coffee_gray image (has the same shape), but where we will create the mask values with a mathematical formula using the row and column indices. In fact we’ll do this to create a circular mask. Bear with us, all should be come clear as we go.

# Unpack and store the number of rows and number of columns.
dim_0, dim_1 = coffee_gray.shape
dim_0, dim_1

(400, 600)

We are going to create a circular mask, using the formula for a circle.

That formula needs the i (row) and j coordinates for each pixel, the center coordinate of the circle, and the radius r of the desired circle.

Call the row and column center coordinates \(c_i, c_j\) respectively.

We can tell if a particular pixel at (i, j) is outside the circle by testing whether the Euclidean distance of the pixel position (i, j) from the center is greater than \(r\).

\[ \sqrt{(i - c_i)^2 + (j - c_j)^2} > r \]

First we use the Numpy meshgrid function to return two arrays, one containing all the i (row) coordinates at each pixel, and another containing all the j (column) coordinates at each pixel.

# indexing='ij' tells meshgrid to return `i` and `j` coordinates.  There are other modes, not relevant here.
i_coords, j_coords = np.meshgrid(np.arange(dim_0), np.arange(dim_1), indexing='ij')
# i coordinate for each element.
i_coords

array([[  0,   0,   0, ...,   0,   0,   0],
       [  1,   1,   1, ...,   1,   1,   1],
       [  2,   2,   2, ...,   2,   2,   2],
       ...,
       [397, 397, 397, ..., 397, 397, 397],
       [398, 398, 398, ..., 398, 398, 398],
       [399, 399, 399, ..., 399, 399, 399]], shape=(400, 600))

# j coordinate for each element.
j_coords

array([[  0,   1,   2, ..., 597, 598, 599],
       [  0,   1,   2, ..., 597, 598, 599],
       [  0,   1,   2, ..., 597, 598, 599],
       ...,
       [  0,   1,   2, ..., 597, 598, 599],
       [  0,   1,   2, ..., 597, 598, 599],
       [  0,   1,   2, ..., 597, 598, 599]], shape=(400, 600))

# The coordinate of the image center in pixels.  Remember that pixel indices
# start at 0.
c_i, c_j = (dim_0 - 1) / 2, (dim_1 - 1) / 2
# Radius
r = 275

We can then use the formula above to generate a 2D array where True (== 1) means outside the circle and False (== 0) means inside the circle.

# Create a circular mask.
mask = np.sqrt((i_coords - c_i) ** 2 + (j_coords - c_j) ** 2) > r
show_attributes(mask)
plt.imshow(mask);

Type: <class 'numpy.ndarray'>
dtype: bool
Shape: (400, 600)
Max Pixel Value: True
Min Pixel Value: False

_images/c4a9148ba2313342c28dca009ce893a938f0fc0681e4ef2e74a754e3d1f87436.png

Once we have our mask - which is just a Boolean array - it is just a matter of Boolean indexing to set all the pixel values in the image to the same value, where there is a True in the corresponding element in the mask:

# Apply the mask.
coffee_gray_masked = coffee_gray.copy()
coffee_gray_masked[mask] = 0

plt.matshow(coffee_gray_masked);

_images/30aecc7b34cc5409048f2e563d4e56f26945cc833cd94251a78d05a77fa6e915.png

Exercise 16

Start with the camera image:

camera = ski.data.camera()
plt.imshow(camera);

Now consider this formula, where \(p\) is some constant:

\[ ((i - c_i)^3 + (j - c_j)^3)^{1/3} > p \]

(where \(x^{1/3}\) is the cube-root of \(x\)).

Use that formula, with some suitable value for p, and masking as above, to create the following image:

Solution to Exercise 16

We apply the formula above to the i and j coordinates of the camera image:

# Solution pt 1.
cam_c_i, cam_c_j = (np.array(camera.shape) - 1) / 2
p = 55  # By experiment.
i_cam, j_cam = np.meshgrid(np.arange(camera.shape[0]),
                           np.arange(camera.shape[1]),
                           indexing='ij')
# Apply formula to make mask.
mask_camera = ((i_cam - cam_c_i) ** 3 + (j_cam - cam_c_j) ** 3) ** (1 / 3) > p
plt.imshow(mask_camera);

/tmp/ipykernel_2783/2746963989.py:8: RuntimeWarning: invalid value encountered in power
  mask_camera = ((i_cam - cam_c_i) ** 3 + (j_cam - cam_c_j) ** 3) ** (1 / 3) > p

_images/123c665ff25db3b701165b4ffb58e9f44a742dbaa44d265ba5b49357045e5f3c.png

# Solution pt 2.
camera_masked_solution = camera.copy()
camera_masked_solution[mask_camera] = 1
plt.imshow(camera_masked_solution);

_images/d9c0f0d9fb1e93ccdd88566f8cbed377629ded1ef6b06cde3d6939945b5be4c3.png

Inverting image colors with Numpy#

We saw color inversion on an earlier page. This is where all the pixel values in an image are, shockingly, inverted: high numbers become low numbers and vice versa:

For a binary image, this involves swapping 1s and 0s…

# Original image
i_img

array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

plt.imshow(i_img);

…which can be accomplished with some simple numeric operations:

inverted_i = 1 - i_img
inverted_i

array([[1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 0., 0., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1.]])

plt.imshow(inverted_i);

_images/2d8e811c0a14ecbc1fab91f7354a8bc4fd87a807f43c8dc62f8372118b4eb837.png

What about a color image?

colorwheel = ski.data.colorwheel()
show_attributes(colorwheel)
plt.imshow(colorwheel);

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (370, 371, 3)
Max Pixel Value: 255
Min Pixel Value: 0

_images/a45f666e5776e35adfcfa813430b2f2d3006553ddc5420ec486e87d226653365.png

Because the maximum value is now 255, we can subtract each array pixel value in each color channel from 255 to “reverse” the values:

# Invert the color image, manually.
inverted_colorwheel = colorwheel.copy()
for i in np.arange(3):
    inverted_colorwheel[:, :, i] = 255 - inverted_colorwheel[:, :, i] 

plt.imshow(inverted_colorwheel);

_images/23537d3a744b91511a23625f2f053c610e2d91c008f648dc04efa41f8920764f.png

Inverting colors with `skimage`#

ski.util.invert() handles color inversion, we simply pass it a color NumPy image array and voilà!:

# Invert the color image, with `skimage`.
invert_colorwheel_with_skimage = ski.util.invert(colorwheel)
plt.imshow(invert_colorwheel_with_skimage);

Exercise 17

Now over to you. You will be working on the brick image from ski.data:

brick = ski.data.brick()
show_attributes(brick)
plt.imshow(brick)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 512)
Max Pixel Value: 207
Min Pixel Value: 63

<matplotlib.image.AxesImage at 0x7f1b243c4050>

_images/0517c8f9ad13e398cd75c28dd662d822fea0b72500e97428433c29c274fc04bc.png

You should then invert every 2nd element on even numbered rows…

So on even numbered rows (row 0, 2, 4, 6, etc.), if you went through the elements in pairs along the row, the second element in each pair should be inverted, vs the original image.

Your final image should look like this:

Your new image should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 512)
Max Pixel Value: 207
Min Pixel Value: 48 

Use only NumPy indexing and Scikit-image functions to do this…

Hint: remember that smaller NumPy arrays indexed out of larger NumPy arrays are still NumPy arrays, and so can be passed as arguments to most skimage functions.

# YOUR CODE HERE

Solution to Exercise 17

The solution here is just to use slicing to select the elements we need, and ski.util.invert() just these elements, using array indexing.

To get every 2nd element on even numbered rows, we can use brick[0::2, 1::2].

This is easier to appreciate in a smaller array:

small_arr = np.zeros((8, 8))
small_arr

array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

# Get every second element, on even numbered rows.
small_arr[0::2, 1::2] = 1
small_arr

array([[0., 1., 0., 1., 0., 1., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

# Show the small array (0 elements are now gray).
plt.imshow(small_arr);

_images/2697ea749e42f87a627f12547ca7809059e02614df4c19e74e11a47d301d4b15.png

We can invert just the pixel values at these locations, by passing the slices to ski.util.invert():

# Solution.
brick_solution = ski.data.brick()

brick_solution[0::2, 1::2] = ski.util.invert(brick_solution[0::2, 1::2])

show_attributes(brick_solution)
plt.imshow(brick_solution);

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 512)
Max Pixel Value: 207
Min Pixel Value: 48

_images/33fe090b66fabdc9777fcd49837fc431410d9aff9128553784b6b65877a93e5f.png

Exercise 18

Again using the cat image, try to recreate the following target image, using only numpy and skimage:

Your output image should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (300, 451, 3)
Max Pixel Value: 255
Min Pixel Value: 24

Notice the color change from the standard cat image.

# YOUR CODE HERE
cat = ski.data.cat()
plt.imshow(cat);
show_attributes(cat)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (300, 451, 3)
Max Pixel Value: 231
Min Pixel Value: 0

Solution to Exercise 18

The target image cannot be created using just ski.util.invert() on the whole cat image array:

cat_solution_2 = ski.data.cat()
plt.imshow(ski.util.invert(cat_solution_2));

_images/e41e84913b0e023c43fb6e289c55a4b52b724a24a38ec5039d5ffa4f43df6f62.png

This looks cool - if anything cooler than the target image - but it is not what we want.

To recreate the target image, we must “switch” off the green color channel before we invert the colors:

cat_purple = cat_solution_2.copy()
cat_purple[:, :, 1] = 0
plt.imshow(cat_purple);

_images/e62479deb4fbcaff7dd3ed6345766a706e09355e48ba448575e5d5348ba732a8.png

Now, when we invert the purple cat image, we get the target image:

# Solution.
inverted_cat_purple = ski.util.invert(cat_purple)
plt.imshow(inverted_cat_purple)

<matplotlib.image.AxesImage at 0x7f1b158cce10>

_images/f24138b0cdd47cb5338b4891429c156c8631b62e59cf4c9d060c8ff6d5be0a74.png

Greyscale to binary conversion#

Greyscale to binary conversion can be achieved using comparison operators (<, >, <=, >=, ==). The resulting Boolean array will always be binary (True or False, 1 or 0).

Let’s demonstrate with a grayscale image:

# Create a grayscale image.
# Make a random number generator, with predictable outputs.
rng = np.random.default_rng(10)
random_check = np.array([[1, 0, 1, 0],
                         [0, 1, 0, 1],
                         [1, 0, 1, 0],
                         [0, 1, 0, 1]], dtype=int)
# Replace ones with random integers.
n_checks = np.count_nonzero(random_check)
random_check[random_check == 1] = rng.integers(3, 12,  # From 3 through 11.
                                               size=n_checks)
plt.matshow(random_check);

_images/3443bf1db602944693def16c48eec84dc7e0cb1f18466125b591d329b9ad9204.png

# Convert to a binary image:
binary_check = random_check > np.median(random_check)
binary_check

array([[ True, False,  True, False],
       [False,  True, False,  True],
       [ True, False,  True, False],
       [False,  True, False,  True]])

plt.matshow(binary_check);

_images/c1818365f2f5e6dfe7ddae7e6cbbba5c1ce017291453bdc561b981468d1a7324.png

We can also use skimage to do this work - see the filtering page for more detail. For now, we can use the ski.filters.threshold_minimum() function. This supplies us a recommended threshold value to attempt to divide the array pixels into two classes e.g. two classes where the pixels in each class are maximally different from pixels in the other class:

# Get a recommended threshold from `skimage`.
threshold = ski.filters.threshold_minimum(random_check)
threshold

np.int64(2)

We can then use this threshold to create a binary array, successfully binarizing our grayscale image:

# Binarize the array, based on the threshold.
binary_check_from_ski = random_check > threshold
show_attributes(binary_check_from_ski)
plt.matshow(binary_check_from_ski);

Type: <class 'numpy.ndarray'>
dtype: bool
Shape: (4, 4)
Max Pixel Value: True
Min Pixel Value: False

Color to grayscale conversion with Numpy#

To downgrade a color image to grayscale we can use a brute force method of taking the mean of the three color channels, to produce a 2D monochrome image array:

gray_wheel = np.mean(colorwheel, axis=2)
plt.imshow(gray_wheel);

_images/b48f40f237e716152b1b6ca7866c7e1183b7cb5a4800bbfc1a262e1fffc306f7.png

A better option, that we encountered on the previous page, is to use the luminance formula:

\[ Y = 0.2126R + 0.7152G + 0.0722B \]

This collapses a 3D color image into a 2D grayscale image via a weighted sum of the channels.

Exercise 19

Here is the original colorwheel image:

colorwheel = ski.data.colorwheel()
plt.imshow(colorwheel)

<matplotlib.image.AxesImage at 0x7f1b15699e50>

It has the following attributes:

show_attributes(colorwheel)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (370, 371, 3)
Max Pixel Value: 255
Min Pixel Value: 0

Using the luminance formula, and any other required numpy, scipy or skimage operations, recreate the target image below, starting from the colorwheel array:

Your final image should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (370, 371)
Max Pixel Value: 255
Min Pixel Value: 0

Hint: For Numpy practice, try to use array multiplication @, or np.dot() or Numpy broadcasting, to apply the luminance formula, rather than slicing.

Hint: you may need to rescale the intensity values to match the target image attributes. You may recall there is a ski.exposure function which can help you do this…

# YOUR CODE HERE

Solution to Exercise 19

The solution here requires some close attention to dtype. To apply the luminance formula using np.dot() is pretty straightforward:

# Solution pt 1.

# Luminance formula coefficients
weights = np.array([0.2126, 0.7152, 0.0722])

# Run the numbers...
gray_wheel = colorwheel @ weights

# Show the result.
show_attributes(gray_wheel)
plt.imshow(gray_wheel);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (370, 371)
Max Pixel Value: 255.0
Min Pixel Value: 0.0

_images/cc7da6d230fa7a4390d69d975dd2de27650cebca5350f762a3e579dd885e0098.png

Notice that the matrix multiplication above left us with float values, as the weights in the luminance formula are floating point numbers.

But this gives us min/max pixel intensity values which are not standard for the float64 dtype, and will generate an error when we try to convert to uint8 to match the target image:

# Ouch!
ski.util.img_as_ubyte(gray_wheel)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[115], line 2
      1 # Ouch!
----> 2 ski.util.img_as_ubyte(gray_wheel)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/util/dtype.py:578, in img_as_ubyte(image, force_copy)
    557 def img_as_ubyte(image, force_copy=False):
    558     """Convert an image to 8-bit unsigned integer format.
    559 
    560     Parameters
   (...)    576 
    577     """
--> 578     return _convert(image, np.uint8, force_copy)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/util/dtype.py:314, in _convert(image, dtype, force_copy, uniform)
    311     return image.astype(dtype_out)
    313 if np.min(image) < -1.0 or np.max(image) > 1.0:
--> 314     raise ValueError("Images of type float must be between -1 and 1.")
    315 # floating point -> integer
    316 # use float type that can represent output integer type
    317 computation_type = _dtype_itemsize(
    318     itemsize_out, dtype_in, np.float32, np.float64
    319 )

ValueError: Images of type float must be between -1 and 1.

The ski.exposure.rescale_intensity() function can save us again here. First, we make the values standard for a float64 image.

# Rescale the pixel intensity values into a legal range, for the current `dtype`.
gray_wheel = ski.exposure.rescale_intensity(gray_wheel,
                              out_range=(0, 1))

show_attributes(gray_wheel)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (370, 371)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Now, we can painlessly convert to uint8. Finally, to match the target image, we must ski.transform.invert() our depressingly dreary colorwheel:

# Convert to `uint8` `dtype`.
gray_wheel = ski.util.img_as_ubyte(gray_wheel)

# Invert the color and show the result.
gray_wheel =  ski.util.invert(gray_wheel)
show_attributes(gray_wheel)
plt.imshow(gray_wheel);

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (370, 371)
Max Pixel Value: 255
Min Pixel Value: 0

_images/cb91fc836ffeaec418d3d950b517f3836106992093e43d92d6bd2e78658b0f4d.png

Color to grayscale conversion with `skimage`#

As always, the ski.color module has us covered here with the rgb2gray() function. We simply pass it the color array that we want to convert to grayscale, without the direct need for the luminance formula:

# Convert color to grayscale.
gray_colorwheel_from_ski = ski.color.rgb2gray(colorwheel)
show_attributes(gray_colorwheel_from_ski)
plt.imshow(gray_colorwheel_from_ski);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (370, 371)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

_images/1e80785bd9b842a80dd933939a9c22ab33656fa8602973c0c9d03912f0b630c8.png

Summary#

This page has shown how to implement some fundamental image processing operations with NumPy, SciPy and Scikit-image. The next page will delve into image filtering.

References#

Gulati, J. (2024) NumPy for Image Processing. KDnuggets. Available from: https://www.kdnuggets.com/numpy-for-image-processing

Also see color images page references.