Introducing Scikit-image

Introducing Scikit-image#

Note

This tutorial is adapted from “Image manipulation and processing using NumPy and SciPy” by Emmanuelle Gouillart and Gaël Varoquaux, and “scikit-image: image processing” by Emmanuelle Gouillart. Please see the References section at the end of the page for other sources and resources.

Scikit-image (skimage) is a powerful image processing package which is built in a large part from NumPy and SciPy, and uses NumPy arrays as its fundamental image representation. This page will show some foundational aspects of skimage, focusing on its coherence with the NumPy image array concepts we saw on the last two pages.

Images are arrays, arrays are images#

As we have seen, we can think of any (at least) two-dimensional array as an image — call this an image array. We know that each element of an image array (an “array pixel”) contains information about the intensity or color of that element. Perhaps obviously, these fundamental principles apply equally to Scikit-image, given that it also represents images as NumPy arrays. The table below outlines how images are represented in skimage - you’ll notice it is identical to the principles we have seen for representing images in NumPy:


Image	`np.ndarray`
Pixels	array values e.g. `img_arr[2, 3]`
Image Array `dtypes`	(`np.uint8`, `np.float`, many others)

Before exploring how skimage works, let’s start by building some image objects to perform image manipulations on. As normal, we begin by importing some libraries:

# Import libraries.
import numpy as np
import matplotlib.pyplot as plt

# Set the default colormap for this session.
plt.rcParams['image.cmap'] = 'gray'

# Import a custom function to give hints for some exercises, and
# a custom function to quickly report array image attributes.
from skitut import hints, show_attributes

We can use some simple NumPy functions and the familiar array indexing syntax to create our familiar, exceedingly artistic, image array:

# Create an image array.
squares = np.array([[1, 0,],
                    [0, 1,]],
                   dtype=float)

# Show the array ("raw" output from NumPy)
squares

array([[1., 0.],
       [0., 1.]])

# Display the array as an image with Matplotlib
plt.matshow(squares);

_images/537da698d538dc2e027b8fef4e66635a37eb47d3d8fe0b431d863547b3f90def.png

You will recall that this array — in virtue of being 2D and having only one numeric value per array pixel — is a single-channel array. This means that it contains only information about pixel intensity (representable, for example, with gray-level), and gives no information about color.

To include color information, our array must be of at least three dimensions. A three-dimensional color image array typically has three “slices” in the third dimension. Each element of each slice contains a number between 0 and 1 (for float64 data), or between 1 and 255 (for uint8 data). For the standard RGB (Red-Green-Blue) format the number in each element of each slice in the third dimension tells that array pixel what color to be (e.g. a mix of red, green and blue, across the three slices). Hence these slices, in the third dimension, are called “color channels”.

As we saw previously, we can control the amount of color in the image by manipulating each channel in the third dimension:

# Using `np.stack()` to create a multi-channel array.
red_squares = np.stack([squares, # 1's are only present in the red channel.
                        squares * 0, # The green channel is "switched off".
                        squares * 0], # The blue channel is "switched off".
                        axis=2)
show_attributes(red_squares)
plt.matshow(red_squares);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (2, 2, 3)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

_images/9c8008b193d754d16733b519dc26fb71b21c6070e548f56afcecdc04e5cff7c5.png

We show this principle again below, for each color channel. The top row of the plot shows the channels for uint8 data with 255 as the maximum pixel intensity value. The second row of the plot shows each channel for float64 data, using 1 as the maximum pixel intensity value:

# Use a custom function to plot the maximum intensity values in the diagonals, for `int` and `float` data,
# using our familiar 1's and 0's array.
from skitut.random_colors import plot_int_float

plot_int_float();

_images/0a7388245c4028131fc7f69ef773feb2d8c17a5d39372297df6fd5878c038ef2.png

These are the basic aspects of images represented as NumPy arrays. Other ways of representing color exist and are supported by skimage, and indeed, not all 3D arrays representing images contain color information (e.g. structural brain images, as we saw for brain images). We will look at these concepts in later sections, but skimage can work with these images as easily as the basic colored squares we have just made — all these images are just NumPy arrays and can be handled just as any NumPy array.

Going `ski`-ing : arrays as arguments#

So far so good, we have an image array to experiment with. Let’s import skimage itself. The convention for importing skimage (and the convention you will see in most other peoples’ code) is to name the import ski:

# Import statement for Scikit-image and the conventional shorthand.
import skimage as ski

Because Scikit-image represents images as NumPy arrays, the majority of Scikit-image functions take NumPy ndarrays as arguments. For instance, the rgb2gray() function from the ski.color module can take a multi-channel array, and convert it to a single channel, grayscale image. We will show this with the red_squares array from above:

# Show the original `red_squares` array
show_attributes(red_squares)
plt.matshow(red_squares);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (2, 2, 3)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Now, we just pass the squares array to ski.color.rgb2gray() as an argument:

# Convert the `red_squares` array to grayscale.
back_to_black = ski.color.rgb2gray(red_squares)
show_attributes(back_to_black)
plt.matshow(back_to_black);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (2, 2)
Max Pixel Value: 0.21
Min Pixel Value: 0.0

Many Scikit-image functions work like this - they take a NumPy array as input, do something to the array pixel values, then return an altered image array. As we mentioned above, under the hood skimage is often using NumPy and SciPy functions to do the image manipulation, something we will explore in more detail on the next page.

You may ask why the maximum pixel values are now 0.21, rather than 1? This is because of the formula that rgb2gray() uses to do the conversion. We will show this in detail in the Colorspaces section later in this tutorial.

Scikit-image ships with built-in images#

Conveniently, skimage ships with example image arrays included. It should come as no surprise that these are also NumPy arrays. We can access all of the built-in images through the ski.data module. Here we use dir() to show all of the available module attributes, most of which are example image arrays:

dir(ski.data)

['astronaut',
 'binary_blobs',
 'brain',
 'brick',
 'camera',
 'cat',
 'cell',
 'cells3d',
 'checkerboard',
 'chelsea',
 'clock',
 'coffee',
 'coins',
 'colorwheel',
 'data_dir',
 'download_all',
 'eagle',
 'file_hash',
 'grass',
 'gravel',
 'horse',
 'hubble_deep_field',
 'human_mitosis',
 'immunohistochemistry',
 'kidney',
 'lbp_frontal_face_cascade_filename',
 'lfw_subset',
 'lily',
 'logo',
 'microaneurysms',
 'moon',
 'nickel_solidification',
 'page',
 'palisades_of_vogt',
 'protein_transport',
 'retina',
 'rocket',
 'shepp_logan_phantom',
 'skin',
 'stereo_motorcycle',
 'text',
 'vortex']

Here we will load the coffee image from ski.data:

# Load in the image.
coffee = ski.data.coffee()

# Show the image array.
coffee

array([[[ 21,  13,   8],
        [ 21,  13,   9],
        [ 20,  11,   8],
        ...,
        [228, 182, 138],
        [231, 185, 142],
        [228, 184, 140]],

       [[ 21,  13,   7],
        [ 21,  13,   9],
        [ 20,  14,   7],
        ...,
        [228, 182, 136],
        [231, 185, 139],
        [229, 183, 137]],

       [[ 21,  14,   7],
        [ 23,  13,  10],
        [ 20,  14,   9],
        ...,
        [228, 182, 136],
        [228, 184, 137],
        [229, 185, 138]],

       ...,

       [[189, 124,  77],
        [214, 155, 109],
        [197, 141, 100],
        ...,
        [165,  86,  37],
        [161,  82,  41],
        [143,  67,  29]],

       [[207, 148, 102],
        [201, 142,  99],
        [196, 140,  97],
        ...,
        [154,  74,  37],
        [147,  66,  33],
        [145,  65,  31]],

       [[197, 141, 100],
        [195, 137,  99],
        [193, 138,  98],
        ...,
        [158,  73,  38],
        [144,  64,  30],
        [143,  60,  29]]], shape=(400, 600, 3), dtype=uint8)

# Inspect the attributes of the image.
show_attributes(coffee)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (400, 600, 3)
Max Pixel Value: 255
Min Pixel Value: 0

# Show the image with Matplotlib.
plt.imshow(coffee);

_images/97fe6e20ae416a866bb7ec14a2e5b284f2a37fea6a05fbac65c3646d0ddd8270.png

Because these images are nothing but NumPy arrays, we can use standard array slicing to interact with them. For instance, we can use slicing to ruin the coffee image by placing a huge green square over it:

ruined_coffee = coffee.copy()
ruined_coffee[100:300, 200:400, :] = [0, 255, 0]  # [Red channel, Green channel, Blue channel]
plt.imshow(ruined_coffee);

_images/1d4d8d5b3cafce28fff5afe31a7ca85878a1cf0978fd1387838b13dccca9ad31.png

Remember our maxim? stating that “image processing” is when we do something that analyzes or changes the numbers inside the image array.

When we explore different image processing operations, using smaller arrays often makes it easier to understand what a given processing operation is doing to individual array pixels. However, in larger, more complex arrays, it is often easier to appreciate the “global” visual/perceptual effect of an image processing operation.

As such, we will use a variety of simple arrays (like squares) along with a variety of more complex images from skimage.data, like coffee, to show the effect of different skimage manipulations, as well as their constituent numpy and scipy operations.

Input/output and `dtype`s in `skimage`#

Before manipulating and processing images, we will need to load them into memory. After we are finished with our high concept digital art, we will want to save our creations. To help us with this, input and output (e.g. loading and saving image files) is handled by the skimage.io module.

We have already met this module on earlier pages but we will discuss some of its finer details here. skimage supports multiple image dtypes (e.g. the types of numbers within the image array), and loading and saving files can require close attention to, or even conversion of, the image dtype, as we will see shortly.

For now, let’s use ski.io.imread() to load a .png of the terrifying smile we hand-crafted in the earlier tutorials:

# Read in an image file, as a single-channel image.
smiley_from_file = ski.io.imread("images/smiley.png",
                                 as_gray=True)

# Show the "raw" NumPy output
smiley_from_file

array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 1., 0., 0.],
       [0., 0., 1., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 1., 0.],
       [0., 1., 1., 0., 0., 1., 1., 0.],
       [0., 0., 1., 1., 1., 1., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

# Show the array (graphically).
plt.imshow(smiley_from_file);

_images/297417f252b33580627496dc96632dbea90dfdee6aa49d13f19d2eee0af24a22.png

# Show the attributes of the image.
show_attributes(smiley_from_file)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (11, 8)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

No less terrifying than when we first saw it…

Note that ski.io.imread() loads image files as NumPy arrays by default. This is not the case with some other python image libraries (like pillow) which have their own ways of representing images.

Now, a lot of skimage functions serve specific purposes — like improving the quality or clarity of an image. Others are just there to look cool. The ski.transform.swirl() function falls into the latter category. According to the documentation, this function performs a non-linear deformation creating a whirlpool effect. Because skimage has loaded smiley.png as a NumPy array, we can pass the image straight to the swirl function:

#  Swirl the `smiley_from_file` array.
smiley_swirled = ski.transform.swirl(smiley_from_file,
                                     center=(3, 6), # Central pixel coordinate.
                                     radius=10)  # Extent of the `swirl`
                                                 # (in number of pixels)
# Show the "raw" NumPy output.
smiley_swirled

array([[0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.00338509,
        0.03785073, 0.        , 0.        ],
       [0.        , 0.14817294, 0.75800676, 0.        , 0.20296116,
        0.84274916, 0.        , 0.        ],
       [0.        , 0.22793052, 0.68880311, 0.        , 0.26538503,
        0.66049489, 0.        , 0.        ],
       [0.        , 0.06692754, 0.09253997, 0.        , 0.        ,
        0.06692754, 0.19601409, 0.01223244],
       [0.08225702, 0.47564046, 0.        , 0.        , 0.12032263,
        0.50828468, 0.91365098, 0.0458475 ],
       [0.01976743, 0.97398067, 0.47110121, 0.        , 0.46717838,
        0.97044561, 0.66297409, 0.00484172],
       [0.02687155, 0.34989506, 0.83183389, 0.93827959, 1.        ,
        0.71320245, 0.08225702, 0.        ],
       [0.        , 0.        , 0.14820102, 0.55782448, 0.84161038,
        0.20920003, 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.01297095, 0.        ,
        0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        ]])

# Show the swirled array (graphically).
plt.matshow(smiley_swirled);

_images/40b470d016f2edce9a78e7627cf9b6edb8e4a3a891dab5b74dc6bcaaa8e61c79.png

smiley only gets more terrifying with each manipulation, it seems…

What swirl has done here is, well, swirled the pixels around a central point, resulting in this crooked, wonky smile. Now, if we want to save our persistently terrifying creation, we can use ski.io.imsave() to save images…

# OUCH!
ski.io.imsave("images/smiley_swirled.png", # Path to save image to.
              smiley_swirled)              # Image array to save.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/PIL/PngImagePlugin.py:1380, in _save(im, fp, filename, chunk, save_all)
   1379 try:
-> 1380     rawmode, bit_depth, color_type = _OUTMODES[outmode]
   1381 except KeyError as e:

KeyError: 'F'

The above exception was the direct cause of the following exception:

OSError                                   Traceback (most recent call last)
Cell In[19], line 2
      1 # OUCH!
----> 2 ski.io.imsave("images/smiley_swirled.png", # Path to save image to.
      3               smiley_swirled)              # Image array to save.

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/_shared/utils.py:386, in deprecate_parameter.__call__.<locals>.fixed_func(*args, **kwargs)
    382     elif self.new_name is not None:
    383         # Assign old value to new one
    384         kwargs[self.new_name] = deprecated_value
--> 386 return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/io/_io.py:206, in imsave(fname, arr, plugin, check_contrast, **plugin_args)
    203     warn(f'{fname} is a low contrast image')
    205 with _hide_plugin_deprecation_warnings():
--> 206     return call_plugin('imsave', fname, arr, plugin=plugin, **plugin_args)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/_shared/utils.py:690, in deprecate_func.__call__.<locals>.wrapped(*args, **kwargs)
    684 stacklevel = (
    685     self.stacklevel
    686     if self.stacklevel is not None
    687     else _warning_stacklevel(func)
    688 )
    689 warnings.warn(message, category=FutureWarning, stacklevel=stacklevel)
--> 690 return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/io/manage_plugins.py:254, in call_plugin(kind, *args, **kwargs)
    251     except IndexError:
    252         raise RuntimeError(f'Could not find the plugin "{plugin}" for {kind}.')
--> 254 return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/imageio/v3.py:139, in imwrite(uri, image, plugin, extension, format_hint, **kwargs)
    104 def imwrite(uri, image, *, plugin=None, extension=None, format_hint=None, **kwargs):
    105     """Write an ndimage to the given URI.
    106 
    107     The exact behavior depends on the file type and plugin used. To learn about
   (...)    136 
    137     """
--> 139     with imopen(
    140         uri,
    141         "w",
    142         legacy_mode=False,
    143         plugin=plugin,
    144         format_hint=format_hint,
    145         extension=extension,
    146     ) as img_file:
    147         encoded = img_file.write(image, **kwargs)
    149     return encoded

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/imageio/core/v3_plugin_api.py:367, in PluginV3.__exit__(self, type, value, traceback)
    366 def __exit__(self, type, value, traceback) -> None:
--> 367     self.close()

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/imageio/plugins/pillow.py:145, in PillowPlugin.close(self)
    144 def close(self) -> None:
--> 145     self._flush_writer()
    147     if self._image:
    148         self._image.close()

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/imageio/plugins/pillow.py:486, in PillowPlugin._flush_writer(self)
    483     self.save_args["save_all"] = True
    484     self.save_args["append_images"] = self.images_to_write
--> 486 primary_image.save(self._request.get_file(), **self.save_args)
    487 self.images_to_write.clear()
    488 self.save_args.clear()

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/PIL/Image.py:2590, in Image.save(self, fp, format, **params)
   2587     fp = cast(IO[bytes], fp)
   2589 try:
-> 2590     save_handler(self, fp, filename)
   2591 except Exception:
   2592     if open_fp:

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/PIL/PngImagePlugin.py:1383, in _save(im, fp, filename, chunk, save_all)
   1381 except KeyError as e:
   1382     msg = f"cannot write mode {mode} as PNG"
-> 1383     raise OSError(msg) from e
   1384 if outmode == "I":
   1385     deprecate("Saving I mode images as PNG", 13, stacklevel=4)

OSError: cannot write mode F as PNG

Oh dear, what a horrible looking error for such a simple request. What has happened here? The error message is cryptic…

OSError: cannot write mode F as PNG

…but it is telling us that there is an issue with the dtype of the array we are trying to save:

smiley_swirled.dtype

dtype('float64')

skimage supports the following dtypes:

However, although skimage supports all these dtypes, the PNG image format does not. In particular, PNG does not support floating point image values, such as float64.

Issues like these are common, and as such skimage has a variety of functions to address them.

In this case, we can convert our image array from float64 to uint8 using the ski.util.img_as_ubyte() function:

smiley_swirled = ski.util.img_as_ubyte(smiley_swirled)

smiley_swirled.dtype

dtype('uint8')

This format is supported for .png files, and we can painlessly save our image using ski.io.imsave():

# Saving our image (successfully).
ski.io.imsave("images/smiley_swirled.png", # Path to save image to.
              smiley_swirled)              # Image array to save.

We can now use ski.io.imread() to read the file we just saved back into memory:

# Load the file back in.
load_back_in = ski.io.imread("images/smiley_swirled.png")

# Show the file.
plt.imshow(load_back_in);

_images/5c5dff20914097f082ba000ba46b21d7905e628daae0e0e77305efcbcb1c556d.png

We can see that the dtype of the freshly saved, freshly loaded image is indeed uint8:

show_attributes(load_back_in)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (11, 8)
Max Pixel Value: 255
Min Pixel Value: 0

Another option here, is just to save to another image format, like .jpg. Below we save the original smiley_swirled array as a .jpg file, avoiding the ugly error message:

# Save as `.jpg`.
ski.io.imsave("images/smiley_swirled.jpg", # Path to save image to.
              smiley_swirled)              # Image array to save.

Issues with dtype can be a source of errors, so it is important to be aware of what dtype your image arrays are using. Fortunately, as we have seen, to help remedy or avoid such errors, skimage makes it easy to convert between dtypes, where such errors do occur.

Note: you should prefer the ski.util conversion functions to using the NumPy ndarray.astype() method, when altering image dtypes. This is because the skimage functions will respect the min/max pixel intensity value conventions shown in the table above, where .astype() may not. Here is a list of the skimage conversion functions - you will need them!:

# Show the `skimage` `dtype` conversion functions.
[func for func in dir(ski.util) if func.startswith('img')]

['img_as_bool',
 'img_as_float',
 'img_as_float32',
 'img_as_float64',
 'img_as_int',
 'img_as_ubyte',
 'img_as_uint']

Small image arrays, big image arrays#

Before we move on, a note on swirl — we said earlier that it is easier to understand a manipulation at the array pixel-level using a small (low-resolution) array, but easier to appreciate its global visual effect on a larger (high-resolution) image. We saw the effect of swirl on smiley_swirled, but the nature of the visual effect can be seen more clearly when we apply it to coffee:

# `swirl` the coffee image
plt.imshow(ski.transform.swirl(coffee, strength=100));

_images/2459635bcf7e3471c9634473e92f050c485c20c9ba1ef7898ab427c002f50722.png

Pretty trippy…

Exercise 8

Now investigate the use of ski.util.random_noise() to add extra noise to an image. Remember that in the context of image processing, noise is randomness. Here is the original camera image from ski.data:

camera = ski.data.camera()
plt.imshow(camera);

_images/35f9dfe13bfb22923a0ffb4e30a0a71877c6b8ffb9cf0556ebe377899835cbf3.png

Try to recreate something like the image below, by adding noise to the image with ski.util.random_noise(). Like most skimage functions, random_noise() takes a NumPy image array as an argument, and adds random quantities to the array pixel values. To match the target image below, you will need to adjust an optional argument, see if you can work out which argument it is by reading the documentation.

Make sure you pay attention to the colors (RGB) of the noise…

Your new image should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (512, 512, 3)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

# YOUR CODE HERE

Solution to Exercise 8

The solution here is to np.stack() the camera image, to introduce color channels. (You could also use np.repeat). We then pass our 3D array to ski.util.random_noise(), manipulating the var argument to control the “level” of the noise. Specifically, var will control the range of the changes. Higher values of var means bigger absolute random values can potentially be added/subtracted from each pixel:

# Stack to 3D.
camera_with_noise  = np.stack([camera,
                               camera,
                               camera],
                               axis=2)

# Add noise to every color channel.
camera_with_noise = ski.util.random_noise(camera_with_noise, var=3)

# Show the result.
plt.imshow(camera_with_noise)
show_attributes(camera_with_noise);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (512, 512, 3)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

_images/341b5239cba5348ea7b4b97fc41924736bc18d5235280847dfdf1766fac0f897.png

Colorspaces#

Colorspaces are different ways of mapping values to colors.

So far we have looked primarily at binary/monochrome single-channel image arrays and three-channel Red-Green-Blue (RGB) color image arrays. We can use the shape conventions from NumPy to think about these image types. Single channel images have a shape of (n, m) where n is the number of rows and m is the number of columns. RGB images have a shape of (n, m, 3), so n rows, m columns and 3 slices (the color channels).

RGB is a common standard, but color images of shape (n, m, 3) can use different representations of color to the RGB method we have seen. We call these different color representation formats colorspaces. skimage supports many of them, and contains many functions for converting image arrays between colorspaces. These functions are contained in the ski.color module:

# Show the functions in `ski.color`.
dir(ski.color)

['ahx_from_rgb',
 'bex_from_rgb',
 'bpx_from_rgb',
 'bro_from_rgb',
 'color_dict',
 'combine_stains',
 'convert_colorspace',
 'deltaE_cie76',
 'deltaE_ciede2000',
 'deltaE_ciede94',
 'deltaE_cmc',
 'fgx_from_rgb',
 'gdx_from_rgb',
 'gray2rgb',
 'gray2rgba',
 'hax_from_rgb',
 'hdx_from_rgb',
 'hed2rgb',
 'hed_from_rgb',
 'hpx_from_rgb',
 'hsv2rgb',
 'lab2lch',
 'lab2rgb',
 'lab2xyz',
 'label2rgb',
 'lch2lab',
 'luv2rgb',
 'luv2xyz',
 'rbd_from_rgb',
 'rgb2gray',
 'rgb2hed',
 'rgb2hsv',
 'rgb2lab',
 'rgb2luv',
 'rgb2rgbcie',
 'rgb2xyz',
 'rgb2ycbcr',
 'rgb2ydbdr',
 'rgb2yiq',
 'rgb2ypbpr',
 'rgb2yuv',
 'rgb_from_ahx',
 'rgb_from_bex',
 'rgb_from_bpx',
 'rgb_from_bro',
 'rgb_from_fgx',
 'rgb_from_gdx',
 'rgb_from_hax',
 'rgb_from_hdx',
 'rgb_from_hed',
 'rgb_from_hpx',
 'rgb_from_rbd',
 'rgba2rgb',
 'rgbcie2rgb',
 'separate_stains',
 'xyz2lab',
 'xyz2luv',
 'xyz2rgb',
 'xyz_tristimulus_values',
 'ycbcr2rgb',
 'ydbdr2rgb',
 'yiq2rgb',
 'ypbpr2rgb',
 'yuv2rgb']

Let’s load in the cat image from ski.data, to look at a different colorspace:

# Load the image and show it.
cat = ski.data.cat()
show_attributes(cat)
plt.imshow(cat);

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (300, 451, 3)
Max Pixel Value: 231
Min Pixel Value: 0

_images/4ee7477b739aa9fda7066966d5336ad7e5db317659e8f8e26fc7d7318b27b680.png

On the previous page we manually converted color images to grayscale, using NumPy operations. As we saw earlier on the present page, functions from the ski.color module, like ski.color.rgb2gray(), can do this more elegantly, with less code:

# From RGB to grayscale.
gray_cat = ski.color.rgb2gray(cat)
show_attributes(gray_cat)
plt.imshow(gray_cat);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (300, 451)
Max Pixel Value: 0.76
Min Pixel Value: 0.02

_images/9c3ae295adbaf0628ca3aed667e93d398aa75b4efab21b06d3bf6f5eccb172c6.png

This grayscale conversion is achieved by calculating the luminance of the image via a weighted sum of the information in each color channel:

\[ L = 0.2126R + 0.7152G + 0.0722B \]

This has the effect of removing the color channels, so the output array is 2D. Why do we use these specific numbers? You could go and knock on the door of your physicist friend to ask (don’t worry, they’ll be in…), or you can see this page for a detailed explanation of where the weights come from.

Exercise 9

Re-color the cat image using the luminance formula provided above. Your output image should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (300, 451)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Hint: you may want to investigate ski.exposure.rescale_intensity() to help ensure that you match the attributes of the target image, above …

The cat image array is copied for you in the cell below:

# YOUR CODE HERE
cat_for_exercise = cat.copy()

Solution to Exercise 9

This is a very simple operation! We can just “plug” each color channel of the cat array into the formula, using NumPy array indexing:

# Apply the formula.        Red channel.           Blue channel.          Green channel.
cat_manual_gray = 0.2126 * cat[:, :, 0] + 0.7152 * cat[:, :, 1] + 0.0722 * cat[:, :, 2]
# Or the same result using Numpy broadcasting.
cat_manual_gray - np.sum(cat * [[[0.2126, 0.7152, 0.0722]]], axis=2)
plt.imshow(cat_manual_gray);

_images/e6c11487d993970c8e639a9539d85f5ee6622b56e5f26cfe58eaa4c1a504e0cd.png

However, this leaves our image with a not-standard pixel intensity range, given the dtype:

show_attributes(cat_manual_gray)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (300, 451)
Max Pixel Value: 192.68
Min Pixel Value: 3.86

It is fairly standard in skimage for float64 images to have a pixel intensity range between 0 and 1 (or -1 to 1, but we digress). Among other things, this matches the conventions that Matplotlib uses to display images; as we have seen, Matplotlib will render single-channel images with any value range, using the colormap. We might therefore choose to rescale the image to have range 0 through 1. We can use ski.exposure.rescale_intensity() to do that.

# Ensure the correct pixel intensity range.
cat_manual_gray = ski.exposure.rescale_intensity(cat_manual_gray,
                                                 out_range=(0, 1)) # Set the desired pixel intensity range.

# Show the result.
show_attributes(cat_manual_gray)
plt.imshow(cat_manual_gray);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (300, 451)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

We can also convert an image into a different colorspace (e.g. not grayscale). ski.color.rgb2hsv() will convert an RGB image to an image in the HSV colorspace. HSV stands for Hue, Saturation, Value. To get a feel for how HSV specifies color, have a look at the Wikipedia page link, and try the HSV color picker.

# Convert `cat` to the HSV colorspcae
hsv_cat = ski.color.rgb2hsv(cat)
show_attributes(hsv_cat)
plt.imshow(hsv_cat);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (300, 451, 3)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

_images/5c0fd3bcc2fedb56d4a43734a3fdf855c12e9eac10bcb112d639640f18e7e3dd.png

Psychedelic! Maybe this is how cats look, from the perspective of other cats (e.g. from within their “umwelt”), but sadly this is probably not the case…

Let’s extract the individual channels with some array indexing operations. Each channel is 2D, and therefore will render as a grayscale image when displayed with Matplotlib:

# Extract the HSV channels.
hue_slice = hsv_cat[:, :, 0]
saturation_slice = hsv_cat[:, :, 1]
value_slice = hsv_cat[:, :, 2]

# Plot them for comparison.
plt.figure(figsize=(12, 7))
plt.subplot(2, 2, 1)
plt.imshow(cat)
plt.title('Original (RGB)')
plt.subplot(2, 2, 2)
plt.imshow(hue_slice)
plt.title('Hue Channel')
plt.subplot(2, 2, 3)
plt.imshow(saturation_slice)
plt.title('Saturation Channel')
plt.subplot(2, 2, 4)
plt.imshow(value_slice)
plt.title('Value Channel')
plt.tight_layout();

_images/877be7df3ccb4fac5cd43ca9a030dbc2b71b0ed94f5ff9ebb03973056be34184.png

Colorspaces 2: transparency in the 3rd dimension#

Now, other colorspaces involve arrays of different shapes. For instance, some image arrays are (n, m, 4) - so that’s n rows, m columns and 4 slices in the third dimension.

When we have four values in the third dimension, we can interpret these as Red-Green-Blue-Alpha (RGBA), where Alpha is the opacity of the color.

Let’s see what that extra slice does, using our tried and true squares image array:

# Show the array.
plt.matshow(squares);

Let’s np.stack() this image array into 4-D, and set only the 1st and 4th channels to have nonzero values (e.g. so all values in the other, green and blue, channels are 0’s):

# Create an array with 4 channels (e.g. 4 slices in the third dimension).
four_channel_stack = np.stack([squares,
                               squares * 0, # All 0's in the green channel.
                               squares * 0, # All 0's in the blue channel.
                               squares], # Add a fourth slice in the third dimension.
                               axis=2)

plt.matshow(four_channel_stack);

_images/fb13f0ca0ead2d649bcb7368dcf9238d34960190c3400236d189782b7b956234.png

Ok, so we just get red nonzero pixels. So what does the 4th channel do?

It controls transparency. Setting it to 1 gives maximum opacity e.g. solid, non-see through color.

Let’s set it lower:

four_channel_stack[:, :, 3] = four_channel_stack[:, :, 3] * 0.25     # Fourth slice nonzero values to equal .25

plt.matshow(four_channel_stack);

_images/fead857b3222e1f3d4d5e89c66d584e87b69fd2e734fc79b72854c69fe624d32.png

This new transparency channel is called an alpha channel. Let’s add one to our cat image. We’ll duplicate the first slice of the third dimension, as the fourth slice:

four_channel_stack_cat = np.stack([cat[:, :, 0],
                                   cat[:, :, 1],
                                   cat[:, :, 2],
                                   cat[:, :, 0]], # Duplicate the first slice as the fourth slice...
                                   axis=2)

plt.imshow(four_channel_stack_cat);

_images/ebe0525cf3f8415499da93d19f643424610c66d06b48167451c46fce4a4754d0.png

Pretty ghostly…maybe this is how cats look to one another…

Exercise 10

Your task now, is to manipulate camera to make it look like the target image below. Use whatever numpy, skimage and numerical operations you need, based on what you have seen on this page and the previous pages. Here is the original camera image:

camera = ski.data.camera()
plt.imshow(camera)

<matplotlib.image.AxesImage at 0x7f9c138dbd90>

The original image has the following attributes:

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (512, 512)
Max Pixel Value: 255
Min Pixel Value: 0

Here is your target image:

Your final image array should have the following attributes:

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (512, 512, 4)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

Here are some things to pay attention to, in the target image, to help you think about what operations you need here:

the shape of the target image.
the color of the target image in different regions.
the dtype of the target image.

Hint 1: run the function hints.camera() for some additional help if you get stuck.

# YOUR CODE HERE

Solution to Exercise 10

There are several steps here - this was not an easy task! First, we need to convert our image to the float64. We need the ski.util.img_as_float() function to do this. If you tried to use the NumPy .astype(float) method, you will have encountered a thorny error (more on this below).

If you look at the target image, you will notice that it has a green-ish tone, but not so green as we would get if the other color channels were “switched off”. As such, you can get the correct effect by reducing the impact of the red and blue channels, in fact by halving their values.

The left-hand side of the image has variance in transparency - pixels where the image is darker are more transparent. This is because, in the target image, the alpha channel is a copy of the original 2D image.

The distinction between the left-hand side and right-hand side of the image is that in the right-hand side, the transparency channel values are all uniform, though less than maximum opacity, giving a hazy effect. This can be achieved by NumPy array indexing, and setting values in the alpha channel in the right-hand side of the image to all equal 0.5:

# Solution.

# Convert the image to `float64`.
camera_as_float = ski.util.img_as_float(camera)

# Create a 4-D stack. Halve the values in the Red and Blue channels.
# Duplicate the 2D image to get the alpha channel.
camera_4D = np.stack([camera_as_float * 0.5,  # Red
                      camera_as_float,        # Green
                      camera_as_float * 0.5,  # Blue
                      camera_as_float],       # Alpha
                      axis=2)

# Set roughly half of the pixels to have a uniform transparency value.
camera_4D[:, 240:512, 3] = 0.5

# Show the result.
show_attributes(camera_4D)
plt.imshow(camera_4D);

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (512, 512, 4)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

_images/65e3ea9693d950ad691b32556c3910b16e8ff0eb5b604e8e95b9710b5930edb3.png

Now, if you tried to do the float64 conversion using the numpy .astype() method, things will seem OK at first:

# DO NOT DO THIS!
camera_as_float_error = camera.astype(float)
show_attributes(camera_as_float_error)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (512, 512)
Max Pixel Value: 255.0
Min Pixel Value: 0.0

BUT the original camera image is dtype unit8, with a maximum array value of 255. You can see the maximum value of 255.0 in the printout from the cell above.

This maximum value violates the conventions, in skimage and matplotlib for float64 images. We can again np.stack() and manipulate this mutant array, without error at first…

camera_4D_error = np.stack([camera_as_float_error * 0.5,
                            camera_as_float_error,
                            camera_as_float_error * 0.5,
                            camera_as_float_error],
                            axis=2)
camera_4D_error[:, 240:512, 3] = 0.5
show_attributes(camera_4D_error)

Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (512, 512, 4)
Max Pixel Value: 255.0
Min Pixel Value: 0.0

…until we try to display it with Matplotlib:

plt.imshow(camera_4D_error);

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..255.0].

_images/766557909954b2541c9232dc92525c9831709714ae24e858d8cd31b8ef0f222e.png

Matplotlib complains that the maximum values (255.0) violate the required 0 to 1 range of float64 images. It tries to clip the data to the valid range, but with horrible visual results. Use skimage conversion functions, to ensure the proper pixel intensity range for the current dtype, to avoid these sorts of errors…

Not all 3D images are in color…#

We noted earlier that it is important to know that not all 3D images contain color information. What sort of images are these? One set of examples are images obtained during medical imaging scans, like brain imaging. The “slices” in the third dimension, for these images, are literally slices of a 3D object (like a brain, chest, arm or leg!) rather than color channels.

To show this, below we use ski.io.imread() to load an image in one of these formats. The image is an X-ray of the head of a desert iguana.

Note: here we are loading in a .png image, this is a highly atypical format to store medical images, but we use it just for illustration of a 3D image that does not contain color, using a familiar image format… The original image file, in the more standard Nifti format, is here.

# Read in the `iguana` image, show its attributes.
iguana = ski.io.imread("images/brainy.png")
show_attributes(iguana)

Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (210, 256, 179)
Max Pixel Value: 229
Min Pixel Value: 0

We can see now that there are 179 slices in the third dimension - however, we do not have 179 color channels! Each “slice” is a different literal “slice” of the iguana’s head. Below, we “walk” through the slices, showing those early in the stack (iguana[:, :, 10]) as well as those in the middle (iguana[:, :, 84]) and at the end (iguana[:, :, 178]):

# Show multiple slices of the iguana's head...
plt.figure(figsize=(16, 6))
for count, k in enumerate(range(10, 84, 8)):
    plt.subplot(2, 5, count + 1)
    plt.imshow(iguana[:, :, k])
    plt.title(f"`iguana[:, :, {k}]`")
    plt.axis('off')

_images/c740f16456e13bd9cd7b3e88ce8c0ec11ea16c0e6e2e616d5e318ca2c74b6d32.png

You can see that we are walking through the iguana’s head, in the vertical direction.

Because this image file contains information about a literal 3D object (e.g. an iguana’s head), we can refer to it as a volumetric image. In such images, the third dimension contains spatial information, and not color information.

If you try to display images that have more than four elements on the third dimension, Matplotlib will give an error “Invalid shape … for image data”, Make a new cell and try plt.imshow(iguana) to trigger this error.

Summary#

On this page we have seen that:

skimage represents images as NumPy arrays.
Many skimage functions take NumPy arrays as arguments.
The ski.io module handles input and output (e.g. reading and saving image files).
skimage supports multiple dtypes in image arrays, and contains convenience functions for converting between dtypes.
skimage supports multiple colorspaces — ways of mapping array values to colors.
Matplotlib supports both three-channel color image arrays with shape (n, m, 3) and four-channel color image arrays with shape (n, m, 4), where an extra channel codes for transparency/opacity.
Some 3D images are volumetric, and do not contain color channels, despite having a third dimension. skimage supports these image formats as well, but you will often have to slice these images to display them in Matplotlib.

On the next page we will look at how image processing can be performed using NumPy, SciPy and Scikit-image.

References#

See color images page references.