Introducing Scikit-image#

Scikit-image (skimage) is a powerful image processing package which is built in a large part from NumPy and SciPy, and uses NumPy arrays as its fundamental image representation. This page will show some foundational aspects of skimage, focusing on its coherence with the NumPy image array concepts we saw on the last two pages.

Images are arrays, arrays are images#

As we have seen, we can think of any (at least) two-dimensional array as an image — call this an image array. We know that each element of an image array (an “array pixel”) contains information about the intensity or color of that element. Perhaps obviously, these fundamental principles apply equally to Scikit-image, given that it also represents images as NumPy arrays. The table below outlines how images are represented in skimage - you’ll notice it is identical to the principles we have seen for representing images in NumPy:

Image

np.ndarray

Pixels

array values e.g. img_arr[2, 3]

Image Array dtypes

(np.uint8, np.float, many others)

Before exploring how skimage works, let’s start by building some image objects to perform image manipulations on. As normal, we begin by importing some libraries:

# Import libraries.
import numpy as np
import matplotlib.pyplot as plt

# Set the default colormap for this session.
plt.rcParams['image.cmap'] = 'gray'

# Import a custom function to give hints for some exercises, and
# a custom function to quickly report array image attributes.
from skitut import hints, show_attributes

We can use some simple NumPy functions and the familiar array indexing syntax to create our familiar, exceedingly artistic, image array:

# Create an image array.
squares = np.array([[1, 0,],
                    [0, 1,]],
                   dtype=float)

# Show the array ("raw" output from NumPy)
squares
array([[1., 0.],
       [0., 1.]])
# Display the array as an image with Matplotlib
plt.matshow(squares);
_images/537da698d538dc2e027b8fef4e66635a37eb47d3d8fe0b431d863547b3f90def.png

You will recall that this array — in virtue of being 2D and having only one numeric value per array pixel — is a single-channel array. This means that it contains only information about pixel intensity (representable, for example, with gray-level), and gives no information about color.

To include color information, our array must be of at least three dimensions. A three-dimensional color image array typically has three “slices” in the third dimension. Each element of each slice contains a number between 0 and 1 (for float64 data), or between 1 and 255 (for uint8 data). For the standard RGB (Red-Green-Blue) format the number in each element of each slice in the third dimension tells that array pixel what color to be (e.g. a mix of red, green and blue, across the three slices). Hence these slices, in the third dimension, are called “color channels”.

As we saw previously, we can control the amount of color in the image by manipulating each channel in the third dimension:

# Using `np.stack()` to create a multi-channel array.
red_squares = np.stack([squares, # 1's are only present in the red channel.
                        squares * 0, # The green channel is "switched off".
                        squares * 0], # The blue channel is "switched off".
                        axis=2)
show_attributes(red_squares)
plt.matshow(red_squares);
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (2, 2, 3)
Max Pixel Value: 1.0
Min Pixel Value: 0.0
_images/9c8008b193d754d16733b519dc26fb71b21c6070e548f56afcecdc04e5cff7c5.png

We show this principle again below, for each color channel. The top row of the plot shows the channels for uint8 data with 255 as the maximum pixel intensity value. The second row of the plot shows each channel for float64 data, using 1 as the maximum pixel intensity value:

# Use a custom function to plot the maximum intensity values in the diagonals, for `int` and `float` data,
# using our familiar 1's and 0's array.
from skitut.random_colors import plot_int_float

plot_int_float();
_images/0a7388245c4028131fc7f69ef773feb2d8c17a5d39372297df6fd5878c038ef2.png

These are the basic aspects of images represented as NumPy arrays. Other ways of representing color exist and are supported by skimage, and indeed, not all 3D arrays representing images contain color information (e.g. structural brain images, as we saw for brain images). We will look at these concepts in later sections, but skimage can work with these images as easily as the basic colored squares we have just made — all these images are just NumPy arrays and can be handled just as any NumPy array.

Going ski-ing : arrays as arguments#

So far so good, we have an image array to experiment with. Let’s import skimage itself. The convention for importing skimage (and the convention you will see in most other peoples’ code) is to name the import ski:

# Import statement for Scikit-image and the conventional shorthand.
import skimage as ski

Because Scikit-image represents images as NumPy arrays, the majority of Scikit-image functions take NumPy ndarrays as arguments. For instance, the rgb2gray() function from the ski.color module can take a multi-channel array, and convert it to a single channel, grayscale image. We will show this with the red_squares array from above:

# Show the original `red_squares` array
show_attributes(red_squares)
plt.matshow(red_squares);
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (2, 2, 3)
Max Pixel Value: 1.0
Min Pixel Value: 0.0
_images/9c8008b193d754d16733b519dc26fb71b21c6070e548f56afcecdc04e5cff7c5.png

Now, we just pass the squares array to ski.color.rgb2gray() as an argument:

# Convert the `red_squares` array to grayscale.
back_to_black = ski.color.rgb2gray(red_squares)
show_attributes(back_to_black)
plt.matshow(back_to_black);
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (2, 2)
Max Pixel Value: 0.21
Min Pixel Value: 0.0
_images/537da698d538dc2e027b8fef4e66635a37eb47d3d8fe0b431d863547b3f90def.png

Many Scikit-image functions work like this - they take a NumPy array as input, do something to the array pixel values, then return an altered image array. As we mentioned above, under the hood skimage is often using NumPy and SciPy functions to do the image manipulation, something we will explore in more detail on the next page.

You may ask why the maximum pixel values are now 0.21, rather than 1? This is because of the formula that rgb2gray() uses to do the conversion. We will show this in detail in the Colorspaces section later in this tutorial.

Scikit-image ships with built-in images#

Conveniently, skimage ships with example image arrays included. It should come as no surprise that these are also NumPy arrays. We can access all of the built-in images through the ski.data module. Here we use dir() to show all of the available module attributes, most of which are example image arrays:

dir(ski.data)
['astronaut',
 'binary_blobs',
 'brain',
 'brick',
 'camera',
 'cat',
 'cell',
 'cells3d',
 'checkerboard',
 'chelsea',
 'clock',
 'coffee',
 'coins',
 'colorwheel',
 'data_dir',
 'download_all',
 'eagle',
 'file_hash',
 'grass',
 'gravel',
 'horse',
 'hubble_deep_field',
 'human_mitosis',
 'immunohistochemistry',
 'kidney',
 'lbp_frontal_face_cascade_filename',
 'lfw_subset',
 'lily',
 'logo',
 'microaneurysms',
 'moon',
 'nickel_solidification',
 'page',
 'palisades_of_vogt',
 'protein_transport',
 'retina',
 'rocket',
 'shepp_logan_phantom',
 'skin',
 'stereo_motorcycle',
 'text',
 'vortex']

Here we will load the coffee image from ski.data:

# Load in the image.
coffee = ski.data.coffee()

# Show the image array.
coffee
array([[[ 21,  13,   8],
        [ 21,  13,   9],
        [ 20,  11,   8],
        ...,
        [228, 182, 138],
        [231, 185, 142],
        [228, 184, 140]],

       [[ 21,  13,   7],
        [ 21,  13,   9],
        [ 20,  14,   7],
        ...,
        [228, 182, 136],
        [231, 185, 139],
        [229, 183, 137]],

       [[ 21,  14,   7],
        [ 23,  13,  10],
        [ 20,  14,   9],
        ...,
        [228, 182, 136],
        [228, 184, 137],
        [229, 185, 138]],

       ...,

       [[189, 124,  77],
        [214, 155, 109],
        [197, 141, 100],
        ...,
        [165,  86,  37],
        [161,  82,  41],
        [143,  67,  29]],

       [[207, 148, 102],
        [201, 142,  99],
        [196, 140,  97],
        ...,
        [154,  74,  37],
        [147,  66,  33],
        [145,  65,  31]],

       [[197, 141, 100],
        [195, 137,  99],
        [193, 138,  98],
        ...,
        [158,  73,  38],
        [144,  64,  30],
        [143,  60,  29]]], shape=(400, 600, 3), dtype=uint8)
# Inspect the attributes of the image.
show_attributes(coffee)
Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (400, 600, 3)
Max Pixel Value: 255
Min Pixel Value: 0
# Show the image with Matplotlib.
plt.imshow(coffee);
_images/97fe6e20ae416a866bb7ec14a2e5b284f2a37fea6a05fbac65c3646d0ddd8270.png

Because these images are nothing but NumPy arrays, we can use standard array slicing to interact with them. For instance, we can use slicing to ruin the coffee image by placing a huge green square over it:

ruined_coffee = coffee.copy()
ruined_coffee[100:300, 200:400, :] = [0, 255, 0]  # [Red channel, Green channel, Blue channel]
plt.imshow(ruined_coffee);
_images/1d4d8d5b3cafce28fff5afe31a7ca85878a1cf0978fd1387838b13dccca9ad31.png

Remember our maxim? stating that “image processing” is when we do something that analyzes or changes the numbers inside the image array.

When we explore different image processing operations, using smaller arrays often makes it easier to understand what a given processing operation is doing to individual array pixels. However, in larger, more complex arrays, it is often easier to appreciate the “global” visual/perceptual effect of an image processing operation.

As such, we will use a variety of simple arrays (like squares) along with a variety of more complex images from skimage.data, like coffee, to show the effect of different skimage manipulations, as well as their constituent numpy and scipy operations.

Input/output and dtypes in skimage#

Before manipulating and processing images, we will need to load them into memory. After we are finished with our high concept digital art, we will want to save our creations. To help us with this, input and output (e.g. loading and saving image files) is handled by the skimage.io module.

We have already met this module on earlier pages but we will discuss some of its finer details here. skimage supports multiple image dtypes (e.g. the types of numbers within the image array), and loading and saving files can require close attention to, or even conversion of, the image dtype, as we will see shortly.

For now, let’s use ski.io.imread() to load a .png of the terrifying smile we hand-crafted in the earlier tutorials:

# Read in an image file, as a single-channel image.
smiley_from_file = ski.io.imread("images/smiley.png",
                                 as_gray=True)

# Show the "raw" NumPy output
smiley_from_file
array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 1., 0., 0.],
       [0., 0., 1., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 1., 0.],
       [0., 1., 1., 0., 0., 1., 1., 0.],
       [0., 0., 1., 1., 1., 1., 0., 0.],
       [0., 0., 0., 1., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])
# Show the array (graphically).
plt.imshow(smiley_from_file);
_images/297417f252b33580627496dc96632dbea90dfdee6aa49d13f19d2eee0af24a22.png
# Show the attributes of the image.
show_attributes(smiley_from_file)
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (11, 8)
Max Pixel Value: 1.0
Min Pixel Value: 0.0

No less terrifying than when we first saw it…

Note that ski.io.imread() loads image files as NumPy arrays by default. This is not the case with some other python image libraries (like pillow) which have their own ways of representing images.

Now, a lot of skimage functions serve specific purposes — like improving the quality or clarity of an image. Others are just there to look cool. The ski.transform.swirl() function falls into the latter category. According to the documentation, this function performs a non-linear deformation creating a whirlpool effect. Because skimage has loaded smiley.png as a NumPy array, we can pass the image straight to the swirl function:

#  Swirl the `smiley_from_file` array.
smiley_swirled = ski.transform.swirl(smiley_from_file,
                                     center=(3, 6), # Central pixel coordinate.
                                     radius=10)  # Extent of the `swirl`
                                                 # (in number of pixels)
# Show the "raw" NumPy output.
smiley_swirled
array([[0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.00338509,
        0.03785073, 0.        , 0.        ],
       [0.        , 0.14817294, 0.75800676, 0.        , 0.20296116,
        0.84274916, 0.        , 0.        ],
       [0.        , 0.22793052, 0.68880311, 0.        , 0.26538503,
        0.66049489, 0.        , 0.        ],
       [0.        , 0.06692754, 0.09253997, 0.        , 0.        ,
        0.06692754, 0.19601409, 0.01223244],
       [0.08225702, 0.47564046, 0.        , 0.        , 0.12032263,
        0.50828468, 0.91365098, 0.0458475 ],
       [0.01976743, 0.97398067, 0.47110121, 0.        , 0.46717838,
        0.97044561, 0.66297409, 0.00484172],
       [0.02687155, 0.34989506, 0.83183389, 0.93827959, 1.        ,
        0.71320245, 0.08225702, 0.        ],
       [0.        , 0.        , 0.14820102, 0.55782448, 0.84161038,
        0.20920003, 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.01297095, 0.        ,
        0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        ]])
# Show the swirled array (graphically).
plt.matshow(smiley_swirled);
_images/40b470d016f2edce9a78e7627cf9b6edb8e4a3a891dab5b74dc6bcaaa8e61c79.png

smiley only gets more terrifying with each manipulation, it seems…

What swirl has done here is, well, swirled the pixels around a central point, resulting in this crooked, wonky smile. Now, if we want to save our persistently terrifying creation, we can use ski.io.imsave() to save images…

# OUCH!
ski.io.imsave("images/smiley_swirled.png", # Path to save image to.
              smiley_swirled)              # Image array to save.
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/PIL/PngImagePlugin.py:1380, in _save(im, fp, filename, chunk, save_all)
   1379 try:
-> 1380     rawmode, bit_depth, color_type = _OUTMODES[outmode]
   1381 except KeyError as e:

KeyError: 'F'

The above exception was the direct cause of the following exception:

OSError                                   Traceback (most recent call last)
Cell In[19], line 2
      1 # OUCH!
----> 2 ski.io.imsave("images/smiley_swirled.png", # Path to save image to.
      3               smiley_swirled)              # Image array to save.

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/_shared/utils.py:386, in deprecate_parameter.__call__.<locals>.fixed_func(*args, **kwargs)
    382     elif self.new_name is not None:
    383         # Assign old value to new one
    384         kwargs[self.new_name] = deprecated_value
--> 386 return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/io/_io.py:206, in imsave(fname, arr, plugin, check_contrast, **plugin_args)
    203     warn(f'{fname} is a low contrast image')
    205 with _hide_plugin_deprecation_warnings():
--> 206     return call_plugin('imsave', fname, arr, plugin=plugin, **plugin_args)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/_shared/utils.py:690, in deprecate_func.__call__.<locals>.wrapped(*args, **kwargs)
    684 stacklevel = (
    685     self.stacklevel
    686     if self.stacklevel is not None
    687     else _warning_stacklevel(func)
    688 )
    689 warnings.warn(message, category=FutureWarning, stacklevel=stacklevel)
--> 690 return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/skimage/io/manage_plugins.py:254, in call_plugin(kind, *args, **kwargs)
    251     except IndexError:
    252         raise RuntimeError(f'Could not find the plugin "{plugin}" for {kind}.')
--> 254 return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/imageio/v3.py:139, in imwrite(uri, image, plugin, extension, format_hint, **kwargs)
    104 def imwrite(uri, image, *, plugin=None, extension=None, format_hint=None, **kwargs):
    105     """Write an ndimage to the given URI.
    106 
    107     The exact behavior depends on the file type and plugin used. To learn about
   (...)    136 
    137     """
--> 139     with imopen(
    140         uri,
    141         "w",
    142         legacy_mode=False,
    143         plugin=plugin,
    144         format_hint=format_hint,
    145         extension=extension,
    146     ) as img_file:
    147         encoded = img_file.write(image, **kwargs)
    149     return encoded

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/imageio/core/v3_plugin_api.py:367, in PluginV3.__exit__(self, type, value, traceback)
    366 def __exit__(self, type, value, traceback) -> None:
--> 367     self.close()

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/imageio/plugins/pillow.py:145, in PillowPlugin.close(self)
    144 def close(self) -> None:
--> 145     self._flush_writer()
    147     if self._image:
    148         self._image.close()

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/imageio/plugins/pillow.py:486, in PillowPlugin._flush_writer(self)
    483     self.save_args["save_all"] = True
    484     self.save_args["append_images"] = self.images_to_write
--> 486 primary_image.save(self._request.get_file(), **self.save_args)
    487 self.images_to_write.clear()
    488 self.save_args.clear()

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/PIL/Image.py:2590, in Image.save(self, fp, format, **params)
   2587     fp = cast(IO[bytes], fp)
   2589 try:
-> 2590     save_handler(self, fp, filename)
   2591 except Exception:
   2592     if open_fp:

File /opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/PIL/PngImagePlugin.py:1383, in _save(im, fp, filename, chunk, save_all)
   1381 except KeyError as e:
   1382     msg = f"cannot write mode {mode} as PNG"
-> 1383     raise OSError(msg) from e
   1384 if outmode == "I":
   1385     deprecate("Saving I mode images as PNG", 13, stacklevel=4)

OSError: cannot write mode F as PNG

Oh dear, what a horrible looking error for such a simple request. What has happened here? The error message is cryptic…

OSError: cannot write mode F as PNG

…but it is telling us that there is an issue with the dtype of the array we are trying to save:

smiley_swirled.dtype
dtype('float64')

skimage supports the following dtypes:

However, although skimage supports all these dtypes, the PNG image format does not. In particular, PNG does not support floating point image values, such as float64.

Issues like these are common, and as such skimage has a variety of functions to address them.

In this case, we can convert our image array from float64 to uint8 using the ski.util.img_as_ubyte() function:

smiley_swirled = ski.util.img_as_ubyte(smiley_swirled)

smiley_swirled.dtype
dtype('uint8')

This format is supported for .png files, and we can painlessly save our image using ski.io.imsave():

# Saving our image (successfully).
ski.io.imsave("images/smiley_swirled.png", # Path to save image to.
              smiley_swirled)              # Image array to save.

We can now use ski.io.imread() to read the file we just saved back into memory:

# Load the file back in.
load_back_in = ski.io.imread("images/smiley_swirled.png")

# Show the file.
plt.imshow(load_back_in);
_images/5c5dff20914097f082ba000ba46b21d7905e628daae0e0e77305efcbcb1c556d.png

We can see that the dtype of the freshly saved, freshly loaded image is indeed uint8:

show_attributes(load_back_in)
Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (11, 8)
Max Pixel Value: 255
Min Pixel Value: 0

Another option here, is just to save to another image format, like .jpg. Below we save the original smiley_swirled array as a .jpg file, avoiding the ugly error message:

# Save as `.jpg`.
ski.io.imsave("images/smiley_swirled.jpg", # Path to save image to.
              smiley_swirled)              # Image array to save.

Issues with dtype can be a source of errors, so it is important to be aware of what dtype your image arrays are using. Fortunately, as we have seen, to help remedy or avoid such errors, skimage makes it easy to convert between dtypes, where such errors do occur.

Note: you should prefer the ski.util conversion functions to using the NumPy ndarray.astype() method, when altering image dtypes. This is because the skimage functions will respect the min/max pixel intensity value conventions shown in the table above, where .astype() may not. Here is a list of the skimage conversion functions - you will need them!:

# Show the `skimage` `dtype` conversion functions.
[func for func in dir(ski.util) if func.startswith('img')]
['img_as_bool',
 'img_as_float',
 'img_as_float32',
 'img_as_float64',
 'img_as_int',
 'img_as_ubyte',
 'img_as_uint']

Small image arrays, big image arrays#

Before we move on, a note on swirl — we said earlier that it is easier to understand a manipulation at the array pixel-level using a small (low-resolution) array, but easier to appreciate its global visual effect on a larger (high-resolution) image. We saw the effect of swirl on smiley_swirled, but the nature of the visual effect can be seen more clearly when we apply it to coffee:

# `swirl` the coffee image
plt.imshow(ski.transform.swirl(coffee, strength=100));
_images/2459635bcf7e3471c9634473e92f050c485c20c9ba1ef7898ab427c002f50722.png

Pretty trippy…

Colorspaces#

Colorspaces are different ways of mapping values to colors.

So far we have looked primarily at binary/monochrome single-channel image arrays and three-channel Red-Green-Blue (RGB) color image arrays. We can use the shape conventions from NumPy to think about these image types. Single channel images have a shape of (n, m) where n is the number of rows and m is the number of columns. RGB images have a shape of (n, m, 3), so n rows, m columns and 3 slices (the color channels).

RGB is a common standard, but color images of shape (n, m, 3) can use different representations of color to the RGB method we have seen. We call these different color representation formats colorspaces. skimage supports many of them, and contains many functions for converting image arrays between colorspaces. These functions are contained in the ski.color module:

# Show the functions in `ski.color`.
dir(ski.color)
['ahx_from_rgb',
 'bex_from_rgb',
 'bpx_from_rgb',
 'bro_from_rgb',
 'color_dict',
 'combine_stains',
 'convert_colorspace',
 'deltaE_cie76',
 'deltaE_ciede2000',
 'deltaE_ciede94',
 'deltaE_cmc',
 'fgx_from_rgb',
 'gdx_from_rgb',
 'gray2rgb',
 'gray2rgba',
 'hax_from_rgb',
 'hdx_from_rgb',
 'hed2rgb',
 'hed_from_rgb',
 'hpx_from_rgb',
 'hsv2rgb',
 'lab2lch',
 'lab2rgb',
 'lab2xyz',
 'label2rgb',
 'lch2lab',
 'luv2rgb',
 'luv2xyz',
 'rbd_from_rgb',
 'rgb2gray',
 'rgb2hed',
 'rgb2hsv',
 'rgb2lab',
 'rgb2luv',
 'rgb2rgbcie',
 'rgb2xyz',
 'rgb2ycbcr',
 'rgb2ydbdr',
 'rgb2yiq',
 'rgb2ypbpr',
 'rgb2yuv',
 'rgb_from_ahx',
 'rgb_from_bex',
 'rgb_from_bpx',
 'rgb_from_bro',
 'rgb_from_fgx',
 'rgb_from_gdx',
 'rgb_from_hax',
 'rgb_from_hdx',
 'rgb_from_hed',
 'rgb_from_hpx',
 'rgb_from_rbd',
 'rgba2rgb',
 'rgbcie2rgb',
 'separate_stains',
 'xyz2lab',
 'xyz2luv',
 'xyz2rgb',
 'xyz_tristimulus_values',
 'ycbcr2rgb',
 'ydbdr2rgb',
 'yiq2rgb',
 'ypbpr2rgb',
 'yuv2rgb']

Let’s load in the cat image from ski.data, to look at a different colorspace:

# Load the image and show it.
cat = ski.data.cat()
show_attributes(cat)
plt.imshow(cat);
Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (300, 451, 3)
Max Pixel Value: 231
Min Pixel Value: 0
_images/4ee7477b739aa9fda7066966d5336ad7e5db317659e8f8e26fc7d7318b27b680.png

On the previous page we manually converted color images to grayscale, using NumPy operations. As we saw earlier on the present page, functions from the ski.color module, like ski.color.rgb2gray(), can do this more elegantly, with less code:

# From RGB to grayscale.
gray_cat = ski.color.rgb2gray(cat)
show_attributes(gray_cat)
plt.imshow(gray_cat);
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (300, 451)
Max Pixel Value: 0.76
Min Pixel Value: 0.02
_images/9c3ae295adbaf0628ca3aed667e93d398aa75b4efab21b06d3bf6f5eccb172c6.png

This grayscale conversion is achieved by calculating the luminance of the image via a weighted sum of the information in each color channel:

\[ L = 0.2126R + 0.7152G + 0.0722B \]

This has the effect of removing the color channels, so the output array is 2D. Why do we use these specific numbers? You could go and knock on the door of your physicist friend to ask (don’t worry, they’ll be in…), or you can see this page for a detailed explanation of where the weights come from.

We can also convert an image into a different colorspace (e.g. not grayscale). ski.color.rgb2hsv() will convert an RGB image to an image in the HSV colorspace. HSV stands for Hue, Saturation, Value. To get a feel for how HSV specifies color, have a look at the Wikipedia page link, and try the HSV color picker.

# Convert `cat` to the HSV colorspcae
hsv_cat = ski.color.rgb2hsv(cat)
show_attributes(hsv_cat)
plt.imshow(hsv_cat);
Type: <class 'numpy.ndarray'>
dtype: float64
Shape: (300, 451, 3)
Max Pixel Value: 1.0
Min Pixel Value: 0.0
_images/5c0fd3bcc2fedb56d4a43734a3fdf855c12e9eac10bcb112d639640f18e7e3dd.png

Psychedelic! Maybe this is how cats look, from the perspective of other cats (e.g. from within their “umwelt”), but sadly this is probably not the case…

Let’s extract the individual channels with some array indexing operations. Each channel is 2D, and therefore will render as a grayscale image when displayed with Matplotlib:

# Extract the HSV channels.
hue_slice = hsv_cat[:, :, 0]
saturation_slice = hsv_cat[:, :, 1]
value_slice = hsv_cat[:, :, 2]

# Plot them for comparison.
plt.figure(figsize=(12, 7))
plt.subplot(2, 2, 1)
plt.imshow(cat)
plt.title('Original (RGB)')
plt.subplot(2, 2, 2)
plt.imshow(hue_slice)
plt.title('Hue Channel')
plt.subplot(2, 2, 3)
plt.imshow(saturation_slice)
plt.title('Saturation Channel')
plt.subplot(2, 2, 4)
plt.imshow(value_slice)
plt.title('Value Channel')
plt.tight_layout();
_images/877be7df3ccb4fac5cd43ca9a030dbc2b71b0ed94f5ff9ebb03973056be34184.png

Colorspaces 2: transparency in the 3rd dimension#

Now, other colorspaces involve arrays of different shapes. For instance, some image arrays are (n, m, 4) - so that’s n rows, m columns and 4 slices in the third dimension.

When we have four values in the third dimension, we can interpret these as Red-Green-Blue-Alpha (RGBA), where Alpha is the opacity of the color.

Let’s see what that extra slice does, using our tried and true squares image array:

# Show the array.
plt.matshow(squares);
_images/537da698d538dc2e027b8fef4e66635a37eb47d3d8fe0b431d863547b3f90def.png

Let’s np.stack() this image array into 4-D, and set only the 1st and 4th channels to have nonzero values (e.g. so all values in the other, green and blue, channels are 0’s):

# Create an array with 4 channels (e.g. 4 slices in the third dimension).
four_channel_stack = np.stack([squares,
                               squares * 0, # All 0's in the green channel.
                               squares * 0, # All 0's in the blue channel.
                               squares], # Add a fourth slice in the third dimension.
                               axis=2)

plt.matshow(four_channel_stack);
_images/fb13f0ca0ead2d649bcb7368dcf9238d34960190c3400236d189782b7b956234.png

Ok, so we just get red nonzero pixels. So what does the 4th channel do?

It controls transparency. Setting it to 1 gives maximum opacity e.g. solid, non-see through color.

Let’s set it lower:

four_channel_stack[:, :, 3] = four_channel_stack[:, :, 3] * 0.25     # Fourth slice nonzero values to equal .25

plt.matshow(four_channel_stack);
_images/fead857b3222e1f3d4d5e89c66d584e87b69fd2e734fc79b72854c69fe624d32.png

This new transparency channel is called an alpha channel. Let’s add one to our cat image. We’ll duplicate the first slice of the third dimension, as the fourth slice:

four_channel_stack_cat = np.stack([cat[:, :, 0],
                                   cat[:, :, 1],
                                   cat[:, :, 2],
                                   cat[:, :, 0]], # Duplicate the first slice as the fourth slice...
                                   axis=2)

plt.imshow(four_channel_stack_cat);
_images/ebe0525cf3f8415499da93d19f643424610c66d06b48167451c46fce4a4754d0.png

Pretty ghostly…maybe this is how cats look to one another…

Not all 3D images are in color…#

We noted earlier that it is important to know that not all 3D images contain color information. What sort of images are these? One set of examples are images obtained during medical imaging scans, like brain imaging. The “slices” in the third dimension, for these images, are literally slices of a 3D object (like a brain, chest, arm or leg!) rather than color channels.

To show this, below we use ski.io.imread() to load an image in one of these formats. The image is an X-ray of the head of a desert iguana.

Note: here we are loading in a .png image, this is a highly atypical format to store medical images, but we use it just for illustration of a 3D image that does not contain color, using a familiar image format… The original image file, in the more standard Nifti format, is here.

# Read in the `iguana` image, show its attributes.
iguana = ski.io.imread("images/brainy.png")
show_attributes(iguana)
Type: <class 'numpy.ndarray'>
dtype: uint8
Shape: (210, 256, 179)
Max Pixel Value: 229
Min Pixel Value: 0

We can see now that there are 179 slices in the third dimension - however, we do not have 179 color channels! Each “slice” is a different literal “slice” of the iguana’s head. Below, we “walk” through the slices, showing those early in the stack (iguana[:, :, 10]) as well as those in the middle (iguana[:, :, 84]) and at the end (iguana[:, :, 178]):

# Show multiple slices of the iguana's head...
plt.figure(figsize=(16, 6))
for count, k in enumerate(range(10, 84, 8)):
    plt.subplot(2, 5, count + 1)
    plt.imshow(iguana[:, :, k])
    plt.title(f"`iguana[:, :, {k}]`")
    plt.axis('off')
_images/c740f16456e13bd9cd7b3e88ce8c0ec11ea16c0e6e2e616d5e318ca2c74b6d32.png

You can see that we are walking through the iguana’s head, in the vertical direction.

Because this image file contains information about a literal 3D object (e.g. an iguana’s head), we can refer to it as a volumetric image. In such images, the third dimension contains spatial information, and not color information.

If you try to display images that have more than four elements on the third dimension, Matplotlib will give an error “Invalid shape … for image data”, Make a new cell and try plt.imshow(iguana) to trigger this error.

Summary#

On this page we have seen that:

  • skimage represents images as NumPy arrays.

  • Many skimage functions take NumPy arrays as arguments.

  • The ski.io module handles input and output (e.g. reading and saving image files).

  • skimage supports multiple dtypes in image arrays, and contains convenience functions for converting between dtypes.

  • skimage supports multiple colorspaces — ways of mapping array values to colors.

  • Matplotlib supports both three-channel color image arrays with shape (n, m, 3) and four-channel color image arrays with shape (n, m, 4), where an extra channel codes for transparency/opacity.

  • Some 3D images are volumetric, and do not contain color channels, despite having a third dimension. skimage supports these image formats as well, but you will often have to slice these images to display them in Matplotlib.

On the next page we will look at how image processing can be performed using NumPy, SciPy and Scikit-image.

References#

See color images page references.