Attributes for Connected Components

Attributes for
Connected Components

Connected components are fundamental to the fields of image processing and computer vision, providing the basis for segmenting and analyzing images. In my Introduction to Max-Trees I delve into the methods of extracting these components from images in order to construct a Max-Tree — a data structure centered around the concept of connected components. However, the true utility of connected components emerges when we compute attributes that provide meaningful information about the image.

This article compiles a set of attributes that can be computed relatively easily, offering insights into the properties and characteristics of connected components. Additionally, a comprehensive collection of GNU Octave implementations for these attributes is provided, facilitating practical application. Some of these implementations are included in the body of this article as well.

While many of these attributes are particularly relevant for connected components within a max-tree structure, they are also applicable in other contexts. The content of this article is derived from my master’s thesis, “Detecting Astronomical Objects with Machine Learning” and aims to make its insights more accessible to a broader audience.

Area

An attribute that can easily be computed is the area. Berger et al. define this attribute as the number of elements within the component.¹ Tushabe provides the formal mathematical definition of area $A (X)$ of component $X$ as²

A (X) = \sum_{x \in X} 1_{X} (x) .

Here, $1_{X} (x)$ is the so-called identicator function, which is $1$ if $x \in X$ or $0$ otherwise. While the term “area” originated from two-dimensional components, it can also apply to higher dimensions, such as indicating the volume of a cube in three dimensions. Berger et al. provide an algorithm for computing the area of a component in the context of a max-tree, which is included below in GNU Octave code.¹

function a = area(f, R, parent)
    a = ones(size(R));

    [_, v] = sort(R(:));

    for x = v'
        if parent(x) == x
            continue
        end

        a(parent(x)) = a(parent(x)) + a(x);
    end
end

Perimeter

The perimeter is closely related to the area attribute. Gonzalez and Woods describe the perimeter as the length of the boundary of a component.³ Interpreting the boundary as the number of elements not exclusively connected to elements within the same component, a novel approach to computing the perimeter is presented in my master’s thesis.⁶ The complexity of computing the perimeter arises from the fact that an element on the boundary of its own component can also contribute to the boundary of its parent, its parent’s parent, etc. To address this, a boundary vector $ℬ (X)$ is constructed, indicating the contribution of a component $X$ to each layer in the tree. These contributions apply only to the direct and indirect parents of the component. First, we compute the intermediate contribution set $Λ (x)$ for each element $x$ , composed of the layers where $x$ is part of the perimeter:

Λ (x) = {y \in ℤ : \underset{n \in 𝒩 (x)}{argmin} λ (n) < u \leq λ (x)} .

Here, $𝒩 (x)$ are the neighbors of $x$ and $λ (x)$ its level. With $Λ (x)$ computed for every element $x$ , the contribution of $X$ to the perimeter of layer $λ$ is the number of $λ$ inclusions in the contribution set of elements in $X$ :

ℬ (X)_{λ} = | {x \in X : λ \in Λ (x)} | .

The perimeter $𝔅 (X_{λ}^{μ})$ of component $X_{λ}^{μ}$ in branch $μ$ and layer $λ$ is determined by summing the contributions of all components to $λ$ :

𝔅 (X_{λ}^{μ}) = \sum_{ν \in N} ℬ {(X_{γ}^{v})}_{λ},

where $N = ν \in ℤ : μ ⪯ v$ and $γ \geq λ$ . The set of branches $N$ includes all branches $ν$ that are either equal to or divarications of branch $μ$ (written as $μ ⪯ ν$ ).

Below is a GNU Octave implementation of the perimeter attribute:

function l = per(f, R, parent)
    b = zeros(size(f));
    l = zeros(size(f));

    for a = unique(f)'
        g = f >= a;
        c = zeros(size(f));

        c = c | g > [zeros(1, size(g, 2)); g(1:end-1, :)]; % n
        c = c | g > [zeros(size(g, 1), 1), g(:, 1:end-1)]; % e
        c = c | g > [g(2:end, :); zeros(1, size(g, 2))];   % s
        c = c | g > [g(:, 2:end), zeros(size(f, 1), 1)];   % w

        b = b + c;
    end

    [_, s] = sort(f(:), "descend");

    for x = s'
        y = x;

        while b(x) > 0
            l(y) = l(y) + 1;
            b(x) = b(x) - 1;
            y = parent(y);
        end
    end
end

Compactness & Circularity Ratio

Using the area $A (X)$ and perimeter $𝔅 (X)$ attributes, composite attributes like compactness and the circularity ratio can be computed. Gonzales and Woods provide the following formal definitions of compactnes $C (X)$ and circularity ratio $R_{c} (X)$ :³

\begin{matrix} C (X) = & \frac{𝔅 (X)^{2}}{A (X)}, and \\ R_{c} (X) = & \frac{4 π A (X)}{𝔅 (X)^{2}} . \end{matrix}

These attributes indicate the shape of the components they represent. For example, $R_{c} (X)$ approaches a value of $1$ as the shape of the component $X$ becomes more circular.

Grayscale, Power & Volume

In addition to positional information, the value or intensity represented by elements can be indicative of a component attribute. Many examples are straightforward, such as the sum of (squared) intensities and grayscale (average value within a component). Tushabe provides the formal definition of the latter:

G (X) = \frac{\sum_{x \in X} f (x)}{A (X)} .

This can be implemented in GNU Octave code as follows:

function g = grayscale(f, R, parent)
    g = zeros(size(f));
    a = area(f, R, parent);

    [_, s] = sort(f(:), "descend");

    for x = s'
        y = x;

        while true
            g(y) = g(y) + f(x) / a(y);

            if y == parent(y)
                break;

            end

            y = parent(y);
        end
    end
end

More sophisticated measurements based on intensity levels include power

𝒫 (X, f, α) = \sum_{x \in X} {(f (x) - α)}^{2}

for an image $f$ and parent intensity level $α$ , and volume

V (X, f, α) = \sum_{x \in X} (f (x) - α) .

Entropy

Gonzalez and Woods define the entropy attribute as the average amount of information conveyed by each element in a component.³ Given a discrete set of distinct grayscale values ${a_{1}, a_{2}, \dots, a_{J}}$ in a component of size $M \times N$ , the probability of encountering a grayscale value $a_{j}$ in the component is

P (a_{j}) = \frac{a_{J}}{M N} .

This function can be used to compute a histogram of component $X$ . Using this definition, Gonzalez and Woods provide the formal definition of entropy:³

H (X) = - \sum_{j = 1}^{J} P (a_{j}) \log P (a_{j}) .

Invariant Moments

Westenberg, Roerdink, and Wilkinson provide definition for the four invariant moments: non-compactness, elongation, flatness and sparseness.⁴ These are suitable for usage in two- and three-dimensional components, unlike those presented by Hu, which only apply to two-dimensional components.⁵ Westenberg, Roerdink, and Wilkinson define the non-compactness attribute $𝒩 (X)$ as

𝒩 (X) = \frac{Tr 𝐈 (X)^{\frac{3}{5}}}{A (X)}

with the moment of inertia tensor $𝐈 (X)$ defined as

𝐈_{i j} (X) = {\begin{matrix} \sum_{X} (i - \bar{i})^{2} + \frac{A (X)}{12} & if i = j \\ \sum_{X} (i - \bar{i}) (j - \bar{j}) & otherwise \end{matrix}

for $i, j \in {x, y, z}$ . Computing eigenvalues $λ_{i} (X)$ of $𝐈 (X)$ and ordering them such that

| λ_{1} (X) | \geq | λ_{2} (X) | \geq | λ_{3} (X) | .

allows the computation of the attributes elongation $ℰ (X)$ , flatness $ℱ (X)$ and sparseness $𝒮 (X)$ :

\begin{matrix} ℰ (X) = & | \frac{λ_{1} (X)}{λ_{2} (X)} |, \\ ℱ (X) = & | \frac{λ_{2} (X)}{λ_{3} (X)} |, and \\ 𝒮 (X) = & \frac{π}{6 A (X)} \prod_{i = 1}^{3} \sqrt{\frac{20 | λ_{i} (X) |}{A (X)}} . \end{matrix}

Christophe Berger, Theirry Géraud, Roland Levillain, Nicolas Widynski, Anthony Baillard and E. Berin. “Effective Component Tree Computation with Application to Pattern Recognition in Astronomical Imaging”. In Proceedings of the International Conference of image Processing, vol. 4, 2007, pp. 41 – 44. ISBN: 978-1-4244-1436-9. DOI: 10.1109/ICIP2007.4379949 ↩↩
Florence F. Tushabe. “Extending Attribute Filters to Color Processing and Multi-Media Applications”. 2010. ISBN: 978-90-367-4630-4. ↩
Rafael C. Gonzalez and Richard E. Woods. “Digital Image Processing”. 3rd edition. 2008. ISBN: 978-0-13-505267-9. ↩↩↩↩
Michael A. Westenberg, Jos B. Roerdink and Michael H.F. Wilkinson. “Volumetric Attribute Filtering and Interactive Visualization Using the Max-Tree Representation”. In: IEEE Transactions on Image Processing, vol. 16, issue 12, 2007, pp. 2943 – 2952. DOI: 10.1109/TIP.2007.909317. ↩
Ming-Kuei Hu. “Visual Pattern Recognition by Moment Invariants”. In: IRE Transactions on Informational Technology, vol. 8, issue 2, 1962, pp. 179 – 187. DOI: 10.1109/tit.1962.1057692. ↩
Michaël P. van de Weerd. “Detecting Astronomical Objects with Machine Learning”. 2021. URL: https://fse.studenttheses.ub.rug.nl/23854/. ↩