In the previous article we have examined some of the most commonly used convolution smoothing filters, which are linear as they are applied through convolution (a linear operator) to generate the filtered image. However, many filters do not use convolution as they are non-linear by design and find application in cases where a linear operation is not suitable to solve the problem. Most of those examined here are based on statistical computations within a region of the original image in order to account for local characteristics and increase the discrimination power between features of interest and unwanted signals. The smoothing process of these filters generally leads to the computation of pixel values that are more representative of the local features compared to linear filters, especially in the presence of noise, leading to a better preservation of the details.
Median filter
The median filter is probably the simplest among non-linear filters. Unlike convolution filters, it does not perform a linear combination between the weights of a kernel and the pixels of the image. Instead, the filtered image is obtained by computing the median in a neighborhood of each pixel in the original image, which requires a non-linear operation.
The estimation of the median involves the use of a sliding window (here simply called “window”) representing the neighborhood around the considered pixel from which to gather the statistical information used in the calculation. The pixels in the window are first sorted in numerical order, the median is then chosen as the middle element in the sorted array and considered as the final pixel in the resulting image, as depicted in Fig.14

The processing is similar to the one used in the convolution in that all pixels in the original image must be scanned and local information analyzed to compute the final value but the whole operation is non-linear and the complexity is generally higher compared to linear filters, depending on the chosen algorithm. Given a window \(W\) of size \(m \times m\), a median filter in its simplest form computes each pixel \(g_f(x,y)\) in the filtered image as follows
$$ \begin{align} g_f(x,y) & = median \bigl(W(x,y)\bigr) \\ & = mid(sort(W(x,y))) \end{align} \tag{9}$$
where \(sort()\) is a sorting function that returns a vector \(\mathbf v\) of numerically ordered elements from the window \(W(x,y)\), and \(mid(\mathbf v)\) is a function that returns the middle element(s) of a vector, defined as follows
$$ mid(\mathbf v) = \begin{cases} \mathbf v_{\frac{m^2 + 1}{2}} & \text{if $m$ is odd} \\ \frac{1}{2} \left( \mathbf v_{\frac{m^2}{2}} + \mathbf v_{\frac{m^2}{2}+1} \right) & \text{if $m$ is even} \end{cases} \tag{10}$$
Fig.14.1 shows an example response of this filter. As can be noticed, compared to linear smoothing filters the median filter does not produce blurring (but it can smear the details if the size is not suitably chosen). It also tends to better preserve many visual cues that are often analyzed to extract perceptual features, such as edges.

An implementation in Python code is given below
import numpy as np import cv2 as cv import matplotlib.pyplot as plt ipath="/path/to/image" def median(g, m=5): """Median filter""" if m % 2 == 0: raise ValueError("m must be odd") rm = m // 2 gf = np.zeros_like(g) ge = np.pad(g, rm, "constant") Dx = range(rm, ge.shape[1] - rm) Dy = range(rm, ge.shape[0] - rm) for y in Dy: for x in Dx: W = ge[y - rm: y + rm + 1, x - rm:x + rm + 1] v = np.sort(W.ravel()) med = v[(m * m + 1) // 2] gf[y - rm][x - rm] = med return gf g = cv.imread(ipath, cv.IMREAD_GRAYSCALE) gf1 = median(g, 3) gf2 = median(g, 5) implot([ [(g, "original"), (gf1, "smoothed (3x3)"), (gf2, "smoothed (5x5)")] ])
The non linearity of this type of filter has several interesting effects. First of all, since the pixels in the processed image are not “averaged out” but simply substituted (with the median), impulsive high frequency noise can be totally eliminated because noisy pixels are generally not statistically representative of the local features and will not be selected. To better understand how this works, consider the example in Fig.15

Depicted in Fig.15 is a small area extracted from an image affected by salt & pepper type of noise, with a window of size m=3 centered at 3 different pixels. When the median filter processes these windows it produces the 3 vectors of ordered pixel values shown on the side. The noisy pixels (numbers in red) are pushed to the margins of the vector as they are very dissimilar from the local features. As a consequence, they are never chosen to be the median (numbers in bold) and will not affect the filter’s response.
This computational technique is known as n-order statistics and is more powerful against impulsive noise than the linear operation of convolution, where the noisy pixels are always included in the evaluation of the filter’s response. For these reasons, the median filter is a much better choice than linear smoothing filters to regularize images with this type of noise, as shown in Fig.16

It can be seen how, unlike linear filters, the noise is already completely removed with the smallest filter size and most of the details in the image have been preserved. In particular, the edges are not blurred, which makes it suitable in all those cases where precise segmentation needs to be performed at subsequent steps. Another interesting property of this filter is that since the details are mostly preserved it can be reiterated to remove noise that may be more resistant to a single pass.
However, this filter has some issues too. Like linear filters, its effect increases with the size of the processed region, so if the window is too big for the image the smaller details are eroded, as can be observed in Fig.16.1

This effect is caused by the fact that any detail smaller than the size of the window will be completely erased or thinned, and this usually happens with lines and points. It also tends to round sharp corners. All these side effects are depicted in Fig.16.2 (processed using a window with m=5)

These issues can be reduced using an appropriate window size (smaller windows tend to mitigate these side effects), cross-shaped windows, or one of the many variants with more sophisticated approaches, such as Adaptive Median Filters, Switching Median Filters, etc.
Alpha-trimmed mean filter
The alpha-trimmed mean filter (ATM for brevity) is designed on the idea of building a filter that’s effective on signals affected by both non-impulsive (Gaussian, etc.) and impulsive (s&p, etc.) kind of noise and is able to preserve as much details as possible. To implement such a filter we can combine the linear approach of the box (average) filter with the non-linear order statistics of the median filter.
To better understand how it works let’s recall that box filters use weights drawn from a uniform distribution, which means that all pixels processed by the kernel are given the same probability of being representative of the local characteristics of the image, with the consequence that pixels related to noise will influence the result of the computation in the same manner as pixels related to features.
The idea is that of excluding noisy pixels from the filter response while still retaining the smoothing effect of a linear filter. This can be achieved by trimming the distribution so that the range of allowed intensity values is limited within a set range. That is by cutting the distribution \(I_W(x,y)\) within the window \(W(x,y)\) at both ends, where noisy pixels are most likely to be present, while retaining the middle range where is more likely to find pixels correlated with the local features, as shown in Fig.16

The retained range of intensities \(I_{W_T}\) depends on the extent of the trimming in the high and low ranges, \(I^L_\alpha\) and \(I^H_\alpha\), which is controlled by a parameter \(\alpha\) (the “trimming parameter”). The filter response is then computed by taking the arithmetic mean of the pixels within the “trimmed window” \(W_T\) (that is the subset of pixels in \(W\) having intensity values within the allowed range).
Since the distribution of intensities in \(W\) varies depending on the local features, it’s not possible to set a configuration of weights that works for every scanning window. The most common approach is that of using a combination of non-linear (statistical) and linear methods. Formally, we can define the alpha-trimmed mean filter as follows
$$\begin{align} g_f(x,y) & = atmean\bigl(W(x,y) ; \alpha\bigr) \\ & = trim(sort(W(x,y)) ; \alpha) \end{align} \tag{11}$$
where \(sort()\) is a sorting function that returns a vector \(\mathbf v\) of numerically ordered elements from the window \(W(x,y)\), and \(trim(\mathbf v)\) is a function defined as follows
$$ trim(\mathbf v) = \frac{1}{m^2-2\alpha} \sum_{i=1}^{m^2-2\alpha} \mathbf v_{\alpha+i} \tag{12}$$
where
$$ \begin{cases} 0 \leq \alpha \leq \frac{m^2-1}{2} & \text{if $m$ is odd} \\ 0 \leq \alpha \leq \frac{m^2}{2}+1 & \text{if $m$ is even} \end{cases} \tag{13}$$
We then get a parametric filter where the parameter \(\alpha\) can be set according to the desired effect. In particular, if \(\alpha\) is set to 0 then the filter behaves like a linear box filter, whereas if it’s set to the maximum allowed value then the filter becomes a median filter. Any value in between determines the extent of the trimming, that is the intensity values that need be suppressed. So, if we expect the images to be affected mainly by random additive noise and the preservation of the details is not a priority, then \(\alpha\) can be set to a small value or zero. On the other hand, if we expect the images to be affected mainly by impulsive noise and require preservation of the edges as much as possible, then \(\alpha\) can be set to a higher value or to its maximum. Fig.17 shows an example of the ATM response at varying values of \(\alpha\)

The first image in Fig.17 (top left) illustrates different types of features and noise that can be found in images: impulsive noise, random noise, a line pattern, and shaded edges. The other images show the ATM filter response at different values of the trimming parameter using a window of size m=3. With \(\alpha=0\) the filter behaves exactly like a box filter, the noise has been reduced but is still visible and the details (lines and edges) have been blurred. Increasing the trimming makes the filter behave more and more like a median filter, with most of the impulsive noise being eliminated, the edges more preserved and the line pattern gradually eroded (a side effect of the median filter). With \(\alpha=4\) (the maximum) the filter behaves exactly like a median filter, the impulsive noise is completely suppressed, the shaded edges preserved but the line pattern further degraded.
It is evident that the optimal setup is application-dependent, based on what the priorities are. As a rule of thumb, it is a good choice to set the trimming parameter to some value around the midpoint between its minimum and maximum. Even better would be to make it adaptive according to the local statistics for a more “dynamic” behavior.
Following is a non-optimized Python implementation of the ATM filter
import numpy as np import cv2 as cv import matplotlib.pyplot as plt ipath="/path/to/image" def atmean(g, a, m=3): """Alpha-trimmed mean filter""" if m % 2 == 0: raise ValueError("m must be odd") if not (0 <= a <= (m*m - 1) / 2): raise ValueError("a out of range") rm = m // 2 t = m**2 - 2 * a k = 1 / t gf = np.zeros_like(g) ge = np.pad(g, rm, "constant") Dx = range(rm, ge.shape[1] - rm) Dy = range(rm, ge.shape[0] - rm) for y in Dy: for x in Dx: W = ge[y - rm: y + rm + 1, x - rm: x + rm + 1] v = np.sort(W, None) gf[y - rm][x - rm] = k * np.sum(v[a: a + t]) return gf g = cv.imread(ipath, cv.IMREAD_GRAYSCALE) gf1 = atmean(g, 2, 3) gf2 = atmean(g, 6, 5) implot([ [(g, "original"), (gf1, "smoothed (3x3)"), (gf2, "smoothed (5x5)")] ])
Kuwahara filter
The idea behind this filter is that of overcoming the issues of linear smoothing filters that operate in a location-invariant mode, that is by performing the same linear operation across the whole image (using the same kernel weights) without regards for local variations, resulting in the blurring of many important visual cues, such as edges. By using local statistical parameters to compute the output pixel value, the kuwahara filter accounts for the characteristics of the regions being processed thus becoming adaptive to the local features.
The peculiarity of this filter is its ability to smooth areas with high spatial frequency (such as textures) by making them homogeneous while preserving or even enhancing their boundaries. The results are images that appear to be composed by flat patches with well defined borders, and that have a characteristic painting-like appearance.
In its original implementation the Kuwahara filter is defined using a symmetric window \(W\) with a squared shape centered at a point (x, y) in the image and partitioned into four equal sub windows \(Q_i\). Each of these sub windows captures local characteristics in a specific direction around the central pixel and the response of the filter is computed by comparing the statistical parameters of all the sub windows. Specifically, the final pixel value is computed as the arithmetic mean intensity of the sub window with the lowest variance. In the discrete space of digital images the sub windows are overlapping so that the central pixel belongs to all of them, as illustrated in Fig.18

A way to think of it is as a small window Q anchored at the central pixel and rotating in the four diagonal direction to scan its surrounding. With this design principles the filter’s response can be equally estimated in all possible directions, making it rotation-invariant (that is isotropic). Now, let \(M=\{\mu_i (x,y)\}\) and \(S=\{\sigma_i (x,y)\}\) be the means and variances respectively computed for each sub window \(Q_i (x,y)\), then the response of the Kuwahara filter can be defined as follows
$$ g_f (x,y) = \sum_i \mu_i (x,y) h_i (x,y) \tag{14}$$
where \(h_i (x,y)\) is the function defined below
$$ h_i (x,y) = \begin{cases} 1 & \text{if } \sigma_i (x,y) = min(S) \\ 0 & \text{otherwise} \end{cases} \tag{15}$$
The filter works on the assumption that in proximity of transitions between two different regions (e.g. the edge between two distinct objects) the response should not be an average between these adjacent regions as it would blur the edge. The filter makes sure that this doesn’t happen because the sub windows containing the edge (having a mean value that is an average of the adjacent regions) will have a higher variance than those lying outside of the edge, hence they will not be selected. Instead, the filter will select the sub window with the lowest variance lying on either side of the edge and whose mean represents one of the adjacent regions, not an average of them. This working principle is depicted in Fig.19

In Fig.19 the sub windows 1 and 3 lie on the edge and their mean is an average of the two regions A and B, so if one of them were selected it would blur the edge. However, their variances are the highest so they will surely be excluded from the filter’s response. The sub window 2 lies outside of the edge, and 4 only minimally on it, so their variances are lower and the filter will select the lowest of them, which in this example would be that of sub window 2. This results in the final pixel having a value that represents region A, whereas a linear filter would have computed an average between A and B thus blurring the edge.
If the window lied completely in a region with high spatial frequency (high variance) not associated with edges (e.g. textures, noise) the filter would select again the most homogeneous sub window and will use its mean to flatten the region. This process of “selective averaging” controlled by the variance gives the filter an adaptive behavior that allows to avoid the borders while still reducing high frequency components in the image. Fig.20 shows an example of image processed with the Kuwahara filter.

In Fig.20 we can observe the main characteristics of this filter, that is producing images with homogeneous flat regions with very sharp borders, even sharper than in the original image, and an overall artistic effect as if the image was made up of paintbrush strokes. A non-optimized implementation in Python is given below
import numpy as np import cv2 as cv import matplotlib.pyplot as plt ipath="/path/to/image" def kuwahara(g, m=5): if m % 2 == 0: raise ValueError("m must be odd") rm = m // 2 gf = np.zeros_like(g) ge = np.pad(g, rm, "constant") Dx = range(rm, ge.shape[1] - rm) Dy = range(rm, ge.shape[0] - rm) for y in Dy: for x in Dx: Q =[ge[y - rm: y + 1, x: x + rm + 1], ge[y - rm: y + 1, x - rm: x + 1], ge[y: y + rm + 1, x - rm: x + 1], ge[y: y + rm + 1, x: x + rm + 1]] S = [np.std(w) for w in Q] Smin = np.argmin(S) gf[y - rm][x - rm] = np.mean(Q[Smin]) return gf g = cv.imread(ipath, cv.IMREAD_GRAYSCALE) gf1 = kuwahara(g, 3) gf2 = kuwahara(g, 5) implot([ [(g, "original"), (gf1, "smoothed (3x3)"), (gf2, "smoothed (5x5)")] ])
As with linear smoothing filters, the size of the window \(W\) affects the quantity of details eroded from the image. Specifically, bigger windows tend to to remove more details, and cause the image to become more “flattened”. But, unlike linear filters, the borders of the different regions are generally retained or even enhanced. This property makes the Kuwahara filter an excellent choice to prepare images for segmentation or contours extraction.
The Kuwahara filter is very effective at reducing noise of both impulsive and non-impulsive type, as can be seen in Fig.22 where it is compared to the median filter. The first row shows Gaussian noise (left) and impulsive noise (right). The second and third row show the results of Kuwahara and median filter respectively using a window of size m=5. The performances are comparable for both types of noise, with the median filter performing slightly better. In particular, the median filter is able to completely remove impulsive noise, while the kuwahara filter has heavily reduced it but not totally suppressed, as confirmed by the standard deviation. This is probably due to the fact that, after all, the Kuwahara filter performs the smoothing using an average, which is not as effective as the order statistics approach of the median filter on impulsive noise.

This filter, however, suffers from some issues, at least in its original implementation. The first problem is due to the use of the arithmetic mean in the computation of the filter’s response. The uniform averaging causes a “blocky” pattern that is clearly visible especially in high frequency regions (Figure 23(a)). Another limitation is the fact that when more than one sub window have the same variance a situation of uncertainty arises as of which one should be selected, and since two regions with the same variance may have very different means, in the presence of noise the choice will be random and this will show as artifacts in the image (Figure 23(b)).

Fig.23(b) shows a gradient image (left), a Kuwahara-filtered version of the original (centre) and a Kuwahara-filtered version of the original corrupted with additive noise at SNR about 50db (right). It can be seen how even in the presence of a very low power noise the filter is unstable and produces artifacts. This is because the variance is the same across the original image, so the algorithm always chooses the same sub window even if there are more than one with the same variance and assigns the same mean to the pixel in the vertical direction. In the presence of even a tiny amount of noise the variances will be different and the sub window with the lowest value will vary randomly with the noise, so the algorithm will chose random means for the response producing the artifacts visible in Fig.23(b).
This latter issue can be mitigated by taking the average mean of the sub windows that have a very close variance, that is if their differences fall below a set threshold. The drawback is that while this solution makes the response more stable it also produces blurring in the edges. The blocky pattern issue, instead, can be solved by using circular windows and weighted averages (e.g. taking the weights from a Gaussian distribution).
There are many variants of the Kuwahara filter that have been developed since the original algorithm was published, most of them trying to overcome the two main issues mentioned earlier. One of the most interesting is the Generalized Kuwahara filter which is based on the idea that the original filter is just a special case of a more general class of filters where a set of fixed sub windows, with a certain shape and displacement, perform a local estimate of some parameter and produce a confidence value. The response will then be computed using the estimate of the sub window providing the highest confidence.
Conclusions
Non-linear smoothing filters are a powerful weapon that should be in the arsenal of every image processing practitioner. They find application in several fields, from computer vision to astronomy, medical imaging, geology, digital art, photography and many more. Despite being certainly more sophisticated than linear filters, their potential is undoubtedly superior compared to simple linear operations, providing far more flexibility in the solution of image processing problems. Furthermore, here we have just seen a few of the most commonly used but there are actually many others equally (if not more) effective, such as the bilateral filter, mean shift, anisotropic diffusion to cite some, and a plethora of variants that a detailed discussion of all of them would take an entire chapter in a DSP book, if not an entire book altogether.