GDAL includes a resampling method beyond the normal mix of nearest neighbor, bilinear, cubic and splines: "Lanczos windowed sinc resampling". I understand that its a convolution filter, but unlike images where results tend to be subjective, the resampling used for spatial data has other implications. What's Lanczos and how does using it affect the output?
Lanczos Resampling – What Is It Useful for in Raster Data
gdalrasterresampling
Related Solutions
Aerial photos are continuous data. Each pixel represents the response of a region of a sensor to light directed at it and as that light varies, the response varies continuously. The result is usually discretized (often into 255 or 256) categories, but that doesn't change the nature of the data. Therefore you want to interpolate rather than using categorical algorithms like nearest neighbor or majority. Bilinear interpolation is usually just fine; at some cost in execution time, cubic convolution will retain local contrast a tiny bit better. A small amount of additional blurriness is unavoidable, but that's almost impossible to notice until the image has undergone many such transformations. The errors made with nearest neighbor are much worse in comparison.
Michael Miles-Stimson is correct in his comment above; nearest-neighbour (NN) and majority resampling methods should only be applied to categorical data, i.e. nominal and ordinal level data. Elevation, even when it is presented as integer values (which is a practice that I wish we could make illegal and punishable by lengthy jail terms), is not categorical. Elevation is ratio level, e.g. an elevation of 20 m asl is twice as high as an elevation of 10 m asl. Elevation is also a phenomenon that is a continuous variable, i.e. 10.432467533 m is perfectly legitimate as an elevation (darn those integer valued DEMs!). Therefore, you should be using bilinear (BL) or cubic convolution (CC) resampling methods when dealing with these data. The difference between BL and CC resampling is essentially that BL resampling will be slightly faster and result in slightly less smoothing of the resulting surface because CC interpolates the output value using a greater number of input values within a local neighbourhood. Both methods are however good options for elevation data.
I should note that this is not simply a preference. The inappropriate use of NN or majority resampling on continuous variables like elevation will result in subtle, sometimes difficult to discern, artifacts resulting from the duplication of rows/columns at regular intervals in the output image. Here is a very good example of the kind of artefacts that can result in a DEM when choosing NN inappropriately. Notice that in this case, the artefact was only noticeable in a derivative of the DEM (curvature) and not the DEM itself. Often it becomes apparent that something isn't quite right with your NN-resampled DEM when you create a hillshade image.
Best Answer
What is Lanczos resampling?
Although the theory is described in an early paper and the Wikipedia article, a "feel" for resampling methods is best obtained by computing them on simple or standard images. This can be a vast topic, requiring extensive experimentation, but some simplifications are available:
These operators work separately in each color channel. Therefore it suffices to study how they work on a monochromatic ("black and white") image.
Most convolution operators used in image processing work the same way in the x and y directions and independently in both directions. In effect, they are really one dimensional operators applied first to the rows and then to the columns. This means we can study them by studying "1D" images, which can be plotted in detail.
Everything we need to know about a linear operator (which includes all convolution operators) can be inferred from how an operator works on the simplest non-constant image of all: this is a sudden jump from one value to another.
Let's look at an illustration of several popular resampling methods. Actually, we need two illustrations: one to show what happens in "downsampling," where the new image is coarser than the old, and another to look at "upsampling," where the new images is refinement of the old. Let's start with the latter, because it shows more detail.
Upsampling
The original 7 by 7 image on the left is really one-dimensional because each row is the same. The resampling occurs across the columns. The dimension of the other five images is 80 by 80, showing in detail how each method interpolates between the original coarse pixels. Nearest-neighbor sampling retains the sharp division between dark and light while the other four methods blur the intervening region to some extent. Notably, the Lanczos resampler creates some regions that are darker than any in the original and others that are lighter than any in the original. (This can have implications for GIS work, because such an extrapolation of the original values can potentially cause the new values to be invalid. They can also extend beyond the range of the original color map, sometimes causing the extreme values in the resampling image to be rendered incorrectly. This is a problem with bicubic convolution resampling in ArcGIS, for example.)
(NB: The "bicubic" method shown here is a bicubic spline, not the "bicubic convolution" of ArcGIS.)
Using lightness to depict image values, although natural, is not very precise. The next illustration rectifies this by graphing the cell values (vertical axis) by column (horizontal axis).
Lower values on the graphs correspond to darker parts of the images. A thoughtful examination of the original uncovers a hidden assumption: although the original image looks like a sharp jump from dark to light, the jump actually occurs over one-seventh (1/7) of the extent of the columns. Who is to say what really happens in that interval in the original scene the image is depicting? We therefore should not be too critical of differences among the resampling methods that occur within this short interval: each one is giving a different but potentially equally valid rendering of what might be occurring in the original scene. In this sense, it is no longer apparent that nearest neighbor sampling is the most faithful interpolation method.
One conclusion we should draw is that the accuracy of any upsampling method depends on the nature of the underlying scene. If the scene consists of values that should smoothly vary from one point to the next, then the nearest neighbor method is likely to be the least faithful way of resampling among those shown.
Downsampling
Here we see the result of downsampling a 16 by 16 image to 8 by 8 images (a 2 by 2 aggregation). Nearest neighbor accurately retains the sharp boundary. Lanczos differs from the others by enhancing the apparent sharpness. A close look shows that it darkens the dark area on one side of the boundary and lightens the light area on the other side. The graphs clarify this:
The bilinear, bicubic, and Gaussian resamplers show characteristics of convolution operators that have all positive weights (or very small negative weights): they average, or "smear," neighboring values. In downsampling this causes sharp features to be blurred. The extent of the blur depends on the width of the kernel. Like these others, the Lanczos resampler also blurs the jump, but it "overshoots" it on both sides. That's the contrast enhancement seen just above in the images themselves. Because of this tendency to increase contrast (the local differences between the highs and lows in the image), the Lanczos resampler is often called a "sharpening filter." These graphs show that this characterization requires a nuanced understanding, because evidently it does not actually reduce the averaging of values on both sides of the jump. At pixel 4, its value of 0.56 is comparable to the values computed by the other convolution filters.
How does using it affect the output?
Let's take a look at what happens in a more complex image.
The original, which is a 13 by 13 image now includes a pattern with the highest possible spatial frequency (alternating between light and dark with every column at the right). We cannot hope to reproduce such features when downsampling: the smaller amount of pixels simply cannot hold all this information. Let's focus, then, on what happens when such an image is upsampled. If we care about faithful reproduction of the scene, we will want this high-frequency pattern to be reproduced accurately.
The smaller images are resampled to 25 by 25 pixels: almost, but not quite, a 2:1 refinement. To my eye, the Lanczos and bilinear methods reproduce the stripes most sharply among the four convolution resamplers. Nearest neighbor is, of course, the most faithful (because it cannot average values at all).
These graphs of the same results show that the Lanczos resampler was able to maintain the contrast in the stripes (as seen by the size of the vertical swings from lows to highs) at the expense of introducing a variation of intensity within the constant-value light area in the middle of the image (pixels 5, 6, 7 of the original). This variation shows up as stripe-like artifacts within the light part of the image (the middle). Of the resamplers shown here, it is alone in introducing such spurious detail.
What is it useful for in a spatial application?
Evidently, Lanczos resampling is not a panacea or omnibus solution to resampling. It is superior to many other convolution resamplers in maintaining (or even enhancing) local contrast. This can be useful when the resampled image is intended for viewing identification of detailed features or boundaries. When the resampled image will subsequently be analyzed or processed, Lanczos resampling may increase the ability to detect edges and linear features.
When the resampled image will be analyzed in other ways, though, the benefits of Lanczos resampling are doubtful. It typically will (artificially) increase local measures of spatial variability, such as focal ranges and focal standard deviations. It will not affect spatial means on the whole--like the other convolution resamplers, it is usually normalized (which means it's a local weighted average, with no bias introduced)--but it may increase some local averages and decrease others compared to the other resamplers.
The (necessarily brief) evaluation here suggests the Lanczos resampler generally should not be used for downsampling: for that application, it appears to offer nothing that simpler (and more commonly available) methods have, which retaining the potential disadvantage of extrapolating beyond the original range of data values.
Afterword: a general comment
The investigation described here is an example of what anybody can do when they have a question about how a GIS operation works. It uses the GIS itself as the subject of the investigation: to know what some operation or analytical method does, simply apply it under controlled experimental conditions. In this case that amounts to constructing simple test images, resampling them according to available methods, and examining the results.
There are three critical aspects of this approach to learning about how GIS works: