MATLAB: How to read large set of multi-page TIFF files into tall array using a datastore

bigdatadatastoreMATLABtall

I have a directory of N TIFF files with M pages of resolution _a _x _b_. How can I read these tiff files such the result is an (N * M) x a x b tall array?
The directory has too many images to load all of them into memory at one time.

Best Answer

This task is best separated into two steps. The first step is to import the multi-page TIFF images using a datastore. The second step is create the desired tall array from the datastore.
(1) Import multi-page TIFF images using a datastore
Use 'fileDatastore' with a custom 'ReadFcn' function. The custom function should use the 'Tiff' class to read the multiple pages of the TIFF files. The following code demonstrates how to read a multi-page, monochrome, TIFF file in which each page is the same resolution:
function subimages = readMultipageTiff(filename)
% Read a multipage tiff, assuming each file is the same size
t = Tiff(filename, 'r');
subimages(:,:,1) = t.read(); % Read the first image to get the array dimensions correct.
if t.lastDirectory()
return; % If the file only contains one page, we do not need to continue.
end
  % Read all remaining pages (directories) in the file
t.nextDirectory();
while true
subimages(:,:,end+1) = t.read();
if t.lastDirectory()
break;
else
t.nextDirectory();
end
end
end
2) Create a tall array of size M x a x b from the datastore
a x b is the size of each image and M is the number of images. Because the custom read function used in (1) returns matrices such that the first dimension selects a sub-image, 'cell2mat' will produce an M x a x b matrix. This approach will not require loading the entire image set into memory. The following gives an example of this approach:
ds = fileDatastore({'<DirectoryHere>'},'ReadFcn', @readMultipageTiff);
tds = tall(ds);
tds = cell2mat(tds);
write('tds.tall', tds);