MATLAB: How To Load Multiple Text Files (specific context)

datadirfilesfullfileheaderhowimportloadMATLABmultipleseparatedtext;textscanto

Hello MatLab community, I would like to load many text files (same # of rows and columns) contained in a same folder and compile/stock all 2nd columns in a one matrix.
Here's a example : For 30 text files, the resulting matrix would thus have 30 columns and as many rows as the files contain (specifically, they'd all have 2048 rows).
But here's the catch, there's a multi-lines header (something like 8 lines of header) before the data and the data is separated by a semicolon '' ; ''.
One of the text files is attached as an example.
Also, the names of the text files do NOT follow a certain pattern and they are quite random. I've already asked a very similar question here, but I wasn't considering the header. One helpful guy wrote the script below and I'd like to tweek it a little bit to include the right parameters for the textscan().
% Set input folder
input_folder = 'C:\Users\Cotet\Downloads';
% Read all *.txt files from input folder
% NOTE: This creates a MATLAB struct with a bunch of info about each text file
% in the folder you specified.
files = dir(fullfile(input_folder, '*.txt'));
% Get full path names for each text file
file_paths = fullfile({files.folder}, {files.name});
% Read data from files, keep second column
for i = 1 : numel(file_paths)
% Read data from ith file.
% NOTE: If you're file has a text header, missing data, or
% uses non white-space delimiters, you should check out the
% documentation for textread to determine which options to use.
data = textscan(file_paths{i}, '');
% Save second data column to matrix
% NOTE: Your data files all need to have the same number of rows for this to work
A(:, i) = data(:, 2);
end
The part with which I'm concerned is this note :
% NOTE: If you're file has a text header, missing data, or
% uses non white-space delimiters, you should check out the
% documentation for textread to determine which options to use.
I've tried many things, but was ultimately unsuccessful.
Thank you so much in advance.

Best Answer

As dpb suggested use one of the modern file import function such as readtable or readmatrix instead of the old textscan. These can figure the format of your file on their own or if they're struggling a bit have plenty of easy to understand options to help them along. They're also a lot more configurable, particularly if you use detectImportOptions.
For example, your text file is easily decoded with:
spectrum = readtable('1903395U1_04Jun19_154040_0001.Raw8.txt', 'HeaderLines', 8)
or for a neater table:
opts = detectImportOptions('1903395U1_04Jun19_154040_0001.Raw8.txt', 'ExpectedNumVariables', 4); %only needed once for all the files that follow the same format
spectrum = readtable('1903395U1_04Jun19_154040_0001.Raw8.txt', opts)
detectImportOptions automatically figure out that the header is 8 lines, that the delimiter is ; and that the name of the columns is on the 6th row. I've told it that there is only 4 variables despite the header having 5 names (why is there a 'scope'?).
You can easily wrap that in a loop over all the files. The detectImportOptions is only needed once if all the files follow the same format. You can store the table from each file into a cell array but if your aim is to run statistics across the files then you'd be better off storing it all as one flat table with an additional variable indicating which file the data comes from. After that you can use groupsumarry or similar to compute your statistics all at once.
So the code would be something like:
%Get list of files. You haven't explained how these can be obtained.
filelist = dir('C:\somefolder\*.txt');
%Loop to read all files:
spectra = cell(size(filelist)); %stored in a file array at first
opts = detectImportOptions(fullfile(filelist(1).folder, filelist(1).name, 'ExpectedNumVariables', 4);
for fileidx = 1:numel(filelist)
spectrum= readtable(fullfile(filelist(fileidx).folder, filelist(fileidx).name), opts); %read file
spectrum.Source = repmat({filelist(fileidx).name}, height(spectrum), 1); %add a variable indicating the source. Maybe you want to use only part of the filename
spectra{fileidx} = spectrum;
end
%flatten it all into one table
spectra = vertcat(spectra{:});
%compute some stats, e.g. mean and standard deviation of spectra at each wavelength across the files
groupsumarry(spectra, 'Wave', {'mean', 'std'}, {'Sample', 'Dark', 'Reference'})
Code untested. There might be typos. Read the error messages carefully. Note that I'm using meaningful variable names instead of the utterly useless A.