MATLAB: How To Load Multiple Text Files (specific context)

datadirfilesfullfileheaderhowimportloadMATLABmultipleseparatedtext;textscanto

Hello MatLab community, I would like to load many text files (same # of rows and columns) contained in a same folder and compile/stock all 2nd columns in a one matrix.

Here's a example : For 30 text files, the resulting matrix would thus have 30 columns and as many rows as the files contain (specifically, they'd all have 2048 rows).

But here's the catch, there's a multi-lines header (something like 8 lines of header) before the data and the data is separated by a semicolon '' ; ''.

One of the text files is attached as an example.

Also, the names of the text files do NOT follow a certain pattern and they are quite random. I've already asked a very similar question here, but I wasn't considering the header. One helpful guy wrote the script below and I'd like to tweek it a little bit to include the right parameters for the textscan().

% Set input folder
input_folder = 'C:\Users\Cotet\Downloads';
% Read all *.txt files from input folder
% NOTE: This creates a MATLAB struct with a bunch of info about each text file
% in the folder you specified. 
files = dir(fullfile(input_folder, '*.txt'));
% Get full path names for each text file
file_paths = fullfile({files.folder}, {files.name});
% Read data from files, keep second column
for i = 1 : numel(file_paths)
    
    
    % Read data from ith file. 
    % NOTE: If you're file has a text header, missing data, or 
    % uses non white-space delimiters, you should check out the
    % documentation for textread to determine which options to use.
    data = textscan(file_paths{i}, '');
    
    % Save second data column to matrix
    % NOTE: Your data files all need to have the same number of rows for this to work
    A(:, i) = data(:, 2);
    
end

The part with which I'm concerned is this note :

% NOTE: If you're file has a text header, missing data, or

% uses non white-space delimiters, you should check out the

% documentation for textread to determine which options to use.

I've tried many things, but was ultimately unsuccessful.

Thank you so much in advance.

Best Answer

As dpb suggested use one of the modern file import function such as readtable or readmatrix instead of the old textscan. These can figure the format of your file on their own or if they're struggling a bit have plenty of easy to understand options to help them along. They're also a lot more configurable, particularly if you use detectImportOptions.

For example, your text file is easily decoded with:

spectrum = readtable('1903395U1_04Jun19_154040_0001.Raw8.txt', 'HeaderLines', 8)

or for a neater table:

opts = detectImportOptions('1903395U1_04Jun19_154040_0001.Raw8.txt', 'ExpectedNumVariables', 4);  %only needed once for all the files that follow the same format
spectrum = readtable('1903395U1_04Jun19_154040_0001.Raw8.txt', opts)

detectImportOptions automatically figure out that the header is 8 lines, that the delimiter is ; and that the name of the columns is on the 6th row. I've told it that there is only 4 variables despite the header having 5 names (why is there a 'scope'?).

You can easily wrap that in a loop over all the files. The detectImportOptions is only needed once if all the files follow the same format. You can store the table from each file into a cell array but if your aim is to run statistics across the files then you'd be better off storing it all as one flat table with an additional variable indicating which file the data comes from. After that you can use groupsumarry or similar to compute your statistics all at once.

So the code would be something like:

%Get list of files. You haven't explained how these can be obtained.
filelist = dir('C:\somefolder\*.txt');
%Loop to read all files:
spectra = cell(size(filelist));  %stored in a file array at first
opts = detectImportOptions(fullfile(filelist(1).folder, filelist(1).name, 'ExpectedNumVariables', 4); 
for fileidx = 1:numel(filelist)
    spectrum= readtable(fullfile(filelist(fileidx).folder, filelist(fileidx).name), opts);  %read file
    spectrum.Source = repmat({filelist(fileidx).name}, height(spectrum), 1);  %add a variable indicating the source. Maybe you want to use only part of the filename
    spectra{fileidx} = spectrum;
end
%flatten it all into one table
spectra = vertcat(spectra{:});
%compute some stats, e.g. mean and standard deviation of spectra at each wavelength across the files
groupsumarry(spectra, 'Wave', {'mean', 'std'}, {'Sample', 'Dark', 'Reference'})

Code untested. There might be typos. Read the error messages carefully. Note that I'm using meaningful variable names instead of the utterly useless A.

Related Solutions

MATLAB: How to put (tab delimited) text files together removing header text

So, the answer is the same as originally given, then...use sotoo

fmto=['%12.3f' repmat('%12.3f',1,nCols-1)];  
fido=fopen(youroutputfilename,'w');
fprintf(fido,'%s\n', yourheadertext)
for j=1:length(fileList)
  fid = fopen(fileList(j).name,'r');
  d=cell2mat(textscan(fid,'%f','headerlines', 6, 'treatasempty',{'n/a';'N/A'}));
  fid=fclose(fid);
  fprintf(fido,fmto,d')
end
fido=fclose(fido);

Adjust the various parameters to suit.

doc textscan % and friends

for more detail on the various options for empty values, and

doc fprintf % etc.

for detail of format strings to match you desired output formats. With a regular file format it is really pretty straightforward. The other respondents use of save is somewhat less verbose at the cost of less control over the output format--your choice depending on wants/needs.

ERRATUM:

Forgot the \n character for the output format...

fmto=['%12.3f' repmat('%12.3f',1,nCols-1) '\n'];

Also if do want the tab-delimited form retained then need it as well...

fmto=['%12.3f' repmat('\t%12.3f',1,nCols-1) '\n'];

MATLAB: How to read begin to read data after a string

filename = 'ConcreteStresses.txt';
S = fileread(filename);
idx = regexp(S, '^\s*\d', 'once', 'lineanchors');
fmt = repmat('%f', 1, 7);
data = cell2mat( textscan(S(idx:end), fmt) );
fcz = data(:,4);

Best Answer

Related Solutions

MATLAB: How to put (tab delimited) text files together removing header text

MATLAB: How to read begin to read data after a string

Related Question