MATLAB: Textread format help – identifying error issue

Hi!

I am a grad student working with Ground-penetrating radar. I'm using a code developed by John Bradford for time-zero corrections. The code is a little dated and uses textread instead of textscan. The GPR files I am loading in are newer and are formatted differently than older versions. Matlab throws an error whenever I try to run the code on newer GPR files and I've identified the source of the error.

Older versions of the GPR file are formatted like:

"NUMBER OF TRACES = 657

NUMBER OF PTS/TRC = 3600

TIMEZERO AT POINT = 198.18 "

Newer versions are formated like without the space between line entries. If I add a space between lines in the newer file formats, the code runs perfectly. However, I have a lot of files and I don't want to edit every single one of them, so I was hoping to better understand textread.

The line that is error prone :

 [value,hdr]=textread([name '.hd'],'%21c%f%*[^\n]',7,'headerlines',5)

. Can someone help me understand how '%21c%f%*[^\n]' works to identify text in the file?

Thanks!

Best Answer

%21c matches exactly 21 characters including \r or \n characters and any Delimiter character you might have specified. The characters are returned.

However, since you have not changed the Whitespace property, leading whitespace (including spaces and newlines) will be skipped before the %21c is considered to start.

In your sample text, the %21c runs to include the space after the '='

%f matches a number, which could be in integer or floating point syntax. Since you have not changed the Whitespace property, leading whitespace (including spaces and newlines) will be skipped before the %f is considered to start. In your sample text, you are picking up right at the numbers and they will be converted. On the third of those lines where you have indicated that there is a space after the number and before the end, textscan would be positioned at the space. The numbers are returned.

%*[^\n] matches any number of characters up to but excluding newline (but including any carriage return), and does not return the characters, because of the * after the % . textscan would be left positioned at the newline and would not consume the newline. But when textscan cycles to the next use of the format starting the %21c over again, remember that it skips leading whitespace, so the newline will be consumed.

I would expect that format to work no matter how many empty lines there were between usable lines.

However, what I would not expect is for the HeaderLines to count properly if there is a difference in empty lines in the header range. That is what I would look into: the possibility that your format still works but that the header positioning needs refinement.

Best Answer

Related Solutions

MATLAB: Textscan won’t read the dates with spaces

MATLAB: Format specifiers for TEXTSCAN

Related Question