MATLAB: Regular Expressions using regexp

expressionMATLABregexpstring

Hello, I have some problem with understanding regexp expression

I have some names: ["T_24_UZK500.txt"; "FWD_T80_UZK500.txt"; "T80_UZK700.txt"]

how can I get numbers after "T" and after "UZK"?

I need a rule that will describe only the numbers after the designated patterns.

Best Answer

Matching only integer numbers after 'UZK' or 'T_' (it is unclear in your question if the underscore is permitted or not, but the regular expression below is easy to adapt):

>> S = {'T_24_UZK500.txt';'FWD_T80_UZK500.txt';'T80_UZK700.txt'};
>> C = regexp(S,'(?<=(T_?|UZK))\d+','match');
>> C{:}
ans = 
    '24'    '500'
ans = 
    '80'    '500'
ans = 
    '80'    '700'

Or simply by matching any integer numbers:

>> C = regexp(S,'\d+','match');
>> C{:}
ans = 
    '24'    '500'
ans = 
    '80'    '500'
ans = 
    '80'    '700'

Related Solutions

MATLAB: Read a number after a specific string in a txt file

As it's a bit more elaborate than your previous question, it might be time to go for a regexp solution (even though you can always use STRFIND, SSCANF, etc).

Are these parameters listed in an increasing order? I.e. could we detect "parameter is" and get what follows iteratively, in order to build an array P whose 1st element is what you call P1, second element is what you call P2, etc?

I'm asking, because you could have a solution like

 >> buffer = fileread('theFile.txt') ;
 >> P = str2double(regexpi(buffer, '(?<=parameter is\s*)\d*', 'match'))
 P =
     1     5

If parameters are not ordered, we have to match them more specifically though.

MATLAB: String Match for Plotting in Excel

If states had no space in their names, or if you had commas as delimiters in the CSV file, you could go for a variant of

 [state, temp] = textread('myFile.csv', '%s %d', 'delimiter', ',', ...
                          'headerlines', 1) ;

Now as it seems that there are spaces in names and no comma as delimiter, you can read line by line and extract states and temperatures with more specific functions (TEXTSCAN, FSCANF, FGETL+SSCANF, etc) and based on position if needed (e.g. start reading temperatures from char. 15 on). Then you can use STRCMPI to find indices of relevant states, and get corresponding temperatures based on these indices. But you could also go for a solution based on regular expressions (less common approach for this kind of structured data), that I illustrate below:

 >> buffer = fileread('myFile.csv') ;
 >> state = 'New York' ;
 >> temp = str2double(regexpi(buffer, sprintf('(?<=%s\\s*)\\d*', state), ...
                              'match'))
 temp =
    83    55
 >> state = 'California' ;
 >> temp = str2double(regexpi(buffer, sprintf('(?<=%s\\s*)\\d*', state), ...
                              'match'))
 temp =
    80    92

Note that I wrote this in a concise manner, but we do the following in fact:

 >> pattern = sprintf('(?<=%s\\s*)\\d*', state) ;
 >> match   = regexpi(buffer, pattern, 'match') ;
 >> temp    = str2double(match) ;

If you look at the pattern for New Work:

 >> pattern
 pattern =
 (?<=New York\s*)\d*

It tells regexp to match

As many numeric characters as possible: \d*
Preceded by (positive look behind (?<=)) the literal New York followed by as many white spaces as possible: New York\s*

Best Answer

Related Solutions

MATLAB: Read a number after a specific string in a txt file

MATLAB: String Match for Plotting in Excel

Related Question