I have a bunch of data with spaces in it that is fixed width. It is formatted such that each "column" is a fixed number of characters wide (including space characters). I've been trying to use "textscan" with a format like this: '%6c %6c %6c %6c %6c' but it seems to A)ignore spaces a the beginning of the string (this serves to throw off the 'count' when the first number goes from 9 to 10 for example) and B)not recognize blanks in the middle or at the ends. eg: "___8.5|___9.2|______|______|___7.6" where "_" represents one space and "|" are added for clarity. I need this to read in with 8.5 in the first cell, 9.2 in the second, and 7.6 in the 5th cell. I can tolerate it being a string (and converting later) as long as the column placement is preserved.
MATLAB: Reading in data with spaces
file readMATLABstringtext;textscan
Related Solutions
Here is a concise way:
files = {'A.txt', 'B.txt', 'C.txt', 'D.txt', 'E.txt'} ; buffer = cell( size( files )) ; for k = 1 : numel( files ) buffer{k} = sscanf( fileread( files{k} ), '%f' ) ; end fId = fopen( 'Merged.txt', 'w' ) ; fprintf( fId, '%f\t%f\t%f\t%f\t%f\r\n', horzcat( buffer{:} ).' ) ; fclose( fId ) ;
You will have to tailor the formatSpec '%f\t%f\t%f\t%f\t%f\r\n' to your needs, i.e. define a more specific numeric format (e.g. %.3f instead of %f) and define the separator (here \t for tab, but you may need/want a simple comma instead).
The documentation of textscan doesn't cover fixed-width very well. However, textscan has "undocumented"/hidden capabilities.
Approach
- Read the first three columns to one string, since they shall only be copied to the output file.
- Read the following six columns to a double array.
- Add 1 to the prescribed elements of the array
- Use the same format string to write the data (don't forget new-line)
Run example code (I use R2013b)
fixed_width_format(13)
where
function fixed_width_format( N ) fid = fopen( 'fixed_width_format.txt' ); format_spec = '%20s%8.3f%8.3f%8.3f%8.4f%8.4f%8.4f'; cac = textscan( fid, format_spec, N ... , 'Whitespace' , '' ... , 'Delimiter' , '' ... , 'CollectOutput' , true ); fclose( fid ); RowHead = cac{1}; Data = cac{2}; % add 1 to the numbers in the columns 4 and 5 and rows 3 to 8.
Data( 3:8, 4:5 ) = Data( 3:8, 4:5 ) + 1; fid = fopen( 'fixed_width_format_out.txt', 'w' ); for rr = 1 : N fprintf( fid, [format_spec,'\n'], RowHead{rr}, Data(rr,:) ); end fclose( fid ); end
and where fixed_width_format.txt contains
1SOL OW 1 4.309 5.254 4.135 -0.2790 0.3440 0.2064 1SOL HW1 2 4.314 5.169 4.082 -1.5406 0.3918 -0.0293 1SOL HW2 3 4.388 5.312 4.114 -1.3375 0.9272 -2.6151 2SOL OW 4 1.743 1.687 2.366 0.2136 0.2777 0.3181 2SOL HW1 5 1.818 1.750 2.387 0.3115 0.1542 0.3431 4502OCTA H13545 2.108 5.326 1.045 -1.2169 0.4890 -2.6144 4502OCTA H13546 2.068 5.492 1.036 0.7609 0.6650 0.8612 4502OCTA H13547 2.285 5.388 1.207 3.0144 2.5562 1.0920 4502OCTA H13548 2.121 5.425 1.265 -1.2460 -1.3635 1.4829 4502OCTA Oc13549 2.131 5.677 1.238 -0.0183 -0.0221 -1.0402 4502OCTA Oh13550 2.353 5.635 1.208 -0.6036 0.2241 -0.8140 4502OCTA H13551 2.383 5.198 0.399 0.4893 0.7154 -0.9915 4502OCTA Ho13552 2.413 5.565 1.189 -0.4685 -0.0421 -2.1107 --- 0---|--- 10---|--- 20---|--- 30---|--- 40---|--- 50---|--- 60--- 123456789|123456789|123456789|123456789|123456789|123456789|123456789 '%20s%8.3f%8.3f%8.3f%8.4f%8.4f%8.4f%8.4f'
and where fixed_width_format_out.txt contains
1SOL OW 1 4.309 5.254 4.135 -0.2790 0.3440 0.2064 1SOL HW1 2 4.314 5.169 4.082 -1.5406 0.3918 -0.0293 1SOL HW2 3 4.388 5.312 4.114 -0.3375 1.9272 -2.6151 2SOL OW 4 1.743 1.687 2.366 1.2136 1.2777 0.3181 2SOL HW1 5 1.818 1.750 2.387 1.3115 1.1542 0.3431 4502OCTA H13545 2.108 5.326 1.045 -0.2169 1.4890 -2.6144 4502OCTA H13546 2.068 5.492 1.036 1.7609 1.6650 0.8612 4502OCTA H13547 2.285 5.388 1.207 4.0144 3.5562 1.0920 4502OCTA H13548 2.121 5.425 1.265 -1.2460 -1.3635 1.4829 4502OCTA Oc13549 2.131 5.677 1.238 -0.0183 -0.0221 -1.0402 4502OCTA Oh13550 2.353 5.635 1.208 -0.6036 0.2241 -0.8140 4502OCTA H13551 2.383 5.198 0.399 0.4893 0.7154 -0.9915 4502OCTA Ho13552 2.413 5.565 1.189 -0.4685 -0.0421 -2.1107
 
Finally, does this example rely on undocumented features of textscan?
 
Addendum triggered by comment
>> data = fixed_width_format(7); >> data(1:5,:) ans = 4.3090 5.2540 4.1350 -0.2790 0.3440 0.2064 4.3140 5.1690 4.0820 -1.5406 0.3918 -0.0293 4.3880 5.3120 4.1140 -0.3375 1.9272 -2.6151 1.7430 1.6870 2.3660 1.2136 1.2777 0.3181 1.8180 1.7500 2.3870 1.3115 1.1542 0.3431 >> data(6:7,:) ans = 0 111 111111 1222222 2222333 3333333 123456 789012 345678 9012345 6789012 3456789 >>
where
function data = fixed_width_format(N) fid = fopen( 'fixed_width_format_dpb.txt' ); format_spec = '%6f%6f%6f%7f%7f%7f'; cac = textscan( fid, format_spec, N ... , 'Whitespace' , '' ... , 'Delimiter' , '' ... , 'CollectOutput' , true ); fclose( fid ); data = cac{1}; end
and where fixed_width_format_dpb.txt contains
4.309 5.254 4.135-0.2790 0.3440 0.2064 4.314 5.169 4.082-1.5406 0.3918-0.0293 4.388 5.312 4.114-0.3375 1.9272-2.6151 1.743 1.687 2.366 1.2136 1.2777 0.3181 1.818 1.750 2.387 1.3115 1.1542 0.3431 000000000111111111122222222223333333333 123456789012345678901234567890123456789
Best Answer