MATLAB: Extract date and time preceded by strings with variable length from text file

datetimereadtable

Hi all,
I am facing the challenge of extracting date and time from the following table.
>>460110419137703 0 2018-09-28 01:23:10 0.000000 0.000000
>>460028186815737 0368ceefe32555 2018-09-28 01:19:25 113.944056 22.504925
I have tried the following command but it only manages to extract the second line correctly.
>>tmp = readtable('1.txt','ReadVariableNames',false,'Format','%f%s%{yyyy-MM-dd HH:mm:ss}D%f%f');
One way to go around this is to replace '0' in the first line with a string of fourteen 0's to match the string length and then use the following command.
>>tmp = readtable('1.txt','ReadVariableNames',false,'Format','%f%14c%{yyyy-MM-dd HH:mm:ss}D%f%f');
But I think there may be a single command that can handle both lines without manual modification as above. Could somebody give me an idea how this could be done? Thanks!

Best Answer

fmt='%f%s%{yyyy-MM-dd HH:mm:ss}D%f%f';
dat=textscan(fid,fmt,'delimiter','\t','whitespace','');
This is directly related tp the discussion of parsing datetime with other fields we just had at Answer_339313>, and more particularly at your previous Q? discussion that Walter linked to there.
You've got to clear the whitespace from being seen inside the datetime field so textscan won't break the date and time into two pieces; fortunately your file here is tab-delimited.
Unfortunately, there's another insurmountable issue with the hex data without reverting to the fscanf level; I'd recommend you also submit enhancement request for missing '%x' conversion format string. I've complained about it previously; I've no idea why TMW would have left it out...one isn't supposed to have hex data???.
fmt='%f%x%{yyyy-MM-dd HH:mm:ss}D%f%f';
fid=fopen('bayer.dat','r');
>> textscan(fid,fmt,'delimiter','\t','whitespace','')
Error using textscan
Unable to parse the format character vector at position 3 ==> %x%{yyyy-MM-dd HH:mm:ss}D%f%f
Unsupported format specifier '%x'. See the documentation for TEXTSCAN for supported formats.
>> frewind(fid)
>> fmt='%f%s%{yyyy-MM-dd HH:mm:ss}D%f%f';
>> textscan(fid,fmt,'delimiter','\t','whitespace','')
ans =
1×5 cell array
{2×1 double} {2×1 cell} {2×1 datetime} {2×1 double} {2×1 double}
>> fid=fclose(fid);
>> t=readtable('bayer.dat')
t =
3×5 table
Var1 Var2 Var3 Var4 Var5
__________ ____ ___________________ ______ ______
4.6011e+14 0 2018-09-28 01:23:10 0 0
4.6003e+14 368 NaT 2018 -9
-28 1 NaT 113.94 22.505
>> opt=setvartype(opt,'Var2','char');
>> t=readtable('bayer.dat',opt)
t =
2×5 table
Var1 Var2 Var3 Var4 Var5
__________ ________________ ____________________ ______ ______
4.6011e+14 '0' 28-Sep-2018 01:23:10 0 0
4.6003e+14 '0368ceefe32555' 28-Sep-2018 01:19:25 113.94 22.505
>>
Your choice of how to handle the hex string; either convert to decimal or categorical depending on how it is to be used. One might consider also converting the first column to long integer as well...or it might also be categorical depending on just what it represents.