Importing DAT file with different space width delimiter in QGIS 3

csvdelimiterimportqgis

I'm trying to import a point located geophysics data file (.dat) in QGIS 3.18, which uses fixed width columns instead of character based delimiters. Excel has a feature to import these data, but I could not find a matching method in QGIS.

An example of the data, which is "Point located data (dat) [100.7 MB]" from http://pid.geoscience.gov.au/dataset/ga/133023

   195099   1950999925.00         5338572  115.112162  -21.662860           2.950          -7.859         -10.809        9787594.02           2.950          -7.859            0.05            2.67          211.91          220.82          245.01          241.71         500.000           1.500           1.500            0.06            6.00           1.500           1.500             0.5               4             1               7               7               7               3               7               7               7               3 ONSLOW A/S PS25 WRKSHP          B  TR  GDA94  MSL         GRS80    B   DEG  m  m     m       um      m  m     um        m  m      m      m      um      m     m      um        MAN    SUR   ELN  W98    SUR     ELN  DFA     AUSGEOID98  GRS80    PEO                 99999 31-DEC-99      31-DEC-99  Pre March 2003 value:978773.71mg (Isogal65 "O")                                         
   195099   1950999926.00         4427037  118.631315  -20.375324           8.000           6.110          -1.890        9786311.72           8.000           6.110            0.05            2.67         -249.54         -256.47         -243.97         -252.93         500.000           1.500           1.500            0.06            6.00           1.500           1.500             0.5               4             1               7               7               7               3               7               7               7               3 PORT HEDLAND A/S TERM  PS26     B  TR  GDA94  MSL         GRS80    B   DEG  m  m     m       um      m  m     um        m  m      m      m      um      m     m      um        MAN    SUR   ELN  W98    SUR     ELN  DFA     AUSGEOID98  GRS80    PEO                 99999 31-DEC-99      31-DEC-99  Pre March 2003 value:978645.56mg (Isogal65 "O")                                         
   195105   1951053320.00         5338583  116.114682  -23.759529         335.760         324.240         -11.520        9787736.92         335.760         324.240            0.18            2.67           32.32         -335.10           67.88         -307.83         100.000          10.000          10.000            0.06            5.00          10.000          10.000             0.5               3             0               4               4               4               3               3               4               4               3 EDMUND GATE 33-20               B  TR  GDA94  MSL         GRS80    B   DEG  m  m     m       um      m  m     um        m  m      m      m      um      m     m      um        MAN    PGE   ELN  W98    PGE     ELN  DFA     AUSGEOID98  GRS80    NOR          TNK413       31-DEC-99      31-DEC-99  Pre March 2003 value:978787.96mg (Isogal65 "O")                                         
   195205   1952050030.00         5338837  115.780630  -32.103506          12.800         -19.849         -32.649        9793973.77          12.800         -19.849            0.70            2.67        -1006.26         -983.74         -905.64         -919.96         100.000           2.000           2.060            0.06            2.00           2.000           2.060             0.5               3             0               7               7               1               3               7               7               1               3 3                               D  TR  GDA94  MSL         GRS80    B   DEG  m  m     m       um      m  m     um        m  m      m      m      um      m     m      um        DIG    MIB   ELN  W98    ALT     ELN  DFA     AUSGEOID98  GRS80           99999        99999 31-DEC-99      31-DEC-99  Comment is null                                                                         
   195205   1952050100.00         5803151  116.011741  -32.149894          56.690          26.251         -30.439        9794633.12          56.690          26.251            4.17            2.67         -242.42         -272.19         -148.57         -212.01         100.000           2.000           2.060            0.06            2.00           2.000           2.060             0.5               3             0               7               7               1               3               7               7               1               3 10                              D  TR  GDA94  MSL         GRS80    B   DEG  m  m     m       um      m  m     um        m  m      m      m      um      m     m      um        DIG    MIB   ELN  W98    ALT     ELN  DFA     AUSGEOID98  GRS80           99999        99999 31-DEC-99      31-DEC-99  Comment is null                                                                         
   195205   1952050150.00         4428660  115.898128  -31.961563          16.150         -15.925         -32.075        9793758.65          16.150         -15.925            0.87            2.67        -1093.95        -1075.89         -995.10        -1013.17         100.000           2.000           2.060            0.06            2.00           2.000           2.060             0.5               3             0               7               7               1               3               7               7               1               3 15                              D  TR  GDA94  MSL         GRS80    B   DEG  m  m     m       um      m  m     um        m  m      m      m      um      m     m      um        DIG    MIB   ELN  W98    ALT     ELN  DFA     AUSGEOID98  GRS80           99999        99999 31-DEC-99      31-DEC-99  Comment is null                                                                         

I've tried a mix of tabs, spaces, etc delimiter options, but the comment / text fields include arbitrary space characters. Additionally, the full file is too large to open in Excel and save as a character delimited text file.

I would like to successfully import this file. Perhaps I've missed an option, plugin, or a regular expression I could use? I could edit the file to add delimiters using a script, but that would not be a generalised solution.

Best Answer

As pointed out above and in the comments the ddf file in the download contains the file structure which can be used to construct the following regex:

(.{14}) (.{16})(.{15})(.{12})(.{12})(.{16})(.{16})(.{16})(.{18})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{14})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{16})(.{32})(.{3})(.{4})(.{7})(.{12})(.{9})(.{4})(.{5})(.{3})(.{6})(.{8})(.{8})(.{3})(.{6})(.{10})(.{3})(.{7})(.{7})(.{7})(.{8})(.{6})(.{7})(.{10})(.{7})(.{7})(.{5})(.{7})(.{8})(.{5})(.{8})(.{12})(.{9})(.{13})(.{13})(.{15})(.{11})

which will parse your data. It's a bit messy in that it doesn't include any field names, but it does seem to work when used with the settings shown on this screenshot: enter image description here

Related Question