I have a string that I want to parse into to numbers and non-numbers.
For my purposes:
A Number can EITHER be any sequential string of digits OR sequential string of digits with a . followed by another sequential string.
A Non-Number is anything that is not a Number.
For example
ljksadflh23898129hfafh0324.22234
should be parsed into something like
ljksadflh, 23898129, hfafh, 0324.22234
or
ljksadflh/23898129/hfafh/0324.22234
or whatever floats your boat as long as the list retains the same ordering.
Best Answer
With the experimental (but pretty much ready for release) package
l3regex
(found in thel3experimental
bundle on CTAN), this task is a piece of cake.The
\regex
line splits the user input#1
into pieces which either consist of one or more (+
) non-digits (\D
), or (|
) of one or more digits (\d
), followed maybe (?
acting on the group(...)
, which we want to be "non-capturing", done using(?:...)
) by a dot (\.
escaped dot, because the dot has a special meaning) and zero or more digits (\d*
). The line below maps through all the matches we found, with##1
being a single match. Of course, you can do whatever you want to do with the items of the sequence\l_uiy_result_seq
.Edit: The module also provides regular expression replacements. If I remember the syntax correctly, the following should work.
This time, I catch both the sequence of non-digits, and the number, as captured groups,
\1
and\2
. Each such occurrence is replaced by the macro\uiy_do:nn
(the\c
escape in this case indicates "build a comman"), then a begin-group (\cB
) character{
(this time,\c
indicates the category code), then the non-digits (\1
), then an end-group (\cE
) character}
, then another\cB{
, the number, and a closing\cE}
.After that, the token list looks like
\uiy_do:nn {ljksadflh} {1}
. We then simply use its contents with\tl_use:N
. The final step is to actually define\uiy_do:nn
. Here, I defined it as simply building a command from#1
, and giving it the argument#2
. This very simple action could be done at the replacement step using\c{\1}
for "build a command from the contents of group\1
", and technically it would be slightly better, producing an "undefined control sequence" error if the relevant command is not defined. Another option for that error detection to happen is to replace\use:c {#1} {#2}
by\cs_if_exist_use:cF {#1} { \msg_error:nnx { uiy } { undefined-command } } {#2}
, with an appropriately defined error message.