I would say that parsing integer and floating point values is the most complex part of the tokenizer.
The simplest way to parse would be similar to what I have done before,
basically a mini kind of state machine with possibilities. Basically as
characters are parsed determine what it cannot be. The only unambigious
literal would be binary digits which start with
0b. Anything else could be
floating point values. But if a decimal point exists, it will be known that it
is a float value, so there will just be removing what is needed.
But definitely literal parsing will be quite complex, especially with underscores and such.