I am trying to process a very massive xml file. At some point, it seem to contain some weird character that provokes the processing script fail.
I'd like to see what is in that given line, but Python (Python 3.6.9) says the line is a negative one:
xml.parsers.expat.ExpatError: not well-formed (invalid token): line -1503625011, column 60
I assume that the line number is negative because it is above the max integer value.
How can I "convert" this negative number to a positive number, so I can feed it to head file -n (number) | tail -n1
in order to isolate that faulty line?
Looks like its incorrectly using a signed 32-bit int.
Converting -1503625011
to an unsigned int gives 2791342285
To 'un-sign' integers like this, see How to convert signed to unsigned integer in python
Note: This would only affect row numbers >= 231 (2,147,483,647
)