Ok, so I'm working with this ENDF
data, see here. Sometimes in the files they have what is quite possibly the most annoying encoding of scientific notation floating point numbers I have ever seen1. There it is often used that instead of 1.234e-3
it would be something like 1.234-3
(omitting the "e").
Now I've seen a library that simply changes -
into e-
or +
into e+
by a simple substitution. But that doesn't work when some of the numbers can be negative. You end up getting some nonsense like e-5.122e-5
when the input was -5.122-5
.
So, I guess I need to move onto regex? I'm open to another solution that's simpler but its the best I can think of right now. I am using the re
python library. I can do a simple substitution where I look for [0-9]-[0-9]
and replace that like this:
import re
str1='-5.634-5'
x = re.sub('[0-9]-[0-9]','4e-5',str1)
print(x)
But obviously this won't work generally because I need to get the numerals before and after the -
to be what they were, not just something I made up... I've used capturing groups before but what would be the fastest way in this context to use a capturing group for the digits before and after the -
and feed it back into the substitution using the Python
regex library import re
?
1 Yes, I know, fortran...80 characters...save space...punch cards...nobody cares anymore.
Probably wouldn't reach for regex for this, when some simple string ops should work:
s.replace("-", "e-").replace("+", "e+").lstrip("e")