I have a lot of strings corresponding each one to the path of files. I would like to extract number in exponential format in each string.
For example, I have :
../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_7.27168772219203e-07/wm_up
and I would like to extrat the float number : 7.27168772219203e-07
I would like to avoid using the split
method (with _
separator).
So I tried with python regexp
like but I can't find which method to use (findall
, research
or sub
) ?
How can I achieve this in a simple or short way (independently from wm_up
substring since it may be other substrings (like this wm_dw
for example))?
I would like to extract number since I want to sort in ascending order all these long srings. I would like to use natsorted
:
For example, I have initially :
../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_9.301510038746646e-06/wm_up
../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_9.301510038746646e-06/wm_dw
../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_9.437191487625705e-05/wm_up
../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_9.437191487625705e-05/wm_dw
This is the result of natsorted
of array of paths : as you can see, the ascending order takes into account the first digits and not the value of float exponential number (the real value) that I would like to extract. I would like to select by the ascending order of this value.
Here is the code:
l = [
'../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_9.301510038746646e-06/wm_up',
'../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_9.301510038746646e-06/wm_dw',
'../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_9.437191487625705e-05/wm_up',
'../../Analysis_Pk_vs_Step_BEFORE_NEW_LAUNCH_13_DECEMBRE_22h57/Archive_WP_Pk_der_3_pts_step_9.437191487625705e-05/wm_dw'
] # the input that we have
# regex from https://stackoverflow.com/a/4703508/7434857
numeric_const_pattern = '[-+]? (?: (?: \d* \. \d+ ) | (?: \d+ \.? ) )(?: [Ee] [+-]? \d+ ) ?'
rx = re.compile(numeric_const_pattern, re.VERBOSE) # compile the regex
l.sort(key=lambda x: (float(rx.findall(x)[-1]),x))