It's my first topic question, thanks in advance for your help and the time you spend to read me
I work under NP++ to try some Regex
I would like to get transformed those lines (from) into those formatted lines (to) with one more attractive and smarter regex than mine below (see Unattractive solution)
(from) => (to)
H04B0001240000; => H04B 1/24;
H04B0010300000; => H04B 10/30;
H04B0011301000; => H04B 11/301;
H04B0111300000; => H04B 111/30;
H04B0101303400; => H04B 101/3034;
H04B0100300010; => H04B 100/30001;
H04B0110300000; => H04B 110/30;
-For a given code, the rules are
H04B0001240000;
-Cut into three parts 4, 4 and 6
H04B 0001/240000;
-Withdraw all padding 0s at the beginning of the second group (the second group should have at least one digit)
H04B 1/240000;
-Withdraw all padding 0s at the end of the third group (the third group should have at least two digits)
H04B 1/24;
So the deemed useless 0s are at the beginning of the second group and at the end of third group. The number of padding 0s is varying...
Under NP++, I found a solution that I find unattractive
In 'Search' field :
([A-Z])((?:0{3}([1-9]))|(?:0{2}([1-9][0-9]))|(?:0([1-9][0-9]{2})))([0-9]{2})([0-9]*[1-9])?0{1,4}(;)
In 'Replace' field :
\1 \3\4\5\/\6\7\8
Explanations with H04B 0001/240000;
==============================
([A-Z])
means one capital letter from A to Z, matchs the last letter of the first group (H04B
)
((?:0{3}([1-9]))|(?:0{2}([1-9][0-9]))|(?:0([1-9][0-9]{2})))
should matchs 0002 or 0020 or 0201 but not 2011. It concerns detection of the second group (0001
)
([0-9]{2})([0-9]*[1-9])?0{1,4}(;)
concerns the third group of 6 digits (240000
) with with the intention of discard all padding 0s at the end. The third group should have at least two digits ([0-9] {2})
Do you know a more attractive and smarter Regex to reach the aimed result ?
You can do it like this
(?m)^(\S{4})0*(\d\d*?)(?<=^.{8})(\d{2}\d*?)0*;
https://regex101.com/r/7pTjkB/2
(?m)
^
( \S{4} ) # (1)
0*
( \d \d*? ) # (2)
(?<= ^ .{8} )
( # (3 start)
\d{2}
\d*?
) # (3 end)
0*
; # Or, (?<= ^ .{14} )
Or, like this
(?m)^(\S{4})0*(\d\d*?)(?<=^.{8})(\d{2}\d*?)0*(?<=^.{14})
https://regex101.com/r/7pTjkB/3
(?m)
^
( \S{4} ) # (1)
0*
( \d \d*? ) # (2)
(?<= ^ .{8} )
( # (3 start)
\d{2}
\d*?
) # (3 end)
0*
(?<= ^ .{14} )