I need to print the 2 missing strings KMJD23KN0008393
and KMJD23KN0008394
but what I am receiving is KMJD23KN8393
and KMJD23KN8394
.I need those missing zeros also in our list.
ll = ['KMJD23KN0008391','KMJD23KN0008392','KMJD23KN0008395','KMJD23KN0008396']
missList=[]
for i in ll:
reList=re.findall(r"[^\W\d_]+|\d+", i)
print(reList)
The issue can be decomposed into three parts:
There are multiple assumptions implicit in these items. You need to be aware of these, and ideally make them explicit. In the following, I’ve worked with the following assumptions:
The following code implements these assumptions:
all_missing = []
last_num = int(re.search(r'\d+$', ll[-1])[0])
prefix = re.match('.*\D', ll[0])[0]
for item in ll:
num_str = re.search(r'\d+$', item)[0]
num = int(num_str)
num_width = len(num_str)
for missing in range(last_num + 1, num):
all_missing.append(f'{prefix}{missing:0{num_width}}')
last_num = num
print(all_missing)
Some notes here:
\d+$
. That is: one or more digits, until the end of the string..*\D
.'0{num_width}'
.