I have two lists.
files = ['26ZJ35_v1.4.doc', '2EPWW9_v1.1.pdf', '344D4Q_v1.8.ppt'. '33ADNL_v3.0.pdf']
baseline_documents = ['26ZJ35', '2EPWW9']
I want to find all the matches in list1 which has an exact string match from list 2 and append to a new list.
Output desired:
list3 = ['26ZJ35_v1.4.doc', '2EPWW9_v1.1.pdf']
Code till now:
import csv
import os
import re
metadata = []
with open('D:/meta_demo.csv', 'r') as f:
rows = csv.reader(f)
for i in rows:
metadata.append(i)
#print(i)
baseline_documents = metadata[1:20]
DIR = 'D:/demo_files/'
files = [i for i in os.listdir(r"D:\demo_files")]
list3 = []
for i in files:
if re.search(r"[^_]*", i) in baseline_documents:
list3.append(files)
list3 = [i for i in baseline_documents if re.search(r"[^_]*", i) in files]
You can use str.startswith
Ex:
files = ['26ZJ35_v1.4.doc', '2EPWW9_v1.1.pdf', '344D4Q_v1.8.ppt', '33ADNL_v3.0.pdf']
baseline_documents = ['26ZJ35', '2EPWW9']
result = [i for i in files if i.startswith(tuple(baseline_documents))]
print(result)
If you need regex use re.match
.
Ex:
import re
files = ['26ZJ35_v1.4.doc', '2EPWW9_v1.1.pdf', '344D4Q_v1.8.ppt', '33ADNL_v3.0.pdf']
baseline_documents = ['26ZJ35', '2EPWW9']
pattern = re.compile("|".join(baseline_documents))
result = [i for i in files if pattern.match(i)]
print(result)
Output:
['26ZJ35_v1.4.doc', '2EPWW9_v1.1.pdf']