Search code examples
stringfunctiontextextractcriteria

I'm looking for a way to extract strings from a text file using specific criterias


I have a text file containing random strings. I want to use specific criterias to extract the strings that match these criterias.

Example text :

B311-SG-1700-ASJND83-ANSDN762 BAKSJD873-JAN-1293

Example criteria :

All the strings that contains characters seperated by hyphens this way : XXX-XX-XXXX

Output : 'B311-SG-1700'

I tried creating a function but I can't seem to know how to use criterias for string specifically and how to apply them.


Solution

  • Based on your comment here is a python script that might do what you want (I'm not that familiar with python).

    import re
    
    p = re.compile(r'\b(.{4}-.{2}-.{4})')
    
    results = p.findall('B111-SG-1700-ASJND83-ANSDN762 BAKSJD873-JAN-1293\nB211-SG-1700-ASJND83-ANSDN762 BAKSJD873-JAN-1293 B311-SG-1700-ASJND83-ANSDN762 BAKSJD873-JAN-1293')
    
    print(results)
    

    Output: ['B111-SG-1700', 'B211-SG-1700', 'B311-SG-1700']

    You can read a file as a string like this

    text_file = open("file.txt", "r")
    data = text_file.read()
    

    And use findall over that. Depending on the size of the file it might require a bit more work (e.g. reading line by line for example