Search code examples
pythonarraysstringlistpython-re

Getting ASIN from text using Python and re


I'm trying to get asins from my text. But I didn't achieve it yet. I tried using re but it didn't help me so much. It always gave me None result.

Here's my text:

/data.txt

FolkArt One Stroke Palette  
FolkArt One Stroke Palette
B0007TZY3W
AMZRSG1890951279
4.6 review rate122

Deleter | Manga Tool Kit SPDX   
Deleter | Manga Tool Kit SPDX
B000DZTROC
AMZRSG1890951289
4.6 review rate46

And I want to get this result:

['B0007TZY3W', 'B000DZTROC']

Here's what I tried:

# Get ASINS from data text file
with open('data.txt', 'r', encoding="utf8") as file:
    data = file.read()
    data = re.search(r'B(.*) AMZRSG', str(data))
    print(data)

The result is:

None

How can I achieve this result? I tried to get it with re but as I said it didn't work. Hope you understand what I mean. Thanks.


Solution

  • You need to add the newline character into your regex:

    import re
    
    with open('data.txt', 'r', encoding="utf8") as file:
        data = file.read()
        asins = re.findall(r'B(.*)\nAMZRSG', str(data))
    
        for asin in asins:
            print(f'B{asin}')
    

    Out:

    B0007TZY3W
    B000DZTROC