Search code examples
pythonpandasdesign-patternsnumbersanalysis

Python (pandas): Find pattern in numbers


I have a few csv-files with large numbers (up to 512bit) in it. I would like to write a program that searches the numbers for repeating digit patterns. So the output of the program is supposed to tell me that there are 100 numbers with "123" in the last position. But there can also be pattern in the middle of the number.

As I'm quite new to Python I was wondering if the pandas library is the right tool for me or maybe there is something better.

I'm thankful for every suggestion!!


Solution

  • This basically returns a dataframe containing numbers (if the number in your column has '123' in it). Input dataframe df:

       a    b
    451234  '123'
     1234   '4123'
      512   '4'
    

    If your column type is already a string :

    print(df[df['b'].str.contains('123')]['b'])
    

    output:

    0     '123'
    1    '4123'
    

    If your column type is not a string:

    print(df[df['a'].astype(str).str.contains('123')]['a'])
    

    output:

    0    451234
    1      1234