Search code examples
pythonregexfinanceyahoo-finance

How can I consistently convert strings like "3.71B" and "4M" to numbers in Python?


I have some rather mangled code that almost produces the tangible price/book from Yahoo Finance for companies (a nice module called ystockquote gets the intangible price/book value already).

My problem is this:

For one of the variables in the calculation, shares outstanding I'm getting strings like 10.89B and 4.9M, where B and M stand respectively for billion and million. I'm having trouble converting them to numbers, here's where I'm at:

shares=''.join(node.findAll(text=True)).strip().replace('M','000000').replace('B','000000000').replace('.','') for node in soup2.findAll('td')[110:112]

Which is pretty messy, but I think it would work if instead of

.replace('M','000000').replace('B','000000000').replace('.','') 

I was using a regular expression with variables. I guess the question is simply which regular expression and variables. Other suggestions are also good.

EDIT:

To be specific I'm hoping to have something that works for numbers with zero, one, or two decimals but these answers all look helpful.


Solution

  • >>> from decimal import Decimal
    >>> d = {
            'K': 3,
            'M': 6,
            'B': 9
    }
    >>> def text_to_num(text):
            if text[-1] in d:
                num, magnitude = text[:-1], text[-1]
                return Decimal(num) * 10 ** d[magnitude]
            else:
                return Decimal(text)
    
    >>> text_to_num('3.17B')
    Decimal('3170000000.00')
    >>> text_to_num('4M')
    Decimal('4000000')
    >>> text_to_num('4.1234567891234B')
    Decimal('4123456789.1234000000000')
    

    You can int() the result if you want too