Websites like http://www.easysurf.cc/cnvert18.htm and http://www.calculatorsoup.com/calculators/conversions/numberstowords.php tries to convert a numerical string into an english strings, but they are giving natural sounding output.
For example, on http://www.easysurf.cc/cnvert18.htm:
[in]: 100456
[out]: one hundred thousand four hundred fifty-six
this website is a little better, http://www.calculator.org/calculate-online/mathematics/text-number.aspx:
[in]: 100456
[out]: one hundred thousand, four hundred and fifty-six
[in]: 10123124001
[out]: ten billion, one hundred and twenty-three million, one hundred and twenty-four thousand, one
but it breaks at some point:
[in]: 10000000001
[out]: ten billion, , , one
I've wrote my own version but it involves lots of rules and it caps at one billion, from http://pastebin.com/WwFCjYtt:
import codecs
def num2word (num):
ones = {1:"one",2:"two",3:"three",4:"four",
5:"five",6:"six",7:"seven",8:"eight",
9:"nine",0:"zero",10:"ten"}
teens = {11:"eleven",12:"twelve",13:"thirteen",
14:"fourteen",15:"fifteen"}
tens = {2:"twenty",3:"thirty",4:"forty",
5:"fifty",6:"sixty",7:"seventy",
8:"eighty",9:"ninety"}
lens = {3:"hundred",4:"thousand",6:"hundred",7:"million",
8:"million", 9:"million",10:"billion"#,13:"trillion",11:"googol",
}
if num > 999999999:
return "Number more than 1 billion"
# Ones
if num < 11:
return ones[num]
# Teens
if num < 20:
word = ones[num%10] + "teen" if num > 15 else teens[num]
return word
# Tens
if num > 19 and num < 100:
word = tens[int(str(num)[0])]
if str(num)[1] == "0":
return word
else:
word = word + " " + ones[num%10]
return word
# First digit for thousands,hundred-thousands.
if len(str(num)) in lens and len(str(num)) != 3:
word = ones[int(str(num)[0])] + " " + lens[len(str(num))]
else:
word = ""
# Hundred to Million
if num < 1000000:
# First and Second digit for ten thousands.
if len(str(num)) == 5:
word = num2word(int(str(num)[0:2])) + " thousand"
# How many hundred-thousand(s).
if len(str(num)) == 6:
word = word + " " + num2word(int(str(num)[1:3])) + \
" " + lens[len(str(num))-2]
# How many hundred(s)?
thousand_pt = len(str(num)) - 3
word = word + " " + ones[int(str(num)[thousand_pt])] + \
" " + lens[len(str(num))-thousand_pt]
# Last 2 digits.
last2 = num2word(int(str(num)[-2:]))
if last2 != "zero":
word = word + " and " + last2
word = word.replace(" zero hundred","")
return word.strip()
left, right = '',''
# Less than 1 million.
if num < 100000000:
left = num2word(int(str(num)[:-6])) + " " + lens[len(str(num))]
right = num2word(int(str(num)[-6:]))
# From 1 million to 1 billion.
if num > 100000000 and num < 1000000000:
left = num2word(int(str(num)[:3])) + " " + lens[len(str(num))]
right = num2word(int(str(num)[-6:]))
if int(str(num)[-6:]) < 100:
word = left + " and " + right
else:
word = left + " " + right
word = word.replace(" zero hundred","").replace(" zero thousand"," thousand")
return word
print num2word(int(raw_input("Give me a number:\n")))
How can I make the script i've wrote accept > billion
?
Is there any other way to get the same output?
Can my code be written in a less verbose way?
A more general approach to this problem uses repeated division (i.e. divmod
) and only hardcodes the special/edge cases necessary.
For example, divmod(1034393, 1000000) -> (1, 34393)
, so you've effectively found the number of millions and are left with a remainder for further calculations.
Possibly more illustrative example: divmod(1034393, 1000) -> (1034, 393)
which allows you to take off groups of 3 decimal digits at a time from the right.
In English we tend to group digits in threes, and similar rules apply. This should be parameterized and not hard coded. For example, "303" could be three hundred and three million, three hundred and three thousand, or three hundred and three. The logic should be the same except for the suffix, depending on what place you're in. Edit: looks like this is sort of there due to recursion.
Here is a partial example of the kind of approach I mean, using a generator and operating on integers rather than doing lots of int(str(i)[..])
everywhere.
say_base = ['zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven',
'eight', 'nine', 'ten', 'eleven', 'twelve', 'thirteen', 'fourteen',
'fifteen', 'sixteen', 'seventeen', 'eighteen', 'nineteen']
say_tens = ['', '', 'twenty', 'thirty', 'forty', 'fifty', 'sixty', 'seventy',
'eighty', 'ninety']
def hundreds_i(num):
hundreds, rest = divmod(num, 100)
if hundreds:
yield say_base[hundreds]
yield ' hundred'
if 0 < rest < len(say_base):
yield ' and '
yield say_base[rest]
elif rest != 0:
tens, ones = divmod(rest, 10)
yield ' and '
yield say_tens[tens]
if ones > 0:
yield '-'
yield say_base[ones]
assert "".join(hundreds_i(245)) == "two hundred and forty-five"
assert "".join(hundreds_i(999)) == 'nine hundred and ninety-nine'
assert "".join(hundreds_i(200)) == 'two hundred'