Search code examples
pythonmathscale

Convert small numbers to a scale from 0 to 10


I have a list of urls->small numbers. numbers represents the importance of each URL

url, value
https://mywebsite.com/p/1, 0.00212
https://mywebsite.com/p/2, 0.00208
https://mywebsite.com/p/3, 0.00201
https://mywebsite.com/p/4, 0.00138
https://mywebsite.com/p/5, 0.00067
https://mywebsite.com/p/1, 0.00001
...

let's say the sum of values = 1 I would like to represent those number within a scale of 0-10 and maintaining a kind of ratio difference between numbers

url, value, scaled_value
https://mywebsite.com/p/1, 0.00212, 10
https://mywebsite.com/p/2, 0.00208, 9
https://mywebsite.com/p/3, 0.00201, 9
https://mywebsite.com/p/4, 0.00138, 6
https://mywebsite.com/p/5, 0.00067, 3
https://mywebsite.com/p/1, 0.00001, 1
...

something like this (I don't know if the ratio difference here is maintained tho) any one can help with the maths ? thanks

#Update

thanks to @annZen help, I tried two approach, but the results are different I don't know why. if someone can help ?

enter image description here

here are the two formulas I used :

res1 = round(x*9/maxpri)+1
res2 = round(((x-minpri)/(maxpri-minpri))*10, 2)

Solution

  • Here is a way:

    with open('file.txt', 'r') as p:
        lst = p.read().splitlines() # List all the lines of the file
    
    lst2 = [float(i.split(', ')[1]) for i in lst[1:]] # List all the floats
    
    num = [round(a*9/max(lst2))+1 for a in lst2] # List all the scaled numbers
    
    for i,(l,n) in enumerate(zip(lst,['scaled_value']+num)):
        lst[i] = f"{l}, {n}" # Add the 'scaled_value' column
    
    with open('file.txt', 'w') as p:
        p.write('\n'.join(lst)) # Write the updated data into the file
    

    Before:

    url, value
    https://mywebsite.com/p/1, 0.00212
    https://mywebsite.com/p/2, 0.00208
    https://mywebsite.com/p/3, 0.00201
    https://mywebsite.com/p/4, 0.00138
    https://mywebsite.com/p/5, 0.00067
    https://mywebsite.com/p/1, 0.00001
    

    After:

    url, value, scaled_value
    https://mywebsite.com/p/1, 0.00212, 10
    https://mywebsite.com/p/2, 0.00208, 10
    https://mywebsite.com/p/3, 0.00201, 10
    https://mywebsite.com/p/4, 0.00138, 7
    https://mywebsite.com/p/5, 0.00067, 4
    https://mywebsite.com/p/1, 0.00001, 1
    



    UPDATE:

    The part of my code that does the converting is:

    num = [round(a*9/max(lst2))+1 for a in lst2]
    

    where lst2 is simply the list of floats extracted from the file. You updated the question for me to explain the difference between

    res1 = round(x*9/maxpri)+1
    res2 = round(((x-minpri)/(maxpri-minpri))*10, 2)
    

    Lets first see then in my list comprehension:

    num1 = [round(x*9/max(lst2))+1 for x in lst2]
    num2 = [round(((x-min(lst2))/(max(lst2)-min(lst2)))*10, 2) for x in lst2]
    print(num1)
    print(num2)
    

    Output:

    [10, 10, 10, 7, 4, 1]
    [10.0, 9.81, 9.48, 6.49, 3.13, 0.0]
    

    The first clearest difference is that i rounded my answer to the nearest integer. Without it, it would be:

    num1 = [round(x*9/max(lst2), 2)+1 for x in lst2]
    num2 = [round(((x-min(lst2))/(max(lst2)-min(lst2)))*10, 2) for x in lst2]
    print(num1)
    print(num2)
    

    Output:

    [10.0, 9.83, 9.53, 6.86, 3.84, 1.04]
    [10.0, 9.81, 9.48, 6.49, 3.13, 0.0]
    

    The values are now very close, but there's one more thing. My code assumes that the minimum value for the scaled values is 1, because I saw in your post https://mywebsite.com/p/1, 0.00001, 1. I now realize that you stated 0-10, not 1-10. So that another is to change the 9 (10-1=9) to a 10(10-0=10), and removing the +1:

    round(x*9/max(lst2), 2)+1
    round(x*10/max(lst2), 2)
    

    num1 = [round(x*10/max(lst2), 2) for x in lst2]
    num2 = [round(((x-min(lst2))/(max(lst2)-min(lst2)))*10, 2) for x in lst2]
    print(num1)
    print(num2)
    

    Output:

    [10.0, 9.81, 9.48, 6.51, 3.16, 0.05]
    [10.0, 9.81, 9.48, 6.49, 3.13, 0.0]
    

    Still a little different, that's because I assumed the minimum value in your column is 0, because you didn't show your whole array. But in this case, it's 0.00001. So, go with:

    num = [round(((x-min(lst2))/(max(lst2)-min(lst2)))*10, 2) for x in lst2]
    

    Summary: My code assumed that you wanted the numbers scaled from 1 to 10, instead of 0 to 10, and my code assumed that the minimum value for your data is 0, which might not be the case.