Search code examples
matplotlibx-axis

In matplotlib, is there a method to fix or arrange the order of x-values of a mixed type with a character and digits?


There are several Q/A for x-values in matplotlib and it shows when the x values are int or float, matploblit plots the figure in the right order of x. For example, in character type, the plot shows x values in the order of

1 15 17 2 21 7 etc

but when it became int, it becomes

1 2 7 15 17 21 etc

in human order. If the x values are mixed with character and digits such as

NN8 NN10 NN15 NN20 NN22 etc

the plot will show in the order of

NN10 NN15 NN20 NN22 NN8 etc

Is there a way to fix the order of x values in the human order or the existing order in the x list without removing 'NN' in x-values.

In more detail, the xvalues are directory names and using grep sort inside linux function, the results are displayed in linux terminal as follows, which can be saved in text file.

joonho@login:~/NDataNpowN$ get_TEFrmse NN 2 | sort -n -t N -k 3
NN7 0.3311
NN8 0.3221
NN9 0.2457
NN10 0.2462
NN12 0.2607
NN14 0.2635

Without sort, the linux shell also displays in the machine order such as

NN10 0.2462
NN12 0.2607
NN14 0.2635
NN7 0.3311
NN8 0.3221
NN9 0.2457

Solution

  • As I said, pandas would make this task easier than dealing with base Python lists and such:

    import matplotlib.pyplot as plt
    import pandas as pd
    
    #imports the text file assuming that your data are separated by space, as in your example above
    df = pd.read_csv("test.txt", delim_whitespace=True, names=["X", "Y"])
    #extracting the number in a separate column, assuming you do not have terms like NN1B3X5
    df["N"] = df.X.str.replace(r"\D", "", regex=True).astype(int)
    #this step is only necessary, if your file is not pre-sorted by Linux
    df = df.sort_values(by="N")
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 6))
    
    #categorical plotting
    df.plot(x="X", y="Y", ax=ax1)
    ax1.set_title("Evenly spaced")
    
    #numerical plotting
    df.plot(x="N", y="Y", ax=ax2)
    ax2.set_xticks(df.N)
    ax2.set_xticklabels(df.X)
    ax2.set_title("Numerical spacing")
    
    plt.show()
    

    Sample output: enter image description here

    Since you asked if there is a non-pandas solution - of course. Pandas makes some things just more convenient. In this case, I would revert to numpy. Numpy is a matplotlib dependency, so in contrast to pandas, it must be installed, if you use matplotlib:

    import matplotlib.pyplot as plt
    import numpy as np
    import re
    
    #read file as strings
    arr = np.genfromtxt("test.txt", dtype="U15")
    #remove trailing strings
    Xnums = np.asarray([re.sub(r"\D", "", i) for i in arr[:, 0]], dtype=int)
    #sort array 
    arr = arr[np.argsort(Xnums)]
    #extract x-values as strings...
    Xstr = arr[:, 0]
    #...and y-values as float
    Yvals = arr[:, 1].astype(float)
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 6))
    
    #categorical plotting
    ax1.plot(Xstr, Yvals)
    ax1.set_title("Evenly spaced")
    
    #numerical plotting
    ax2.plot(np.sort(Xnums), Yvals)
    ax2.set_xticks(np.sort(Xnums))
    ax2.set_xticklabels(Xstr)
    ax2.set_title("Numerical spacing")
    
    plt.show()
    

    Sample output:

    enter image description here