Search code examples
pythonmatplotlibastronomy

How do I separate a data set on a scatter plot


I'm very new to python but am interested in learning a new technique whereby I can identify different data points in a scatter plot with different markers according to where they fall in the scatter plot.

My specific example is much to this: http://www.astroml.org/examples/datasets/plot_sdss_line_ratios.html

I have a BPT plot and want to split the data along the demarcation line line.

I have a data set in this format:

data = [[a,b,c],
        [a,b,c],
        [a,b,c]
]

And I also have the following for the demarcation line:

NII   = np.linspace(-3.0, 0.35)

def log_OIII_Hb_NII(log_NII_Ha, eps=0):
    return 1.19 + eps + 0.61 / (log_NII_Ha - eps - 0.47)

Any help would be great!


Solution

  • I assume you have the pixel coordinates as a, b in your example. The column with cs is then something that is used to calculate whether a point belongs to one of the two groups.

    Make your data first an ndarray:

    import numpy as np
    
    data = np.array(data)
    

    Now you may create two arrays by checking which part of the data belongs to which area:

    dataselector = log_OIII_Hb_NII(data[:,2]) > 0
    

    This creates a vector of Trues and Falses which has a True whenever the data in the third column (column 2) gives a positive value from the function. The length of the vector equals to the number of rows in data.

    Then you can plot the two data sets:

    import matplotlib.pyplot as plt
    
    fig = plt.figure()
    ax = fig.add_subplot(111)
    
    # the plotting part
    ax.plot(data[dataselector,0], data[dataselector,1], 'ro')
    ax.plot(data[-dataselector,0], data[-dataselector,1], 'bo')
    

    I.e.:

    • create a list of True/False values which tells which rows of data belong to which group
    • plot the two groups (-dataselector means "all the rows where there is a False in dataselector")