How do I separate a data set on a scatter plot

I'm very new to python but am interested in learning a new technique whereby I can identify different data points in a scatter plot with different markers according to where they fall in the scatter plot.

My specific example is much to this: http://www.astroml.org/examples/datasets/plot_sdss_line_ratios.html

I have a BPT plot and want to split the data along the demarcation line line.

I have a data set in this format:

data = [[a,b,c],
        [a,b,c],
        [a,b,c]
]

And I also have the following for the demarcation line:

NII   = np.linspace(-3.0, 0.35)

def log_OIII_Hb_NII(log_NII_Ha, eps=0):
    return 1.19 + eps + 0.61 / (log_NII_Ha - eps - 0.47)

Any help would be great!

Solution

I assume you have the pixel coordinates as a, b in your example. The column with cs is then something that is used to calculate whether a point belongs to one of the two groups.

Make your data first an ndarray:

import numpy as np

data = np.array(data)

Now you may create two arrays by checking which part of the data belongs to which area:

dataselector = log_OIII_Hb_NII(data[:,2]) > 0

This creates a vector of Trues and Falses which has a True whenever the data in the third column (column 2) gives a positive value from the function. The length of the vector equals to the number of rows in data.

Then you can plot the two data sets:

import matplotlib.pyplot as plt

fig = plt.figure()
ax = fig.add_subplot(111)

# the plotting part
ax.plot(data[dataselector,0], data[dataselector,1], 'ro')
ax.plot(data[-dataselector,0], data[-dataselector,1], 'bo')

I.e.:

create a list of True/False values which tells which rows of data belong to which group
plot the two groups (-dataselector means "all the rows where there is a False in dataselector")