image-processing image-segmentation data-fitting polynomial-approximations

How do I 'fit a line' to a cluster of pixels?

I would like to generate a polynomial 'fit' to the cluster of colored pixels in the image here

(The point being that I would like to measure how much that cluster approximates an horizontal line). I thought of using grabit or something similar and then treating this as a cloud of points in a graph. But is there a quicker function to do so directly on the image file? thanks!

Solution

Here is a Python implementation. Basically we find all (xi, yi) coordinates of the colored regions, then set up a regularized least squares system where the we want to find the vector of weights, (w0, ..., wd) such that yi = w0 + w1 xi + w2 xi^2 + ... + wd xi^d "as close as possible" in the least squares sense.

import numpy as np
import matplotlib.pyplot as plt

def rgb2gray(rgb):
    return np.dot(rgb[...,:3], [0.299, 0.587, 0.114])

def feature(x, order=3):
    """Generate polynomial feature of the form
    [1, x, x^2, ..., x^order] where x is the column of x-coordinates
    and 1 is the column of ones for the intercept.
    """
    x = x.reshape(-1, 1)
    return np.power(x, np.arange(order+1).reshape(1, -1)) 

I_orig = plt.imread("2Md7v.jpg")
# Convert to grayscale
I = rgb2gray(I_orig)

# Mask out region
mask = I > 20

# Get coordinates of pixels corresponding to marked region
X = np.argwhere(mask)

# Use the value as weights later
weights = I[mask] / float(I.max())
# Convert to diagonal matrix
W = np.diag(weights)

# Column indices
x = X[:, 1].reshape(-1, 1)
# Row indices to predict. Note origin is at top left corner
y = X[:, 0]

We want to find vector w that minimizes || Aw - y ||^2 so that we can use it to predict y = w . x

Here are 2 versions. One is a vanilla least squares with l2 regularization and the other is weighted least squares with l2 regularization.

# Ridge regression, i.e., least squares with l2 regularization. 
# Should probably use a more numerically stable implementation, 
# e.g., that in Scikit-Learn
# alpha is regularization parameter. Larger alpha => less flexible curve
alpha = 0.01

# Construct data matrix, A
order = 3
A = feature(x, order)
# w = inv (A^T A + alpha * I) A^T y
w_unweighted = np.linalg.pinv( A.T.dot(A) + alpha * np.eye(A.shape[1])).dot(A.T).dot(y)
# w = inv (A^T W A + alpha * I) A^T W y
w_weighted = np.linalg.pinv( A.T.dot(W).dot(A) + alpha * \
                             np.eye(A.shape[1])).dot(A.T).dot(W).dot(y)

The result

# Generate test points
n_samples = 50
x_test = np.linspace(0, I_orig.shape[1], n_samples)
X_test = feature(x_test, order)

# Predict y coordinates at test points
y_test_unweighted = X_test.dot(w_unweighted)
y_test_weighted = X_test.dot(w_weighted)

# Display
fig, ax = plt.subplots(1, 1, figsize=(10, 5))
ax.imshow(I_orig)
ax.plot(x_test, y_test_unweighted, color="green", marker='o', label="Unweighted")
ax.plot(x_test, y_test_weighted, color="blue", marker='x', label="Weighted")
fig.legend()
fig.savefig("curve.png")

For simple straight line fit, set the argument order of feature to 1. You can then use the gradient of the line to get a sense of how close it is to a horizontal line (e.g., by checking the angle of its slope).

It is also possible to set this to any degree of polynomial you want. I find that degree 3 looks pretty good. In this case, the 6 times the absolute value of the coefficient corresponding to x^3 (w_unweighted[3] or w_weighted[3]) is one measure of the curvature of the line.

See A measure for the curvature of a quadratic polynomial in Matlab for additional details.