Search code examples
machine-learningcircular-dependencycircular-referencehsl

Circular data in machine learning algorithms


I have a circular data (color component Hue of HSL) and I need to use it as predictor in one of the machine learning algorithms. How can I convert it to the regular continuous variable?

In order to clarify the problem, suppose we have an object in red. The predictor hue, for instance, takes its value in two separated range [0, 60] and [300, 359]. Most of the machine learning algorithms find the mean (average) of the predictors. Therefore, the mean will lie into the range [150, 210] which refer to the cyan color! That happens because hue is circular data..

Any help would be appreciated!


Solution

  • Decompose the single digit circular data into 2 dimensional x,y or cos0 / sin0 data.

    Imagine time as data.

    11:59.35... PM (14399) is a minute away from 12:00AM (00000)

    but the algorithm interprets 14399.35.. as far away from 00000 when in fact they should be close

    The option I suggest is to map the data into points in a unit circle. From here there are two ways to transform the data.

    1. Get the x,y coordinates of the data from the unit circle ex. 14399.35 = [-0.01, 0.99] 00000.00 = [ 00.0, 1.00]

    2. Get the sin/cos of the points in the unit circle with respect to the center ex. 14399.35 = [0.1,-0.9] 00000.00 = [0.89,-0.4]

    thus we get a result where the circular data now has values that are comparable with each other

    note: these are not the exact values, they're just here for demonstration