Search code examples
pythonpandasdataframedata-scienceiot

Python get specific value from HDF table


I have two tables, the first one contains 300 rows, each row presents a case with 3 columns in which we find 2 constant values presenting the case, the second table is my data table collected from sensors contains the same indicators as the first except the case column, the idea is to detect to which case belongs each line of the second table knowing that the data are not the same as the first but in the range.

example:

First table:

  [[1500, 22, 0], [1100, 40, 1], [2400, 19, 2]]
    columns=['analog', 'temperature', 'case'])**

second table:

[[1420, 20], [1000, 39], [2300, 29]]
 columns=['analog', 'temperature']

I want to detect my first row (1420 20) belongs to which case?


Solution

  • You can simply use a classifier; K-NN for instance...

    import pandas as pd
    df = pd.DataFrame([[1500, 22, 0], [1100, 40, 1], [2400, 19, 2]],columns=['analog', 'temperature', 'case'])
    df1 = pd.DataFrame([[1420, 10], [1000, 39], [2300, 29]],columns=['analog', 'temperature'])
    
    from sklearn.neighbors import KNeighborsClassifier
    classifier = KNeighborsClassifier(n_neighbors = 1, metric = 'minkowski', p = 2)
    classifier.fit(df[['analog', 'temperature']], df["case"])
    df1["case"] = classifier.predict(df1)
    

    Output of df1;

       analog  temperature  case
    0    1420           10     0
    1    1000           39     1
    2    2300           29     2
    

    so, first row (1420 20) in df1 (2nd table) belongs to case 0...