Using Knime, I am trying to remove duplicates in all the rows for set of columns through Groupby node. Can you tell how to implement this or if I can use any other node to get this done. First I have divided my table in set of columns such as set 1 is -->Col1,Col2,Col3,Col4 set 2 is-->Col5,col6,Col7,col8 and like this I have 10 sets(with 4 columns each) now I want to check if there we have same data in any particular set Lets say below values are there in set 1 Col1 has 4 Col2 has 4 Col3 has 4 Col4 has 4
then I will keep Col1 as 4 and values in Col2, col3,col4 will be 'null' .
Can you please tell me how to do this through GroupBy node in KNIME
I have tried this using other nodes like constant Value column Filter, math formula,Rule Engine, but nothing seems to working .
First I have divided my table in set of columns such as set 1 is -->Col1,Col2,Col3,Col4 set 2 is-->Col5,col6,Col7,col8 and like this I have 10 sets(with 4 columns each) now I want to check if there we have same data in any particular set Lets say below values are there in set 1 Col1 has 4 Col2 has 4 Col3 has 4 Col4 has 4
then I will keep Col1 as 4 and values in Col2, col3,col4 will be 'null' .
Can't do it in a GroupBy node. You can get unique values in GroupBy node but you need some logic that will determine that this value is a duplicate and instead of it put null or some other identifier. I advise you to use Rule Engine node with following syntax for last column:
$column4$ MATCHES $column1$ OR $column4$ MATCHES $column2$ OR $column4$ MATCHES $column3$ => "null"
TRUE => $column4$
After that add two more Rule Engine nodes with syntax for column3 and column2. You don't need to do anything for column1 obviously.