The table Locations has the following items :
The problem is that there are some rows which are "semi-repeated" (all the elements are equals except for the attribute attb that's an integer). I want to delete all repeated rows and append all the attributes attb, separated by commas.
For example if I have these two rows :
Attb | City | County | Latitude | Longitude
--------------------------------------------
1 | c1 | co1 | l1 | long1
--------------------------------------------
2 | c1 | co1 | l1 | long1
Using OpenRefine, I want to delete the second row and append the attb value of the second row to the first row attb'value. So the output expected should be :
Attb | City | County | Latitude | Longitude
--------------------------------------------
1,2 | c1 | co1 | l1 | long1
Let me know if I was clear enough.
Thanks in advance.
Create a key for each row which combines the values that will be replicated. You can do this using:
Edit Column->Add column based on this column
Then use a GREL expression something like:
cells["City"].value + cells["County"].value + cells["Latitude"].value + cells ["Longitude"].value
Call the column 'Key' or similar.
Note that the key doesn't include the content of the Attb column. Move the new 'Key' column to be the first column in the project Sort by the 'Key' column and apply the Sort permanently Use 'Blank down' on the Key column
Make sure OR is in 'Record' mode (towards top right of grid). You should see that those duplicate rows are now part of the same OR record.
On the Attb column use Edit Cells->Join Multi-valued cells and join them with a comma
Remove the 'Key' column and switch OpenRefine back into 'Row' mode. Now use 'Facet by blank' on the Attb column, find those rows with a blank Attb and remove those rows.
You should now have the merged rows as you want.