Search code examples
openrefine

Easiest way to merge rows in Google Refine (OpenRefine) if all columns are identical


I'm cleaning data with OpenRefine (was Google Refine) from multiple sources. I have files from different sources which contain companies, column definitions are identical i.e.

UNID  | Name      | Street    | City    | Country   | Phone | ...
sg52d | Company a | A street  | a city  | c country | 12345
sg52d | Company a | A street  | a city  | c country | 0099835
dfnsd | Company B | B Street  | City B  | c country | 33445
dfnsd | Company B | Different | Another | c country | 33445
xxbb3 | Company C | C Street  | City B  | Country A | 1111
xxbb3 | Company C | C Street  | City B  | Country A | 1111

What I want is this result (only the last Company is merged, all columns were identical)

UNID  | Name      | Street    | City    | Country   | Phone | ...
sg52d | Company a | A street  | a city  | c country | 12345
sg52d | Company a | A street  | a city  | c country | 0099835
dfnsd | Company B | B Street  | City B  | c country | 33445
dfnsd | Company B | Different | Another | c country | 33445
xxbb3 | Company C | C Street  | City B  | Country A | 1111

Is there a simple way to do this?

I understand that I can concatenate all columns into a new column, but this is a little PITA, because of the number of columns.

Perhaps there is a way for the new column definition to loop through all other columns and merge it?


Solution

  • You could create new column with an expression like:

    forEach(["UNID", "Name", "Street", "City", "..." ],x,cells[x].value).join("")