CSV Find Nearly the same value and make them identical and filter the values

Community!

Story: I am trying to upload a CSV file with a huge batch of products to my e-commerce shop. But there are many very similar products, but all with every column slightly different. And luckily the plugin I use can handle this, but it needs the same title for the entire product range or some reference to its parent product. The reference is sadly not there.

Now I want to know how I can find values in a CSV file that are nearly the same (in SQL there was something called '%LIKE%') to structure the table appropriately. I can hardly describe what I want to achieve, but here is an example for what I'm looking for.

I basically want to transform this table:

+---------------+---------------+---------------+---------------+
|      ID       |     Title     |      EAN      |      ...      |
+---------------+---------------+---------------+---------------+
|      1        | AquaMat 3.6ft |   1234567890  |      ...      |
+---------------+---------------+---------------+---------------+
|      2        | AquaMat 3.8ft |   1234567891  |      ...      |
+---------------+---------------+---------------+---------------+
|      3        |  AquaMat 4ft  |   1234567892  |      ...      |
+---------------+---------------+---------------+---------------+

into this:

+---------------+---------------+---------------+---------------+
|      ID       |     Title     |      EAN      |      ...      |
+---------------+---------------+---------------+---------------+
|      1        |    AquaMat    |   1234567890  |      ...      |
+---------------+---------------+---------------+---------------+
|      2        |    AquaMat    |   1234567891  |      ...      |
+---------------+---------------+---------------+---------------+
|      3        |    AquaMat    |   1234567892  |      ...      |
+---------------+---------------+---------------+---------------+

The extra data can be scraped. Can I do this with Excel? With Macros? With Python?

Thank you for taking time and reading this.

If you have any questions, than feel free to ask.

EDIT:

The Title column contains products with completely different names and might even contain more whitespaces. And some products might have 1 attribute but others have up to 3 attributes. But this can be sorted manually.

And with nearly the same I mean as you can see in the table. The Title's are basically the same but not identical. I want to remove the attributes from them. Also, there are no other columns with any more details, only numbers and the attributes that I am trying to cut of the title!!!

Solution

Here's an idea using Python and .split():

import csv

with open('testfile.csv', 'r', encoding="utf-8-sig") as inputfile:

 csv_reader = csv.reader(inputfile, delimiter=',')
   with open('outputfile.csv', 'w', newline='') as outputfile:
      w = csv.writer(outputfile)
      header=['ID','Title','EAN','Product','Attr1','Attr2','Attr3']
      w.writerow(header)
      for row in csv_reader:
          if row[0]=='ID':
              header_row=True
              pass
          else:
              header_row=False
              list=row[1].split()
              for item in list:
                  # if you want, you can add some other conditions on the attribute (item) in here
                  row.append(item)
              if not header_row:
                  print('row: {}'.format(row))
                  w.writerow(row)