Search code examples
google-sheetssplitarray-formulas

CSV extracted automatically into a sheet. Need help for filtering


I have data from an internal software exported automatically into a sheet. The raw data looks like this:

Report Title:, FTP - Transcript Completed Report,,,,,, Report Created By:,"Newman, Paul",,,,,, Report Generated Date/Time:,06/24/2024 07:01 AM,,,,,, Record Count:,2,,,,,, Record Count Limit:,200000,,,,,, Report Source:,data.com,,,,,, ,,,,,,, Training - Training Title contains: FTP,,,,,,, "AND Transcript - Transcript Status is one of: Completed , Completed (Equivalent) ",,,,,,, ,,,,,,, User - User Full Name,User - Location,User - Manager - User Full Name,Training - Training Title,Transcript - Transcript Status,Transcript - Transcript Completed Date,Transcript - Transcript Assigned Date,User - User Status "Bob, Milton Jones",France - Location A,"Steffen, Jones",FTP: Location B - Lesson 1A: Building Layouts,Completed,06/19/2024 03:30 PM,06/19/2024 03:06 PM,Active "Jules, Finnegan",Ireland - Location B,"Miles, Morales",FTP: Location B - Lesson 1A: Building Layouts,Completed,06/20/2024 07:00 AM,06/20/2024 07:15 PM,Active

The datas spread into the sheet by using columns. The more people in the report, the more columns will be used.The datas will keep on coming in a different row for every update.

What I am trying to achieve is isolate the following on another tab:

Name Location Training name Date delivered Date completed
Bob, Milton Jones France - Location A Location B - Lesson 1A: Building Layouts 06/19/2024 03:06 PM 06/19/2024 03:30 PM
Jules, Finnegan Ireland - Location B Location B - Lesson 1A: Building Layouts 06/20/2024 07:00 AM 06/20/2024 07:15 PM

I tried Split, Byrow and more but cannot isolate these data the way I want Anybody can help?

Here is a test sheet showing the layout in the sheet: Test sheet


Solution

  • SUGGESTION

    Note: This answer is a workaround since there's no pattern in the cells of the provided data.

    OPTION 1

    Using Google Sheets Formulas, you may try this for:

    A2
    =SPLIT(RIGHT(Sheet1!BZ2,LEN(Sheet1!BZ2)-FIND("""",Sheet1!BZ2)) & "," & " " & LEFT(Sheet1!CA2,LEN(Sheet1!CA2)-1) & " " & "|" & " " & Sheet1!CB2 & " " & "|" & " " & REGEXREPLACE(Sheet1!CD2, "FTP:", "") & " " & "|" & " " & Sheet1!CF2 & " " & "|" & " " & Sheet1!CG2, "|", TRUE, TRUE)
    
    A3
    =SPLIT(RIGHT(Sheet1!CH2,LEN(Sheet1!CH2)-FIND("""",Sheet1!CH2)) & "," & " " & LEFT(Sheet1!CI2,LEN(Sheet1!CI2)-1) & " " & "|" & " " & Sheet1!CJ2 & " " & "|" & " " & REGEXREPLACE(Sheet1!CL2, "FTP:", "") & " " & "|" & " " & Sheet1!CN2 & " " & "|" & " " & Sheet1!CO2, "|", TRUE, TRUE)
    

    EXPLANATION

    I've used the vertical bar | as the delimiter for the SPLIT function to separate the data in the columns. The only concern is that you'll have to manually enter the reference cells into it for the other rows.

    OUTPUT

    image

    OPTION 2

    Using Google Apps Script, assuming that the data sets only use 8 cells, unlike in the example that uses 8 in the first and 9 in the second, you may use this script to achieve what you'd like to do:

    I have also added some explanations of what the code does

    const myFunction = () => {
      // Gets your current active spreadsheet
      var ss = SpreadsheetApp.getActiveSpreadsheet();
      // Gets the spreadsheet named "Sheet1"
      var sh1 = ss.getSheetByName("Sheet1");
      // Gets the spreadsheet named "Sheet2"
      var sh2 = ss.getSheetByName("Sheet2");
      // Gets the first row values from "BZ2" to the last column
      var vl = sh1.getRange(2, 78, 1, ss.getLastColumn() - 1).getValues()[0];
      // Filters the data and removes empty spaces
      var fd = vl.filter(dt => dt != "");
      // Empty array
      var op = [];
      // Loops through the filtered data in groups of 8
      for (var i = 0; i < fd.length; i += 8) {
        // Separates the data to have 8 cells each
        var arr = vl.slice(i, i + 8);
        // Removes the first quotation mark as well as everything before it
        arr[0] = arr[0].replace(/^.*"/, '"').replace(/"/, "");
        // Removes the second quotation mark
        arr[1] = arr[1].replace(/"/g, "");
        // Combines the processed first element with the second, adding a comma separator
        arr.unshift(arr[0] + ", " + arr[1]);
        // Removes the "[1, 1, 2, 3]" elements
        [1, 1, 2, 3].forEach(i => {
          arr.splice(i, 1);
        });
        // Removes "FTP:" from the third element
        arr[2] = arr[2].replace(/^FTP: /, "");
        // Pushes the processed data into the output array
        op.push(arr);
      }
      // Sets the values of the combined data in "Sheet2"
      sh2.getRange(2, 1, op.length, op[0].length).setValues(op);
    }
    

    Make sure that you go to Extensions > Apps Script and run the function there since it'll ask for authorization.

    OUTPUT

    image

    Active is the 9th cell in the second example, considered by the code as the first of a new group, which is why it's on A4.