Search code examples
listduplicatesapplescript

AppleScript: Checking a list for an existing item


I can't figure out why this script is not working. I have a list of almost two thousand entries (categories) but there are many duplicates. I'm just trying to create a list of unique categories but I can't seem to get it to work.

Background: I am reading a CSV file that has a column titled: CATEGORIES. I read the file, use a newline delimiter to create an array of each entry, loop through the array, delimit again by the comma separator and get the content for the CATEGORIES column. Those entries could be a single category or multiple entered as CAT1; CAT2 or just to be more annoying CAT1 > CAT2. Here's my code, I am ignoring the third instance of categories that can be returned (with the > symbol) for now until I get the code working.

...
set arrCategories to {}

set theCats to item 14 of arrThisLine
set oatd to AppleScript's text item delimiters
set AppleScript's text item delimiters to ";"
set subCats to every text item of theCats
        
repeat with thisSubCat in subCats
    if thisSubCat does not contain ">" then
        if arrCategories contains thisSubCat then
        else
            copy thisSubCat to end of arrCategories
            log arrCategories
        end if
    end if
end repeat
set AppleScript's text item delimiters to oatd

The log looks like this, eventually there are thousands of entries in arrCategories (I have roughly 1000 line in the CSV to loop over)

(*Design*)
(*Design, Design*)
(*Design, Design, Design*)
(*Design, Design, Design, Design*)
(*Design, Design, Design, Design, Revenue*)
(*Design, Design, Design, Design, Revenue, Learning & Development*)
(*Design, Design, Design, Design, Revenue, Learning & Development,  Product & Engineering*)
(*Design, Design, Design, Design, Revenue, Learning & Development,  Product & Engineering, Product & Engineering*)

I am sure it is just something simple I am missing but I cannot figure out why it is not picking up the duplicates. Any help would be appreciated.


Solution

  • On Yosemite or later systems you can avoid repeat loops using AppleScript Objective C:

    use AppleScript version "2.4" -- Yosemite or later
    use scripting additions
    use framework "Foundation"
    
    -- .... INSERT HERE THE BEGINNING OF YOUR SCRIPT
    
    set theCats to item 14 of arrThisLine
    set oatd to AppleScript's text item delimiters
    set AppleScript's text item delimiters to ";"
    set subCats to every text item of theCats
    set AppleScript's text item delimiters to oatd
    
    -- remove items with ">"
    set stringArray to current application's NSArray's arrayWithArray:subCats
    set thePred to current application's NSPredicate's predicateWithFormat:"!self  LIKE '*>*'"
    set bList to (stringArray's filteredArrayUsingPredicate:thePred) as list
    
    -- remove duplicates
    set aSet to current application's NSOrderedSet's orderedSetWithArray:bList
    set arrCategories to (aSet's array()) as list
    

    I tested following script:

    use AppleScript version "2.4" -- Yosemite (10.10) or later
    use framework "Foundation"
    use scripting additions
    
    set theCats to {"Design", "Design", "Design > Design", "Revenue", "Learning & Development", "Product & Engineering", "Product & Engineering"}
    
    -- remove duplicates, retaining list's order
    set aSet to current application's NSOrderedSet's orderedSetWithArray:theCats
    set aList to (aSet's array()) as list
    
    -- remove strings with ">"
    set stringArray to current application's NSArray's arrayWithArray:aList
    set thePred to current application's NSPredicate's predicateWithFormat:"!self  LIKE '*>*'"
    set arrCategories to (stringArray's filteredArrayUsingPredicate:thePred) as list
    
    --> {"Design","Revenue","Learning & Development","Product & Engineering"}