Search code examples
excelstatesvba

VBA Excel Populating cells based on previous existence


I haven't seen this addressed yet, but I think that might be because I don't know how to phrase my problem concisely. Here's an example of what I'd like to try and do:

Given a column which holds state initials check output sheet if that state has been found before. If it hasn't then populate a new cell with that state's initials and initialize the count (number of times state has been found) to one. If the state's initials are found in a cell within the output sheet then increment the count by one.

With this, if we have a 50,000 (or however many) lined excel sheet that has states in random order (states may or may not be repeated) we will be able to create a clean table which outputs which states are in the raw data sheet and how many times they appeared. Another way to think about this is coding a pivot table, but with less information.

There's a couple of ways that I've thought about how to complete this, I personally think none of these are very good ideas but we'll see.

Algorithm 1, all 50 states:

  1. Create 50 string variables for each state, create 50 long variables for the counts
  2. Loop through raw data sheet, if specific state found then increment appropriate count (this would require 50 if-else statements)
  3. Output results

Overall..... terrible idea

Algorithm 2, flip-flop:

  1. Don't create any variables
  2. If a state is found in raw data sheet , look in output sheet to check if state has been found before
  3. If state has been found before, increment cell adjacent by one
  4. If state has not been found before, change next available blank cell to state initials and initialize cell adjacent to one
  5. Go back to raw data sheet

Overall..... this could work, but I feel as if it would take forever, even with raw data sheets that aren't very big but it has the benefit of not wasting memory like the 50 states algorithm and less lines of code

On a side note, is it possible to access a workbook's (or worksheet's) cells without activating that workbook? I ask because it would make the second algorithm run much quicker.

Thank you,

Jesse Smothermon


Solution

  • A couple of point that will speed up your code:

    1. You don't need to active workbooks, worksheets or ranges to access them eg

      DIM wb as workbook  
      DIM ws as worksheet  
      DIM rng as range
      
      Set wb = Workbooks.OpenText(Filename:=filePath, Tab:=True) ' or Workbooks("BookName")  
      Set ws = wb.Sheets("SheetName")  
      Set rng = ws.UsedRange ' or ws.[A1:B2], or many other ways of specifying a range  
      

    You can now refer to the workbook/sheet/range like

    rng.copy
    for each  cl in rng.cells
    etc
    
    1. Looping through cells is very slow. Much faster to copy the data to a variant array first, then loop through the array. Also, when creating a large amount of data on a sheet, better to create it in a variant array first then copy it to the sheet in one go.

      DIM v As Variant
      v = rng
      

    eg if rng refers to a range 10 rows by 5 columns, v becomes an array of dim 1 to 10, 1 to 5. The 5 minutes you mention would probably be reduced to seconds at most