Search code examples
javaperformanceprocessing-efficiency

Most efficient way to compare two datasets and find results


If I have two data sets which come from SQL tables which appear like this. Where table A contains 3 possible values for a given item and Table B containts a full path to a file name, I have two data sets which come from SQL tables which appear like this.

TABLE A:
Column1    Column2        Column3
Value     SecondValue     ThirdValue
Value2    SecondValue2    ThirdValue2
Value3    SecondValue3    ThirdValue3

Table B:
Column1
PathToFile1\value.txt
PathToFile2\SecondValue2_ThirdValue.txt
PathToFile3\ThirdValue3_Value3.txt

I can extract any of the tables/columns to text, and I will use Java to find the full path (Table B) which contains any combination of the values in a row from (Table A). Table B can have values such as c:\directory\file.txt, c:\directory\directory2\filename.txt or c:\filename.txt

What is the most efficient way to search for the paths, given the filename?

I have two ideas from coworkers, but I am not sure if they are the optimal solution.

1.Store the filename and path parsed from Table B in a hash map and then look up the paths using the values from A as the key. Doing this for each column of A.

2.Sort both alphabetically and do a binary-search using alphabetic order.

CLARIFICATION:

The path to the file in Table B can contain any one of the values from the columns in Table A. That is how they relate. The output has to run eventually in Java and I wanted to explore the options in Java, knowing SQL would be faster for relating the data. Also added some info to the table section. Please let me know if more info is needed.


Solution

  • I found this to help along with my answer, although not a specific answer to my question. I think using the info in this article can lead to the optimal practice.

    http://www.javacodegeeks.com/2010/08/java-best-practices-vector-arraylist.html