Search code examples
sqlsql-serverdatabasesql-server-2008inner-join

SQL extract IDS based on two columns on common


i need to figure out how i can i accomplish this task that was given to me, you see, i have imported an Excel, cleaned out the information and used this information to start joining the tables i need to, when i started i realized i needed to make it very precisely so i needed the id of the data i'm using which doesn't come in this Excel document i imported (since the id are stored in the database and the Excel was built by other people who don't handle databases) so i have a workmate whom i asked about how to do this task, he told me to do the inner join on the columns in common, but the way i did it appeared an error and logically didn't work, therefore i thought extracting the id from the table they are stored would be a good idea (well maybe not) but i don't know how to do it nor if it Will work, i'll give you some examples of how the tables would look like:

table 1 
----------------------
|ID|column_a|column_b|
|1 |2234    |3       |
|2 |41245   |23      |
|3 |442     |434     |
|4 |1243    |1       |
----------------------

table 2 
---------------------------------
|creation_date|column_a|column_b|
|1/12/2018    |2234    |3       |
|4/31/2011    |41245   |23      |
|7/22/2014    |442     |434     |
|10/14/2017   |1243    |1       |
---------------------------------

as you can see, the values of the columns a and b match perfectly, so there could be a bridge between the two tables, i tried to join the data by the column a but did not work since the output was much larger that i should, i also tried doing a simple query with an IN statement but did not work either since i brought up nearly the entire databases duplicated (i'm working with big databases the table 1 contains nearly 35.000 rows and the table 2 contains nearly 10.000) extracting the ids ad if they were row files won't work since they are very different from what is in the id tables in the actual table i'm working with, so what do you think it would be the best way to achieve this task? any kind of help i would be grateful, thanks in advance.

EDIT

Based on the answer of R3_ i tried his query but adapted to my needs and worked in some cases but in others i got the cartesian product, the example i'm using is that i have in table 2 in column_a the number 1000 and column_b has number 1, table 1 has 10 ids for that combination of numbers since the 1000-1 number is not the same (technically it is, but it has stored different information and is usually differenced by the ID) so the output is either 10 rows (assuming that it is only picking those with id) or 450 not the 45 i need as result, the query i'm using is like this:

SELECT DISTINCT table_1.id, table_2.column_a, table_2.column_b --if i pick the columns from table 1 returns 10 rows if i pick them from table 2 it returns 450
    FROM            table_2
    INNER JOIN      table_1 ON table_2.column_a = table_1.column_a AND table_1.column_b = table_2.column_b
    WHERE           table_2.column_a = 1022 AND table_2.column_b = 1

so the big deal has to do with the 10 id that has that 1000-1 combination so the sql doesn't know how to identify where the id should go, how can i do to obtain those 45 i need?

also i figured out that if i do the general query, there are some rows missing, here is how i print it:

SELECT       table_1.id, table_1.column_a, table_1.column_b 
FROM         table_2 --in this case i try switching the columns i return from table 1 or 2
INNER JOIN   table_1 ON table_2.column_a = table_1.column_a AND table_2.column_b = table_1.column_b

the output of the latter example is 2666 rows and should be 2733, what am i doing wrong?


Solution

  • SELECT DISTINCT -- Adding DISTINCT clause for unique pairs of ID and creation_date
    ID, tab1.column_a, tab1.column_b, creation_date  
    FROM [table 1] as tab1
    LEFT JOIN [table 2] as tab2 -- OR INNER JOIN
    ON tab1.column_a = tab2.column_a
    AND tab1.column_b = tab2.column_b
    -- WHERE ID IN ('01', '02') -- Filtering by desired ID