Search code examples
where-clauseinformixisql

isql "select ... where 'a part of a column value' = 'a part of a column value'


I have 2 csv files. In one file I have a phone number with prices and in the second file I have a phone number with the name of its owner.

First file: file1.csv

491732234332;30,99
491723427343;12,59
491732097232;33,31

Second file: file2.csv

01732/234332;Ben Jefferson
01723/427343;Jon Doe
01732/097232;Benjamin Franklin

My problem is, that the phone number columns are formatted differently and I can not find a way to compare them.

Desired output is:

01732/234332;30,99;Ben Jefferson
01723/427343;12,59;Jon Doe
01732/097232;33,31;Benjamin Franklin

My sql statement is

create temp table FILETB1
      (phonenr    char(30),
       price      char(30)
      );

create temp table FILETB2
      (phonenr char(40),
       owner char(60)   
      );

       load from "file1.csv"
       insert into FILETB1;

       load from "file2.csv"
       insert into FILETB2;

       unload to "output.csv"

       select FILETB1.phonenr, FILETB1.price, FILETB2.owner
       from FILETB1, FILETB2
       where FILETB1.phonenr = FILETB2.phonenr

How do I have to modify my where clause to be able to compare both columns?

We are working on linux with IBM INFORMIX-SQL Version 7.50.UC5 which makes finding a working solution not easier since many functions are not supported...

Any help is highly appreciated!


Solution

  • Using just the facilities of ISQL, you can use:

    CREATE TEMP TABLE FILETB1
    (
        phonenr    CHAR(30),
        price      CHAR(30)
    );
    
    CREATE TEMP TABLE FILETB2
    (
        phonenr CHAR(40),
        owner CHAR(60)   
    );
    
    LOAD FROM "file1.csv" DELIMITER ';' INSERT INTO FILETB1;
    LOAD FROM "file2.csv" DELIMITER ';' INSERT INTO FILETB2;
    
    UNLOAD TO "output.csv" DELIMITER ';'
        SELECT FILETB2.phonenr, FILETB1.price, FILETB2.owner
          FROM FILETB1, FILETB2
         WHERE FILETB1.phonenr[3,6] = FILETB2.phonenr[2,5]
           AND FILETB1.phonenr[7,12] = FILETB2.phonenr[7,12];
    

    Testing with DB-Access, I got:

    $ dbaccess stores so-35360310.sql
    
    
    Database selected.
    
    
    Temporary table created.
    
    
    Temporary table created.
    
    
    3 row(s) loaded.
    
    
    3 row(s) loaded.
    
    
    3 row(s) unloaded.
    
    
    Database closed.
    
    $ cat output.csv
    01732/234332;30,99;Ben Jefferson;
    01723/427343;12,59;Jon Doe;
    01732/097232;33,31;Benjamin Franklin;
    $
    

    The key is using the built-in substring [start,end] operator. You compare the two parts of the phone numbers that are comparable. And you select the number from file2.csv (table FILETB2) because that's the format you wanted.

    For the sample data, of course, you could simply use Unix command line tools to do the job, but knowing how to do it inside the DBMS is helpful too.

    You could also use the SUBSTR(col, start, len) function:

    UNLOAD TO "output2.csv" DELIMITER ';'
        SELECT FILETB2.phonenr, FILETB1.price, FILETB2.owner
          FROM FILETB1, FILETB2
         WHERE SUBSTR(FILETB1.phonenr, 3, 3) = SUBSTR(FILETB2.phonenr, 2, 3)
           AND SUBSTR(FILETB1.phonenr, 7, 6) = SUBSTR(FILETB2.phonenr, 7, 6);
    

    This produces the same output from the sample data.

    If ISQL does not recognize the DELIMITER ';' clause to the UNLOAD (or LOAD) pseudo-SQL statements, then you can set the environment variable DBDELIMITER=';' before running the script and remove those clauses from the SQL.