I have a "flat file" with structure as below:
machineCode,Key,Ip_Name_No,Share_Percent,Account_Name,Account_No
"ygh048GT",4767,534293748,"100.00","cderfgdsc Publishing International Ltd","160102040"
"xcd064HW",6380,65424090,"100.00","dascdfrgh snm skion","00090382478"
"000065AN",6402,65424090,"100.00","xcdertn,john sean","00090382478"
.....
The first row are the column headings. As can be seen, the fields are separated by a comma. The requirement is to split the single string into separate fields.
This could be done by excel and then uploaded to a DB table using the data to columns option with comma as delimiter but the Account_Name field can contain commas within the values itself.
So, I came up with the below SQL. Question is, does this look correct ? Also, there must be some easier way to do this, any suggestions ?
WITH POS AS (
select
LOCATE_IN_STRING ( DATA , ',' , 2 ) - 1 AS TUNECODE_END ,
LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) - 1 AS WORKKEY_END,
LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) - 1 AS IPNN_END,
LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) + 1) ) - 1 AS PERC_END,
CASE WHEN
SUBSTR ( DATA ,
(
LOCATE_IN_STRING ( DATA , ',' ,
(LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) + 1) ) + 1 ) ) + 1),
1) <> '"'
THEN
LOCATE_IN_STRING ( DATA , ',' ,
(LOCATE_IN_STRING ( DATA , ',' ,
(LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) + 1) ) + 1 ) ) + 1)) - 1
ELSE
LOCATE_IN_STRING ( DATA , ',' ,
(LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING( DATA , ',' , (LOCATE_IN_STRING (DATA, ',' , LOCATE_IN_STRING ( DATA , ',' , 2 ) + 1) + 1) ) + 1) ) + 1 ) ) - 1
END AS ACNAME_END,
RRN(P) ROWN
FROM PLDWRK P
) SELECT
CAST ( SUBSTR( DATA , 1, TUNECODE_END ) AS CHAR(25))AS MACHINECODE ,
CAST ( SUBSTR( DATA , TUNECODE_END + 2 , WORKKEY_END - (TUNECODE_END + 1) ) AS DEC(12,0)) AS KEY,
CAST(SUBSTR( DATA , WORKKEY_END + 2, IPNN_END - (WORKKEY_END + 1) ) AS DEC(12, 0 )) AS IP_NN,
CAST (SUBSTR( DATA, IPNN_END + 2, PERC_END - (IPNN_END + 1)) AS CHAR(8))AS PERCENTAGE,
CAST (SUBSTR( DATA, PERC_END + 2, ACNAME_END - (PERC_END + 1)) AS CHAR(100)) AS ACCOUNT_NAME,
CAST (SUBSTR( DATA, ACNAME_END + 2 ) AS CHAR(30)) as ACCOUNT_NUMBER
FROM PLDWRK P JOIN POS ON ROWN = RRN(P)
Lots of things are easier than raw SQL...
Why not simply use Copy From Import File (CPYFRMIMPF) that's what it's designed for.
CPYFRMIMPF FROMSTMF('/inbound/somedata.csv') TOFILE(MYLIB/MYTABLE) MBROPT(*REPLACE) RCDDLM(*CRLF) DTAFMT(*DLM) STRDLM(*DBLQUOTE) RMVCOLNAM(*YES)
You'll have to transfer the stream data into the IFS (where it really belongs) instead of a DB table.
IBM's Access Client Solutions (ACS) includes data transfer functionality that can understand .CSV files. This can be automated and can in fact run on either a PC or the IBM i itself.
Another great option would be an RPG program, back in 2008 Scott Klement wrote a CSV parser in RPG. He's since enhanced it to make it easier to use by taking advantage of RPG's DATA-INTO
op-code.
Lastly it's 2023...node.js, PHP, Python are all available on the IBM i and all of them have libraries/packages to handle CSV and write to a DB table.