I'm currently using LOAD DATA LOCAL INFILE to import bulk financial data for processing.
The values within the file are stored to an accuracy of 30 decimal places, e.g.
125.154821679413246187945612314846
When this imported however, the data is always truncated to 10 decimal places, with trailing zeros, e.g.
125.154821679200000000000000000000
The column is set up as follows :-
USDPayable DECIMAL (33,30)
EDIT:
Table creation script:
CREATE TABLE IF NOT EXISTS dump
(
SaleID INT NOT NULL AUTO_INCREMENT,
Country VARCHAR(8),
Label VARCHAR(20),
Product VARCHAR(5),
URI VARCHAR(20),
UPC VARCHAR(20),
EAN VARCHAR(20),
ISRC VARCHAR(20),
TrackName VARCHAR(28),
ArtistName VARCHAR(64),
ComposerName VARCHAR(64),
AlbumName VARCHAR(54),
Quantity INT(10),
USDPayable decimal(33,30),
PRIMARY KEY (SaleID)
);
Load data script:
LOAD DATA
LOCAL INFILE '<my file>'
INTO TABLE dump
IGNORE 3 LINES
(Country, Label, Product, URI, UPC, EAN, ISRC,
TrackName, ArtistName, ComposerName, AlbumName,
Quantity, USDPayable)
Input data sample :
BE Label1 product code 00cflHmwefweidJA barcode ISRC ......... 1 0.003872402660862401479116078884
US Label2 product code 00cflHmtyfweidJA barcode ISRC ..........1 0.002220695558213356018688393633
BE Label2 product code 00cflHmwefweidJA barcode ISRC ..........2 0.002137613958913373918420510406
NO Label3 product code 00cflHmjkfweidJA barcode ISRC ..........3 0.02264616748080050066133527663
DE Label4 product code 00cflHmwefweidJA barcode ISRC ..........1 0.003018216435957714580945696704
CO Label5 product code 00cflHmzxfweidJA barcode ISRC ..........1 0.0004178407583000146349569881848
CA Label6 product code 00cflHmwefpoidJA barcode ISRC ..........2 0.01385864190292964399955986534
CA Label7 product code 00cflHmwefmnidJA barcode ISRC ..........1 0.003270121556795672746439239972
IS Label7 product code 00cflHmwefweidJA barcode ISRC ..........8 0.05702767311942350853930831032
TR Label7 product code 00cf09poefweidJA barcode ISRC ..........4 0.009839895102632677068730014884
UPDATE
After a while, I decided to bite the bullet - and stream the file line-by-line using PHP ready to process the values before inserting into the table. using fgets(), this value is truncated here also... It's almost as if mysql and php think that the truncated values are the literal values in the file as it's being read. Very confusing
php is a weakly typed language. If it sees something it assumes is a decimal number, it handles it as a float
-- a ieee 754 double precision approximate number -- by default. The same goes for internal numeric processing in MySQL.
ieee 754 double precision isn't precise enough for your royalty computations (pity the poor musicians; you can't buy anything with a millionth of a penny).
So you're right to declare a high-precision decimal type for your table. But you have to trick MySQL into handling your numbers as if they were strings, and not take its favorite ieee 754 shortcut (or hack, we might say).
You could try something like this:
LOAD DATA
LOCAL INFILE 'C:\\yadda\\yadda\\sample.tsv'
INTO TABLE dump
(Country, Label, Product, URI, UPC, ISRC, Quantity, @USDPayable)
SET USDPayable = CAST(@USDPayable AS DECIMAL(33,30) );
This will arrange to process your tiny fractions of dollars as a string on input, then convert it to the decimal format you need using a SET clause.
Notice how the list of columns in parentheses assigns that value from each column to @USDPayable
.
That worked for me. If it doesn't work for you you probably should consider moving to a later version of MySQL.
Notice you'll have to change the list of columns for your real data. You didn't furnish some columns in your sample tsv file.
Be careful with weak data typing with this kind of money data. Please double check that your sums are coming out right. You may need to switch to a strongly typed language.