I have a log file, perl script who took the log file and transcript the log file, i want to send this all line (eight thousand line) in one commit
my script:
# Connect to the database.
my $dbh = DBI->connect(
"DBI:mysql:database=DB;host=>IP",
"hostname", 'password',
{'RaiseError' => 1,'AutoCommit'=> 0}
);
open (FILE, 'file.log');
while (<FILE>) {
($word1, $word2, $word3, $word4, $word5, $word6, $word7, $word8, $word9, $word10, $word11, $word12, $word13, $word14) = split(" ");
$word13 =~ s/[^\d.]//g;
if ($word2 eq "Feb") {
$word2 = "02"
}
print "'$word5-$word2-$word3 $word4', $word11, $word13 \n";
eval {
#on peut utiliser insert mais il y aura des doublons et ici on est dans une table unique
my $sth = $dbh->prepare("INSERT INTO `test_query` (time, cstep, time_in_seconde) VALUES('$word5-$word2-$word3 $word4', $word11, $word13);");
#print $sth->rows . " rows found.\n";
#$sth->finish;
# do inserts, updates, deletes, queries here
#$sth->execute() or die "execution failed: $dbh->errstr()";
$sth->execute() or die "execution failed: $dbh->errstr()";
$dbh->commit();
};
### If something went wrong...
}
}
$dbh->disconnect();
thanks
For better performance, you want to simplify your code and move as many code as possible out of the loop :
prepare
the statement out of the loop, using bind parameters : the statement is always the same, only bind parameters get to change
commit
out of the loop : this will increase performance and also has the advantage of making your process atomic. As all changes occur within the same database transaction, either all lines will be processed (and commited), or, if a failure occurs on any line, no line at all will be commited. While implementing this optimization you need to watch for resource usage on your database (this will typically require more space in the UNDO
tablespace) ; if resources are not enough, either increase them or commit every Nth record (with N
being as high as possible)
avoid print
ing inside the loop unless your really need it (I commented that line)
you are building a connection with RaiseError
attribute enabled, but then you ignore errors that can occur at execute
. If this is really what you want, then just disable the RaiseError
attribute on the statement handler, and remove the eval
around execute
Other considerations in terms of coding practices :
always use strict
and use warnings
use an array to store parsed data instead of a list of scalars : could make your code faster and will make it more readable
Code :
use strict;
use warnings;
# Connect to the database.
my $dbh = DBI->connect(
"DBI:mysql:database=DB;host=>IP",
"hostname", 'password',
{'RaiseError' => 1,'AutoCommit'=> 0}
);
# prepare the insert statement
my $sth = $dbh->prepare("INSERT INTO `test_query` (time, cstep, time_in_seconde) VALUES(?, ?, ?)");
$sth->{RaiseError} = 0;
open (my $file, 'file.log') or die "could not open : $!";
while (<$file>) {
my @words = split / /;
$words[12] =~ s/[^\d.]//g;
if ($words[1] eq "Feb") {
$words[1] = "02" ;
}
# print "'$words[4]-$words[1]-$words[2] $words[3]', $words[10], $words[12] \n";
$sth->execute( "$words[4]-$words[1]-$words[2] $words[3]", $words[10], $words[12] );
}
$dbh->commit;
$dbh->disconnect;
The last solution, that would probably perform even faster than this one, is to use DBI method execute_array
to perform bulk database inserts. Attribute ArrayTupleFetch
can be used to provide a code reference that DBI will invoke everytime it is ready to perform the next INSERT
: this code reference should read the next file line and provide an array reference of values suitable for INSERT
. When the file is exhausted, the sub should return undef, which will indicate DBI that the bulk process is completed.
Code :
#!/usr/local/bin/perl
use strict;
use warnings;
use DBI;
# open the file
open (my $file, 'log.file') or die "could not open : $!";
# connect the database
my $dbh = DBI->connect("DBI:mysql:database=DB;host=ip", "hostname", 'password', {'RaiseError' => 1,'AutoCommit'=> 0});
# prepare the INSERT statement
my $sth = $dbh->prepare("INSERT INTO `test_query` (time, cstep, time_in_seconde) VALUES(?, ?, ?)");
# run bulk INSERTS
my $tuples = $sth->execute_array({
ArrayTupleStatus => \my @tuple_status,
ArrayTupleFetch => sub {
my $line = <$file>;
return unless $line;
my @words = split / /;
# ... do anything you like with the array, then ...
return [ "$words[4]-$words[1]-$words[2] $words[3]", $words[10], $words[12] ];
}
});
if ($tuples) {
print "Successfully inserted $tuples records\n";
} else {
# do something usefull with @tuple_status, that contains the detailed results
}
$dbh->commit;
$dbh->disconnect;