Well, this one is the weirdest filesystem related issue i had in a loooooong time.
I have a script, that basically connects on a remote imap server, downloads emails, marks them as read, rip-off trash to download only .txt
and .xml
files. If .txt
use Text::Unaccent
to remove accents.
This is done on a 1-to-1 relationship of imap remote folder to a local cifs mounted folder on this server. The remove imap download and accentuation handling works just fine.
My problem is: IF i download the file, handle accentuation and move it to a cifs
mounted directory, the file gets ripped off(last 4 to 10K is missing). If i move it to anoter partition on the same machine, files are moved on a sane fashion(same md5sum
, same filesize, no changes noticed by diff
).
The chunk of code that does the accents remove and moves the file:
#If file extension = .txt
if ("$temp_dir/$arquivo" =~ /txt$/i){
#Put file line by line inside array
open (LEITURA, "$temp_dir/$arquivo");
@manipular = <LEITURA>;
close LEITURA;
#Open the same file to writing with other filehandler
open (ESCRITA, ">", "$temp_dir/$arquivo");
foreach $manipula_linha (@manipular){
# Removes & and accents
$manipula_linha =~ s/\&/e/g;
$manipula_linha = unac_string("UTF-8", $manipula_linha);
print ESCRITA $manipula_linha;
};
};
# copy temp file to final destination. If cifs = crash
# move also does not work...
copy "$temp_dir/$arquivo", "$dest_file";
unlink "$temp_dir/$arquivo";
Cifs version:
[root@server mail_downloader]# modinfo cifs
filename: /lib/modules/2.6.18-409.el5.centos.plus/kernel/fs/cifs/cifs.ko
version: 1.60RH
description: VFS to access servers complying with the SNIA CIFS Specification e.g. Samba and Windows
license: GPL
Perl version:
[root@server mail_downloader]# perl --version
This is perl, v5.8.8 built for i386-linux-thread-multi
Unaccent version:
[root@server mail_downloader]# rpm -qa | grep Unaccent
perl-Text-Unaccent-1.08-1.2.el5.rf
Question: Any clues on why perl move
or copy
have this behavior with a cifs mountpoint and how to solve this?
Obviously i cant post the files contents here, cause they are EDI related stuff, and have some financial info.
Also, if i comment the perl copy
handle the file myself after unaccent is done using cp
or mv
, the file is moved correctly to the cifs
mountpoint.
The problem is really obvious - you're not closing the file once you've finished writing to it. When you copy/move it to the other file system, you lose a chunk of it that hasn't been synced to disk.
open (ESCRITA, ">", "$temp_dir/$arquivo");
foreach $manipula_linha (@manipular){
# Removes & and accents
$manipula_linha =~ s/\&/e/g;
$manipula_linha = unac_string("UTF-8", $manipula_linha);
print ESCRITA $manipula_linha;
};
# Flush the file
my $old_fh = select(ESCRITA);
$| = 1;
select($old_fh);
close ESCRITA;
};
move "$temp_dir/$arquivo", "$dest_file";