Linux write into same file with awk and tee: very odd behaviour

I was trying to do something unusual and overwrite the same file on Unexpected new line when writing out in Unix Shell Script this question just out of curiosity

I found that on some attempts I could tee > to_same_file and it worked as you can see on the very first attempt and then subsequent attempts produced an empty file, my assumption is, this must be related to processing time.. Meaning on the first attempt it took longer to get to tee and had time to i/o where as it happens faster on the other attempts and has no chance to write to the file in time that it has read it.. just interested to understand why this odd behaviour occured

me@desktop:~/$ cp 2.csv 1.csv
me@desktop:~/$ cat 1.csv
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
ABCD89N, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
me@desktop:~/$ awk  -F"," '{ 
     timestamp=$5;  
     gsub(":"," ",timestamp); 
     gsub("-"," ",timestamp);   
     EPOCH=(mktime(timestamp))
     } 
     {
      print $0","EPOCH
      }' 1.csv  2>&1 | tee > 1.csv
me@desktop:~/$ cat 1.csv
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0,1388998800
ABCD89N, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0,1388998800
me@desktop:~/$ cp 2.csv 1.csv
me@desktop:~/$ cat 1.csv 
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
ABCD89N, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
me@desktop:~/$ awk  -F"," '{ 
     timestamp=$5;  
     gsub(":"," ",timestamp); 
     gsub("-"," ",timestamp);   
     EPOCH=(mktime(timestamp))
     } 
     {
      print $0","EPOCH
      }' 1.csv  2>&1 | tee > 1.csv
me@desktop:~/$ cat 1.csv 
me@desktop:~/$ cp 2.csv 1.csv
me@desktop:~/$ awk  -F"," '{ 
     timestamp=$5;  
     gsub(":"," ",timestamp); 
     gsub("-"," ",timestamp);   
     EPOCH=(mktime(timestamp))
     } 
     {
      print $0","EPOCH
      }' 1.csv  2>&1 | tee > 1.csv
me@desktop:~/$ cat 1.csv 
me@desktop:~/$ cp 2.csv 1.csv
me@desktop:~/$ cat 1.csv 
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
ABCD89N, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
me@desktop:~/$ awk  -F"," '{ 
     timestamp=$5;  
     gsub(":"," ",timestamp); 
     gsub("-"," ",timestamp);   
     EPOCH=(mktime(timestamp))
     } 
     {
      print $0","EPOCH
      }' 1.csv  2>&1 | tee -a > 1.csv
me@desktop:~/$ cat 1.csv 
me@desktop:~/$

Solution

A small, self contained test case with the same problem is this:

cat file | tee > file

This pipeline consists of two parts that runs in parallel.

cat file tries to open and read from the file.

tee > file tries to truncate the file.

Depending on whether the file is (partially) read or truncated first, you'll get either parts or all of your data, or just an empty file.