Search code examples
mathstatisticsj

I am trying to do some basic math with files in J


I have a file with numbers separated by space (not a csv file but plain text file) The following code will sum the numbers appearing in a continuous line.

+/ 0". 1!:1 < F

(0". is for converting strings to numbers) (F is a text file)

But if each number appears on its own line this does not work. How do I modify the code to do that? And some general pointers for J-specific numeric analysis with files.


Solution

  • The first issue to deal with is the line endings. Using freads from the standard library will read your text file and ensure that each line is terminated by LF, no matter whether lines were originally CRLF, CR or just LF terminated. Use fread, with no left argument, if you want to read the bytes exactly as they were in the file.

    Let's use the following space-delimited file (numbers.txt) as an example:

    1 -3.93 17 -7 564
    2 4.27 12 3 234
    3 -1.90 22 5 728
    4 0.00 10 -4 442
    

    You can read the file as follows:

       0 ". 'm' freads 'numbers.txt'
    1 _3.93 17 _7 564
    2  4.27 12  3 234
    3  _1.9 22  5 728
    4     0 10 _4 442
    

    Under the covers the 'm' left argument to freads is doing this:

       0 ". ];._2 freads 'numbers.txt'
    

    The ;._2 says to use the last byte in the file (which is now LF) as the delimiter, remove them from the result and apply the verb to its left (] in this case) to each line. The result is a character array with a row for each line and columns corresponding to the longest line in the file (shorter lines are padded with spaces). 0 ". is then attempts to parse the array as numbers, where it can't it will replace the input with 0 (I often use _999 so it is more obvious that something hasn't converted correctly).

    If the file is delimited by something other than a space, then you need to parse the fields otherwise you'll get interesting results:

       0 ". ];._2 freads 'numbers.csv'
    0 24.2712 0 0
    

    If it is a simple file you could try the one of following:

       0 ". ];._2 ', ' charsub freads 'numbers.csv'  NB. replace delimiter with spaces
    1 _3.93 17 _7 564
    2  4.27 12  3 234
    3  _1.9 22  5 728
    4     0 10 _4 442
    
       0 ". > ','&cut;._2 freads 'numbers.csv'  NB. box fields in each line
    1 _3.93 17 _7 564
    2  4.27 12  3 234
    3  _1.9 22  5 728
    4     0 10 _4 442
    

    If it is more complicated (quoted delimiters etc) or if you just find it easier, you can use the tables/csv or tables/dsv addons:

       load 'tables/csv'
       0 ". > readcsv 'numbers.csv'
    1 _3.93 17 _7 564
    2  4.27 12  3 234
    3  _1.9 22  5 728
    4     0 10 _4 442
    
       0 ". > ',' readdsv 'numbers.csv'
    1 _3.93 17 _7 564
    2  4.27 12  3 234
    3  _1.9 22  5 728
    4     0 10 _4 442