Search code examples
hadoopbigdatamahoutdatamodelmahout-recommender

How to make DataModel with multiple-input-file in mahout?


I want to make DataModel with numerous *.csv files.(these have same format, different data)
But, I have no idea how to do that.
I cannot find function through mahout documentation.(mahout-API)
"Make module that makes numerous *.csv file to one *.csv file" only solution?
please help...!


Solution

  • You can combine all of you *.csv file, i.e combine two files as below

    public static void main(String [] args) throws IOException{
        BufferedReader reader = new BufferedReader(new FileReader("YOUR_SOURCE_1"));
        BufferedReader reader2 = new BufferedReader(new FileReader("YOUR_SOURCE_2"));
        BufferedWriter writer = new BufferedWriter(new FileWriter("YOUR_TARGET");
        int x = 0;
        while ((line = reader.readLine()) != null) {
            if (x > 0) {
                String [] values = line.split("\\t", -1);
                writer.write(values[0] + "," + values[1]+","+values[2]+"\n");
            }
            x++;
        }
        String line;
        int x = 0;
        while ((line = reader.readLine()) != null) {
            if (x > 0) {
                String [] values = line.split("\\t", -1);
                writer.write(values[0] + "," + values[1]+","+values[2]+"\n");
            }
            x++;
        }
        reader.close();
        reader2.close();
        writer.close();
    }