I'm using Spark to import data from text files into CQL tables (on DataStax). I've done this successfully with one file in which all variables were strings. I first created the table using CQL, then in the Spark shell using Scala ran:
val file = sc.textFile("file:///home/pr.txt").map(line => line.split("\\|").map(_.toString));
file.map(line => (line(0), line(1))).saveToCassandra("ks", "ks_pr", Seq("proc_c", "proc_d"));
The rest of the files I want to import contain multiple variable types. I've set up the tables using CQL and specified the appropriate types there, but how do I transform them when importing the text file in spark?
For example if proc_c is Int and proc_d is Double you can do it this way:
file.map{
line => (line(0), line(1)).
map({ case (l, r) => (l.toInt, r.toDouble) }).
saveToCassandra("ks", "ks_pr", Seq("proc_c", "proc_d")
}