I have just started learning spark and scala. I have a file test.txt which has one line "My name is xyz".
When I create RDD and apply flatmap method, and when I print that, I am getting -
My
name
is
xyz
But when the same line is passed as string to flatmap, it throws me a compiler error "split is not a member of char"
val lines = sc.textFile("C:/test.txt")
val result = lines.flatMap(x => x.split(" "))
result.foreach(println)
val name = "My name is xyz"
val res = name.flatMap(x => x.split(" "))
//println(res)
This is using sc and so is parallized in Spark.
val lines = sc.textFile("C:/test.txt")
val result = lines.flatMap(x => x.split(" "))
result.foreach(println)
This is not Spark'ed. Just Scala and is just a String. Next level down from String is Char.
val name = "My name is xyz"
val res = name.flatMap(x => x.split(" "))
println(res)
The equivalent at Scala level of the first is at least making an Array of String that approximates a line being read in by the sc.textFile, then it works or as they say Bob's your uncle:
val name = Array("My name is xyz")
val res = name.flatMap(x => x.split(" "))
println(res)
returns (note the ','s):
[Ljava.lang.String;@16947521
name: Array[String] = Array(My name is xyz)
res: Array[String] = Array(My, name, is, xyz)