Search code examples
apache-flinkflink-streaming

Apache Flink Writing to MapR filesystem


I am currently running Apache Flink 1.2.0 in my current environment and was using BucketingSink to write data into hadoop file system. I am able to write data using file:/// and hdfs:/// filesystem protocol without any problem. Tested in Hortonworks Sandbox. But when I switch to write with maprfs:/// protocol in MapR Sandbox, it says

No FileSystem for scheme: maprfs
  Caused by: java.io.IOException: No FileSystem for scheme: maprfs
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2644)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)

I need to understand what configuration I need to add into my Flink application in order to allow me writing to maprfs. In my MapR cluster, the core-site.xml and hdfs-site.xml is empty, therefore, I did not copy to my $FLINK_CONF_DIR.


Solution

  • You need mapr's hadoop.jar first in your classpath. Its usually sitting somewhere in /opt/mapr/...