Search code examples
javahadoophdfspentahoapache-commons-vfs

Configuring pentaho's hdfs-vfs to pick up hdfs-site.xml


I just started to use Pentaho's HDFS VFS and don't know much about this project. I am trying to have my Hadoop config files read from an external location. This seems to work fine for all files except hdfs-site.xml. The failure occurs in the VFS layer when trying to communicate with HDFS through the Pentaho HDFS VFS project. My gut tells me that pentaho is reading this file through some environment variable or other external pointer, but I can't seem to find it in their source. Everything works fine when I manually place the hdfs-site.xml file in the compiled war file but this will not suffice for me because I need to have this file in an external location so it can be changed by other processes.

Has anyone dealt with this issue before? Could someone please let me know how to tell pentaho where to pick this file up from?

Thanks


Solution

  • So I figure out a way to make pentaho hdfs works, all you need to do is adding hdfs-site.xml to your classpath. I did this when I start my jar file:

    java -jar start.jar --lib=/etc/hadoop/conf.cloudera.yarn1

    And /etc/hadoop/conf.cloudera.yarn1 is the directory where hdfs-site.xml reside.