Search code examples
scala

cannot list files in a hdfs dir using new File.listFiles


There is full permission to the folder I am trying to list but still, couldn't.

scala> new File("hdfs://mapdigidev/apps/hive/warehouse/da_ai.db/t_fact_ai_pi_ww").listFiles
res0: Array[java.io.File] = null

Solution

  • You can use the hadoop libraries to list files in hadoop:

    import org.apache.hadoop.conf.Configuration
    import org.apache.hadoop.fs.{FileSystem, Path}
    
    val fs = FileSystem.get(new URI("hdfs://mapdigidev"), new Configuration())
    val files = fs.listFiles(new Path("/apps/hive/warehouse/da_ai.db/t_fact_ai_pi_ww"), false)
    

    But java.io doesn't know about hadoop/hdfs.