Search code examples
datedynamicpathdate-rangedatabricks

dynamic path for fetching files from start and end date


I want to make a only one path variable to fetch all the data from different directories depending upon entered start and end dates.

startDate = 2011/05/01
endDate = 2011/05/04

/myfolder/2011/05/01/*.csv
/myfolder/2011/05/02/*.csv
/myfolder/2011/05/03/*.csv
/myfolder/2011/05/04/*.csv

I can do this by extracting separately from 4 different paths. But i want 1 dynamic path.


Solution

  • You can accomplish this using joda-time.

    import org.joda.time.Days
    import org.joda.time.format.DateTimeFormat
    
    def dynamicPath(start: String, end: String): Array[String] = {
      val format = DateTimeFormat.forPattern("yyyy/MM/dd")
      val startDate = format.parseDateTime(start)
      val endDate = format.parseDateTime(end)
    
      val numberOfDays = Days.daysBetween(startDate, endDate).getDays()
    
      val dateRange = (for (d <- 0 to numberOfDays) yield s"/myfolder/${startDate.plusDays(d).toString("yyyy/MM/dd")}/*.csv").toArray
    
      dateRange
    }
    

    And you would call it using:

    val folderPaths = dynamicPath("2011/05/01", "2012/06/04")