Search code examples
arraysswiftdropbox-apiswiftydropbox

Dropbox Files.download does not start when number of files in folder is > 1000


I am cross-posting this from my initial question in DropBox forums. I think it would be good to have this here as well for swiftydropbox users.

I'm having trouble downloading entire folders to a local device via swiftyDropbox. I am doing the ListFolder and ListFolderContinue (which I observe that it chunks it to ~500 files per response) and appending it to a local array.

After which, I pass this array to files.download. However, I am finding out that in cases where my folder is >1000 files (txt files ~0.5-1kb in size), the download process will not start.

static func downloadMissingFiles(client: DropboxClient, callingProcess: String) {
      let fileManager = FileManager.default
      let localBaseURL = fileManager.urls(for: .cachesDirectory, in: .userDomainMask)[0].appendingPathComponent("Cloud/Dropbox", isDirectory: true)
      
      // Data will be in the form of
      // key   : "/workouts/workout list 1/peye.mrc"
      // value : "//workouts/workout list 1/peye.mrc=_-_=015ca880b135d01000000020cb26de0"
      for dbFiles in Array(dbFileNameRevDict) {
        let dbFilePathLower = dbFiles.key
        let dbFileNameRev = dbFiles.value
        let fullURL = localBaseURL.appendingPathComponent(dbFileNameRev)
        
        if fileManager.fileExists(atPath: fullURL.path) {
          print("  -> FILE EXISTS dbFileNameRev:\(dbFileNameRev)")
          localFileList.append(dbFileNameRev)
        } else {
          let destination : (URL, HTTPURLResponse) -> URL = { temporaryURL, response in
            return fullURL
          }
          
          client.files.download(path:dbFilePathLower, overwrite: true, destination: destination)
            .response { response, error in
              if let (_, url) = response {
                print("====> DOWNLOADED:\(url.lastPathComponent)")
              } else if let error = error {
               print(error)
            }
            /// This gives a progress of every single file on it's own. Hence, useless
            // .progress { progressData in
            //  print(progressData)
            // }
        }
      }
    }
  }

I have tried various method to download these files, I also tried to do a serial queue to iterate the array of files one by one but it doesn't work.

This is how I process the ListFolder and ListFolderContinue, looking at the hasMore attribute.

      // https://stackoverflow.com/a/52870045/14414215
      if result.hasMore == true {
        processDBMore(client: client, cursor: result.cursor)
      } else {
        // When there is no more files (as indicated by hasMore == false)
        // start downloading the files
        downloadMissingFiles(client: client, callingProcess: "processDBMore-Finish")
        print("PrcessDBMore - dropboxGroup.leave")
        dropboxGroup.leave()
      }

Solution

  • According to Greg (swiftyDropbox)

    Each 'client.files.download' call downloads one file by making one HTTPS request to the Dropbox API servers. Additionally, these calls run asynchronously, and will not block until the call completes. That is, calling 'client.files.download' will start the HTTPS request, but will itself return before it's complete and the response is fully received. (It just runs the supplied block once the request is done.) That being the case, in the code you showed here, you're actually starting 1000 connections in a row, at almost the same time, so it's likely exhausting your network connection. You should update your code to only submit one (or a few) of these at a time. You mentioned you tried a serial queue, but that may be running in to the same issue, since the actual requests run asynchronously.

    So I was searching for other solutions when I came across this post https://stackoverflow.com/a/66227963/14414215 which greatly helped in my understanding of how semaphore works and how implementing semaphores (besides using dispatchGroups) is able to properly control the files.download calls.

       static func downloadMissingFiles(client: DropboxClient, callingProcess: String) {
          let fileManager = FileManager.default
          let localBaseURL = fileManager.urls(for: .cachesDirectory, in: .userDomainMask)[0].appendingPathComponent("Cloud/Dropbox", isDirectory: true)
          let semaphore = DispatchSemaphore(value: 1)  // insert desired concurrent downloads value here.
    
          // Data will be in the form of
          // key   : "/workouts/workout list 1/peye.mrc"
          // value : "//workouts/workout list 1/peye.mrc=_-_=015ca880b135d01000000020cb26de0"
          DispatchQueue.global().async { // Wrap the call within an async block
          for dbFiles in Array(dbFileNameRevDict) {
            semaphore.wait() // Decrement the semaphore counter
            let dbFilePathLower = dbFiles.key
            let dbFileNameRev = dbFiles.value
            let fullURL = localBaseURL.appendingPathComponent(dbFileNameRev)
            
            if fileManager.fileExists(atPath: fullURL.path) {
              print("  -> FILE EXISTS dbFileNameRev:\(dbFileNameRev)")
              localFileList.append(dbFileNameRev)
              semaphore.signal()  // Increment semaphore counter
            } else {
              let destination : (URL, HTTPURLResponse) -> URL = { temporaryURL, response in
                return fullURL
              }
              
              client.files.download(path:dbFilePathLower, overwrite: true, destination: destination)
                .response { response, error in
                  if let (_, url) = response {
                    print("====> DOWNLOADED:\(url.lastPathComponent)")
                    // we've reached here means we've successfully download the file
                    // So we can (release)increment semaphore counter
                    semaphore.signal() 
                  } else if let error = error {
                   print(error)
                }
                /// This gives a progress of every single file on it's own. Hence, useless
                // .progress { progressData in
                //  print(progressData)
                // }
            }
          }
        }
       }
      }