Search code examples
onedrive

Getting file contents with Range header returns Partial Content and subsequent request returns no data


We have a non-understandable issue with getting file contents using OneDrive API.

When we request file contents with Range header:

GET /blahblah/foobar.docx HTTP/1.1
Host: qw122q-ch3301.files.1drv.com
Accept: */*
Accept-Encoding: deflate, gzip
Range: bytes=0-77270

OneDrive returns:

HTTP/1.1 206 Partial Content
Cache-Control: no-cache
Content-Length: 18325

We checked that the file size is correct on OneDrive server using web interface. Usually OneDrive returns full requested content but from last week they returns partial contents. But it's OK if we can get remaining parts with another API calls.

But when we send another request with Range header:

Range: bytes=18325-77270

OneDrive returns no data:

HTTP/1.1 206 Partial Content
Control: no-cache
Content-Length: 0

Has anyone experienced this issue? I can't find any clues on this issue from OneDrive developer documents. Please shed some light on this..


Solution

  • Actually I have a theory so I'm going to take a shot at an answer. There are actually two different issues that are resulting in this confusing behavior, so I'll tackle each one separately.

    Reported file size doesn't match content size

    This is an unfortunate quirk of the system that is being tracked with this GitHub issue. Ryan explains in more detail here.

    Range downloads of word docs do not correctly handle unsatisfiable ranges

    When a range outside of the actual file size is requested we should be failing with a 416 Requested Range Not Satisfiable like we do for "normal" files. But that's obviously not working. You can see in the Content-Range of the result there's something screwy going on:

    FileSize: 15 bytes
    Range Requested: bytes=15-
    Content-Range Response: bytes=15-14/15

    The value of the Content-Range obviously makes no sense.

    Together these two issues should result in the weird behavior you're seeing. We're close to resolving the first, while the second was unknown (at least to me) so I've opened a new GitHub issue to track it.