Search code examples
pythonsharepointfile-uploaduploadsharepoint-api

How to Upload a File in Chunks to SharePoint (Office 365) via API in Python (using startupload/ continueupload/ finishupload)


I'm trying to upload a file in chunks via the API to SharePoint.

Uploading a 'small' file without chunks works (.../files/add(overwrite=true,url='{file_name}'). (So authentication etc. is working fine)

But when I use startupload/ continueupload/ finishupload it does not work and I get this error message/ response (code: -2147024809): "Parameter name: Specified value is not supported for the serverRelativeUrl parameter."

I need to solve this w/o any other framework like Office365-REST-Python-Client because I need to find out the correct API call.

I've checked the references: https://learn.microsoft.com/en-us/previous-versions/office/developer/sharepoint-rest-reference/dn450841(v=office.15) https://learn.microsoft.com/en-us/previous-versions/office/sharepoint-server/dn760924(v=office.15)

I've also tried w/ and w/o urllib.parse.quote, etc. (I have'nt hat the chance to test the fileoffset part yet because I dont get a working response.)

Here's my code for "big" files.

def uploadFile(bearer_token: str, tenant_name: str, site_name: str, folder_name: str, file_name: str) -> str:
    with open(file_name, "rb") as file:
        file_size = os.stat(file_name).st_size    
        big_file = file_size > CHUNK_SIZE
        if big_file:
            headers = {
                'Authorization': f'Bearer {bearer_token}',
                'Accept': 'application/json;odata=verbose',
                'Content-Type': 'application/octet-stream;odata=verbose'
                }        
                      
            GUID = uuid.uuid4()
            chunk_count = 1
            chunks_to_be_sent = file_size / CHUNK_SIZE
            continue_upload = True
            
            file_path = f"Shared%20Documents/{folder_name}/{file_name}"
            #file_path = urllib.parse.quote(file_path)
           
            web_host_url = f"{SHAREPOINT_HOST}"#/sites/{site_name}" #'<host web url>'
            # web_host_url = urllib.parse.quote(web_host_url)

            while continue_upload:
                big_file_chunk = file.read(CHUNK_SIZE)
                if not big_file_chunk or chunk_count >= chunks_to_be_sent: # Finish Upload
                    file_offset = (CHUNK_SIZE * (chunk_count - 1))# + 1
                    file_url = f"/sites/{site_name}/_api/web/getfilebyserverrelativeurl('{file_path}')/finishupload(uploadId=guid'{GUID}',fileOffset={file_offset})"
                    continue_upload = False
                elif chunk_count < 2: # Start Upload (First Chunk)
                    file_url = f"/sites/{site_name}/_api/web/getfilebyserverrelativeurl('{file_path}')/startupload(uploadId=guid'{GUID}')"#?@target='{web_host_url}'"
                else: # continue Upload
                    file_offset = (CHUNK_SIZE * chunk_count)# + 1
                    file_url = f"/sites/{site_name}/_api/web/getfilebyserverrelativeurl('{file_path}')/continueupload(uploadId=guid'{GUID}',fileOffset={file_offset})"
             
                url = urllib.parse.quote(file_url)
                conn = http.client.HTTPSConnection(SHAREPOINT_HOST)
                conn.request("POST", url, big_file_chunk, headers, encode_chunked=True)
                resp = conn.getresponse()
                resp = resp.read()
                #icecream.ic(chunk_count, file_url, resp)
                print(chunk_count, " -- ", resp)
                chunk_count += 1

            
            # resp = conn.getresponse()
            # resp = resp.read()
            return resp.decode("utf-8")

I get this error message/ response: code: -2147024809 Parameter name: Specified value is not supported for the serverRelativeUrl parameter.

1 -- b'{"error":{"code":"-2147024809, System.ArgumentException","message":{"lang":"en-US","value":"serverRelativeUrl\\r\\nParameter name: Specified value is not supported for the serverRelativeUrl parameter."}}}' 2 -- b'{"error":{"code":"-2147024809, System.ArgumentException","message":{"lang":"en-US","value":"serverRelativeUrl\\r\\nParameter name: Specified value is not supported for the serverRelativeUrl parameter."}}}'


Solution

  • Here's my working solution. I think my mistake was to urllib.parse.quote() the URL; but I dont know for sure.

    There are to ways:

    1. GetFileByServerRelativePath(decodedurl='{URL}') e.g., url = f"/sites/{site_name}/_api/web/GetFileByServerRelativePath(decodedurl='/sites/{site_name}/{folder_path}/{file_name}')/startupload(uploadId=guid'{GUID}')"

    2. GetFileById('{unique_file_id}') e.g., file_url = f"/sites/{site_name}/_api/web/getfilebyid('{unique_file_id}')/startupload(uploadId=guid'{GUID}')"

    General:

    • If there is no existing file on the server, you have to create one, otherwise neither of the both approaches will work.
    • Using urllib.parse.quote() with the GetFileByServerRelativePath(decodedurl='{URL}') does not work for my solution

    Approach:

    1. Check for UniqueId of the file with ".../GetFileByServerRelativePath(decodedurl='/sites/{site_name}/{folder_path}/{file_name}')"
    2. If the file does not exist on the server, create new (empty, in-memory) file, upload it via ".../files/add(overwrite=true,url='{file_name}')"
    3. Generate a GUID -> GUID = uuid.uuid4()
    4. Read file in chunks -> big_file_chunk = file.read(CHUNK_SIZE)
    5. Start with ".../startupload(uploadId=guid'{GUID}')" and safe the fileoffset from the response
    6. Continue with ".../continueupload(uploadId=guid'{GUID}',fileOffset={file_offset})" as long as you are not on the last chunk_count; also safe the fileoffset from the response
    7. Finish the last chunk with ".../finishupload(uploadId=guid'{GUID}',fileOffset={file_offset})"

    Expample URL:

    host = "COMPANY.sharepoint.com"
    site_name = "DUMMYSUPPLIER"
    foler_path = "Shared%20Documents/TestFolder"
    file_name = "FILE.EXT"
    UniqueId = '3e6be666-8b5b-40fa-89ab-7cf9092a603d'
    guid = '57dd389e-5325-4cc4-95cd-55af2131ae67'
    
    url = "/sites/DUMMYSUPPLIER/_api/web/getfilebyid('3e6be666-8b5b-40fa-89ab-7cf9092a603d')/startupload(uploadId=guid'57dd389e-5325-4cc4-95cd-55af2131ae67')"
    url = f"/sites/DUMMYSUPPLIER/_api/web/GetFileByServerRelativePath(decodedurl='/sites/DUMMYSUPPLIER/Shared%20Documents/TestFolder/FILE.EXT')/startupload(uploadId=guid'57dd389e-5325-4cc4-95cd-55af2131ae67')"
    

    Solution with GetFileById('{unique_file_id}'):

    def _uploadLargeFile(host: str, bearer_token: str, site_name: str, folder_path: str, file_name: str) -> http.client.HTTPResponse:    
        """
        Uploads a file in chunks to the server and overwrites exisitng files. (Recommended for large files; mandatory for file sizes >= 250MB.)
        """
        headers = {
                'Authorization': f'Bearer {bearer_token}',
                'Accept': 'application/json;odata=verbose',
                'Content-Type': 'application/octet-stream;odata=verbose'
                }                              
    
        unique_file_id = get_file_unique_id(host=host, bearer_token=bearer_token, site_name=site_name, folder_path=folder_path, file_name=file_name)
        if not unique_file_id:
            unique_file_id, resp = create_and_upload_empty_file(host=host, bearer_token=bearer_token, site_name=site_name, folder_path=folder_path, file_name=file_name)
    
        GUID = uuid.uuid4()
        file_size = os.stat(file_name).st_size
        with open(file_name, "rb") as file:
            continue_upload = True
            file_offset = 0 # While uploading in chunks the fileoffset can be calculated as follows: fileoffset = chunk_count * CHUNK_SIZE
            while continue_upload:
                big_file_chunk = file.read(CHUNK_SIZE)
    
                conn = http.client.HTTPSConnection(host=host)
                
                if not big_file_chunk or _is_last_chunk(file_size, file_offset, CHUNK_SIZE): # Finish Upload
                    file_url = f"/sites/{site_name}/_api/web/getfilebyid('{unique_file_id}')/finishupload(uploadId=guid'{GUID}',fileOffset={file_offset})"
                    url = urllib.parse.quote(file_url)
                    conn.request("POST", url, big_file_chunk, headers, encode_chunked=True)
                    resp = conn.getresponse()                
                    continue_upload = False
                elif file_offset == 0: # Start Upload
                    file_url = f"/sites/{site_name}/_api/web/getfilebyid('{unique_file_id}')/startupload(uploadId=guid'{GUID}')"
                    url = urllib.parse.quote(file_url)
                    conn.request("POST", url, big_file_chunk, headers, encode_chunked=True)
                    resp = conn.getresponse()
                    resp_json = json.load(resp)
                    file_offset = int(resp_json['d']['StartUpload'])
                else: # continue Upload
                    file_url = f"/sites/{site_name}/_api/web/getfilebyid('{unique_file_id}')/continueupload(uploadId=guid'{GUID}',fileOffset={file_offset})"
                    url = urllib.parse.quote(file_url)
                    conn.request("POST", url, big_file_chunk, headers, encode_chunked=True)
                    resp = conn.getresponse()
                    resp_json = json.load(resp)
                    file_offset = int(resp_json['d']['ContinueUpload'])
    
                if continue_upload:
                     prect = (file_offset / file_size) * 100
                     print(f"Uploading file '{file_name}': {prect:.2f}% uploaded ({file_offset}/{file_size})")
                else:
                     print(f"Uploading file '{file_name}': {100.0:.1f}% uploaded ({file_size}/{file_size})")
            return resp
            
            
    def get_file_unique_id(host: str, bearer_token: str, site_name: str, folder_path: str, file_name: str) -> str | None:
        """
        Returns the files UniqueId if it exists otherwise None.
        """
    
        headers = {
        'Authorization': f'Bearer {bearer_token}',
        'Accept': 'application/json;odata=verbose',
        'Content-Type': 'application/octet-stream;odata=verbose'
        }
    
        url = f"/sites/{site_name}/_api/Web/GetFileByServerRelativePath(decodedurl='/sites/{site_name}/{folder_path}/{file_name}')"
    
        conn = http.client.HTTPSConnection(host)
        conn.request(method="POST", url=url, body=None, headers=headers)
        resp = conn.getresponse()
        resp_json = json.load(resp)
        
        if 'd' in resp_json:       
            return resp_json['d']['UniqueId']
        else:
            return None
    
            
    def create_and_upload_empty_file(host: str, bearer_token: str, site_name: str, folder_path: str, file_name: str) -> (str, http.client.HTTPResponse):
        """
        Creates an empty file (in-memory) and uploads it the server. Returns the UniqueId of the uploaded file.
        """
        emptyfile = io.BytesIO(b"") # In-Memory file content
    
        headers = {
        'Authorization': f'Bearer {bearer_token}',
        'Accept': 'application/json;odata=verbose',
        'Content-Type': 'application/octet-stream;odata=verbose'
        }
    
        file_url = f"/sites/{site_name}/_api/web/getfolderbyserverrelativeurl('{folder_path}')/files/add(overwrite=true,url='{file_name}')"
        url = urllib.parse.quote(file_url)
        
        conn = http.client.HTTPSConnection(host)
        conn.request("POST", url, emptyfile, headers)
        resp = conn.getresponse()
        resp_json = json.load(resp)    
        if 'd' in resp_json:       
            return (resp_json['d']['UniqueId'], resp)
        else:
            return (None, resp)
    
    
    def _is_last_chunk(file_size, file_offset, chunk_size) -> bool:
        return file_size - file_offset <= chunk_size