Search code examples
dockersharepointdownload

cURLing Sharepoint for a pdf file from docker container not working completely


I have a docker desktop on windows 11.

When I cURL (or wget) sharepoint from my laptop (host) for a pdf file, everything works perfectly (in 99% of cases, not 100%, as I have observed data loss only once out of a hundred tests), the pdf file is downloaded correctly on my laptop: curl --location "https://xxx.sharepoint.com/sites/xxx/_api/web/GetFolderByServerRelativeUrl('/sites/xxx/Shared%20Documents/xxx')/Files('file.pdf')/$value" --header "Authorization: Bearer xxx" --output file.pdf

When I run the same cURL inside my docker container (Debian), Sharepoint responds but the pdf file received in the container is corrupted and is only 3.5KB instead of 3.9KB (just for example). HOWEVER, sometimes the container receives more than excpected, for example when the file size is 3.9KB and the container receives 5.4KB, it happens regularly, it's really confusing.

ALSO, as I said cURL command works fine from host with windows shell. However, with MobaXterm, I notice the same problem as from docker container, downloaded file gets corrupted due to encoding, file systems or something I ignore !

My Docker image:

FROM gradle:8.5-jdk21 AS build
WORKDIR /usr/local/app/myApp
COPY build.gradle.kts settings.gradle.kts ./
COPY . ./
RUN gradle wrapper --gradle-version 8.5
RUN ./gradlew bootJar
FROM openjdk:21-jdk-slim
LABEL maintainer="[email protected]"
EXPOSE 8081
COPY --from=build /usr/local/app/myApp/build/libs/*.jar myApp.jar
ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/myApp.jar"]

I tried to play with MTU (in my docker-compose and also in my vEthernet (WSL (Hyper-V firewall)) network interface) but no good effect and my connection is not so that bad (around 100Mbps up and down with 40ms of latency approximatively).

Now I will try to take a look at TCP packets using Wireshark but I'm not sure it will help.

Thanks for help.


Solution

  • Solved :

    When cURLing Sharepoint from MobaXterm or a Linux container, we just have to escape '$' with a backslash.

    Thus, the following curl --location "https://xxx.sharepoint.com/sites/xxx/_api/web/GetFolderByServerRelativeUrl('/sites/xxx/Shared%20Documents/xxx')/Files('file.pdf')/\$value" --header "Authorization: Bearer xxx" --output file.pdf works fine.

    Regarding 1% failing case I noticed when cURLing from windows, it was probably due to expired token and not something related to my network or internet speed.

    Moreover, unless I'm wrong, this sharepoint endpoint is to be fixed because it seems returning an http status 200 (instead of 4** or 5**) even when it doesn't return the requested file correctly due to a URL problem or whatever.

    Thank you for trying to help.