Search code examples
nginxdockergitlablarge-filesgit-lfs

How can I increase gitlab CE lfs file size limitation as to not get 500 server errors?


I'm using the excellent sameersbn/gitlab to set up a custom gitlab server for my job.

So I have a ridiculous scenario, I'm using git lfs to store files which are in the 10-20 GB range, with gitlab ce v8.12.5, but I'm seeing 500 server errors all over the place, my uploads cannot finish.

Question: Does anyone know how I can increase the server side limitations?

Note: This is not a 413 nginx issue, I've set the client_max_body_size 500G so it should be forwarding to gitlab just fine.

If any more info is required(i.e. log files, etc) I will gladly provide it, just make a comment.

Update.1:

There seems to be a related gitlab issue on this same problem.

Update.2

Other resources which are relavant:

For now my hypothesis is that there is a timeout somewhere in the chain or proxy servers in the docker container.

git bash: error: RPC failed; result = 18, HTP code = 200B | 1KiB/s

https://github.com/gitlabhq/gitlabhq/issues/694

So here's something I just noticed the docker mapped device /dev/dm-7 becomes 100% full near the same time that gitlab errors out with a 500.

Now I'm starting to believe that this is not a gitlab problem, but a docker problem and that gitlab is just running out of space.

Thanks for your time, and cheers.


Solution

  • Problem 1

    The first major issue was the following error in ~/log/gitlab-workhorse.log

    error: handleStoreLfsObject: copy body to tempfile: unexpected EOF

    This error had nothing to do with gitlab itself, but rather that docker in their latest version has decided to shrink the default container sizes from 100G / container to 10G / container, which consequently meant that whenever I tried to upload files larger than 10GB, gitlab would attempt to make a temporary file of the size of the uploaded file (in my case 30GB) and subsequently blow with the above error message for lack of space in the docker container.

    I followed this excellent guide on how to increase the size of my container, but it basically boils down to:

        sudo `which docker-compose` down
    

    to stop the running container.

        sudo vim /etc/systemd/system/docker.service
    

    and appending

        --sotrage-opt dm.basesize=100G
    

    as the default size for new base images. Now since there seems to be a current issue with docker, you have to

       sudo `which docker` rmi gitlab
    

    assuming your image is called gitlab, and

       sudo `which docker-compose` up
    

    to re-pull the image and have it be created with the proper size.

    If this still doesn't work try sudo systemctl restart docker.service as this seems to help when docker seems to not do what you asked it to do.

        sudo docker exec -it gitlab df -h
    

    Should produce something like:

    Filesystem                                                                                        Size  Used Avail Use% Mounted on
    
    /dev/mapper/docker-253:1-927611-353ffe52e1182750efb624c81a3a040d5c054286c6c5b5f709bd587afc92b38f  100G  938M  100G   1% /
    

    I'm not 100% certain that all of these settings were necessary, but in solving Problem 2, below, I ended up having to set these as well in the docker-compose.yml

        - GITLAB_WORKHORSE_TIMEOUT=60m0s
        - UNICORN_TIMEOUT=3600
        - GITLAB_TIMEOUT=3600
    

    Problem 2

    In addition to the resizing the container from 10GB to 100GB I had to add the following to the running instance:

    The reason for this was that the filesize was so large (30GB+) and the network speed so slow(10MB/s) that the upload was taking longer than the default nginx and timing out with a 504 Gateway Timeout.

        sudo docker exec -it /bin/bash
    

    of the gitlab container's vim.tiny /etc/nginx/nginx.conf:

        http {
        ...
          client_max_body_size 500G;
          proxy_connect_timeout       3600;
          proxy_send_timeout          3600;
          proxy_read_timeout          3600;
          send_timeout                3600;
        ...
        }
    

    then I restarted nginx. Sadly service restart nginx did not work and I had to:

        service stop nginx
        service start nginx
    

    Note: I have a reverse proxy running on this server which catches all http requests, so I'm not certain, but I believe that all the settings that I added to the container's nginx config have to be duplicated on the proxy side

    If I've left step out or you would like some clarification on how exactly to do a certain part of the procedure please leave a comment and ask. This was a royal pain and I hope to help someone with this solution.