Search code examples
loggingautomationgitlabpipeline

How to delete gitlab job logs automatically after certain time?


In my git repo Jobs logs are consuming a lot of space and I have to delete it jobs manually every week using the curl in order to delete associated logs.

I have gone through couple of articles, and concluded that:

  1. We can't delete Gitlab Jobs or Piplines automatically after some time, this is manual process only either from UI or using Curl.

  2. We can delete artifacts using expire_in.

But is there any way to delete or expire Pipeline and Jobs logs also in Gitlab in some automated way as we do for artifacts?


Solution

  • This is not possible on GitLab.com, as stated in the documentation:

    There isn’t a way to automatically expire old job logs

    Self-hosted GitLab administrators, however, do have an option to simply delete the log files from the filesystem (or remote storage, if configured). This could be setup in a cronjob, for example:

    find /var/opt/gitlab/gitlab-rails/shared/artifacts -name "job.log" -mtime +60 -delete
    

    An alternative (that would work on gitlab.com or self-hosted) would be to setup some automation scripts (maybe as a scheduled GitLab pipeline?) that uses the GitLab API to locate old pipelines and delete them.

    Pseudo code:

    def get_all_projects() -> List[Project]:
        # https://docs.gitlab.com/ee/api/projects.html#list-all-projects
        ...
    
    def get_project_pipelines(project: Project) -> List[Pipeline]:
        # https://docs.gitlab.com/ee/api/pipelines.html#list-project-pipelines
        # https://docs.gitlab.com/ee/api/pipelines.html#get-a-single-pipeline
        ...
    
    today = datetime.today()
    threshold = today - timedelta(days=60)
    for project in get_all_projects():
        for pipeline in get_project_pipelines(project):
            if pipeline.finished_at < threshold:
                print('deleting', pipeline.id, 'from project', project.id)
                pipeline.delete()
    
    

    You could run such a script on a schedule (say, in a scheduled pipeline itself?) to regularly remove your old pipelines.