Search code examples
pythonamazon-web-servicesamazon-s3gitlabgitlab-api

Gitlab User list API with Python and Amazon S3


I'm trying to pull a complete user list from our private gitlab server instance and input it into an S3 Bucket to reference whenever we need. Eventually I will have some form of Lambda/cfn deleting it and running it again every week to update it. I'm not so great with Python and this is what I have so far..

import json
import boto3
import re
import os
import sys
import botocore
import urllib3
from pprint import pprint

sess = boto3.Session(profile_name="sso-profile-here")

s3_client = sess.client("s3")
bucket_name = "user-statistics"

http = urllib3.PoolManager()


baseuri = "https://git.tools.dev.mycompany.net/api/v4/"


access_token = "access-token-code"



def get_gitlab_users(access_token=access_token, baseuri=baseuri):
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer {}".format(access_token),
    }
    url = "{}/users/?per_page=100&active=true&without_project_bots=true&next_page=x-next-page".format(
        baseuri
    )
    req = http.request(method="GET", url=url, headers=headers)
    result = json.loads(req.data)
    s3_client.put_object(
        Bucket=bucket_name, Key="get_users_gitlab.json", Body=json.dumps(result)
    )


if __name__ == "__main__":
    get_gitlab_users(access_token=access_token, baseuri=baseuri)

What I would like to be able to do is pull all the users on each page and also format it a bit neater in the S3 bucket, When I download it from the bucket the format is really unreadable and I'm not sure if I can improve it, can anyone suggest anything I can do?

Please also ignore the fact my access token is directly in the code here, it's for testing at this stage and I will make sure it's not stored directly in code.

Thanks in advance for any suggestions.


Solution

  • You can try to use python-gitlab package instead of requests. It should be a lot easier to get user infos :

    import gitlab
    
    baseuri = "https://git.tools.dev.mycompany.net"
    access_token = "access-token-code"
    
    gl = gitlab.Gitlab(baseuri , private_token=access_token)
    
    users = [user.asdict() for user in gl.users.list()]
    users
    # [{'id': 1,
    #  'username': 'username1',
    #  'name': 'name1',
    #  'state': 'active',
    #  'avatar_url': 'https://avatar.com/1',
    #  'web_url': 'https://git.tools.dev.mycompany.net/username1'},
    # ...]