Search code examples
pythonpython-3.xconfluenceconfluence-rest-apiatlassian-python-api

Get page id of a confluence page from its public url


How to get a Confluence page_id given a page_url. For Eg:

If this is the Display URL: https://confluence.som.yale.edu/display/SC/Finding+the+Page+ID+of+a+Confluence+Page

I want to get its page_id using Confluence REST API

More details here


Solution

  • Do you use atlassian-python-api?

    In that case you can parse your url to get the confluence space (SC) and page title (Finding the Page ID of a Confluence Page) then use confluence.get_page_id(space, title).

    from atlassian import Confluence
    
    page_url = "https://confluence.som.yale.edu/display/SC/Finding+the+Page+ID+of+a+Confluence+Page"
    
    confluence = Confluence(
            url='https://confluence.som.yale.edu/',
            username=user,
            password=pwd)
            
    space, title = page_url.split("/")[-2:]
    title = title.replace("+", " ")
    
    page_id = confluence.get_page_id(space, title)
    

    Note that when your title contains a special character (+ or ü, ä...) your page url will already contain the id like this: https://confluence.som.yale.edu/pages/viewpage.action?pageId=1234567890 so you might want to check for it first.

    EDIT: here a version of what your function could look like:

    from atlassian import Confluence
    import re
    import urllib
    
    # regex pattern to match pageId if already in url
    page_id_in_url_pattern = re.compile(r"\?pageId=(\d+)")
    
    def get_page_id_from_url(confluence, url):
        page_url = urllib.parse.unquote(url) #unquoting url to deal with special characters like '%'
        space, title = page_url.split("/")[-2:]
    
        if re.search(page_id_in_url_pattern, title):
            return re.search(page_id_in_url_pattern, title).group(1)
        
        else:
            title = title.replace("+", " ")
            return confluence.get_page_id(space, title)
    
    
    
    if __name__ == "__main__":
        from getpass import getpass
        user = input('Login: ')
        pwd = getpass('Password: ')
    
        page_url = "https://confluence.som.yale.edu/display/SC/Finding+the+Page+ID+of+a+Confluence+Page"
    
        confluence = Confluence(
                url='https://confluence.som.yale.edu/',
                username=user,
                password=pwd)
    
        print(get_page_id_from_url(confluence, page_url))