Search code examples
pythonpython-3.xbashpython-poetrytoml

Get Poetry script endpoints


Poetry has a nice way of running your Python entrypoints using poetry run <entrypoint>. What would be the best way to programatically get a list of <entrypoints> from the pyproject.toml with Python or Bash?

For example, the pyproject.toml:

[tool.poetry]
name = "random-tool"
version = "1.0"

[tool.poetry.scripts]
tool = "tool.function:main"
other_tool = "other_tool.function:main"
all_the_tools = "other_tool.function:all"

Output being:

entrypoint = [ "tool", "other_tool", "all_the_tools" ]

Solution

  • OP has stated in a comment the desire to collect the endpoints into a structure that can be iterated over at a later point.

    For this answer I'm going to focus on one bash/array idea ...

    First issue is to parse desired data from the pyproject.toml file; my sample file:

    $ cat poetry.toml
    [tool.poetry]
    name = "random-tool"
    version = "1.0"
    
    [tool.poetry.scripts]
    tool = "tool.function:main"
    other_tool = "other_tool.function:main"
    all_the_tools = "other_tool.function:all"
    
    [tool.poetry.other_stuff]         # add some gibberish to demonstrate extracting data from middle of file
    a = "a"
    b = "b"
    c = "c"
    

    One sed idea for parsing out the desired data:

    $ sed -En '1,/\[tool.poetry.scripts\]/d;/^$/,$d;s/^([^ ]+) .*$/\1/p' poetry.toml
    tool
    other_tool
    all_the_tools
    

    Where:

    • -En - enable Extended regex support and suppress printing of pattern space (n)
    • 1,/\[tool.poetry.scripts\]/d - deletes everything in the rangefrom line 1 to the line containing the string [tool.poetry.scripts\]
    • /^$/,$d - deletes everything in the range from the first blank line (^$) to the end of the file ($)
    • s/^([^ ]+) .*$)/\1/p - define first capture group as start of line up to, but not including, first space (([^ ]+)), then print the first capture group (\1/p)

    One idea using awk:

    $ awk '
    /\[tool.poetry.scripts\]/ { extract=1        # set "extract" flag
                                next             # go to next record
                              }
    
    # when "extract" flag = 1:
    
    extract                   { if ( NF == 0)    # if no fields (ie, blank line) then
                                   exit          # exit processing
                                print $1         # else print first field
                              }
    ' poetry.toml
    tool
    other_tool
    all_the_tools
    

    Or as a one-liner:

    $ awk '/\[tool.poetry.scripts\]/ { extract=1 ; next } extract { if ( NF == 0) { exit } ; print $1}' poetry.toml
    tool
    other_tool
    all_the_tools
    


    From here there are several ways to get this data loaded into a bash array structure; one idea using mapfile:

    # load sed output into array endpoints[]
    
    $ mapfile -t endpoints < <(sed -En '1,/\[tool.poetry.scripts\]/d;/^$/,$d;s/^([^ ]+) .*$/\1/p' poetry.toml)
    
    # display contents of the endpoints[] array
    
    $ typeset -p endpoints
    declare -a endpoints=([0]="tool" [1]="other_tool" [2]="all_the_tools")
    


    At this point the data has been loaded into the endpoints[] array and can be iterated over at a later point, eg:

    for i in "${!endpoints[@]}"
    do
        echo "endpoints[${i}] = ${endpoints[${i}]}"
    done
    

    Which generates:

    endpoints[0] = tool
    endpoints[1] = other_tool
    endpoints[2] = all_the_tools