python python-3.x bash python-poetry toml

Get Poetry script endpoints

Poetry has a nice way of running your Python entrypoints using poetry run <entrypoint>. What would be the best way to programatically get a list of <entrypoints> from the pyproject.toml with Python or Bash?

For example, the pyproject.toml:

[tool.poetry]
name = "random-tool"
version = "1.0"

[tool.poetry.scripts]
tool = "tool.function:main"
other_tool = "other_tool.function:main"
all_the_tools = "other_tool.function:all"

Output being:

entrypoint = [ "tool", "other_tool", "all_the_tools" ]

Solution

OP has stated in a comment the desire to collect the endpoints into a structure that can be iterated over at a later point.

For this answer I'm going to focus on one bash/array idea ...

First issue is to parse desired data from the pyproject.toml file; my sample file:

$ cat poetry.toml
[tool.poetry]
name = "random-tool"
version = "1.0"

[tool.poetry.scripts]
tool = "tool.function:main"
other_tool = "other_tool.function:main"
all_the_tools = "other_tool.function:all"

[tool.poetry.other_stuff]         # add some gibberish to demonstrate extracting data from middle of file
a = "a"
b = "b"
c = "c"

One sed idea for parsing out the desired data:

$ sed -En '1,/\[tool.poetry.scripts\]/d;/^$/,$d;s/^([^ ]+) .*$/\1/p' poetry.toml
tool
other_tool
all_the_tools

Where:

-En - enable Extended regex support and suppress printing of pattern space (n)
1,/\[tool.poetry.scripts\]/d - deletes everything in the rangefrom line 1 to the line containing the string [tool.poetry.scripts\]
/^$/,$d - deletes everything in the range from the first blank line (^$) to the end of the file ($)
s/^([^ ]+) .*$)/\1/p - define first capture group as start of line up to, but not including, first space (([^ ]+)), then print the first capture group (\1/p)

One idea using awk:

$ awk '
/\[tool.poetry.scripts\]/ { extract=1        # set "extract" flag
                            next             # go to next record
                          }

# when "extract" flag = 1:

extract                   { if ( NF == 0)    # if no fields (ie, blank line) then
                               exit          # exit processing
                            print $1         # else print first field
                          }
' poetry.toml
tool
other_tool
all_the_tools

Or as a one-liner:

$ awk '/\[tool.poetry.scripts\]/ { extract=1 ; next } extract { if ( NF == 0) { exit } ; print $1}' poetry.toml
tool
other_tool
all_the_tools

From here there are several ways to get this data loaded into a bash array structure; one idea using mapfile:

# load sed output into array endpoints[]

$ mapfile -t endpoints < <(sed -En '1,/\[tool.poetry.scripts\]/d;/^$/,$d;s/^([^ ]+) .*$/\1/p' poetry.toml)

# display contents of the endpoints[] array

$ typeset -p endpoints
declare -a endpoints=([0]="tool" [1]="other_tool" [2]="all_the_tools")

At this point the data has been loaded into the endpoints[] array and can be iterated over at a later point, eg:

for i in "${!endpoints[@]}"
do
    echo "endpoints[${i}] = ${endpoints[${i}]}"
done

Which generates:

endpoints[0] = tool
endpoints[1] = other_tool
endpoints[2] = all_the_tools