Given a Github repository, I need to extract the graph representing its commits, branches etc. so that I can process it with scripts.
I know that once cloned the repository I can use the log command like:
git log --graph --abbrev-commit --decorate --date=relative --all
but its output cannot be processed (or at least easily).
After many useless attempts, I found out this tool (git-dot) that generates a .dot file representing the graph of the given repository; then it has been easy to work with the graph since I have been able to import it reading the .dot file in Networkx. However, I think that such tool doesn't work very well as I have less commits than the number written in the Github repository, too many cycles and so on.
My question is about other tools or a representation of log command giving me a graph that I can process with my scripts. I hope you can help me.
git rev-list --all --parents
will give you the raw data, you can annotate it however you want. Git ancestry graphs don't have cycles.
Here's the basics of what that tool you found has to be doing:
git rev-list --all --parents \
| awk ' BEGIN{print "strict digraph git {"}
NF==1 {print "\""$1"\";"}
NF>1 { for (n=2; n<=NF; ++n) print "\""$1"\" -> \""$n"\";" }
END{print "}"}' \
| dot -Tpng -otest.png