"Logbook" for scientific simulations

I'm using C++ to perform scientific simulation on some things. At this moment, due to the increasing number of parameters, I found necessary to have a "logbook": a file where all the information about a given simulation is stored (not the output; the parameters that led to that output and the respective git commit).

I've searched and it seems to me that the use of XML should be a good option, since it can easily be parsed using python, mathematica or other analysis software.

I wonder if anyone agrees with this, or has a better option.

Besides, I wonder how can I pick the current commit of git to save it on the logbook.

Solution

In general I agree with you:

XML is widely deployed, there's tonnes of tools to bring the logbook into shape.
It's flexible, you can add additional attributes later without breaking old ``scripts''
It's file based, one document, one file, use the filesystem to organise logbook ``pages''
It's file based and plain text, tools like find, grep, diff (at a push) can help you in urgent cases
It's your own solution, you're free to track any information you need, and if you deem it essential to associate sunlight hours with the parameters, do it.

That being said, I should add the storage format depends on the typical use case, if you need to find out why every monday after a full moon the optimiser cannot find any solutions, it will be hard (well, harder) to come up with the necessary XPath/XQuery hackery to do that because of the non-normativity of your structure.

Well all the downsides I can think of:

It's verbose, XML documents in my area tend to be more like 20 to 40 GBs whereas the info probably could be represented in more like 500 MB.
It's slow (depends on how you use it), RDBMs or even nosql solutions employ techniques like indexing to make reading faster.
It's flexible, that's also a downside: If you happen to add two new attributes per day you will end up with nothing but a marked up free text, it will need thorough polishing if you want to import it into structure-focussed systems (SQL, csv, json, ...)
It's your own solution, you have to write it and maintain it

As for the second bit: git describe --always HEAD