Saving a rigid-body simulation as an animation

I work in autonomous robotics. I will often simulate a robot without visualization, export position and rotation data to a file at ~30 fps, and then play that file back at a later time. Currently, I save the animation data in a custom-format JSON file and animate using three.js.

I am wondering if there is a better way to export this data?

I am not well versed in animation, but I suspect that I could be exporting to something like COLLADA or glTF and gain the benefits of using a format that many systems are already setup to import.

I have a few questions (some specific and some general):

How do animations usually get exported in these formats? It seems that most of them have something to do with the skeletons or morphing, but neither of concepts appear to apply to my case. (Could I get a pointer to an overview of general animation concepts?)
I don't really need key-framing. Is it reasonable to have key-frames at 30 to 60 fps without any need for interpolation?
Do any standard animation formats save data in a format that doesn't assume some form of interpolation?
Am I missing something? I'm sure my lack of knowledge in the area has hidden something that is obvious to animators.

Solution

You specifically mentioned autonomous robots, and position and rotation in particular. So I assume that the robot itself is the level of granularity that is supposed to be stored here. (Just to differentiate it from an articulated robot - basically a manipulator ("arm") with several rotational or translational joints that may have different angles)

For this case, a very short, high-level description about how this could be stored in glTF^(*):

You would store the robot (or each robot) as one node of a glTF asset. Each of the nodes can contain a translation and rotation property (given as a 3D vector and a quaternion). These nodes would then simply describe the position and orientation of your robots. You could imagine the roboty being "attached" to these nodes. (In fact, you can attach a mesh to these nodes in glTF, which then could be the visual representation of the robot).

The animation data itself would then be a description about how these properties (translation and rotation) change over time. The way how this information is stored can be imagined as a table, where you associate the translation and rotation with each time stamp:

time (s)        0.1   0.2  ...  1.0

translation x   1.2   1.3  ...  2.3
translation y   3.4   3.4  ...  4.3
translation z   4.5   4.6  ...  4.9

rotation x      0.12  0.13 ...  0.42
rotation y      0.32  0.43 ...  0.53
rotation z      0.14  0.13 ...  0.34
rotation w      0.53  0.46 ...  0.45

This information is then stored, in a binary form, and provided by so-called accessor objects.

The animation of a glTF asset then basically establishes the connection between this binary animation data, and the properties in the node that are affected by that: Each animation refers to such a "data table", and to the node whose properties will be filled with the new translation and rotation value as time progresses.

Regarding interpolation:

In your case, where the output is sampled at a high rate from the simulation, basically each frame is a "key frame", and no explicit information about key frames or the interpolation scheme will have to be stored. Just declaring that the animation interpolation should be of the type LINEAR or STEP should be sufficient for this use case.

(The option to declare it as a LINEAR interpolation will mainly relevant for the playback. Imagine you stop your playback exactly after 0.15 seconds: Should it then show the state that the robot had at the time stamp 0.1 or the state at time stamp 0.2, or one that is interpolated linearly? This, however, would mainly apply to a standard viewer, and not necessarily to a custom playback)

^(*) A side note: On a conceptual level, the way of how the information is represented in glTF and COLLADA is similar. Roughly speaking, COLLADA is an interchange format for authoring applications, and glTF is a transmission format that can efficiently be transferred and rendered. So although the answers until now refer to glTF, you should consider COLLADA as well, depending on your priorities, use-cases or how the "playback" that you mentioned is supposed to be implemented.

Disclaimer: I'm a glTF contributor as well. I also created the glTF tutorial section showing a simple animation and the one that explains some concepts of animations in glTF. You might find them useful, but they obviously build upon some of the concepts that are explained in the earlier sections.