How to apply Data-Oriented Desing when an structure contains various number of elements in inner vector?

I would like to apply Data-Oriented Design (based on e.g. this article) to my simple physics engine. And I'm focused on optimizing the collision testing as it is the most expensive part of it.

I've organized the bounding spheres that may collide with player into single vector:

struct Sphere{ //I don't split sphere into parts, 
    //as I usually access both position and radius in my calculations
    Point3D position;
    float radius;
};
std::vector<BoudingSphere> spheres;

and I test collisions with them inside single function/method. Everything looks clear to me to that point.

The problem is, I also have some more general structures like:

struct Polygon{ //it may e.g. represents the area or be used for more precise tests      
    std::vector<Point2D> points;
};

I guess it won't be a good practise to just create std::vector<Polygon> the same way, as nested vector (points) will take a lot of place in memory (reserving it).

On the other hand, I cannot assume that there are always 2,3,4 or 10 points (it differs a lot, with the maximum of about 20, but it's usually much less).

And I do not want to switch from Polygon general structure to e.g. series of triangles (as it is faster then separated triangles in many calculations).

What should I do then? I want to go with the spirit of Data-Oriented Design and use the memory/cache efficiently with my Polygon.

Do I have to get rid of the inner vector (points)? How if so?

Solution

There can't be a definitive answer to this question as it would require knowledge about the way you access the data in terms of when is it initialized, can it be changed after the initialization stage and many others. However, if your data is relatively stable and you are accessing your polygons in a consistent manner just iterating over all polygons or polygons belonging to one particular object, one approach may be to put the points of your polygons into a separate vector and just have the polygons store the beginning and the end indices to that array.

This way, there are a couple of things that needs to be accessed during traversal. First, the indices stored in the polygons. Second, the points themselves. Both of these accesses are likely to be cache-friendly if the polygons are also laid out in a vector. Similar approach can be applied to polygon sets - just store the polygons in a vector and have a (begin, end) pair in your game objects.