CPU bottleneck; simple way to handle rendering of a 3D scene with many non static objects

I'm currently working on a small 3D game (rendering) engine. However I'm facing to an issue in my renderer. Here is how it currently works :

On one hand, I have a scene graph/ECS used to handle all SceneObjects I have in my game. Each scene object can have one parent and 0 to N children. I can attach components to each SceneObject. Components can be various things, such as renderable, lights or cameras.

On the other hand, I have my renderer which is completely agnostic off my scene hierarchy. In fact, each time I add a "Renderable" component on a SceneObject, my renderer receives a notification and create it's own representation of a renderable (Mesh, BoundingBox etc.). My renderer doesn't care about parent or children of a renderable.

The issue I'm facing occurs when I want to retrieve visible objects before putting them in a RenderQueue. On renderer side, all renderables are store in a flat list, and to perform culling operations I need to loop through this list. Unfortunately, this list contains all active objects of my game and I can be very CPU time consuming.

I would like to find a way to implement an efficient some of partitioning system on renderer side in which I could easily update renderables data (position, bounding box).

Thanks for your help.

Solution

This is more a question about algorithms:

They are many space partitioning algorithms, such as KD-Tree, Oct-tree, VP-tree and many others.

Selecting which algorithm suits you best depends on the object you want to store (are they convex?), their size, how often you need to compute intersections with them, are they static, dynamic in position, shape ?

You'll need to figure this out, and then select the best algorithm for them. Nothing prevents you from using multiple different tree in your engine, in order to deal with different object/object behaviors.

Usually, a rendering engine will use at least 2 trees, one for static information (that would generally be precomputed for optimal storage and ray/intersection performance), and one for dynamic objects.

Also, depending if your object are convex or concave, and the precision for ray intersection, you might want to break your concave objects into convex parts (and intersect the bounding box of those parts only), look at V-HACD as a preprocessing step for your concave objects. This allows to limit the number of ray/object intersection computation to large box instead of many surfaces.