algorithm web-applications data-structures sparse-matrix

Sparse (Pseudo) Infinite Grid Data Structure for Web Game

I'm considering trying to make a game that takes place on an essentially infinite grid.

The grid is very sparse. Certain small regions of relatively high density. Relatively few isolated nonempty cells.
The amount of the grid in use is too large to implement naively but probably smallish by "big data" standards (I'm not trying to map the Internet or anything like that)
This needs to be easy to persist.

Here are the operations I may want to perform (reasonably efficiently) on this grid:

Ask for some small rectangular region of cells and all their contents (a player's current neighborhood)
Set individual cells or blit small regions (the player is making a move)
Ask for the rough shape or outline/silhouette of some larger rectangular regions (a world map or region preview)
Find some regions with approximately a given density (player spawning location)
Approximate shortest path through gaps of at most some small constant empty spaces per hop (it's OK to be a bad approximation often, but not OK to keep heading the wrong direction searching)
Approximate convex hull for a region

Here's the catch: I want to do this in a web app. That is, I would prefer to use existing data storage (perhaps in the form of a relational database) and relatively little external dependency (preferably avoiding the need for a persistent process).

Guys, what advice can you give me on actually implementing this? How would you do this if the web-app restrictions weren't in place? How would you modify that if they were?

Thanks a lot, everyone!

Solution

I think you can do everything using quadtrees, as others have suggested, and maybe a few additional data structures. Here's a bit more detail:

Asking for cell contents, setting cell contents: these are the basic quadtree operations.
Rough shape/outline: Given a rectangle, go down sufficiently many steps within the quadtree that most cells are empty, and make the nonempty subcells at that level black, the others white.
Region with approximately given density: if the density you're looking for is high, then I would maintain a separate index of all objects in your map. Take a random object and check the density around that object in the quadtree. Most objects will be near high density areas, simply because high-density areas have many objects. If the density near the object you picked is not the one you were looking for, pick another one.

If you're looking for low-density, then just pick random locations on the map - given that it's a sparse map, that should typically give you low density spots. Again, if it doesn't work right try again.
Approximate shortest path: if this is a not-too-frequent operation, then create a rough graph of the area "between" the starting point A and end point B, for some suitable definition of between (maybe the square containing the circle with the midpoint of AB as center and 1.5*AB as diameter, except if that diameter is less than a certain minimum, in which case... experiment). Make the same type of grid that you would use for the rough shape / outline, then create (say) a Delaunay triangulation of the black points. Do a shortest path on this graph, then overlay that on the actual map and refine the path to one that makes sense given the actual map. You may have to redo this at a few different levels of refinement - start with a very rough graph, then "zoom in" taking two points that you got from the higher level as start and end point, and iterate.

If you need to do this very frequently, you'll want to maintain this type of graph for the entire map instead of reconstructing it every time. This could be expensive, though.
Approx convex hull: again start from something like the rough shape, then take the convex hull of the black points in that.

I'm not sure if this would be easy to put into a relational database; a file-based storage could work but it would be impractical to have a write operation be concurrent with anything else, which you would probably want if you want to allow this to grow to a reasonable number of players (per world / map, if there are multiple worlds / maps). I think in that case you are probably best off keeping a separate process alive... and even then making this properly respect multithreading is going to be a headache.