Search code examples
phpmysqlarraysdatabase-designtree-structure

Creating an efficient friendlist using PHP


I would like to build a website that has some elements of a social network.

So I have been trying to think of an efficient way to store a friend list (somewhat like Facebook).

And after searching a bit the only suggestion I have come across is making a "table" with two "ids" indicating a friendship.

That might work in small websites but it doesn't seem efficient one bit.

I have a background in Java but I am not proficient enough with PHP.

An idea has crossed my mind which I think could work pretty well, problem is I am not sure how to implement it.

the idea is to have all the "id"s of your friends saved in a tree data structure,each node in that tree resembles one digit from the friend's id.

first starting with 1 node, and then adding more nodes as the user adds friends. (A bit like Lempel–Ziv).

every node will be able to point to 11 other nodes, 0 to 9 and X.

"X" marks the end of the Id.

for example see this tree:

An Example

In this tree the user has 4 friends with the following "id"s:

  • 0
  • 143
  • 1436
  • 15

Update: as it might have been unclear before, the idea is that every user will have a tree in a form of multidimensional array in which the existence of the pointers themselves indicate the friend's "id".

If every user had such a multidimensional array, searching if id "y" is a friend of mine, deleting id "y" from my friend list or adding id "y" to my friend list would all require constant time O(1) without being dependent on the number of users the website might have, only draw back is, taking such a huge array, serializing it and pushing it into each row of the table just doesn't seem right.

-Is this even possible to implement?

-Would using serializing to insert that tree into a table be practical?

-Is there any better way of doing this?

The benefits upon which I chose this is that even with a really large number of ids (millions or billions) the search,add,delete time is linear (depends of the number of digits).

I'd greatly appreciate any help with implementing this or any suggestions for alternative ways to improve or change this method.


Solution

  • I would strongly advise against this.

    • Storage savings are not significant, and may (probably?) be worse. In a real dataset, the actual space-savings afforded to you with this approach are minimal. Computing the average savings is a very difficult problem, but use some real numbers and try a few samples with random IDs. If you have a million users, consider a user with 15 friends. How much data do you save with this approch? You may actually use more space, since tree adjacency models can require significant data.

    • "Rendering" a list of users requires CPU investment.

    • Inserts are non-deterministic and non-trivial. When you add a new user to an existing tree, you will have a variety of methods of inserting them. Assuming you don't choose arbitrarily, it is difficult to compute which approach is the best (and would only be based on heuristics).

    This are the big ones that came to my mind. But generally, I think you are over-thinking this.