Database Indexes

I need to develop a "naive" implementation of database indexes for use in a distributed environment. I know almost nothing about the subject, and I'm a bit pressured by time.

I would love to hear some opinions, examples and algorithms on the subject. I'd like to be able to have a mental representation of what I need to implement.

EDIT: I'm referring to clustered indexes

Solution

There are basically two main types of indexes :

Clustered (ie the data is physically organized, and you re-sort it at each insertion if needed)

Typical use-case : the physical organization is usually the same as the insertion order, so the re-sorting overhead is not a problem. This is for example the case with sequential UID's (the so called "IDENTITY" fields in a database context)

An obvious drawback of clustered indexing is that you can have only one such index on your data.

Naive implementation if the insertion order is exactly the sorting order : use a List.
1. Insertion is O(1) : you just append the new data of the list
2. Access is O(1) if the ID's are sequential (ie array indexes exactly matches UID), O(log) otherwise
Unclustered (ie you keep pointers on the data, as in a Hashtable)

Typical use-case : The clustering is not appropriate because it would induce to big an insertion overhead.

Depending on your needs, you'll probably end up using on those two datastructures

An extensive repository of Index-related information is available here