What is the principle of Differential Buffer

I dont really unterstand the principle of differential buffers. The main reason is to avoid the inserting new tuples into main store, because the reorganization of the dictionary and attribute vector is needed. But if we insert new value in the differential buffer, we still need to reorganize its dictionary and attribute value. In what way should differential buffer improve the performance?

Solution

you are right. The reorganization of the dictionary for inserts/delets is avoided by a differential buffer.

With the differential buffer you don't change any commpressed data in main store. Instead you just set a valid flag to 0 (false) to identify outdated datasets. In the differential buffer itself the data is compressed in an unsorted dictionary. Because of this you can easily insert new data because there is no need to reorganize a dictionary or attribute vector. Disadvantage of an unsorted dictionary: range selections are more expensive.

The buffer has a maximum size, so you have to merge it with the main store periodically.

For further information check https://www.fbi.h-da.de/fileadmin/personal/u.stoerl/BigData-SoSe16/BigData-SoSe16-4-InMemory.pdf pages 27 to 39.