Search code examples
databaseindexingsizemarklogic

MarkLogic - How to know size of database, size of Index, Total indexs


We are using MarkLogic 9.0.8.2

We have setup MarkLogic cluster, ingested around 18M XML documents, few indexes have been created like Fields, PathRange & so on.

Now while setting up another environment with configuration, indexs, same number of records but i am not able to understand why the total size on database status page is different from previous environment.

So i started comparing database status page of both clusters where i can see size per forest/replica forest and all.

So in this case, i would like to know size for each

  • Database
  • Index

Also would like to know (instead of expanding each thru admin interface) the total indexes in given database

Option within Admin interface OR thru xQuery will also do.


Solution

  • MarkLogic does not break down the index sizes separately from the Database size. One reason for this is because the data is stored together with the Universal Index.

    You could approximate the size of the other indexes by creating them one at a time, and checking the size before and after the reindexer runs, and the deleted fragments are merged out. We usually don't find a lot of benefit it trying to determine the exact index sizes, since the benefits they provide typically outweigh the cost of storage.

    It's hard to say exactly why there is a size discrepancy. One common cause would be the number of deleted fragments in each database. Deleted fragments are pieces of data that have been marked for deletion (usually due to an update, delete or other change). Deleted fragments will continue to consume database space until they are merged out. This happens by default, or it can be manually started at the forest or database level.

    The database size, and configured indexes can be determined through the Admin UI, Query Console (QConsole) or via the MarkLogic REST Management API (RMA) endpoints. QConsole supports a number of languages, but server side Javascript and XQuery are the most common. RMA can return results in XML or JSON.

    Database Size:

    Configured Indexes: