I have an SE11 table with a column of type STRING
and I'm wondering how this is stored on the underlying DB system (SAP Hana in this case).
I read that only a reference to the LOB is actually saved in a column typed as STRING
and the string itself is saved outside the table. Is this true and does it work the same on Hana? I tried to RTFM but I couldn't find that info.
Is it generally advisable to use CHAR
with a specific length whenever possible?
Disclaimer. Although I work for SAP SE, I am not related to SAP HANA's teams or code. The info below was collected by trial and error in an SAP HANA 2 SP02 (2.00.024.00.1519806017). It is neither dependable, nor legally binding, and may be subject to future change without notice.
Okay, now that's out of the way, let's look at some things:
SAP HANA has a Column Store (= the fancy new thing) and a Row Store (= as known from other relational databases). The two are very different. You should therefore be aware of which store you are dealing with when optimizing your structures.
The ABAP DDIC turns STRING
columns in transparent tables into NCLOB
columns, and CHAR
to NVARCHAR
.
The ABAP DDIC is pretty peculiar with strings: They cannot be used as keys because they exceed the maximum key length of 255 characters. They also prevent the application server from buffering the table, increasing response time for repeated queries. That alone is usually a reason to refrain from STRING
and use CHAR
instead. In summary, one could say it doesn't make sense to add more than one STRING
column to a transparent table.
SAP HANA indeed stores the LOB outside of the table, and the table holds only a reference. The contents are treated similar to files. Their CONTAINER_ID
can be gleaned from the system view "SYS"."M_TABLE_LOB_FILES"
. The related system view "SYS"."M_TABLE_LOB_STATISTICS"
gives you more details on the consumed space.
A (rather) recent blog on hybrid LOBs reveals another interesting fact: "As SAP HANA will not compress LOB columns regardless of whether it resides in disk or in-memory [...]". Meaning the column will consume exactly as much disk space as the content you put in. Which is quite different from the rest of SAP HANA's column store content, which is submitted heavily to compression to optimize storage and access. The resulting conclusion is also interesting to note: "[...] it is essential that any possible compression algorithm logic (e.g. gzip) are applied at the application layer on writing/reading from the database".
Generally - and that's true for all database management systems I know - it's a good idea to prefer variable character types, as they give the system the freedom to optimize the space it actually reserves. As SAP's guidelines discourage using VARCHAR
(= non-Unicode) for anything other than pure ASCII, the sensible default for SAP HANA should therefore be NVARCHAR
(= Unicode-capable)'.