Search code examples
gremlinamazon-neptune

AWS Neptune set cardinality single-valued?


I’m having trouble understanding the cardinality specifications for Gremlin data load formatted columns as outlined here:

https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial-format-gremlin.html

Specifically, this statement:

“The cardinality value can be either single or set. The default is assumed to be set, meaning that the column can accept multiple values.”

Which seems to contradict the following cardinality specification:

name:type(set) – the cardinality is set, which is the same as the default, and the content is single-valued.

How can a set cardinality column (which accepts multiple values) be single-valued? There is a multi-valued set cardinality specification as shown below, which aligns with my understanding of what a “set” cardinality is, but a single-valued set just doesn’t seem logical:

name:type(set)[] – the cardinality is set, and the content is multi-valued.


Solution

  • The same vertex and property may appear in multiple rows. If the cardinality for that column is single that will cause the bulk loader to throw an error the second time it appears (unless the option to allow single cardinality values to be replaced is specified when the loader is started). If the cardinality is set, even if there is only one value for each row, the value will be added to the other values already in the set for the given property.

    The square brackets notation means that in the CSV column there are multiple values for that property on each row where it appears.

    Without the square brackets there only one value is expected for that column but the cardinality is still set unless type(single) is explicitly specified for example String(single).

    I hope this helps clarify.

    UPDATED:

    Adding an example of a CSV file that loads a set containing multiple integers.

    ~id,~label,list:Int(set)[],flag:String
    T001,test,1;2;3;4;5,hello