I have lots of users(150-200 million). Each user has N(30-100) attributes. The attribute can be of type integer, text or timestamp. Attributes are not known, so I want to add them dynamically, on the fly.
Solution 1 - Add new column by altering the table
CREATE TABLE USER_PROFILE(
UID uuid PRIMARY KEY,
LAST_UPDATE_DATE TIMESTAMP,
CREATION_DATE TIMESTAMP
);
For each new attribute:
ALTER TABLE USER_PROFILE ADD AGE INT;
INSERT INTO USER_PROFILE ( UID, LAST_UPDATE_DATE, CREATION_DATE, AGE) VALUES ('01f63e8b-db53-44ef-924e-7a3ccfaeec28', 2021-01-12 07:34:19.121, 2021-01-12 07:34:19.121, 27);
Solution 2 - Fixed schema:
CREATE TABLE USER_PROFILE(
UID uuid,
ATTRIBUTE_NAME TEXT,
ATTRIBUTE_VALUE_TEXT TEXT,
ATTRIBUTE_VALUE_TIMESTAMP TIMESTAMP,
ATTRIBUTE_VALUE_INT INT,
LAST_UPDATE_DATE TIMESTAMP,
CREATION_DATE TIMESTAMP,
PRIMARY KEY (UID, ATTRIBUTE_NAME)
);
For each new attribute:
INSERT INTO USER_PROFILE ( UID, ATTRIBUTE_NAME, ATTRIBUTE_VALUE_INT, LAST_UPDATE_DATE, CREATION_DATE) VALUES ('01f63e8b-db53-44ef-924e-7a3ccfaeec28', 'age', 27, 2021-01-12 07:34:19.121, 2021-01-12 07:34:19.121, 27);
Which is the best solution in terms of performance?
I would personally go with the 2nd solution - having columns for each data type that is used, and use the attribute name as the last component of the primary key (see examples in my previous answers on that topic:
First solution has following problems: