So I am a student currently learning about PostgreSQL. I am trying to figure out the way, how to randomly seed data. I have 10M users and 100 stocks.
Currently my tables will look like:
CREATE TABLE user (
user_id INTEGER NOT NULL,
amount_of_stocks [][] array, -- this is just assumption
PRIMARY KEY (user_id)
);
CREATE TABLE stock (
stock_id INTEGER NOT NULL,
amount_per_stock INT,
quantity INT
PRIMARY KEY (stock_id)
);
How would I store 100 different stocks for each user?
Sounds like a classical many-to-many relationship. Should not involve arrays at all. Assuming Postgres 10 or later, use something along these lines:
CREATE TABLE users ( -- "user" is a reserved word!
user_id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, username text UNIQUE NOT NULL -- or similar
);
CREATE TABLE stock (
stock_id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, stock text UNIQUE NOT NULL -- or similar
);
CREATE TABLE user_stock (
user_id int REFERENCES users
, stock_id int REFERENCES stock
, amount int NOT NULL
, PRIMARY KEY (user_id, stock_id)
);
Detailed explanation:
Postgres provides generate_series()
to conveniently generate rows. random()
is instrumental to generate random numbers:
INSERT INTO users(username)
SELECT 'user_' || g
FROM generate_series(1, 10000000) g; -- 10M (!) - try with just 10 first
INSERT INTO stock(stock)
SELECT 'stock_' || g
FROM generate_series(1, 100) g;
Experiment with a small number of users first. 10M users * 100 stocks generates a billion rows. Takes some time and occupies some space.
How would I store 100 different stocks for each user?
INSERT INTO user_stock
(user_id, stock_id, amount)
SELECT u.user_id, s.stock_id, ceil(random() * 1000)::int
FROM users u, stock s; -- cross join
Every user gets 100 different stocks - though everyone gets the same set in this basic example, you did not define more closely. I added a random amount per stock between 1 and 1000.
About the cross join to produce the Cartesian product: