I'm taking a database class and I'd like to have a large sample database to experiment with. My definition of large here is that there's enough data in the database so that if I try a query that's very inefficient, I'll be able to tell by the amount of time it takes to execute. I've googled for this and not found anything that's HSQLDB specific, but maybe I'm using the wrong keywords. Basically I'm hoping to find something that's already set up, with the tables, primary keys, etc. and normalized and all that, so I can try things out on a somewhat realistic database. For HSQLDB I guess that would just be the .script file. Anyway if anybody knows of any resources for this I'd really appreciate it.
You can use the MySQL Sakila database schema and data (open source, on MySQL web site), but you need to modify the schema definition. You can delete the view and trigger definitions, which are not necessary for your experiment. For example:
CREATE TABLE country (
country_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
country VARCHAR(50) NOT NULL,
last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (country_id)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;
modified:
CREATE TABLE country (
country_id SMALLINT GENERATED BY DEFAULT AS IDENTITY,
country VARCHAR(50) NOT NULL,
last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (country_id)
)
Some MySQL DDL syntax is supported in the MYS syntax mode of HSQLDB, for example AUTO_INCREMENT is translated to IDENTITY, but others need manual editing. The data is mostly compatible, apart from some binary strings.
You need to access the database with a tool that reports the query time. The HSQLDB DatabaseManager does this when the query output is in Text mode.