I would like to create an sqlite3 database to use using python.I am using the ArnetMiner dataset, where is had "blocks" of data for each entity. A block's description is as follows:
#* --- paperTitle
#@ --- Authors
#year ---- Year
#conf --- publication venue
#citation --- citation number (both -1 and 0 means none)
#index ---- index id of this paper
#arnetid ---- pid in arnetminer database
#% ---- the id of references of this paper (there are multiple lines, with each indicating a reference)
#! --- Abstract
The following is an example:
#*Spatial Data Structures.
#@Hanan Samet,Wei Lee Chang,Jose Fernandez
#year1995
#confModern Database Systems
#citation2743
#index25
#arnetid27
#%165
#%356
#%786754
#%3243
#!An overview is presented of the use of spatial data structures in spatial databases. The focus is on hierarchical data structures, including a number of variants of quadtrees, which sort the data with respect to the space occupied by it. Such techniques are known as spatial indexing methods. Hierarchical data structures are based on the principle of recursive decomposition.
Here is my question:
How do I import this into an sqlite3 table that I created?
Usually dataset I use are just separated by tabs, so I would just say the below after creating the table:
.separator "\t"
.import Data.txt table_name
I created the tables as follows:
CREATE TABLE publications (
PaperTitle varchar(150),
Year int,
Conference varchar(150),
Citations int,
ID int primary key,
arnetId int,
Abstract text
);
CREATE TABLE authors (
ID int primary key,
Name varchar (100)
);
CREATE TABLE authors_publications (
PaperID int,
AuthorID int
);
CREATE TABLE publications_citations (
PaperID int,
CitationID int
);
Basically, I guess I'm asking if there is a quick way to import the dataset into the database tables I created? Or would I have to write python script and insert each block at a time?
The best way to do it was to parse the data myself and rewrite it into csv files and just import them directly my database's tables.