What is a good tool to use to automate or semi-automate (ie give a good headstart) the process of taking a rectangle of data from a statistics package like SPSS and:
I doubt full automation is possible but this must be a reasonably common task. We have about a dozen such datasets, some with several hundred variables, that we want to set up in a relational database (Oracle, if that makes any difference). There is no conceptual difficulty in doing this by hand other than prohibitive cost.
I feel there must be such a tool available but I am clearly searching in the wrong places or using the wrong terminology.
(edit - added the R tag because in my own answer to this I am using it as part of the solution)
OK, after further investigation (and thanks for the answer I was given, which was helpful although not quite fully there), I now favour:
as.numeric()
or unclass()
version, so it is just the numbers, not the labelssqlSave()
from the RODBC package.Step 2 is facilitated by a little function like this:
factorToRef <- function(x, field){
tmp <- levels(x)
tab <- data.frame(1:length(tmp), tmp)
names(tab) <- paste(field, c("_ID","_NAME"), sep="")
tab
}
Which can give results like
> data(iris)
> factorToRef(iris$Species, "species" )
species_ID species_NAME
1 1 setosa
2 2 versicolor
3 3 virginica
That are then the basis of a reference table to be saved in the database.