Here is a snippet of data I have to analyze:
2014 log[1]: id="1" sys="Sec" sub="fill" severity="inf" dort="30"
2014 log[2]: id="2" sys="Sec" sub="fill" severity="inf" name="dropped"
Excuse me if this is an obvious question, but what kind of data is this, and how can I get it into a csv format to analyze? So far I have tried using R by specifying whitespace as the delimiter, but that didn't get me what I need:
table1<-read.table('data.txt', sep="", fill=TRUE, header = FALSE)
Ideally, "id", "sys", "sub", "severity", "dort"
, and "name"
would be the column headers. Each row contains about the same number of variables, but some rows do not contain variables that other rows do. For those instances, I would like there to be "NULL". Ex: In row two in the data above, under the dort
column I would like it to say "NULL"
Hope this helps!
id="1" sys="Sec" sub="fill" severity="inf" dort="30"
id="2" sys="Sec" sub="fill" severity="inf" name="dropped"
Let's consider that test.txt
has above mentioned data then
library(rjson)
library(plyr)
#prepare json data
txt_file <- readLines("test.txt")
json_file <- lapply(txt_file, function(x) paste0('{"', gsub(',', ',"',gsub("=",'":',gsub('"\\s+', '",', gsub("^\\s+|\\s+$","",x)))), '}'))
json_file <- paste0("[", paste(json_file, collapse=","), "]")
#read json data
json_data <- fromJSON(json_file)
#convert it to dataframe
df <- rbind.fill(lapply(json_data,function(y){as.data.frame(t(y),stringsAsFactors=FALSE)}))
df[df=='NULL'] <- NA
df
Output is:
id sys sub severity dort name
1 1 Sec fill inf 30 NA
2 2 Sec fill inf NA dropped