I’m trying to break down my server-log into multiple files so I can run some metrics on them. I have this cronjob that adds a string and a timestamp to my server-log at the first of every month, the string looks like this ‘Monthly Breakpoint, March 1 2020’. The idea is that I can break up this large server-log file into multiple log files by this line delimiter, then run some metrics on each file. I’m trying to write a script that will create these output files for me but I’m struggling with it. So far I can read the file and loop through the lines and find the delimiter, but I’m not sure the best approach for a problem like this, maybe I shouldn't be using R and there's an easier way?
# server log
serverLog <- "server-out.log"
# Process File
conn <- file( serverLog ,open="r")
linn <-readLines(conn)
for (i in 1:length(linn)){
print( linn[i] )
test <- grepl( "Monthly", linn[i] )
# print( paste("test: ", test, sep="" ) )
if( test ) {
print( "Found Monthly Breakpoint")
}
}
close(conn)
# Example of the server-out.log file
[0mGET /notifications [36m304 [0m9.439 ms - -[0m
[0mGET /user/status [36m304 [0m2.137 ms - -[0m
[0mGET /user/status [36m304 [0m5.675 ms - -[0m
[0mPOST /user/login [32m200 [0m19.960 ms - 30[0m
[0mGET /user/status [36m304 [0m9.518 ms - -[0m
[0mGET /user/status [32m200 [0m2.364 ms - 16[0m
[0mGET /user/status [36m304 [0m1.396 ms - -[0m
[0mGET /user/status [36m304 [0m1.087 ms - -[0m
[0mPOST /user/login [32m200 [0m300.214 ms - 30[0m
[0mGET /user/status [36m304 [0m4.374 ms - -[0m
[0mGET /localUser [32m200 [0m2.260 ms - 1045[0m
Monthly Breakpoint, March 1 2020
[0mGET /user/status [32m200 [0m5.284 ms - 16[0m
[0mGET /user/status [36m304 [0m2.101 ms - -[0m
[0mGET /users [32m200 [0m2.387 ms - 36[0m
[0mGET /notifications [32m200 [0m30.395 ms - 2624[0m
[0mGET /user/status [36m304 [0m2.172 ms - -[0m
[0mGET /user/status [36m304 [0m1.424 ms - -[0m
[0mGET /user/status [36m304 [0m2.074 ms - -[0m
[0mGET /user/status [36m304 [0m0.920 ms - -[0m
[0mGET /users [36m304 [0m2.471 ms - -[0m
[0mGET /notifications [36m304 [0m8.416 ms - -[0m
[0mGET /user/status [36m304 [0m1.757 ms - -[0m
[0mGET /user/status [36m304 [0m1.114 ms - -[0m
[0mGET /favicon.ico [33m404 [0m2.218 ms - 150[0m
[0mGET /user/status [36m304 [0m2.003 ms - -[0m
[0mPOST /user/login [32m200 [0m175.473 ms - 30[0m
[0mGET /user/status [36m304 [0m3.893 ms - -[0m
csplit -z server-out.min /Monthly/ '{*}' csplit: illegal option -- z usage: csplit [-ks] [-f prefix] [-n number] file args ...
This isn't the most elegant answer but this got me what I needed. I'll try out the other answer, it's a good idea to keep the data in my R environment so I can run all my metrics without reading in unnecessary files. Thanks @Till
#~~~~~~~~~~~~~~~~~~~~~~#
#~~ Parse Server Log ~~#
#~~~~~~~~~~~~~~~~~~~~~~#
# Read File
serverLog <- "server-out.min"
conn <- file( serverLog ,open="r")
linn <-readLines(conn)
num <- 1
# Loop through File
for (i in 1:length(linn)){
# print( linn[i] )
# current output file
file <- paste( "server-log-", num, sep = "")
# write to file
write(linn[i], file=file, append=TRUE)
# Check for Monthly Delimiter, update num
test <- grepl( "Monthly", linn[i] )
if( test ) {
print( "Found Monthly Breakpoint")
num <- num+1
}
}
close(conn)