I have to open a text file which contain data that I want to open as a tibble or dataframe. The data are stored in a tbale with white space as separator (I suppose). The probleme is that whitespace is also used in the column "PI". I have no idea how to open it without doing any complicated script to scan all of it and parse it with regex. Do you have any tips ? I think it was designed to have visualy aligned column when we open it as a text file.
here is a sample of the file.
DAC PI SPROF_DATE_UPDATE DATA_MODE WMO CYCLE PROFILE_DATE LON LAT MPD_RAW MPD_ADJ Z_RAW Z_ADJ
aoml STEPHEN RISER , KENNETH JOHNSON 2021-05-12 18:52:52 R 5903612 038 2012-07-04 01:15:50 8.651 -41.466 -89.313473 NaN 5.439052 NaN
aoml STEPHEN RISER , KENNETH JOHNSON 2020-12-03 02:03:33 R 5903717 138 2014-02-03 18:07:45 -164.533 -69.958 134.866767 NaN 7.884637 NaN
aoml STEPHEN RISER , KENNETH JOHNSON 2021-05-12 22:44:39 R 5903717 139 2014-02-10 20:05:49 -164.302 -70.061 99.848464 NaN 5.803396 NaN
aoml STEPHEN RISER , KENNETH JOHNSON 2021-05-12 22:44:53 R 5903717 140 2014-02-17 22:06:53 -164.146 -70.246 107.535752 NaN 6.260275 NaN
Fixing @danlooo
First: header separator is single space.
Second: sometimes before -
there is just a single space
data = readLines("testdata.txt")
data[1] = gsub(pattern ="\\s+" ,replacement = "\t" ,x = data[1])
modified_data = gsub(pattern ="\\s-" ,replacement = " -" ,x = data)
modified_data = gsub(pattern ="\\s{2,}" ,replacement = "\t" ,x = modified_data)
final_data = paste0(modified_data,collapse = "\n")
write(final_data, "finaldata.txt")
finaldatatable <- read_delim("finaldata.txt",
delim = "\t", escape_double = FALSE,
trim_ws = TRUE)