Currently, I have the following dataframe (the first 30 columns are from dput()
):
structure(list(PacketTime = c(0.0636830000000002, 0.0691829999999989,
0.0639040000000008, 0.0636270000000003, 0.0656370000000024, 0.064778000000004,
0.0616950000000003, 0.0666280000000015, 0.0630829999999989, 0.0665130000000005,
0.0621160000000032, 0.0654010000000014, 0.0652889999999928, 0.0640989999999988,
0.0621339999999861, 0.0645319999999998, 0.065757000000005, 0.0624459999999942,
0.061782000000008, 0.0626439999999917, 0.0648419999999987, 0.0664910000000134,
0.0644649999999984, 0.0654030000000034, 0.0657139999999998, 0.0642799999999966,
0.069137000000012, 0.0631520000000023, 0.0634139999999945, 0.0615009999999927
), FrameLen = list(c(304L, 276L, 276L), c(304L, 276L, 276L),
c(304L, 276L, 276L), c(304L, 276L, 276L), c(304L, 276L, 276L
), c(304L, 276L, 276L), c(304L, 276L, 276L), c(304L, 276L,
276L, 276L, 276L), c(304L, 276L, 276L), c(304L, 276L, 276L,
276L, 276L), c(304L, 276L, 276L), c(304L, 276L, 276L), c(304L,
276L, 276L), c(304L, 276L, 276L), c(304L, 276L, 276L), c(304L,
276L, 276L), c(304L, 276L, 276L, 276L, 276L), c(304L, 276L,
276L), c(304L, 276L, 276L), c(304L, 276L, 276L), c(304L,
276L, 276L, 276L, 276L), c(304L, 276L, 276L), c(304L, 276L,
276L), c(304L, 276L, 276L, 276L), c(304L, 276L, 276L, 276L,
276L), c(304L, 276L, 276L), c(304L, 276L, 276L), c(304L,
276L, 276L), c(304L, 276L, 276L), c(304L, 276L, 276L)), IPLen = list(
c(300L, 272L, 272L), c(300L, 272L, 272L), c(300L, 272L, 272L
), c(300L, 272L, 272L), c(300L, 272L, 272L), c(300L, 272L,
272L), c(300L, 272L, 272L), c(300L, 272L, 272L, 272L, 272L
), c(300L, 272L, 272L), c(300L, 272L, 272L, 272L, 272L),
c(300L, 272L, 272L), c(300L, 272L, 272L), c(300L, 272L, 272L
), c(300L, 272L, 272L), c(300L, 272L, 272L), c(300L, 272L,
272L), c(300L, 272L, 272L, 272L, 272L), c(300L, 272L, 272L
), c(300L, 272L, 272L), c(300L, 272L, 272L), c(300L, 272L,
272L, 272L, 272L), c(300L, 272L, 272L), c(300L, 272L, 272L
), c(300L, 272L, 272L, 272L), c(300L, 272L, 272L, 272L, 272L
), c(300L, 272L, 272L), c(300L, 272L, 272L), c(300L, 272L,
272L), c(300L, 272L, 272L), c(300L, 272L, 272L)), Movement = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -30L), class = c("tbl_df",
"tbl", "data.frame"))
From here, I can use the keras
package to put the dataframe (in variable packets
) into a matrix using:
packets.m <- as.matrix(packets)
However, when I attempt to pass this into the model (without normalisation) or normalise before passing, I receive the following error:
Error in py_call_impl(callable, dots$args, dots$keywords) : Matrix type cannot be converted to python (only integer, numeric, complex, logical, and character matrixes can be converted
Thus, how can I effectively normalise the two columns FrameLen
and IPLen
containing lists, so that I can accurately use this for the deep learning model using the keras
package?
EDIT: The full dput()
can be found here, for the packets dataframe https://pastebin.com/cXKdSB2y
It depends on how you trained this data
library(tidyverse)
df %>%
unnest()
df %>%
mutate(position = map(FrameLen,seq_along),id = row_number()) %>%
unnest() %>%
pivot_wider(names_from = position,values_from = c(FrameLen,IPLen))