Spark R - What is the equivalent of `factors()` in Spark R?

From R, we have the function factors(). I would like to use this function in a parallelize way, with Spark R.

My version of Spark is 1.6.2, and I cannot find an equivalent in the documentation. I thought I could do it with a map, but I am not certain I understand this answer, and there should be an easier way.

So to put it simply: What is the equivalent of factors() in Spark R ?

Solution

There is no direct equivalent. Spark encodes every type of variable using double precision numbers and uses metadata to distinguish between different types. For ML algorithms you can use formulas which automatically encode columns.

Rcpp Rf_warningcall compiler warnings
Modify the name of factor variables in lm function(summary function)
Extreme value analysis and quantile estimation using log Pearson type 3 (Pearson III) distribution - R vs Python
How to hide NAs when using xlsx::saveWorkbook?
How do I retrieve a simple numeric value from a named numeric vector in R?
Matching pair-wise columns from left to right across rows in one dataframe to another dataframe and adding new columns with matching values
Income to outcome flow chart in Sankey plotly R
color mapping in geom_conn_bundle not showing correctly
Print R package startup message AFTER automatic package conflict messages instead of before
Summing a set of R dataframe rows (column-wise), while retaining the first n columns
Added variable / partial regression plots for groups in an interaction?
how to make a topoplot in R with coordinates variable distribution
List of all functions in base R?
Plotting multiple plots for different initial conditions in one graph
Printing repetitively on the same line in R
Generating UI/Server based on initial selection
Subset dataframe based on pickerInput
How to let user pick the data in R-shiny?
Couldn't show my simple bar charts separately on Shiny R dashboardBody
How to programmatically filter contents of a second shiny app displayed via iframe
How to select specific interesting groups for the boxplot in R Shiny app?
Crosstable and Plot grouping with reactive values
Is there a way to make multiple Shiny picker inputs where the selections must be disjoint?
Delay/avoid duplication of shiny server side functions until after credentials
Predictions only returns value "1"
How to display a busy indicator in a shiny app?
Append doesn't work when writing to CSV in R
Changing the start date of a gantt chart in DiagrammeR
Check for installed packages before running install.packages()
Compare two columns element-wise