Is there a way to correctly represent missing values in VW input format -- not to impute with the mean or median, not to set them to 0 or any other constant, but to treat them as really missing, so that SGD and FTRL-Proximal algorithms could exclude these coordinates from the gradient computation for a given example?
VW expects sparse feature representation input format, see VW wiki. So missing values are treated correctly. Simply, don't list the features whose values are missing.