My goal is to do a grid search over various VW models in their parameter space (trying different loss functions and regularizations etc). Since the model could use multiple passes, I would like to use cross validation. I am wondering if I should implement my own cross validation code (perhaps as a bash script) or am I reinventing the wheel. Any pointers on whether this has been done before etc or best ways to proceed would be useful. I was looking at implementing cross validation in a bash script and using GNU parallel to parallelize the Grid Search
You should try the vw-hypersearch perl script ( https://github.com/JohnLangford/vowpal_wabbit/blob/HEAD/utl/vw-hypersearch ) which can also be found in the utl directory of VW. It can help you tune the VW parameters, but as for as cross-validation you have to implement your own code, feeding the algorithm with the data folds you intend to validate.