I'm using Weights&Biases Cloud-based sweeps with Keras. So first i create a new Sweep within a W&B Project with a config like following:
description: LSTM Model
method: random
metric:
goal: maximize
name: val_accuracy
name: LSTM-Sweep
parameters:
batch_size:
distribution: int_uniform
max: 128
min: 32
epochs:
distribution: constant
value: 200
node_size1:
distribution: categorical
values:
- 64
- 128
- 256
node_size2:
distribution: categorical
values:
- 64
- 128
- 256
node_size3:
distribution: categorical
values:
- 64
- 128
- 256
node_size4:
distribution: categorical
values:
- 64
- 128
- 256
node_size5:
distribution: categorical
values:
- 64
- 128
- 256
num_layers:
distribution: categorical
values:
- 1
- 2
- 3
optimizer:
distribution: categorical
values:
- Adam
- Adamax
- Adagrad
path:
distribution: constant
value: "./path/to/data/"
program: sweep.py
project: SLR
My sweep.py
file looks something like this:
# imports
init = wandb.init(project="my-project", reinit=True)
config = wandb.config
def main():
skfold = StratifiedKFold(n_splits=5,
shuffle=True, random_state=7)
cvscores = []
group_id = wandb.util.generate_id()
X,y = # load data
i = 0
for train, test in skfold.split(X,y):
i=i+1
run = wandb.init(group=group_id, reinit=True, name=group_id+"#"+str(i))
model = # build model
model.fit([...], WandBCallback())
cvscores.append([...])
wandb.join()
if __name__ == "__main__":
main()
Starting this with the wandb agent
command within the folder of sweep.py
.
What i experienced with this setup is, that with the first wandb.init() call a new run is initialized. Okay, i could just remove that. But when calling wandb.init() for the second time it seems to lose track of the sweep it is running in. Online an empty run is listed in the sweep (because of the first wandb.init() call), all other runs are listed inside the project, but not in the sweep.
My goal is to have a run for each fold of the k-Fold cross-validation. At least i thought this would be the right way of doing this. Is there a different approach to combine sweeps with keras k-fold cross validation?
We put together an example of how to accomplish k-fold cross validation:
https://github.com/wandb/examples/tree/master/examples/wandb-sweeps/sweeps-cross-validation
The solution requires some contortions for the wandb library to spawn multiple jobs on behalf of a launched sweep job.
The basic idea is:
sweep_run
in the main function.Example visualizations of the sweep and k-fold grouping can be seen here: