Search code examples
tensorflowmachine-learningneural-networkdata-sciencebatchsize

what is difference between batch size in data pipeline and batch size in midel.fit()?


Are these 2 the same batch-size, or they have different meaning?

BATCH_SIZE=10
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.batch(BATCH_SIZE)

2nd

history = model.fit(train_ds,
  epochs=EPOCHS,                    
  validation_data=create_dataset(X_valid, y_valid_bin),
  max_queue_size=1,
  workers=1,
  batch_size=10,
  use_multiprocessing=False)

I'm having problem of Ram out of run... Traning images example 333000 Ram 30GB 12 GB GPU What should be the batch size for this?

Full Code Here


Solution

  • Data-Set (Batch Size)

    Batch size only mean that how much data will pass through a pipeline you defined. In the case of Dateset batch size represent how much data will be passed to the model in one iteration. For-example if you form a data generator and set batch size 8. Now on every iteration data generator give 8 data records.

    Model.fit (Batch Size)

    and in the model.fit when we set batch size it means that model will calculate loss after passing data-records equal to the batch-size. If you know about deep learning models they will calculate a particular loss on feed forward and than through back propagation they improves themselves. Now if you set batch-size 8 in model.fit now 8 data-records are pass to the model and loss is calculated of that 8 data-records and than model improve from that loss.

    Example:

    Now if you set dateset batch-size equal to 4 and set model.fit batch-size equal to 8. Now your dateset generator has to iterate 2 times to give 8 images to model and model.fit only perform 1 iteration to calculate loss.

    Ram Issue

    what is your image size ? Try to reduce batch_size because step per epochs are not related to ram but batch size is. Because if you are giving 10 batch size than 10 images have to load onto the ram for processing and your ram is not capable of loading 10 images at the same time. Try to give batch size 4 or 2. that may help you