Search code examples
tensorflowtensorflow2.0tpu

What does 'with strategy.scope():' or 'with tf.distribute.experimental.TPUStrategy(tpu).scope():' do to the creation of a NN?


In the code here: https://www.kaggle.com/ryanholbrook/detecting-the-higgs-boson-with-tpus

Before the model is compiled, the model is made using this code:

with strategy.scope():
    # Wide Network
    wide = keras.experimental.LinearModel()

    # Deep Network
    inputs = keras.Input(shape=[28])
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(inputs)
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(x)
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(x)
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(x)
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(x)
    outputs = layers.Dense(1)(x)
    deep = keras.Model(inputs=inputs, outputs=outputs)
    
    # Wide and Deep Network
    wide_and_deep = keras.experimental.WideDeepModel(
        linear_model=wide,
        dnn_model=deep,
        activation='sigmoid',
    )

I don't understand what with strategy.scope() does here and if it in any way affects the model. What does it do exactly?

In the future how could I figure out what this does? What resources would I have to look into to figure this out?


Solution

  • Distribution strategies were introduced as part of TF2 to help distribute training across multiple GPUs, multiple machines or TPUs with minimal code changes. I'd recommend this guide to distributed training for starters.

    Specifically creating a model under the TPUStrategy will place the model in a replicated (same weights on each of the cores) manner on the TPU and will keep the replica weights in sync by adding appropriate collective communications (all reducing the gradients). For more information check the API doc on TPUStrategy as well as this intro to TPUs in TF2 colab notebook.