Search code examples
pythonpandastensorflow

Importing pandas before tensorflow makes the script freeze


So I have changed from my Windows machine to a MacBook Pro with Apple M3 Pro (36 GB) running with macOS Sonoma (version 14.5) due to a work requirement. I realized something very strange. In a small sample script I managed to extract the root cause of this issue.

When I import pandas before tensorflow / keras the script freezes. It works the other way around.

The script:

import numpy as np
import os
import pandas as pd
from tensorflow.keras import layers, models

print("Creating simple model...")
try:
    model = models.Sequential([
        layers.Input(shape=(10,)),
        layers.Dense(64, activation='relu'),
        layers.Dense(1, activation='linear')
    ])
    print("Model created successfully.")
except Exception as e:
    print(f"Error creating model: {e}")

x_train = np.random.rand(100, 10)
y_train = np.random.rand(100, 1)

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
try:
    model.fit(x_train, y_train, epochs=5, batch_size=32)
    print("Model training completed successfully.")
except Exception as e:
    print(f"Error during training: {e}")

This, when run, gives me the following output:

Creating simple model...
2024-05-31 18:04:07.639131: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Pro
2024-05-31 18:04:07.639149: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 36.00 GB
2024-05-31 18:04:07.639154: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 13.50 GB
2024-05-31 18:04:07.639170: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-05-31 18:04:07.639186: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)

The script freezes at this point and has to be terminated. When I swap the order of import

from tensorflow.keras import layers, models
import pandas as pd

I get the following:

Creating simple model...
2024-05-31 18:07:18.879661: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Pro
2024-05-31 18:07:18.879680: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 36.00 GB
2024-05-31 18:07:18.879685: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 13.50 GB
2024-05-31 18:07:18.879705: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-05-31 18:07:18.879717: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Model created successfully.
Epoch 1/5
2024-05-31 18:07:19.269585: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - loss: 0.1177 
Epoch 2/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.1078 
Epoch 3/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.0932 
Epoch 4/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.1008 
Epoch 5/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.0865 
Model training completed successfully.

Note that I dont even use pandas in the script. For reference I imported os and didnt use it anywhere in the script either but it doesnt affect it.

Here is my env package pip list:

Package                      Version
---------------------------- -----------
absl-py                      2.1.0
astunparse                   1.6.3
Bottleneck                   1.3.7
cachetools                   5.3.3
certifi                      2024.2.2
charset-normalizer           3.3.2
db-dtypes                    1.2.0
flatbuffers                  24.3.25
gast                         0.5.4
google-api-core              2.19.0
google-auth                  2.29.0
google-cloud-bigquery        3.23.1
google-cloud-core            2.4.1
google-crc32c                1.5.0
google-pasta                 0.2.0
google-resumable-media       2.7.0
googleapis-common-protos     1.63.0
grpcio                       1.64.0
grpcio-status                1.62.2
h5py                         3.11.0
idna                         3.7
importlib_metadata           7.1.0
joblib                       1.4.2
keras                        3.3.3
libclang                     18.1.1
Markdown                     3.6
markdown-it-py               3.0.0
MarkupSafe                   2.1.5
mdurl                        0.1.2
ml-dtypes                    0.3.2
namex                        0.0.8
numexpr                      2.8.7
numpy                        1.26.4
opt-einsum                   3.3.0
optree                       0.11.0
packaging                    24.0
pandas                       2.2.1
pip                          24.0
proto-plus                   1.23.0
protobuf                     4.25.3
pyarrow                      16.1.0
pyasn1                       0.6.0
pyasn1_modules               0.4.0
Pygments                     2.18.0
python-dateutil              2.9.0.post0
pytz                         2024.1
requests                     2.32.3
rich                         13.7.1
rsa                          4.9
scikit-learn                 1.4.2
scipy                        1.11.4
setuptools                   69.5.1
six                          1.16.0
tensorboard                  2.16.2
tensorboard-data-server      0.7.2
tensorflow                   2.16.1
tensorflow-io-gcs-filesystem 0.37.0
tensorflow-macos             2.16.1
tensorflow-metal             1.1.0
termcolor                    2.4.0
threadpoolctl                3.5.0
tqdm                         4.66.4
typing_extensions            4.12.0
tzdata                       2024.1
urllib3                      2.2.1
Werkzeug                     3.0.3
wheel                        0.43.0
wrapt                        1.16.0
zipp                         3.19.0

Suggestion from comments (@Ze'ev Ben-Tsvi)

import numpy as np
import os
import pandas as pd
from tensorflow.keras import layers, models

print("Creating simple model...")

try:
    print("Initializing Sequential model...")
    model = models.Sequential()
    print("Adding input layer...")
    model.add(layers.Input(shape=(10,)))
    print("Adding first Dense layer...")
    model.add(layers.Dense(64, activation='relu'))
    print("Adding output Dense layer...")
    model.add(layers.Dense(1, activation='linear'))
    print("Model created successfully.")
except Exception as e:
    print(f"Error creating model: {e}")

x_train = np.random.rand(100, 10)
y_train = np.random.rand(100, 1)

# Compile the model
try:
    print("Compiling model...")
    model.compile(optimizer='adam', loss='mean_squared_error')
    print("Model compiled successfully.")
except Exception as e:
    print(f"Error during compilation: {e}")

# Train the model
try:
    print("Training model...")
    model.fit(x_train, y_train, epochs=5, batch_size=32)
    print("Model training completed successfully.")
except Exception as e:
    print(f"Error during training: {e}")

The output of this script is:

Initializing Sequential model...
Adding input layer...
Adding first Dense layer...
Adding output Dense layer...
Model created successfully.
Compiling model...
Model compiled successfully.
Training model...
Epoch 1/5

It seems to get a little bit further in the execution when written like this. Now it doesnt get stuck at models.Sequential anymore but at model.fit.

Swapping the order of import again (tensorflow then pandas) I get:

Creating simple model...
Initializing Sequential model...
Adding input layer...
Adding first Dense layer...
Adding output Dense layer...
Model created successfully.
Compiling model...
Model compiled successfully.
Training model...
Epoch 1/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.4620  
Epoch 2/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 636us/step - loss: 0.3263
Epoch 3/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.2322 
Epoch 4/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 629us/step - loss: 0.1395
Epoch 5/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 690us/step - loss: 0.1251
Model training completed successfully.

The main issue here is that at no point do I get an exception, not even when wrapping all imports individually in try/catch blocks. Something seems to either swallow the errors or none are thrown.


Solution

  • Importing pandas AFTER tensorflow seems to fix the issue:

    from tensorflow.keras import layers, models
    import pandas as pd
    

    source: https://evoila.com/blog/debugging-tensorflow-pandas-import-issue-macos-sonoma-apple-m3-pro/