So I have changed from my Windows machine to a MacBook Pro with Apple M3 Pro (36 GB) running with macOS Sonoma (version 14.5) due to a work requirement. I realized something very strange. In a small sample script I managed to extract the root cause of this issue.
When I import pandas before tensorflow / keras the script freezes. It works the other way around.
The script:
import numpy as np
import os
import pandas as pd
from tensorflow.keras import layers, models
print("Creating simple model...")
try:
model = models.Sequential([
layers.Input(shape=(10,)),
layers.Dense(64, activation='relu'),
layers.Dense(1, activation='linear')
])
print("Model created successfully.")
except Exception as e:
print(f"Error creating model: {e}")
x_train = np.random.rand(100, 10)
y_train = np.random.rand(100, 1)
# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
# Train the model
try:
model.fit(x_train, y_train, epochs=5, batch_size=32)
print("Model training completed successfully.")
except Exception as e:
print(f"Error during training: {e}")
This, when run, gives me the following output:
Creating simple model...
2024-05-31 18:04:07.639131: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Pro
2024-05-31 18:04:07.639149: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 36.00 GB
2024-05-31 18:04:07.639154: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 13.50 GB
2024-05-31 18:04:07.639170: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-05-31 18:04:07.639186: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
The script freezes at this point and has to be terminated. When I swap the order of import
from tensorflow.keras import layers, models
import pandas as pd
I get the following:
Creating simple model...
2024-05-31 18:07:18.879661: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Pro
2024-05-31 18:07:18.879680: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 36.00 GB
2024-05-31 18:07:18.879685: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 13.50 GB
2024-05-31 18:07:18.879705: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-05-31 18:07:18.879717: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Model created successfully.
Epoch 1/5
2024-05-31 18:07:19.269585: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - loss: 0.1177
Epoch 2/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.1078
Epoch 3/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.0932
Epoch 4/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.1008
Epoch 5/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.0865
Model training completed successfully.
Note that I dont even use pandas in the script. For reference I imported os and didnt use it anywhere in the script either but it doesnt affect it.
Here is my env package pip list:
Package Version
---------------------------- -----------
absl-py 2.1.0
astunparse 1.6.3
Bottleneck 1.3.7
cachetools 5.3.3
certifi 2024.2.2
charset-normalizer 3.3.2
db-dtypes 1.2.0
flatbuffers 24.3.25
gast 0.5.4
google-api-core 2.19.0
google-auth 2.29.0
google-cloud-bigquery 3.23.1
google-cloud-core 2.4.1
google-crc32c 1.5.0
google-pasta 0.2.0
google-resumable-media 2.7.0
googleapis-common-protos 1.63.0
grpcio 1.64.0
grpcio-status 1.62.2
h5py 3.11.0
idna 3.7
importlib_metadata 7.1.0
joblib 1.4.2
keras 3.3.3
libclang 18.1.1
Markdown 3.6
markdown-it-py 3.0.0
MarkupSafe 2.1.5
mdurl 0.1.2
ml-dtypes 0.3.2
namex 0.0.8
numexpr 2.8.7
numpy 1.26.4
opt-einsum 3.3.0
optree 0.11.0
packaging 24.0
pandas 2.2.1
pip 24.0
proto-plus 1.23.0
protobuf 4.25.3
pyarrow 16.1.0
pyasn1 0.6.0
pyasn1_modules 0.4.0
Pygments 2.18.0
python-dateutil 2.9.0.post0
pytz 2024.1
requests 2.32.3
rich 13.7.1
rsa 4.9
scikit-learn 1.4.2
scipy 1.11.4
setuptools 69.5.1
six 1.16.0
tensorboard 2.16.2
tensorboard-data-server 0.7.2
tensorflow 2.16.1
tensorflow-io-gcs-filesystem 0.37.0
tensorflow-macos 2.16.1
tensorflow-metal 1.1.0
termcolor 2.4.0
threadpoolctl 3.5.0
tqdm 4.66.4
typing_extensions 4.12.0
tzdata 2024.1
urllib3 2.2.1
Werkzeug 3.0.3
wheel 0.43.0
wrapt 1.16.0
zipp 3.19.0
Suggestion from comments (@Ze'ev Ben-Tsvi)
import numpy as np
import os
import pandas as pd
from tensorflow.keras import layers, models
print("Creating simple model...")
try:
print("Initializing Sequential model...")
model = models.Sequential()
print("Adding input layer...")
model.add(layers.Input(shape=(10,)))
print("Adding first Dense layer...")
model.add(layers.Dense(64, activation='relu'))
print("Adding output Dense layer...")
model.add(layers.Dense(1, activation='linear'))
print("Model created successfully.")
except Exception as e:
print(f"Error creating model: {e}")
x_train = np.random.rand(100, 10)
y_train = np.random.rand(100, 1)
# Compile the model
try:
print("Compiling model...")
model.compile(optimizer='adam', loss='mean_squared_error')
print("Model compiled successfully.")
except Exception as e:
print(f"Error during compilation: {e}")
# Train the model
try:
print("Training model...")
model.fit(x_train, y_train, epochs=5, batch_size=32)
print("Model training completed successfully.")
except Exception as e:
print(f"Error during training: {e}")
The output of this script is:
Initializing Sequential model...
Adding input layer...
Adding first Dense layer...
Adding output Dense layer...
Model created successfully.
Compiling model...
Model compiled successfully.
Training model...
Epoch 1/5
It seems to get a little bit further in the execution when written like this. Now it doesnt get stuck at models.Sequential anymore but at model.fit.
Swapping the order of import again (tensorflow then pandas) I get:
Creating simple model...
Initializing Sequential model...
Adding input layer...
Adding first Dense layer...
Adding output Dense layer...
Model created successfully.
Compiling model...
Model compiled successfully.
Training model...
Epoch 1/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.4620
Epoch 2/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 636us/step - loss: 0.3263
Epoch 3/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.2322
Epoch 4/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 629us/step - loss: 0.1395
Epoch 5/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 690us/step - loss: 0.1251
Model training completed successfully.
The main issue here is that at no point do I get an exception, not even when wrapping all imports individually in try/catch blocks. Something seems to either swallow the errors or none are thrown.
Importing pandas
AFTER tensorflow
seems to fix the issue:
from tensorflow.keras import layers, models
import pandas as pd
source: https://evoila.com/blog/debugging-tensorflow-pandas-import-issue-macos-sonoma-apple-m3-pro/