I want to use AutoML Library Autogluon with Paperspace IPU/Kaggle TPU instance for specification reasons (big RAM, big space, and fast training time). For IPU, when I try to fit the Autogluon predictor class, the library only recognizes the available IPU but not using it. How to make the Autogluon use the IPU? For TPU, I have not yet tried it because somewhat I could not import the Autogluon library. Last for GPU, Currently, from what I tried, Autogluon could use the available GPU but I don't want to use it because of performance reasons.
Predictor fit output with IPU instance example:
predictor.fit(
train_data=train_data,
hyperparameters={
'model.hf_text.checkpoint_name': 'xlm-roberta-base'
}
)
Output:
Global seed set to 123
/usr/local/lib/python3.8/dist-packages/autogluon/multimodal/utils/environment.py:96: UserWarning: Only CPU is detected in the instance. This may result in slow speed for MultiModalPredictor. Consider using an instance with GPU support.
warnings.warn(
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: True, using: 0 IPUs
HPU available: False, using: 0 HPUs
/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py:1777: UserWarning: IPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='ipu', devices=4)`.
rank_zero_warn(
| Name | Type | Params
-------------------------------------------------------------------
0 | model | HFAutoModelForTextPrediction | 278 M
1 | validation_metric | Accuracy | 0
2 | loss_func | CrossEntropyLoss | 0
-------------------------------------------------------------------
278 M Trainable params
0 Non-trainable params
278 M Total params
1,112.190 Total estimated model params size (MB)
Importing Autogluon library with TPU instance:
import os
import numpy as np
import warnings
import pandas as pd
from IPython.display import display, Image
import json
# Auto Exploratory Data Analysis
from pandas_profiling import ProfileReport
# AutoML
from autogluon.core.utils.loaders import load_zip
from autogluon.multimodal import MultiModalPredictor
from autogluon.multimodal.data.infer_types import infer_column_types
from autogluon.tabular import TabularPredictor
from autogluon.features.generators import AutoMLPipelineFeatureGenerator
from autogluon.tabular import FeatureMetadata
pd.set_option('display.max_columns', None)
np.random.seed(123)
Output:
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /opt/conda/lib/python3.7/site-packages/transformers/utils/import_utils.py:1063 in _get_module │
│ │
│ 1060 │ │
│ 1061 │ def _get_module(self, module_name: str): │
│ 1062 │ │ try: │
│ ❱ 1063 │ │ │ return importlib.import_module("." + module_name, self.__name__) │
│ 1064 │ │ except Exception as e: │
│ 1065 │ │ │ raise RuntimeError( │
│ 1066 │ │ │ │ f"Failed to import {self.__name__}.{module_name} because of the followin │
│ │
│ /opt/conda/lib/python3.7/importlib/__init__.py:127 in import_module │
│ │
│ 124 │ │ │ if character != '.': │
│ 125 │ │ │ │ break │
│ 126 │ │ │ level += 1 │
│ ❱ 127 │ return _bootstrap._gcd_import(name[level:], package, level) │
│ 128 │
│ 129 │
│ 130 _RELOADING = {} │
│ <frozen importlib._bootstrap>:1006 in _gcd_import │
│ <frozen importlib._bootstrap>:983 in _find_and_load │
│ <frozen importlib._bootstrap>:967 in _find_and_load_unlocked │
│ <frozen importlib._bootstrap>:677 in _load_unlocked │
│ <frozen importlib._bootstrap_external>:728 in exec_module │
│ <frozen importlib._bootstrap>:219 in _call_with_frames_removed │
│ │
│ /opt/conda/lib/python3.7/site-packages/transformers/modeling_tf_utils.py:39 in <module> │
│ │
│ 36 from tensorflow.python.keras.saving import hdf5_format │
│ 37 │
│ 38 from huggingface_hub import Repository, list_repo_files │
│ ❱ 39 from keras.saving.hdf5_format import save_attributes_to_hdf5_group │
│ 40 from transformers.utils.hub import convert_file_size_to_int, get_checkpoint_shard_files │
│ 41 │
│ 42 from . import DataCollatorWithPadding, DefaultDataCollator │
│ │
│ /opt/conda/lib/python3.7/site-packages/keras/__init__.py:21 in <module> │
│ │
│ 18 [keras.io](https://keras.io). │
│ 19 """ │
│ 20 from keras import distribute │
│ ❱ 21 from keras import models │
│ 22 from keras.engine.input_layer import Input │
│ 23 from keras.engine.sequential import Sequential │
│ 24 from keras.engine.training import Model │
│ │
│ /opt/conda/lib/python3.7/site-packages/keras/models/__init__.py:18 in <module> │
│ │
│ 15 """Keras models API.""" │
│ 16 │
│ 17 │
│ ❱ 18 from keras.engine.functional import Functional │
│ 19 from keras.engine.sequential import Sequential │
│ 20 from keras.engine.training import Model │
│ 21 │
│ │
│ /opt/conda/lib/python3.7/site-packages/keras/engine/functional.py:26 in <module> │
│ │
│ 23 │
│ 24 import tensorflow.compat.v2 as tf │
│ 25 │
│ ❱ 26 from keras import backend │
│ 27 from keras.dtensor import layout_map as layout_map_lib │
│ 28 from keras.engine import base_layer │
│ 29 from keras.engine import base_layer_utils │
│ │
│ /opt/conda/lib/python3.7/site-packages/keras/backend.py:32 in <module> │
│ │
│ 29 import numpy as np │
│ 30 import tensorflow.compat.v2 as tf │
│ 31 │
│ ❱ 32 from keras import backend_config │
│ 33 from keras.distribute import distribute_coordinator_utils as dc │
│ 34 from keras.engine import keras_tensor │
│ 35 from keras.utils import control_flow_util │
│ │
│ /opt/conda/lib/python3.7/site-packages/keras/backend_config.py:33 in <module> │
│ │
│ 30 │
│ 31 │
│ 32 @keras_export("keras.backend.epsilon") │
│ ❱ 33 @tf.__internal__.dispatch.add_dispatch_support │
│ 34 def epsilon(): │
│ 35 │ """Returns the value of the fuzz factor used in numeric expressions. │
│ 36 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: module 'tensorflow.compat.v2.__internal__' has no attribute 'dispatch'
The above exception was the direct cause of the following exception:
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /tmp/ipykernel_248/1088848273.py:13 in <module> │
│ │
│ [Errno 2] No such file or directory: '/tmp/ipykernel_248/1088848273.py' │
│ │
│ /opt/conda/lib/python3.7/site-packages/autogluon/multimodal/__init__.py:6 in <module> │
│ │
│ 3 except ImportError: │
│ 4 │ pass │
│ 5 │
│ ❱ 6 from . import constants, data, models, optimization, predictor, utils │
│ 7 from .predictor import AutoMMPredictor, MultiModalPredictor │
│ 8 from .utils import download │
│ 9 │
│ │
│ /opt/conda/lib/python3.7/site-packages/autogluon/multimodal/optimization/__init__.py:1 in │
│ <module> │
│ │
│ ❱ 1 from . import lit_module, utils │
│ 2 │
│ │
│ /opt/conda/lib/python3.7/site-packages/autogluon/multimodal/optimization/lit_module.py:14 in │
│ <module> │
│ │
│ 11 │
│ 12 from ..constants import AUTOMM, LM_TARGET, LOGITS, T_FEW, TEMPLATE_LOGITS, WEIGHT │
│ 13 from ..data.mixup import MixupModule, multimodel_mixup │
│ ❱ 14 from .utils import apply_layerwise_lr_decay, apply_single_lr, apply_two_stages_lr, get_l │
│ 15 │
│ 16 logger = logging.getLogger(AUTOMM) │
│ 17 │
│ │
│ /opt/conda/lib/python3.7/site-packages/autogluon/multimodal/optimization/utils.py:57 in <module> │
│ │
│ 54 │ ROOT_MEAN_SQUARED_ERROR, │
│ 55 │ SPEARMANR, │
│ 56 ) │
│ ❱ 57 from ..utils import MeanAveragePrecision │
│ 58 from .losses import MultiNegativesSoftmaxLoss, SoftTargetCrossEntropy │
│ 59 from .lr_scheduler import ( │
│ 60 │ get_cosine_schedule_with_warmup, │
│ │
│ /opt/conda/lib/python3.7/site-packages/autogluon/multimodal/utils/__init__.py:39 in <module> │
│ │
│ 36 from .log import LogFilter, apply_log_filter, make_exp_dir │
│ 37 from .map import MeanAveragePrecision │
│ 38 from .matcher import compute_semantic_similarity, convert_data_for_ranking, create_siame │
│ ❱ 39 from .metric import compute_ranking_score, compute_score, get_minmax_mode, infer_metrics │
│ 40 from .misc import logits_to_prob, shopee_dataset, tensor_to_ndarray │
│ 41 from .mmcv import CollateMMCV, send_datacontainers_to_device, unpack_datacontainers │
│ 42 from .model import create_fusion_model, create_model, list_timm_models, modify_duplicate │
│ │
│ /opt/conda/lib/python3.7/site-packages/autogluon/multimodal/utils/metric.py:7 in <module> │
│ │
│ 4 import warnings │
│ 5 from typing import Dict, List, Optional, Tuple, Union │
│ 6 │
│ ❱ 7 import evaluate │
│ 8 import numpy as np │
│ 9 from sklearn.metrics import f1_score │
│ 10 │
│ │
│ /opt/conda/lib/python3.7/site-packages/evaluate/__init__.py:29 in <module> │
│ │
│ 26 │
│ 27 del version │
│ 28 │
│ ❱ 29 from .evaluator import ( │
│ 30 │ Evaluator, │
│ 31 │ ImageClassificationEvaluator, │
│ 32 │ QuestionAnsweringEvaluator, │
│ │
│ /opt/conda/lib/python3.7/site-packages/evaluate/evaluator/__init__.py:29 in <module> │
│ │
│ 26 │
│ 27 from .base import Evaluator │
│ 28 from .image_classification import ImageClassificationEvaluator │
│ ❱ 29 from .question_answering import QuestionAnsweringEvaluator │
│ 30 from .text_classification import TextClassificationEvaluator │
│ 31 from .token_classification import TokenClassificationEvaluator │
│ 32 │
│ │
│ /opt/conda/lib/python3.7/site-packages/evaluate/evaluator/question_answering.py:22 in <module> │
│ │
│ 19 │
│ 20 │
│ 21 try: │
│ ❱ 22 │ from transformers import Pipeline, PreTrainedModel, PreTrainedTokenizer, TFPreTraine │
│ 23 │ │
│ 24 │ TRANSFORMERS_AVAILABLE = True │
│ 25 except ImportError: │
│ <frozen importlib._bootstrap>:1032 in _handle_fromlist │
│ │
│ /opt/conda/lib/python3.7/site-packages/transformers/utils/import_utils.py:1053 in __getattr__ │
│ │
│ 1050 │ │ if name in self._modules: │
│ 1051 │ │ │ value = self._get_module(name) │
│ 1052 │ │ elif name in self._class_to_module.keys(): │
│ ❱ 1053 │ │ │ module = self._get_module(self._class_to_module[name]) │
│ 1054 │ │ │ value = getattr(module, name) │
│ 1055 │ │ else: │
│ 1056 │ │ │ raise AttributeError(f"module {self.__name__} has no attribute {name}") │
│ │
│ /opt/conda/lib/python3.7/site-packages/transformers/utils/import_utils.py:1068 in _get_module │
│ │
│ 1065 │ │ │ raise RuntimeError( │
│ 1066 │ │ │ │ f"Failed to import {self.__name__}.{module_name} because of the followin │
│ 1067 │ │ │ │ f" traceback):\n{e}" │
│ ❱ 1068 │ │ │ ) from e │
│ 1069 │ │
│ 1070 │ def __reduce__(self): │
│ 1071 │ │ return (self.__class__, (self._name, self.__file__, self._import_structure)) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Failed to import transformers.modeling_tf_utils because of the following error (look up to see its
traceback):
module 'tensorflow.compat.v2.__internal__' has no attribute 'dispatch'
As far as I can tell, the Autogluon library does not currently support using IPUs. The Poplar SDK supports PyTorch and PyTorch Lightning, which Autogluon is based on, so the library could in principle be supported. I'd be really interested to hear more about what you want to use Autogluon for!
In the meantime, there are many IPU resources available, including:
I'm sorry that what you want isn't supported, but I hope that the above is at least helpful.
For the TPU error, it looks like the environment in your instance comes with TensorFlow pre-installed, but it doesn't look like Autogluon uses TensorFlow. You might need to look into how to use PyTorch on TPUs with PyTorch/XLA: https://github.com/pytorch/xla
With thanks,
Callum