Search code examples
machine-learningkerasdeep-learningautomlauto-keras

Explanation Needed for Autokeras's AutoModel and GraphAutoModel


I understand what AutoKeras ImageClassifier does (https://autokeras.com/image_classifier/)

clf = ImageClassifier(verbose=True, augment=False)
clf.fit(x_train, y_train, time_limit=12 * 60 * 60)
clf.final_fit(x_train, y_train, x_test, y_test, retrain=True)
y = clf.evaluate(x_test, y_test)

But i am unable to Understand what does AutoModel class (https://autokeras.com/auto_model/) does, or how is it different from ImageClassifier

autokeras.auto_model.AutoModel(
inputs,
outputs,
name="auto_model",
max_trials=100,
directory=None,
objective="val_loss",
tuner="greedy",
seed=None)

Documentation for Arguments Inputs and Outputs Says

  • inputs: A list of or a HyperNode instance. The input node(s) of the AutoModel.
  • outputs: A list of or a HyperHead instance. The output head(s) of the AutoModel.

What is HyperNode Instance ?

Similarly, what is GraphAutoModel class ? (https://autokeras.com/graph_auto_model/)

autokeras.auto_model.GraphAutoModel(
inputs,
outputs,
name="graph_auto_model",
max_trials=100,
directory=None,
objective="val_loss",
tuner="greedy",
seed=None)

Documentation Reads

A HyperModel defined by a graph of HyperBlocks. GraphAutoModel is a subclass of HyperModel. Besides the HyperModel properties, it also has a tuner to tune the HyperModel. The user can use it in a similar way to a Keras model since it also has fit() and predict() methods.

What is HyperBlocks ? If Image Classifier automatically does HyperParameter Tuning, what is the use of GraphAutoModel ?

Links to Any Documents / Resources for better understanding of AutoModel and GraphAutoModel appreciated .


Solution

  • Having worked with autokeras recently, I can share my little knowledge.

    1. Task API When doing a classical task such as image classification/regression, text classification/regression, ..., you can use the simplest APIs provided by autokeras called Task API: ImageClassifier, ImageRegressor, TextClassifier, TextRegressor, ... In this case you have one input (image or text or tabular data, ...) and one output (classification, regression).

    2. Automodel However when you are in a situation where you have for example a task that requires multi inputs/outputs architecture, then you cannot use directly Task API, and this is where Automodel comes into play with the I/O API. you can check the example provided in the documentation where you have two inputs (image and structured data) and two outputs (classification and regression)

    3. GraphAutoModel GraphAutomodel works like keras functional API. It assembles different blocks (Convolutions, LSTM, GRU, ...) and create a model using this block, then it will look for the best hyperparameters given this architecture you provided. Suppose for instance I want to do a binary classification task using time series as input data. First let's generate a toy dataset :

    import numpy as np
    import autokeras as ak
    
    x = np.random.randn(100, 7, 3)
    y = np.random.choice([0, 1], size=100, p=[0.5, 0.5])
    

    Here x is a time series of 100 samples, each sample is a sequence of length 7 and a features dimension of 3. The corresponding target variable y is binary (0, 1). Using GraphAutomodel, I can specify the architecture I want, using what is called HyperBlocks. There are many blocks: Conv, RNN, Dense, ... check the full list here. In my case I want to use RNN blocks to create a model because I have time series data :

    input_layer = ak.Input()
    rnn_layer = ak.RNNBlock(layer_type="lstm")(input_layer)
    dense_layer = ak.DenseBlock()(rnn_layer)
    output_layer = ak.ClassificationHead(num_classes=2)(dense_layer)
    
    automodel = ak.GraphAutoModel(input_layer, output_layer, max_trials=2, seed=123)
    automodel.fit(x, y, validation_split=0.2, epochs=2, batch_size=32)
    

    (If you are not familiar with the above style of defining model, then you should check the keras functional API documentation).

    So in this example I have more flexibility for creating the skeleton of architecture I would like to use : LSTM block followed by a Dense layer, followed by a Classification layer, However I didn't specify any hyperparameter, (number of lstm layers, number of dense layers, size of lstm layers, size of dense layers, activation functions, dropout, batchnorm, ....), Autokeras will do the hyperparameters tuning automatically based on the architecture (skeleton) I provided.