What is best practice for mapping a TF2 keras model's signaturedef to TF Serving classify/predict/regression API in TFX Pipeline?

We are building an automated TFX pipeline on Airflow and have based our model off of the Keras Tutorial . We save the keras model as follows:

model.save(fn_args.serving_model_dir, save_format='tf',
           signatures=signatures,
           )

That signatures dict is:

    signatures = {
    'serving_default':
        _get_serve_tf_examples_fn(model, tf_transform_output) \
            .get_concrete_function(
            tf.TensorSpec(
                shape=[None],
                dtype=tf.string,
                name='examples'
            )
        ),

    'prediction':
        get_request_url_fn(model, tf_transform_output) \
            .get_concrete_function(
            tf.TensorSpec(
                shape=[None],
                dtype=tf.string,
                name='prediction_examples'
            )
        ),
}

_get_serve_tf_examples_fn serves the purpose of providing the TFX evaluator component with additional tensors, tensors not used in model, for model evaluation purposes. It is as in the Keras TFX tutorial above:

def _get_serve_tf_examples_fn(model, tf_transform_output):
    model.tft_layer = tf_transform_output.transform_features_layer()

    @tf.function
    def serve_tf_examples_fn(serialized_tf_examples):
        feature_spec = tf_transform_output.raw_feature_spec()
        feature_spec.pop(_LABEL_KEY)

        parsed_features = tf.io.parse_example(serialized_tf_examples, feature_spec)
        transformed_features = model.tft_layer(parsed_features)
        transformed_features.pop(_transformed_name(_LABEL_KEY))
        return model(transformed_features)

    return serve_tf_examples_fn

The above model 'interface' accepts TF.Examples as is needed by TFX Evaluator component (TFMA).

However, For TF Serving, we want to be able to send 1 raw string - just a url - to the TF Serving predictor REST API and get the predicted score for it. Currently the get_request_url_fn is:

def get_request_url_fn(model, tf_transform_output):
    model.tft_layer = tf_transform_output.transform_features_layer()

    @tf.function
    def serve_request_url_fn(request_url):
        feature_spec = tf_transform_output.raw_feature_spec()
        # Model requires just one of the features made available to other TFX components
        # Throw away the rest and leave just 'request_url'
        feature_spec = {'request_url': feature_spec['request_url']}

        parsed_features = tf.io.parse_example(request_url, feature_spec)
        transformed_features = model.tft_layer(parsed_features)
        transformed_features.pop(_transformed_name(_LABEL_KEY))
        return model(transformed_features)

    return serve_request_url_fn

This approach still requires input in the form of a TF.Example though. It necessitates a consideerable amount of overhead on behalf of the client. Namely, import tensorflow. That code does work though:

url = f'http://{server}:8501/v1/models/wrcv3:predict'
headers = {"content-type": "application/json"}
url_request = b'index'
example = tf.train.Example(
            features=tf.train.Features(
              feature={"request_url": 
                          tf.train.Feature(bytes_list=tf.train.BytesList(value=[url_request]))
                      }
                )
            )
print(example)


data = {
  "signature_name":"prediction",
  "instances":[
    {
       "prediction_examples":{"b64": base64.b64encode(example.SerializeToString()).decode('utf-8')}
    }
  ]
}
data = json.dumps(data)
print(data)
json_response = requests.post(url, data=data, headers=headers)
print(json_response.content)
print(json_response.json)

Returning as a result:

features {
  feature {
    key: "request_url"
    value {
      bytes_list {
        value: "index"
      }
    }
  }
}

{"signature_name": "prediction", "instances": [{"prediction_examples": {"b64": "ChoKGAoLcmVxdWVzdF91cmwSCQoHCgVpbmRleA=="}}]}
b'{\n    "predictions": [[0.897708654]\n    ]\n}'
<bound method Response.json of <Response [200]>>

When we submit a base64-encoded string en lieu of the TF.Example it obviously fails:

url = f'http://{server}:8501/v1/models/wrcv3:predict'
headers = {"content-type": "application/json"}
url_request = b'index.html'


data = {
  "signature_name":"prediction",
  "instances":[
    {
       "prediction_examples":{"b64": base64.b64encode(url_request).decode('UTF-8')}
    }
  ]
}
data = json.dumps(data)
print(data)
json_response = requests.post(url, data=data, headers=headers)
print(json_response.content)
print(json_response.json)

returning:

{"signature_name": "prediction", "instances": [{"prediction_examples": {"b64": "aW5kZXguaHRtbA=="}}]}
b'{ "error": "Could not parse example input, value: \\\'index.html\\\'\\n\\t [[{{node ParseExample/ParseExampleV2}}]]" }'
<bound method Response.json of <Response [400]>>

Question is: what should the signaturedef/signature look like to accept raw strings? If not like get_request_url_fn . Surely the client should not have to load TF just to make a request ?

The TFX website itself goes into detail to document 3 protobufs for classify/predict/regression here , but it is not intuitive (to me) how to use these 3 protobufs to do the mapping we need.

Deep gratitude in advance.

Solution

According to your code the input to the function serve_request_url_fn is a Dense Tensor, but maybe the input of your transform graph is a Sparse Tensor.

The function tf.io.parse_example knows how to deserialise your tf.Example into a Sparse Tensor, but if you want to send a Tensor, without serialising it, then you should convert it manually to a Sparse Tensor and stop using the tf.io.parse function.

For example:

@tf.function(input_signature=[tf.TensorSpec(shape=(None), dtype=tf.string, name='examples')])
def serve_request_url_fn(self, request_url):
    request_url_sp_tensor = tf.sparse.SparseTensor(
        indices=[[0, 0]],
        values=request_url,
        dense_shape=(1, 1)
    )
    parsed_features = {
        'request_url': request_url_sp_tensor,
    }
    transformed_features = self.model.tft_example_layer(parsed_features)
    transformed_features.pop(_transformed_name(_LABEL_KEY))
    return self.model(transformed_features)