TFF: Custom input spec with custom data set - TypeError: object of type 'TensorSpec" has no len()

1: problem: I have the need to use a custom data set in a tff simulation. I have built on the tff/python/research/compression example "run_experiment.py". The error:

  File "B:\tools and software\Anaconda\envs\bookProjects\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-47998fd56829>", line 1, in <module>
    runfile('B:/projects/openProjects/githubprojects/BotnetTrafficAnalysisFederaedLearning/anomaly-detection/train_v04.py', args=['--experiment_name=temp', '--client_batch_size=20', '--client_optimizer=sgd', '--client_learning_rate=0.2', '--server_optimizer=sgd', '--server_learning_rate=1.0', '--total_rounds=200', '--rounds_per_eval=1', '--rounds_per_checkpoint=50', '--rounds_per_profile=0', '--root_output_dir=B:/projects/openProjects/githubprojects/BotnetTrafficAnalysisFederaedLearning/anomaly-detection/logs/fed_out/'], wdir='B:/projects/openProjects/githubprojects/BotnetTrafficAnalysisFederaedLearning/anomaly-detection')
  File "B:\tools and software\PyCharm 2020.1\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "B:\tools and software\PyCharm 2020.1\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "B:/projects/openProjects/githubprojects/BotnetTrafficAnalysisFederaedLearning/anomaly-detection/train_v04.py", line 292, in <module>
    app.run(main)
  File "B:\tools and software\Anaconda\envs\bookProjects\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "B:\tools and software\Anaconda\envs\bookProjects\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "B:/projects/openProjects/githubprojects/BotnetTrafficAnalysisFederaedLearning/anomaly-detection/train_v04.py", line 285, in main
    train_main()
  File "B:/projects/openProjects/githubprojects/BotnetTrafficAnalysisFederaedLearning/anomaly-detection/train_v04.py", line 244, in train_main
    input_spec=input_spec),
  File "B:/projects/openProjects/githubprojects/BotnetTrafficAnalysisFederaedLearning/anomaly-detection/train_v04.py", line 193, in model_builder
    metrics=[tf.keras.metrics.Accuracy()]
  File "B:\tools and software\Anaconda\envs\bookProjects\lib\site-packages\tensorflow_federated\python\learning\keras_utils.py", line 125, in from_keras_model
    if len(input_spec) != 2:
TypeError: object of type 'TensorSpec' has no len()

highlighting: TypeError: object of type 'TensorSpec' has no len()

2: have tried: I have looked at the response to: TensorFlow Federated: How can I write an Input Spec for a model with more than one input describing what would be needed to produce a custom input spec for. I might be miss understanding input spec.

If I don't need to do this, and there is a better way, please tell.

3: source:

    df = get_train_data(sysarg)
    x_train, x_opt, x_test = np.split(df.sample(frac=1,
                                                random_state=17),
                                      [int(1 / 3 * len(df)), int(2 / 3 * len(df))])

    x_train, x_opt, x_test = create_scalar(x_opt, x_test, x_train)
    input_spec = tf.nest.map_structure(tf.TensorSpec.from_tensor, tf.convert_to_tensor(x_train))

Solution

TFF's models declare a slightly different input specification than you may be expecting; they generally are expecting both the x and the y values as parameters (IE, data and labels). It is unfortunate that you're hitting that AttributeError, as the ValueError TFF would be raising is probably more helpful in this case. Inlining the operative parts of the message here:

The top-level structure in `input_spec` must contain exactly two elements,
as it must specify type information for both inputs to and predictions from the model.

The TLDR in your particular example is: if you have access to the labels as well (y_train below), simply change your input_spec definition to:

input_spec = tf.nest.map_structure(
    tf.TensorSpec.from_tensor,
    [tf.convert_to_tensor(x_train), tf.convert_to_tensor(y_train)])