machine-learning dataset label multilabel-classification

Label Studio: How to use data fields in labelling interface

I imported thousand of images in Label Studio, every image is defined by a .json like this:

[
  {
    "data": {
      "image": "cometa_32742CRO000005502746_1500824468_2.png"
    },
    "predictions": [
      {
        "result": [
          {
            "value": {
              "text": ["OLRIONI MAURO"]
            },
            "id": "fe83f7ed-2325-41a1-bc1c-2d46eeec899f",
            "from_name": "question",
            "to_name": "image",
            "type": "textarea"
          }
        ]
      }
    ]
  }
]

I need to get the predictions->result->value->text parameter for use it in the labelling interface like this:

<TextArea name="question" value=""> predictions->result->value->text </TextArea>

How can I do that?

Solution

I opened an issue in the repo. Thaks to smorface for answer:

I suggest using variables in the data rather than predictions to get the behavior you describe.

For example:

  [
  {
    "data": {
      "image": "/data/upload/47/fa12d6c8-dialogue-analysis.png",
      "text" : "mario rossi"
    }
  }
]

Then use a labeling config like this:

<View>
  <Choices name="handwritten" toName="handwritten" choice="single-radio" showInLine="true">
    <Choice value="Maiuscolo" selected="true" hotkey="q"/>
    <Choice value="Minuscolo" hotkey="w"/>
  </Choices>
  <TextArea name="question" toName="image" value="$text" editable="true"/>
  <Image name="image" value="$image"/>
</View>

NOTE: If you try to update the labeling config while some task data contains just the image data and not also the text data, you won't be able to save your changes. Let me know if this is an issue for you and I can share my workaround.

more about variables here: https://labelstud.io/tags/index.html#Variables