Simple Training

This example shows the general procedure to create a dataset, train an AutoTransformer on it and then make predictions.
We will train an AutoTransformer for single-label classification, i.e. it will learn to assign exactly one label (of a number of known labels) to each sample.
[ ]:
from autotransformers import AutoTransformer, DatasetLoader

Below, we have prepared a minimal dataset example. AutoTransformers’ datasets are dicts, with the following keys:

  • meta: Meta-information about the dataset (name, origin, version, …).

  • config: Configuration of the data. This defines what input data to expect, what tasks the AutoTransformer has, and other information about the problem you are trying to solve.

  • train: A list of samples used for training

  • eval (optional): A list of samples used for evaluating training progress, e.g. for early stopping. If this is omitted, a part of the training samples is automatically split off.

  • test (optional): A list of samples used for one test run at the end, using data never used for training.

Important: Training datasets usually require hundreds or thousands of samples in order to achieve good performance on real tasks. This 4-item example only serves to show the format. The trained AutoTransformer likely does not make accurate predictions.

[ ]:
# The text snippets in this dataset are from "googleplay", a public dataset of app reviews on Google's Play Store.
dataset = {
    "meta": {
        "name": "example_singlelabel",
        "version": "1.0.0",
        "created_with": "wizard"
    },
    "config": [
        {
            "domain": "text",
            "type": "IText"
        },
        {
            "task_id": "task1",
            "classes": ["positive", "neutral", "negative"],
            "type": "TSingleClassification"
        }
    ],
    "train": [
        [
            {"value": "None of the notifications work. other people in the forums teport similar problems bht no fix. the app is nice but it isnt nearly as functional without notifications"},
            {"value": "negative"},
        ],
        [
            {"value": "It's great"},
            {"value": "positive"},
        ],
        [
            {"value": "Not allowing me to delete my account"},
            {"value": "negative"},
        ],
        [
            {"value": "So impressed that I bought premium on very first day"},
            {"value": "positive"},
        ],
    ],
    "test": [
        [
            {"value": "Can't set more than 7 tasks without paying an absurdly expensive weekly subscription"},
            {"value": "negative"},
        ]
    ],
}

Each sample is a pair of input value and target value (=label). To automatically create a skeleton dataset file for your task, you can use the at wizard command-line tool.

Now we create a DatasetLoader with our dataset, which manages data handling while training an AutoTransformer.
You can instantiate a DatasetLoader from a path to a JSON file containing a dataset, or directly from a Python dictionary (as we do here).
[ ]:
dl = DatasetLoader(dataset)

# Or create a DatasetLoader from a file
# dl = DatasetLoader("path/to/my-dataset.json")

Now we can create an AutoTransformer. The config can be used to change various settings, such as training time, learning rate or experiment tracking. Here, we set a very short training time, and reduce the frequency of console output.

[ ]:
# In this example, we only train for one epoch to finish fast.
# In reality, you want to set this to a higher value for better results.
config = [
    ("engine/stop_condition/type", "MaxEpochs"),
    ("engine/stop_condition/value", 1),
]
at = AutoTransformer(config)

We initialize the AutoTransformer with our dataset configuration (so it knows what data to expect), the base model (e.g. BERT or RoBERTa), and optionally a path to save checkpoints to. Then, we are ready to train. For this, we simple call at.train() with the dataset loader as an argument.

[ ]:
at.init(dataset_loader=dl, model_name_or_path="Roberta", path=".models/example01")
at.train(dl)

Try out the result!

[ ]:
res = at(["This app is amazing!", "While I do like the ease of use, I think it's too expensive."])

# Formatting the result for nicer output
[(input.value, output.value) for input, output in res]