Multi-Single label text classification¶
Note: This example builds upon multi-label classification. If you are unfamiliar with it, refer to the previous examples.
Multi-single label classification is a type of hierarchical classification: First, a sample is classified with multi-label classification, yielding a mumber of classes. Then, a single prediction (in a second dimension) is given for each of those classes.
We showcase this with an ABSA (aspect-based sentiment analysis) example, using reviews of hotels. Just like a multi-label dataset, each sample has multiple aspects, but then each aspect additionally has a sentiment. For example, “I liked the breakfast, althought the staff doesn’t even bother to greet you when you enter” has the aspects “food” with positive sentiment, and the aspect “staff” with a negative sentiment. The label would thus be `[[“Food”, “positive”], [“Staff”, “negative”]].
[ ]:
from autotransformers import AutoTransformer, DatasetLoader
Below is an example dataset for multi-single-label tasks. Additionally to the “classes”, we now have to configure a list of “inner classes” for the inner classification task. Note the
None
inner class, which is used when a sample is not labeled with the corresponding outer class.The samples’ values are lists of pairs in multi-single classification: Each pair specified an outer and its corresponding inner class.
[ ]:
dataset = {
"meta": {
"name": "example_multisingle_label",
"version": "1.0.0",
"created_with": "wizard"
},
"config": [
{
"domain": "text",
"type": "IText"
},
{
"task_id": "task1",
"classes": ["Room", "Staff", "Cleanliness"],
"inner_classes": ["None", "positive", "neutral", "negative"],
"none_inner_class": "None",
"type": "TMultiSingleClassification"
}
],
"train": [
[
{"value": "the room was very spacious."},
{"value": [["Room", "positive"]]},
],
[
{"value": "Everything was spotless, bed lines were indeed heavenly and the bed was super comfortable with wonderful pillows."},
{"value": [["Cleanliness", "positive"]]},
],
[
{"value": "In the room there is a tiny bathroom with shower and seperate toilet."},
{"value": [["Room", "neutral"]]},
],
[
{"value": "Our agent had booked the wrong room type and the hotel room we were given was poor."},
{"value": [["Room", "negative"]]},
],
],
"eval": [
[
{"value": "Really clean right by a canal staff really friendly and helpful."},
{"value": [["Staff", "positive"], ["Cleanliness", "positive"]]},
],
],
"test": [
[
{"value": "The concierge was also very helpful in organising some train tickets"},
{"value": [["Staff", "positive"]]},
]
]
}
As before, we simply create a DatasetLoader from the dataset and start training.
[ ]:
dl = DatasetLoader(dataset)
# Or create a DatasetLoader from a file
# dl = DatasetLoader("path/to/my-dataset.json")
[ ]:
# In this example, we only train for one epoch to finish fast.
# In reality, you want to set this to a higher value for better results.
config = [
("engine/stop_condition/type", "MaxEpochs"),
("engine/stop_condition/value", 1),
]
at = AutoTransformer(config)
at.init(dataset_loader=dl, path=".models/example03")
at.train(dl)
And check our result:
[ ]:
at("I liked the breakfast, althought the staff doesn't even bother to greet you when you go anywhere.")