09 - Annotation with Active Learning

Use an annotation tool that benefits from active learning to enforce a robust annotion process and balanced annotations.

Rob van Zoest
Founder @ innerdoc.com | NLP Expert-Engineer-Enthusiast | Writes about how to get value from textual data | Lives in the Netherlands | Loves to travel around the globe | Dutchman | rob@innerdoc.com
More posts by Rob van Zoest.

Rob van Zoest

10 Sep 2020• 1 min read

It might not be useful to build a training dataset for Named Entity Recognition with 2000 annotations, including 100 occurrences where ‘Barack Obama’ is tagged as a Person. You only want to annotate sentences where the model is least sure about the prediction.

With active learning, the model chooses which sentences should be selected for annotating. Other sentences are skipped, because the model is more certain about those annotations.

The makers of spaCy made the annotation tool Prodi.gy which is powered by active learning is (video below).

^{Type caption for embed (optional)

This article is part of the project Periodic Table of NLP Tasks. Click to read more about the making of the Periodic Table and the project to systemize NLP tasks.}

09 - Annotation with Active Learning

Rob van Zoest

Rob van Zoest

13 - Rulebased Training Data

12 - Textual Data Augmentation

11 - Crowdsourcing Marketplace

08 - Manual Annotation

10 - Training Data Provider