Automatically generated NE tagged corpora for English and Hungarian
Simon, Eszter and Nemeskey, Dávid Márk (2012) Automatically generated NE tagged corpora for English and Hungarian. In: Proceedings of the 4th Named Entity Workshop (NEWS), 2012-07-08 - 2012-07-14, Jeju, Dél-Korea.
|
Text
engNERwiki.pdf Download (103kB) | Preview |
Abstract
Supervised Named Entity Recognizers require large amounts of annotated text. Since manual annotation is a highly costly procedure, reducing the annotation cost is essential. We present a fully automatic method to build NE annotated corpora from Wikipedia. In contrast to recent work, we apply a new method, which maps the DBpedia classes into CoNLL NE types. Since our method is mainly language-independent, we used it to generate corpora for English and Hungarian. The corpora are freely available.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | Q Science > QA Mathematics and Computer Science > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány |
Divisions: | ?? R104a ?? |
Depositing User: | EPrints Admin |
Date Deposited: | 18 Feb 2013 14:01 |
Last Modified: | 05 Feb 2014 12:28 |
URI: | https://eprints.sztaki.hu/id/eprint/6893 |
Update Item |