Conference Paper (published)
Details
Citation
Galke L, Mai F, Schelten A, Brunsch D & Scherp A (2017) Using Titles vs. Full-text as source for automated semantic document annotation. In: Proceedings of the Knowledge Capture Conference K-Cap 2017. Knowledge Capture Conference 2017, Austin, TX, USA, 04.12.2017-06.12.2017. New York: ACM, p. Article 20. https://doi.org/10.1145/3148011.3148039
Abstract
We conduct the first systematic comparison of automated semantic annotation based on either the full-text or only on the title metadata of documents. Apart from the prominent text classification baselines kNN and SVM, we also compare recent techniques of Learning to Rank and neural networks and revisit the traditional methods logistic regression, Rocchio, and Naive Bayes. Across three of our four datasets, the performance of the classifications using only titles reaches over 90% of the quality compared to the performance when using the full-text.
Keywords
Multi-label classification; document analysis; semantic annotation;
Journal
Proceedings of the Knowledge Capture Conference, K-CAP 2017
Status | Published |
---|---|
Funders | European Commission |
Publication date | 31/12/2017 |
URL | http://hdl.handle.net/1893/28018 |
Publisher | ACM |
Place of publication | New York |
ISBN | 9781450355537 |
Conference | Knowledge Capture Conference 2017 |
Conference location | Austin, TX, USA |
Dates | – |