Virtual paper at DH2017: Distinguishing Newspaper Genres. Exploring Automated Classification of Journalism’s Modes of Expression

Introductie

Former researcher-in-residence dr. Frank Harbers will be presenting a virtual paper at DH2017 on the research he and Juliette Lonij did during his time at the KB Lab, which resulted in the Genre Classifier

Inhoudsblokken
Body

Abstract

This paper examines the opportunities, approaches and issues of automatically classifying historical newspaper articles from the Netherlands for ‘genre’ as an expression of the historically and culturally determined conception of journalism. Ultimately, it offers an outline of a concrete machine learning approach, applying linear and non-linear classifiers, to predict the genre of a newspaper article. As a part of this, the paper discusses the different tools we have tried out and the problems we have encountered in the process. Specifically, the paper reflects on the way the rule-based approach to determining genre in the manual content analysis relates to the training of an automatic classifier based on machine learning techniques.