bayfere.blogg.se

Italian magazines and newspapers
Italian magazines and newspapers




italian magazines and newspapers

The corpus is available for download from a dedicated webpage. This corpus contains articles from the German newspaper Frankfurter Rundschau. The corpus is available through a dedicated concordancer with an institutional account.Īnnotation: tokenised, PoS-tagged, parsed, lemmatised This corpus contains articles from the German newspaper Die Tageszeitung. Tübingen Treebank of Written German / Newspaper CorpusĪnnotation: tokenised, MSD tagged, lemmatised, syntactic constituency, named-entities The corpus is available for download from Ortolang. This corpus contains articles from the French newspaper l'Est Républicain from 1999 to 2003. This corpus contains recorded readings of articles from the French newspaper Le Monde.Ĭorpus journalistique issu de l'Est Républicain The corpus is available through the concordancer Korp. This corpus contains articles from the Finnish newspaper Karjalan Sanomat from 2012 to 2014. This corpus contains articles from Czech newspapers from 2005 to 2009. Licence: Czech National Corpus (Shuffled Corpus Data) SYN2013PUB: corpus of written Czech newspapersĪnnotation: tokenised, lemmatised, MSD-tagged The corpus is available for download from the Czech repository LINDAT. This corpus contains articles from 11 Czech newspapers from 1989 to 2004. The corpus is available for download from the ELRA catalogue.Īnnotation: tokenised, lemmatised, PoS-tagged This corpus contains articles from the Arabic newspaper An-Nahar from 1995 to 2000. Newspaper corpora in the CLARIN infrastructure Monolingual corpora Corpus This website was last updated on 7 September 2021. We first provide overviews of the corpora that are already part of the CLARIN infrastructure and then list those that have not yet been integrated.įor comments, changes of the existing content or inclusion of new corpora, send us an email. The majority of them richly tagged and are available under public licences. Almost a third of the newspaper corpora are historical, with the oldest articles from the 18th century. The available corpora contain newspaper articles in the following 11 languages: Arabic, Czech, Finnish, French, German, Greek, Italian, Norwegian, Polish and Swedish. Read more infrastructure gives access to 34 newspaper corpora, 7 of which are multilingual and 27 monolingual. Collections of newspapers in digital form are a rich source of information for researchers in a number of disciplines in the Humanities and Social Sciences and are especially valuable for synchronic as well as diachronic studies, ranging from history, media and communication studies to lexicography for which newspapers are a rich source of neologisms and other lexicographic phenomena.Įuropean Research Infrastructure Consortium






Italian magazines and newspapers