The Faroese part of the The Leipzig Corpora Collection has texts from news, web and media. The mixed corpus has 300.000 sentences and a wordlist of around 240.000 different word forms ordered by their frequency.
You can read more about the The Leipzig Corpora Collection here: https://wortschatz.uni-leipzig.de/en/download
Release: 2012
Contact: wort@informatik.uni-leipzig.de




