Tools

Korpusomat employs two high-level programming libraries for natural language processing: spaCy and Stanza as well as language-specific models built by their creators.

It also employs the following tools and resources:
  • Universal Dependencies,

  • Marciniak, M., Mykowiecka, A., & Rychlik, P. (2016). TermoPL - a Flexible Tool for Terminology Extraction. LREC.

  • Matthijs Brouwer, Hennie Brugman and Marc Kemps-Snijders 2017. MTAS: A Solr/Lucene based multi tier annotation search solution. Selected papers from the CLARIN Annual Conference 2016. Linköping Electronic Conference Proceedings 136: 19–37.