Studies in Linguistics and Linguistic Data Science (SLLDS)

Our series of linguistic publications introduces work done at the lab and makes it available for the public. You can find all editions here:

Our Resources

GerEO: German experiencer-object verbs

GerEO is a set of syntactic and semantic annotations on German sentences containing an experiencer-object (EO) verb. EO verbs are psychological predicates whose Experiencer argument is mapped onto the object. They are claimed to be syntactically special in the literature.

PrepSensNZZ

PrepSensNZZ is a collection of over 19,000 sentences containing ambiguous prepositions, which have been automatically annotated for parts-of-speech and syntactic dependency structure (following the TiGer guidelines), and also for the head of the NPs embedded by the prepositions in terms of morphological structure and lexical information.

Bochum English Countability Lexicon (BECL)

The BECL comprises valuable data, we gladly share with other researchers. The project’s website informs you about itself and offers the opportunity to download the BECL:

PUNKT in NLTK package

Based on Kiss, Tibor & Jan Strunk (2006) Unsupervised Multilingual Sentence Boundary Detection. Computational Linguistics. 485-525. (seehere) PUNKT has been implemented for the NLKT project and integrated as part of the NLTK package. On the project’s website you can find anintroduction to the PUNKT sentence tokenizerand the package’ssource code