Carolin Odebrecht

July 13th 2021, 3:30 - 5:00 pm UTC

Meeting ID: 121 141 0020
Meeting password: UWmYN82nmg6


Challenges for Data Curation and Selection. Starting Infrastructure Community for Computational Literary Studies

The overall aim of the CLS INFRA, an Integrating Activities for Starting Communities (IASC) project, is to create a unified and easy access to the best European and national infrastructures for the CLS community which has not been fully supported to benefit from the existing infrastructures and data resources. The project will therefore consolidate, integrate and further develop institutional, national and regional efforts to build shared and sustainable access to the high-quality data, tools and knowledge in the field of literary studies, in general, and Computational Literary Studies (CLS), in particular. In my talk, I focus on the challenges on data curation and selection and discuss the following questions: What are literary corpora or how do we need to define them? Which criteria for data selection and curation are needed? How can we develop a data landscape review which enables findability and provides (intellectual) access to CLS resources in order to foster their re-usability?

Dr. Carolin Odebrecht is responsible for the Research Data Management of the Faculty of Language, Literature and Humanities at Humboldt-Universit├Ąt zu Berlin. She has been involved in the setup of the LAUDATIO Repository and the RIDGES Herbology project. She is the leader of the working group on Scholarly Resources and ELTeC in the COST Action CA 16204: Distant Reading for European Literary History and she is involved with the recently started Horizon 2020 project Computational Literary Studies Infrastructure. Her research intrests include interdisciplinary research in Corpus and Computational Linguistics, Computational Literary Studies and the Digital Humanities; non-standard varieties in historical written text; as well as meta data modelling and sustainability for data reuse.