This document provides all details needed to have access to the research collection of "DepreSym: A Depression Symptom Annotated Corpus".
Any scientific publication derived from the use of this collection should explicitly refer to the following publication:
Pérez, A., Fernández-Pichel, M., Parapar, J., & Losada, D. E. (2023). DepreSym: A Depression Symptom Annotated Corpus and the Role of LLMs as Assessors of Psychological Markers. arXiv preprint arXiv:2308.10758.
The DepreSym collection are available for research purposes under proper user agreements.
SYMPTOM_ID 0 sentence-id RELEVANCE
SYMPTOM_ID: The identifier of the corresponding symptom (from 1 to 21).
0
sentence-id: The idenfitier of the respective Reddit sentence.
RELEVANCE: relevance of the sentence (0 or 1).
DepreSym is a resource that derives from Task 1 of the eRisk 2023 Lab. This is a novel task that consists of identifying sentences that
are indicative of the presence of clinical symptoms in the individuals who wrote these sentences. We follow the BDI-II inventory, a well-studied
clinical questionnaire, which covers 21 standard symptoms of depression.
The sentences come from a large corpus of user's posts that were written by multiple social media users (Reddit users).
The user's posts were segmented into sentences and a TREC-style collection was created (3,807,115 sentences from 3,107 unique users).
All extracted sentences were public and Reddit's terms and conditions allows the use of its contents for research purposes.
This resource comes from a shared-data ranking task introduced in the CLEF 2023 eRisk Lab
To create this new resource, three expert assessors annotated a pool of sentences associated with each of the 21 BDI-II symptoms.
The candidate sentences were obtained using top-k pooling from the rankings of estimated relevant sentences
contributed by the participants in the 2023 eRisk task.
This collection can only be used for research purposes. To obtain the textual collection hat corresponds with the eRisk 2023 Task 1 sentences, please fill the following user agreement and send it to david.losada@usc.es .