What is ACQDIV [ˈækdɪv] Database?
The ACQDIV Database brings together 17 corpora of first language acquisition, representing 15 maximally diverse languages, in a formally and semantically standardized format. It contains video and audio recordings, transcribed speech, and linguistic annotations from these corpora. The database is created and maintained by the TTF DataScience of the NCCR and the UZH ACQDIV Lab, led by Prof. Sabine Stoll.
To learn more about the corpus’ linguistic design, its structure, and its technical realization, please read the corpus manual.
What for?
Which languages?
Corpus | Language | Family | Public |
---|---|---|---|
Allen Inuktitut Corpus | Inuktitut | Eskimo-Aleut | No |
Chintang Language Corpus | Chintang | Sino-Tibetan | No |
Corpus of the Chisasibi Language Acquisition Study | Cree | Algic | Yes |
Demuth Sesotho Corpus | Sesotho | Atlantic-Congo | Yes |
Dëne Sųłıné Language Acquisition Study | Dene | Athabaskan-Eyak-Tlingit | No |
English Manchester Corpus | English | Indo-European | Yes |
MPI-EVA Manchester Corpus | English | Indo-European | Yes |
Hellwig Qaqet Corpus | Qaqet | Baining | No |
Koç University Longitudinal Language Development Database | Turkish | Turkic | No |
MiiPro Japanese Corpus | Japanese | Japonic | Yes |
Miyata Japanese Corpus | Japanese | Japonic | Yes |
MPI-EVA Jakarta Child Language Database | Indonesian | Austronesian | Yes |
Sarvasy Nungon Corpus | Nungon | Nuclear Trans New Guinea | Yes |
Pfeiler Yucatec Child Language Corpus | Yucatec | Mayan | No |
Stoll Russian Corpus | Russian | Indo-European | No |
The Ku Waru Child Language Socialization Study (KWCLSS) | Ku Waru | Nuclear Trans New Guinea | Yes |
Tuatschin Corpus | Tuatschin | Indo-European | No |
How to get access?
The ACQDIV database is not made fully publicly available because it contains sensitive data from unpublished subcorpora. The sub-database with the public corpora can be downloaded on Zenodo.
Access to the ACQDIV database may be granted to researchers upon request to PI. You can access the textual database via API services.