Skip to content


Personal tools
You are here: Home » Corpora » Corpora & Resources » Corpora & Resources

Corpora & Resources




Corpus of Spontaneous Spoken Italian (Adult and Child)

Corpus Stammerjohann 1965 e Corpus LABLITA per il confronto diacronico
Il primo corpus di italiano parlato realizzato nel 1965 e un corpus comparabile degli anni 2000 per il confronto diacronico. Distribuzione dietro licenza Creative Commons

Integrated Reference Corpora for Spoken Romance Languages (IST-2000-26-228)

Corpus Oral Didáctico Anotado Lingüísticamente

Corpora Didattici Italiani di Confronto
2 billions word Italian Web Corpus with a modern search interface

Information Structure Database

LABLITA Keyword Extractor
Extraction tool for single and multi-term keywords from multilingual (english, french, german, italian, spanish) texts

Website Crawling and Text Processing Infrastructure developed inside Project

Metadata of our corpora are freely available on this web site.

Access conditions

The Metadata collection of all corpora is freely avaliable in this web site

The Lablita Corpus is avaliable in accordance with the following possibilities:

  1. within the C-ORAL-ROM corpus
    • A significant portion of the Lablita Corpus of Adult Spoken Italian (roughly 36 hours for 300.000 transcribed words) is distributed by ELDA (European Language Resource Dinstrubution Agency, Paris) within the C-ORAL-ROM collection of romance corpora, through licence agreement
    • The C-ORAL-ROM corpus is also distributed in encrypted form for personal use by J. Benjamins Publishing Company in one DVD, together with E. Cresti & M. Moneglia (eds.) C-ORAL-ROM. Integrated Reference Corpora for Spoken Romance Languages
    • A collection of short samples of the C-ORAL-ROM corpus is freely avaliable in DEMO version in this web site
  2. Longitudinal samples of the LABLITA Collections of Longitudinal Corpora of Early Acquisition of Italian are freely avaliable in this web site
  3. A sampling of the Stammerjohan Corpus is going to be published
  4. Larger selections of the LABLITA corpus can be accessed by priviate or public bodies for research and development within the frame of specific projects. The terms of the access are extabilished thorough licence agreement with the Italian Departement of the University of Florence which will limit the use of corpora to the purpose of the project itself.
    The user can select his preferred sampling using the metadata in this web site and then contact E. Cresti. Costs and condition vary in accordance with the sampling and purpose of the project.
Created by admin
Last modified 19 November 2013, 14:57

Powered by Plone

This site conforms to the following standards: