C-ORAL-ROM variation parameters

.

Variation parameters of spoken language in C-ORAL-ROM:

Text structure:

  1. dialogical structure.
    Text distintion amoung:
    • monologues
    • dialogues
    • conversations
  2. domain of use. Text distintio among:
    • family
    • private life
    • public life

  3. media productions

Speakers marked as:

  • Sex and Age
  • Education
  • Occupation

Once each variation parameters is well represented in each corpus it can be considered to represent language variation.
Once each parameter is represented in each corpus with the same consistency the multilingual corpus can be said comparable with respect to those parameters.

Finally Comparability among Romance corpora does not regard single texts like in parallel corpora and in corpora based on restrict semantic domains but each corpus in its whole


In the C-ORAL-ROM perspective Corpora with the same sociolinguistic weight can be compared with respect to

  • communicative acts

  • lexicon

  • syntax

  • intonation

C-ORAL-ROM corpora are the basis for a better knowledge of variability of spoken language structures in the four romance language in both a quantitative and qualitative perspective.

For instance: Middle Length Utterance, number of words per second, number of utterances per minute, quotation frequency, particular syntactic configurations frequency, lexical density, lexical types frequency, the main proportion among different communicative acts seem vary in accordance with the domain of use of spoken language.

The exploitation of C-ORAL-ROM corpora will allow modelling of multilingual speech technology on the peculiar properties of the romance languages in their spoken variety