Écho des études romanes 2009, 5(1):7-25 | DOI: 10.32725/eer.2009.002
Approccio quantitativo alla produttività morfologica: alcuni sviluppi recentiItalian
- Università Carlo di Praga
Quantitative approach to morphological productivity: some recent developments
This article aims at presenting the quantitative approach to morphological productivity based mainly on Baayen's work. The discussion starts out from the widely accepted distinction between a qualitative and a quantitative approach. It is argued that there are two main quantitative approaches: one based on dictionaries, the other on large text corpora. While the dictionary-based investigation is limited to measures based on type frequency (V), the corpus-based research requires another variable: the token frequency (N). The main idea behind the relation of type frequency and token frequency is that the former (V) can be viewed as a function of the latter (N). The increasing value of N (given by the corpus size) will lead to the increasing value of V. This relation gives rise to the definition of vocabulary growth curve (BAAYEN, 1992 ; 2008). Two additional measures are also presented. The rate at which the vocabulary grows can in fact be captured by the proportion of hapax legomena (V1), the types that occur precisely once, to the overall number of tokens of the formations with a given affix. The notion of vocabulary growth rate (P = V1 / N) (BAAYEN, 1992) is thus introduced. Finally, a third statistical tool of modelling the relation of V, N and V1, put forward by BARONI, EVERT, 2006, is presented. It is the frequency spectrum, which is a specific object that views the number of types (V) as a function of a frequency rank (m) assigned to every type according to its token frequency. Some problems typical of this quantitative approach are also discussed, namely the difficult relation between hapax legomena and neologisms, and the role of the number of tokens for the assessment of the productivity. As far as the role of the number of tokens is concerned, it is shown that - in the light of the evident fact that the measure depends directly on the corpus size - it is not possible to compare corpora of different sizes using this measure (cfr. BAAYEN, 1992 : 117). In order to overcome this problem, two main techniques are presented: binomial interpolation and extrapolation. Especially, three modes of extrapolation are introduced. In conclusion, something is said about the research and particular studies being conducted within this framework.
Keywords: morphological productivity; quantitative approach; hapax legomenon; neologism; binomial interpolation; extrapolation
Published: June 11, 2009 Show citation
References
- BAAYEN, Harald R. (1992), Quantitative aspects of morphological productivity, in BOOIJ, G., MARLE, J. VAN (eds.), Yearbook of Morphology 1991, Dordrecht, Kluwer, pp. 109-149.
Go to original source...
- BAAYEN, Harald R. (2001), Word frequency distributions, Dordrecht, Kluwer.
Go to original source...
- BAAYEN, Harald R. (2008), Analyzing Linguistic Data. A Practical Introduction to Statistics Using R, Cambridge, Cambridge University Press.
Go to original source...
- BAAYEN, Harald R. (2009), Corpus linguistics in morphology : Morphological productivity, in LÜDELING, Anke, MERJA, Kytö (eds.), Corpus Linguistics. An International Handbook, Berlin, Mouton de Gruyter, vol. 2, article 41, pp. 899-919.
Go to original source...
- BAAYEN, Harald, RENOUF, Antoinette (1996), Chronicling the Times. Productive Lexical Innovations in an English Newspaper, Language, 72, pp. 69-96.
Go to original source...
- BARONI, Marco (2007), I sensi di ri-. Un'indagine preliminare, in MASCHI, R., PENELLO, N., RIZZOLATTI, P. (eds.), Miscellanea di studi linguistici offerti a Laura Vanelli, Udine, Forum, pp. 163-171.
- BARONI, Marco (2009), Distributions in text, in LÜDELING, Anke, MERJA, Kytö (eds.), Corpus Linguistics. An International Handbook, Berlin, Mouton de Gruyter, vol. 2, article 37, pp. 803-822.
Go to original source...
- BARONI, Marco, EVERT, Stefan (2006), The zipfR package for lexical statistics : A tutorial introduction. Disponibile su : http://zipfr.r-forge.r-project.org/.
- BAUER, Laurie (2001), Morphological Productivity, Cambridge, Cambridge University Press.
Go to original source...
- CARSTAIRS-MCCARTHY, Andrew (1992), Current Morphology, London and New York, Routledge.
- CORBIN, Danielle (1987), Morphologie dérivationnelle et structuration du lexique, 2 voll., Tübingen, Niemeyer.
Go to original source...
- DAL, Georgette (2003), Productivité morphologique : définitions et notions connexes, Langue française, 140, pp. 3-23.
Go to original source...
- DAL, Georgette, FRADIN, B., GRABAR, N., LIGNON, S., NAMER, F., PLANCQ, C., YVON, F., ZWEIGENBAUM, P. (2007), Linguistic prerequisites to the calculation of morphological productivity and first results. Relazione presentata alle Journées ATALA, Paris, November 10, 2007.
- DISC - Dizionario Italiano Sabatini-Coletti Compact versione 1.1. Milano, Giunti, 1997.
- EVERT, Stefan (2004), A simple LNRE model for random character sequences, Proceedings of JADT 2004, pp. 411-422.
- EVERT, Stefan, BARONI, Marco (2006a), Testing the extrapolation quality of word frequency models. Proceedings of Corpus Linguistics 2005, disponibile su http://www.corpus.bham.ac.uk/PCLC/.
- EVERT, Stefan, BARONI, Marco (2006b), The zipfR library : Words and other rare events in R. Relazione presentata all'useR! 2006 : The Second R User Conference, Vienna, Austria.
- EVERT, Stefan, BARONI, Marco (2007), zipfR : Word frequency distributions in R, in Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Posters and Demonstrations Session, Prague, Czech Republic, pp. 29-32.
Go to original source...
- GAETA, Livio, RICCA, Davide (2002), Corpora testuali e produttività morfologica : i nomi d'azione in due annate della Stampa, in BAUER, R. - GOEBL, H. (a cura di). Parallela IX. Testo - variazione - informatica. Text - Variation - Informatik, Wilhelmsfeld, Gottfried Egert Verlag, pp. 223-249.
- GAETA, Livio, RICCA, Davide (2003), Frequency and productivity in Italian derivation : A comparison between corpus-based and lexicographical data, Italian Journal of Linguistics / Rivista di Linguistica 15, 1, pp. 63-98.
- GAETA, Livio, RICCA, Davide (2006), Productivity in Italian word formation : A variable-corpus approach, Linguistics 44, 1, pp. 57-89.
Go to original source...
- LÜDELING, Anke, EVERT, Stefan (2005) The Emergence of Non-Medical -itis. Corpus Evidence and Qualitative Analysis, in KEPSER, S., REIS, M. (eds.), Linguistic evidence. Empirical, Theoretical, and Computational Perspectives. Berlin, Mouton de Gruyter, pp. 315-333.
Go to original source...
- PLAG, Ingo (1999), Morphological Productivity. Structural Constraints in English Derivation, Berlin, Mouton de Gruyter.
- PLAG, Ingo (2006), Productivity, in AARTS, B., MCMAHON, A., The Handbook of English Linguistics, Oxford, Blackwell, pp. 537-557.
Go to original source...
- RADIMSKÝ, Jan (2006), Les composés italiens actuels, Paris, Cellule de Recherche en Linguistique.
- RICCA, Davide (2005), Al limite tra sintassi e morfologia : i composti aggettivali V-N nell'italiano contemporaneo, in GROSSMANN, M. - THORNTON, A. M. (a cura di) La formazione delle parole, Atti del XXXVII congresso della Società di Linguistica Italiana, Roma, Bulzoni, pp. 465-486.
- RICCA, Davide (2008), VN compounds in Italian : Data from corpora and theoretical issues. Comunicazione presentata al convegno CompoNet Congress on Compounding, Bologna, 6-7 giugno 2008.
- SCALISE, Sergio (1994), Morfologia, Bologna, il Mulino.
- ©TICHAUER, Pavel (2007), Tvoøení slov v souèasné ital¹tinì, Praha, Karolinum.
- ©TICHAUER, Pavel (2009), Morphological productivity in diachrony: the case of the deverbal nouns in -mento, -zione and -gione in Old Italian from the 13th to the 16th century, in MONTERMINI, F., BOYÉ, G., TSENG, J. (eds.), Selected Proceedings of the 6th Décembrettes. Somerville, MA: Cascadilla Proceedings Project, 2009, pp. 138-147. Disponibile su : http://www.lingref.com/cpp/decemb/6/abstract2241.html
- ©TICHAUER, Pavel (in corso di stampa), La produttività morfologica in diacronia : i suffissi -mento, -zione e -gione in italiano antico dal Duecento al Cinquecento, Praha, Karolinum.
This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, distribution, and reproduction in any medium, provided the original publication is properly cited. No use, distribution or reproduction is permitted which does not comply with these terms.