1 Preliminary Survey of European Portuguese Frozen Sentences Jorge Baptista 1,2, Anabela Correia 1, Graça Fernandes 1 1 Univ. Algarve, Portugal 2 L 2 F,

1 1 Preliminary Survey of European Portuguese Frozen Sentences Jorge Baptista 1,2, Anabela Correia 1, Graça Fernandes 1 1 Univ. Algarve, Portugal 2 L 2 F, INESC-ID Lisboa, Portugal 1 st Iberian Workshop on Contrastive Grammar Univ. Algarve, Faro, Portugal November 28-29th, 2005

2 2 Structure Presentation Collecting Frozen Sentences Collecting Frozen Sentences Defining Frozen Sentences Defining Frozen Sentences Classification Classification Format of Dictionary Format of Dictionary Syntactic Properties Syntactic Properties Classification Problems Classification Problems Concluding Remarks Concluding Remarks

3 3 Collecting Frozen Sentences Many frozen sentences, especially those that are most usual or most obviously idiomatic, have already been collected both in general and in specialized dictionaries of idioms. In these dictionaries, frozen sentences are usually undistinguished from other types of multiword expressions, such as compound nouns, adverbs, prepositions and conjunctions; proverbs, etc. In order to build this Lexicon-Grammar of frozen sentences of European Portuguese, several sources were used, including (general and specialized) dictionaries, and these were completed with information retrieved from newspapers, magazines, the web, etc. and our knowledge as native speakers of Portuguese. [1] Basically, Mello (1986), Santos (1990), Simões (1993), Moreira (1996), and Neves (2000). The electronic dictionary of frozen sentences of Brazilian Portuguese (Vale, 2001) was also consulted, but many of those sentences either do not exist if European Portuguese or else they present substantial syntactical and lexical differences, so that a detailed comparative study is in order. Many sentences were checked against corpora of different nature, using the Linguateca resources (www.linguateca.pt), and web browsers.

4 4 no easy definition heterogeneous expressions Frozen sentences are elementary sentences where the main verb and at least one of its argument noun-phrases are distributionally constraint, and usually the global meaning of the expression cannot be calculated from the individual meaning of its component elements when they are used independently (M. Gross 1982, 1989, 1996; G. Gross 1996; Baptista et al 2003; Ranchhod 2003). Defining Frozen Sentences

5 5 For that reason, the whole expression must be taken as a complex, multiword lexical unit: (1) O João fechou-se em copas (lit:John closed himself in hearts; John isolated himself) the verb-object combination (fechar-copas) is frozen. One cannot replace copas (hearts) by other noun of the same lexical paradigm: * O João fechou-se em (espadas + paus + ouros) (lit: John closed himself in spades + clubs + diamonds)

6 6 The formal framework of M. Gross (1982, 1989, 1996) was adopted to classify frozen sentences. The classification is based on sentence structure, number and type of noun phrases (NP) attached to the main verb, their frozen (C) or free nature (N), as well as the syntactic properties of the construction. Table 1 (next) shows some formal classes, their internal structure, an illustrative example, and the approximate number of sentences collected so far. Classification of Frozen Sentences See Leclère 2002, for an updated overview of the current status of French Lexicon-Grammar

7 7 Table 1. Classification of frozen sentences (extract) ClassStructureExampleSize C1 N 0 V C 1 O Pedro matou a galinha dos ovos de ouro 800 CAN N 0 V ( C de N ) 1 = C 1 a N 2 O Pedro arrefeceu os ânimos ( de Maria = à Maria ) 200 CDN N 0 V ( C de N ) 1 O Pedro queria a cabeça da Maria 100 CP1 N 0 V Prep C 1 O Pedro bateu com a porta 900 CPN N 0 V Prep ( C de N ) 1 O Pedro foi aos cornos do João 100 C1PN N 0 V C 1 Prep N 2 O Pedro arrastou a asa à Maria 400 CNP2 N 0 V N 1 Prep C 2 O Pedro tirou o relógio do prego 350 C1P2 N 0 V C 1 Prep C 2 O Pedro deitou mãos à obra 400 CPP N 0 V Prep C 1 Prep C 2 O Pedro foi de cavalo para burro 200 CPPN N 0 V C 1 Prep C 2 Prep C 3 O Pedro deitou o bebé fora com a água do banho 50 Total 3,500 Frozen sentences with sentential subjects (C0Q, C5) or objects (C6), or with frozen subject noun phrases (C0) were not considered in this paper. N and C stand for free or frozen noun phrases, respectivelly; N 0 is the subject, N 1, N 2 and N 3 the first, second and third complement; V is the verb and Prep a preposition.

8 8 Compared with figures reported for other languages: French (+20,000; M.Gross 1996), Spanish (3,500; Mogorrón-Huerta 2002), Modern Greek (4,500; Fotopoulou 1993) Brazilian Portuguese (3,500; Vale 2001), it is clear that our own lists are still far from complete and they should, in fact, be completed, probably using other corpus-based methods for lexical acquisition (McKeown & Rodev 2000, Mutsimoto 2003). Classification of Frozen Sentences

9 9 Format of Dictionary The Lexicon-Grammar may be viewed as an electronic dictionary of frozen sentences. The electronic dictionary is composed of several matrices, one per formal class. In these matrices, each line is a frozen sentence and the columns contain the lexical elements of the sentence and their syntactic (distributional and transformational) proper- ties. The set of matrices constitutes the lexicon- grammar of frozen sentences.

10 10 Table 2. Class CPN (extract) N 0 =: Nhum N 0 =: N-hum V Vse NegObrig Prep Det C N 1 =: Nhum N 1 =: N-hum de N = a N de N = Poss Example comaraça +-++ O Pedro acabou com a raça da Maria aospés +-++ O Pedro atirou-se aos pés da Maria aoscalcanhares +-++ O Pedro não chega aos calcanhares da Maria emacasaca +-+- O Pedro cortava na casaca da Maria aastrombas +-+- O Pedro foi às trombas do João emacantiga +--+ O Pedro foi na cantiga da Maria aacara +-+- A Maria foi à cara do Pedro emadeixa +-++ O Pedro pegou na deixa da Maria emacara +-++ O Pedro riu na cara da Maria emacara +-++ O Pedro riu-se na cara da Maria deopelo +-+- O salário sai-lhe do pelo aacabeça +-++ A fama subiu à cabeça do Pedro emasombra ++-+ O Pedro vive na sombra da Maria

11 11 Syntactic Properties distributional constraints on free NP (e.g. ±Nhum) intrinsically reflexive constructions (Vse): reflex Pro cannot be zeroed nor replaced by free NP (not referent to the subject) (2) O Pedro atirou-se aos pés da Maria (lit: Peter threw himself to the feet of Mary, Peter humbled himself before Mary) *O Pedro atirou (E + o João) aos pés da Maria (lit: Peter threw E/John to the feet of Mary)

12 12 Syntactic Properties Syntactic Properties (continued) obligatory negation (NegObrig): (3) O Pedro não chega aos calcanhares da Maria (lit: Peter does not get to the heels of Mary ) Peter is not a match for Mary ?*O Pedro chega aos calcanhares da Maria (lit: Peter gets to the heels of Mary )

13 13 Syntactic Properties Syntactic Properties (continued) dative NP restructuring (Leclère 1995): (4a) O Pedro foi às trombas do João = ao João (lit: Peter went to_the snouts of/to John, Peter beat John) (4b)O Pedro foi-lhe (= ao João) às trombas. but in some sentences with the a similar syntactic structure, the reduction to a dative Pro is not possible: (5)O Pedro foi na cantiga do João/ *ao João/ *-lhe (lit: Peter went in_the song of John) Peter was persuaded by Johns ill-intended words

14 14 reduction of free NP to an oblique pronoun and reduction of de N to a possessive pronoun: (5) O Pedro foi na cantiga do João [Pro_Obl]= O Pedro foi na cantiga dele [Pro_Pos]= O Pedro foi na sua cantiga but in some cases, the reduction to a possessive is blocked: (4) O Pedro foi às trombas do João [Pro_Obl]= ? O Pedro foi às trombas dele [Pro_Pos]= *? O Pedro foi às suas trombas. Syntactic Properties Syntactic Properties (continued)

15 15 Syntactic Properties Syntactic Properties (continued) Other relations: Conversion-like transformation (G.Gross 1989; Baptista 1997) (6) O Pedro foi às trombas ao João(CP1) (lit: Peter went to the snouts to John, Peter beat John) =(7) O João apanhou nas trombas do Pedro(CPP) (lit: John got on the snouts from Peter, John was beaten by Peter) - permutation of free NP around frozen elements; - replacement of V ir (active) by apanhar (passive); - change of Prep1

16 16 - similar variants of V as in Vsup Npred entering the Conversion transformation (G. Gross 1989; described for E_PT by Baptista 1997, among others) (8) O João levou/comeu nas trombas do Pedro (lit: John took/ate on the snouts from John, John was beaten by Peter)

17 17 Free vs frozen sentences Free vs frozen sentences (Modes of freezing) In many frozen sentences, V presents the same syntactic structure as in free sentences N 0 desaparecer de Nloc (to disappear from) (9) O João desapareceu do mapa (lit: John disappeared from the map, John went away/escaped) Many sentences, however, show structures with frozen complements unrelated to the basic structure(s) of their free constructions: (10)Esta aldeia não vem no mapa (lit:This village does not come on the map, it is not very important) NB: verb vir/to come does not accept a em Nloc complement

18 18 Sentences with frozen prepositional complements (CP1,CPN, CPP), in particular, show frozen complements of very diverse syntactic and semantic nature Often, these have an adverbial-like status: –locative: (11) O João veio a terreiro (lit.: John came to yard John went public ) –causative: (12) O João pagou pela língua (lit.: John paid for the tongue, John was punished for saying something)

19 19 Classification problems complex sentences –frozen modifiers (13) O João voltou à vaca fria (CP1) (lit. John returned to the cold cow, return to a difficult subject/problem) *O João voltou à vaca que (era/estava) fria */?* A vaca (era/estava) fria LG formalism (tables) integrate these frozen modifiers, but other types of complexity can hardly be represented in this fashion

20 20 Classification problems Classification problems (continued) frozen subordinate clauses - relative clauses O João comeu o pão que o Diabo amassou / O João comeu um pão#o Diabo amassou esse pão

21 21 –frozen adverbs (14) O João nasceu com o rabo virado para a lua (CP1) (lit: John was born with the bottom turned to the moon John was always very fortunate in everything in his life) / o João nasceu# o rabo do João estava virado para a lua constraint co-reference (João – rabo) Classification problems Classification problems (continued)

22 22 constraint co-reference: (15) A fama subiu à cabeça do Pedro (lit. The fame went up to the head of Peter) *A fama do João subiu à cabeça do Pedro (lit. The fame of John went up to the head of Peter) Classification problems Classification problems (continued)

23 23 –pseudo-transitive predicative constructions: (16) O João/A Maria não se deu por achado/a (CP1) (lit. John/Mary did not give him-/herself by found _ms/fs, he/she didnt stop him/herself from doing something) */# O João/A Maria (era+estava) achado/a (John/Mary was found _ms/fs ) (17) O João deu o trabalho/a tarefa por terminado/a (John considered the work _ms /task _fs as finished _ms/fs ) O João considerou o trabalho/a tarefa terminado/a (John considered the work _ms /task _fs as finished _ms/fs ) O trabalho/A tarefa estava terminado/a (The work _ms /task _fs was finished _ms/fs )

24 24 Coordinated PPs and/or NPs (18) O João agradou a gregos e a troianos (CPP) (lit.: John pleased to Greeks and Trojans, John pleased everybody) º O João agradou a gregos º O João agradou a troianos In this case, Prep 2 can be zeroed: O João agradou a gregos e troianos but this is not always the case: (19) O João e a Maria brincaram aos papás e às mamãs (CPP) (lit.: John and Mary wanted to play daddy and mummy, wanted to have sex) *?O João e a Maria brincaram aos papás e mamãs

25 25 Concluding remarks and perspectives same methodology and formal criteria as in M. Gross (1982,1989) and other LG teams data comparable to that of other languages improve lexical coverage attentive to these shoehorn solutions, especially complex frozen sentences improve classification towards more homogenous classes experiments on corpora both for lexical acquisition and semi-automatically add information regarding variants

26 26 References Araújo-Vale, Oto, Expressões Cristalizadas do Português do Brasil: Uma Proposta de Tipologia (Ph.D. Thesis). Araquara (Brazil): UNESP. Chacoto, Lucília, Estudo e Formalização das Propriedades Léxico-Sintácticas das Expressões Fixas Proverbiais. (M.A. Thesis). Lisbon: FLUL. Fotopoulou, Aggeliki, Une classification des phrases à compléments figés en grec moderne. (PhD Thesis). Paris : Univ. Paris 8. Gaatone, David, A quoi sert la notion d «expression figée» ?, in Buvet, P.-A., D. le Pesant, M. Mathieu-Colas (eds.), Lexique, Syntaxe et Sémantique, BULAG (hors série), Besançon : Centre Lucien Tesnière/PUFC, pp Gross, Gaston, Degrée de figement des noms composés. Langages 90. Paris : Larousse, pp Gross, Gaston, Les Expressions Figées en Français. Paris: Ophrys. Gross, Maurice Une classification des phrases figées du français. Revue Québécoise de Linguistique Montréal : UQAM, p Gross, Maurice Les nominalisations dexpressions figées. Langue Française 69, Paris: Larousse, pp Gross, Maurice Les limites de la phrase figée. Langages 90. Paris: Larousse, pp Gross, Maurice Les expressions figées : une description des expressions françaises et ses conséquences théoriques. Rapport Téchnique 8. Paris : LADL-Univ. Paris 7 / CERIL. Gross, Maurice Lexicon-Grammar. in K. Brown and J. Miller (eds.). Concise Encyclopedia of Syntactic Theories. Cambridge: Pergamon, pp Jurafsky, Daniel and James H. Martin, 2000, Speech and Language Processing. New Jersey: Prentice Hall. Leclère, Christian, Sur une restructuration dative. Language Research Seoul: LRI- Seoul National Univ, pp Leclère, Christian, Organization of the Lexicon-Grammar of French Verbs, Linguisticae Investigationes 25-1, Amesterdam: John Benjamins Pub. Co., pp McKeown, Kathleen R. and Dragomir Rodev, 2000, Collocations, in Dale, R., H. Moisl and H. Sommers (eds.) Handbook of Natural Language Processing. New York: Marcel Dekker Inc., pp Mejri, Salah, Le figment lexical. Description linguistique et structuration sémantique. La Manouba (Tunis) : Pub. Fac. Lettres. Melcuk, I, La phraseologie et son rôle dans lenseignment / apprentissage dune langue étrangère. ELA, Didier Érudition, pp Mello, Fernando R., Nova Recolha de Provérbios Portugueses e Outros Lugares-Comuns (2nd. ed.). Lisbon: Ed. Afrodite. Mogorrón-Huerta, Pedro, La expressividad en las locuciones verbales españolas y francesas. Alicante: Pub. Univ. Alicante. Moreira, António, Provérbios Portugueses. Lisbon : Ed. Notícias. Mutsimoto, Yuji, Lexical Knowledge Acquisition, in Miktov, R. (ed.) The Oxford Handbook of Computational Linguistics. Oxford: OUP, pp Neves, Orlando, Dicionário de Expressões Correntes (2nd. ed.) Lisbon: Ed. Notícias. Ranchhod, Elisabete, Cristina Mota, Jorge Baptista, A Computational Lexicon for Automatic Text Parsing, Proceedings of SIGLEX99: ACL/NScF, pp Ranchhod, Elisabete M., O lugar das expressões fixas na gramática do Português. in Castro, I. and I. Duarte (eds.), Razão e Emoção, vol. II, Lisbon: INCM, pp Santos, António, Novos Dicionários de Expressões Idiomáticas. Lisbon: João Sá da Costa. Silberztein, Max, Dictionnaires électroniques et analyse automatiques de textes : le système Intex. Paris : Masson. Silberztein, Max, Intex Manual. Simões, Guilherme A., Dicionário de Expressões Populares Portuguesas. Lisbon: D. Quixote. Acknowledgement : Fundação Calouste Gulbenkian.

