Sonntag, 2. November 2014

Gaulish Stress

This is a re-posting of a comment on Languagehat's blog. In a discussion that somehow moved from a discussion of register differences in Japanese to initial mutations and stress in Celtic (one of the many charms of discussions on that blog), I posted some half-remembered information from Pierre-Yves Lambert, "La Langue Gauloise", Éditions Errance, Paris, 2003 and promised to check and post what I found. So here are (1) the relevant paragraphs in French (from p. 48), (2) an English translation (did it myself, so please point out any mistakes), and (3) a few comments.


(1) L’accent en gaulois
Nous n’avons pas beaucoup d’indices sur l’accent gaulois; quelques formes du latin de Gaule ont un comportement spécial. On a depuis longtemps relevé les deux traitements que présentent les noms des cités gauloises, un accent antépénultième donne Rennes, Bourges, et l’accent pénultième donne Redon, Berry.
Bitúriges > Bourges
Bituríges > Berry
Ainsi Nemausus donne (accent pénultième) Nemours, mais avec accent sur l’antépénultième, Nîmes; Condate donne CondesCondé, Arelate donee Arles ou Arlet.
Autres exemples de formes accentuées sur l’antépénultième: Caturiges > Chorges, Cambo-ritum (« le gué courbé ») Chambord, Eburovices Evreux, Durocasses Dreux, Bodiocasses Bayeux…    
En fait, les formes avec accent antépénultième ne sont pas celles qui posent problème : on en avait aussi en latin et même en latin tardif (ex. : hóminem > homme). Le problème est de savoir pourquoi certains de ces mots sont devenus accentués sur la pénultième, avec allongement de la voyelle pénultième (Cóndate > Condáte > Condāte > Condé). Il n’est pas sûr que le phénomène remonte vraiment au gaulois : cela peut être dû à des disparités socio-linguistiques dans la société gallo-romaine.

(2) Stress in Gaulish
We don’t have many clues about Gaulish stress; some forms of the Latin of Gaul have a special behavior. One has long noted the two treatments that the names of Gaulish cities present, an antepenultimate stress gives Rennes, Bourges, and the penultimate stress gives Redon, Berry.
Bitúriges > Bourges
Bituríges > Berry
Thus Nemausus gives (penultimate stress) Nemours, but with stress on the antepenultimate, Nîmes; Condate gives Condes or Condé, Arelate gives Arles or Arlet.
Other examples of forms stressed on the antepenultimate: Caturiges > Chorges, Cambo-ritum (“the curved ford”) Chambord, Eburovices Evreux, Durocasses Dreux, Bodiocasses Bayeux…    
Actually, the forms with antepenultimate accent are not those that pose a problem: those were there also in Latin and even in Late Latin (e.g.: hóminem > homme). The problem is to know why some of these words have become accented on the penultimate, with lengthening of the penultimate vowel (Cóndate > Condáte > Condāte > Condé). It is not certain that the phenomenon really goes back to Gaulish: this may be due to socio-linguistic disparities in Gallo-Roman society

(3) Comments:
While it is certainly true that Latin knew stress on the antepenultimate, this was only true for words where the penultimate was short. As length was not normally indicated in Latin writing, we partially have to rely on the stress indicated by the modern forms of the names or on etymology to establish Gaulish vowel length; in this case, relying on the modern stress can become a circular argument. But there ought to be no doubt that names like Nemausus or the names in –casses ought to have penultimate stress in accordance with Latin rules, so the antepenultimate stress indicated by some of the modern French names needs to be explained. I’ve also generally seen the “i” in –riges described as long; again, that would demand penultimate accent according to Latin rules. Of course, as the authors state, the indicated stress may not be the Gaulish stress but due to some differences between Gallo-Roman and Standard Latin; that we’ll probably never know.
I originally posted to respond to the statement that Celtic had word-initial stress - an opinion that has a good pedigree but is supported by surprisingly little evidence - the only Celtic branch that clearly has initial stress is Goidelic. Brythonic had penultimate stress and for Continental  Celtic we only have the clues for Gaulish quoted above, which point to antepenultimate stress. Forms like Némausus and Cóndate could also indicate word-initial stress, but most of the forms with more than three syllable rule that out. The only exception is Arelate, where forms like Arles actually don’t indicate antepenultimate stress, but initial (or ante-antepenultimate) stress, something that Lambert seems not to notice. So perhaps Gaulish didn’t have a fixed stress, or prefixed nouns (prefix are-) behaved differently? 

Maltese Souvenir

This summer, I had a few days vacation on Malta and bought a few books as souvenirs. One of them was “Maltese and Other Languages. A Linguistic History of Malta” by Joseph M. Brincat (Midsea Books, Malta, 2011). As the title says, the book explores what languages were spoken on Malta throughout its history, and the influences these languages had on Maltese. By the author’s own admission, it is not a historical grammar of Maltese – it mostly concentrates on the influence of other languages (mostly Sicilian varieties of Italian, Standard Italian, and English) had on the lexicon, only rarely touching on syntactic issues.  This is not supposed to be a review – I’ll just quote a few facts that were new to me as an amateur interested in languages, and may be interesting to the readers of this blog as well:

  1. We do not really know what languages were spoken by the general populace of Malta before it became Arabic-speaking in the Middle Ages. As Malta is geographically close to Sicily and in historical times has frequently been influenced from Sicily (even its Arabic settlement may have come from Sicily, not from Northern Africa), and it also during many periods – but not through all archeological periods - shows ties to Sicily in its material culture, it can be assumed that often the same languages were spoken on Malta as on Sicily, but that only replaces an unknown by a not-very-well-known.
  2. As inscriptions have been found on Malta, we know that Punic, Greek, and Latin were known at least to some people there, but we do not know whether these were the languages of the general populace or only of a small elite. It is not even certain that Malta became fully Romanized during the Roman Empire, as there are references to Maltese as “barbari” from early Christian times, implying that they did not speak Latin or Greek.
  3. The available sources seem to indicate that Malta was uninhabited or only sparsely populated by a few bee-keepers etc. during the 7th – 8th  centuries, the period of Islamic conquest, when constant pirate raids threatened the inhabitants of small islands. Therefore, it’s doubtful whether the current inhabitants of Malta have any genetic (not to speak of linguistic) continuity with the pre-Arabic inhabitants of the island.
  4. Nevertheless, such continuity is part of some national myths. One myth is Malta as a “Christian nation baptized by St.Paul” (while actually modern Maltese most likely are the descendants of Arabic speaking Muslim settlers that were converted after the Norman conquest). The other myth is one that Malta shares with Lebanon, another nation of Arabic-speaking Christians, the myth that Maltese (Lebanese) is not Arabic, but a descendant of Phoenician. Linguistically, these myths have no base, but are easily explained as an attempt to find an alternative Semitic ancestry instead of Arabic, an ancestry that is not “tainted” by its association with Islam.   
  5.  After the Norman conquest, Malta came under strong Italian influence, first by Sicilian (11th – 16th century) and then by Standard Italian (from the 16th century). The Sicilian influence is still effective in the form Italian and even modern Latin-based internationalisms are loaned into Maltese – the suffixes show Sicilian forms and, like Sicilian, Maltese accepts only /a/, /i/, /u/ as final vowels in such loans.
  6. During the 19th and early 20th century, there was a three-way linguistic battle between Italian, English, and Maltese for dominance on Malta, with the local elites favoring Italian, the British administration trying to promote English, especially after the unification of Italy, when they feared irredentist currents on Malta, and in the 1920 and 30s, when the supporters of Italian were suspect of Fascist sympathies. Maltese initially was promoted only by a few intrepid writers, until the British administrators discovered its promotion as a weapon in the fight against the Italian-speaking native Maltese elites.
  7. As a result of these influences, Maltese is curiously similar to English in that it has an inherited grammar and basic vocabulary (Maltese’s are Arabic, English’s are Germanic), but that most of its culture words are loaned (from Italian and English in Maltese’s case, French and Latin in English’s case). The Arabic character of the Basic lexicon of Maltese is shown by a Swadesh list Brincat publishes (p. 398),  where he marks only four words as of Italian origin (he writes “seven words”, but lists only four): persuna “person”, muntanja “mountain”, tond “round”, and qarn “horn” – and of these four, qarn surely is Arabic as well and not from Italian corno. I’ll discuss this Swadesh list in a separate post.  
  8. The book discusses the fate of Arabic on Sicily (where it left a sizeable amount of loans in the dialect) and on Pantelleria, where seemingly the populace switched from heavily Italian-influenced Arabic (as spoken on Malta) to a heavily Arabic-influenced Italian a couple of centuries ago. One could speculate that this could have happened on Malta as well, if it had returned to the Kingdom of Sicily after Napoleon expelled the Maltese order, instead of being occupied by Britain.
  9. Compared to Maghreb Arabic, Maltese exhibits some “Eastern” traits and some traits (like a development of /a/ to /i:/) that it shares with (extinct) Andalusian Arabic. Brincat seems to suggest that this is due to changes in Maghreb Arabic that were caused by a second inflow of Bedouins in the Middle Ages, i.e., that Maltese has retained traits that were typical for Western Arabic (including Andalusian and Sicilian) before that inflow.
There’s much more – detailed discussions of various parts of the Maltese lexicon, the role of the Maltese order in the linguistic history of the island, the use of various languages in publications in the 19th and 20th century, results of various socio-linguistic studies done by the author (whose main research interests clearly are Italian dialectology and Maltese socio-linguistics) and his students. I enjoyed reading it very much. 

Samstag, 27. September 2014

Beech Reading

NIL 2-4 treats the PIE Etymon *bhah2g-ó- "beech". They mention that some scholars reconstruct long /a:/ and some (not always the same) scholars link it to the previously discussed *bhag-. In general I don't see any reasonable link between a tree name and a root meaning "share" etc. But there is a possible connection for the Germanic cognates meaning "book, letter".
NIL also mentions (FN 2 on p.3) that there are doubts about the relationship between the words meaning "book, letter" and the continuants of this root meaning "beech". The main formal problem is that the words meaning "letter / rune" seem to go back to a root noun, that actually is attested in Old English, while the beech words are eh2 stems (or n-stems derived from them). So we have derived morphology for what is supposed to be the original meaning ("beech") and a root noun for what is supposed to be the derived meaning ("book, letter").
Elmar Seebold, addressing this issue in his "Etymologie: eine Einführung am Beispiel der deutschen Sprache", (München : Beck 1981, quoted herafter as "Et."), also mentions that the proposed writing on beech tables that is supposed to be behind the change of the meaning from "beech" to "letter" is actually not attested, neither archaeologically, nor in written sources; Germanic runes are attested only on bark, stone, and various household objects (Et., pp. 290-291). It is also clear that the original meaning was "letter", not "book" - the oldest attestations mean "letter" in the singular and "document, book" in the plural, an obvious calque from Latin (littera - litterae) and Greek (gramma - grammata) (Et. p. 290). Seebold argues that the meaning "letter" is derived from the compound Norse bókstafr, Old Saxon bōkstaf, OHG. buohstap, whose second member means "staff". The writing of runes on staffs is widely attested (here Seebold is undermining his previous argument somewhat, as these staffs could of course have been made of beech wood, but the written source he quotes actually mentions ash wood).  
He then adduces a parallel from Welsh, coelbren "sign-wood, lot-wood", composed of coel "sign, omen" and bren "wood", designing a piece of wood covered with signs used to throw lots, a custom also attested for the Germanic people. He takes this parallel as an indication that the first element of bókstafr etc. originally meant "sign, omen, lot", and links it with our old acquaintance, the root *bhag-, reconstructing a root noun Proto-Germanic *bōk-s "lot, portion" (Et. pp. 291-291). That would mean that the word family of "book" is not related to "beech", but that the purported writing on beech tablets is only a folk etymology. (A shorter Version of these arguments can also be found in Kluge(-Seebold), "Etymologisches Wörterbuch der deutschen Sprache", s.v. Buch).
In general, I find this argumentation attractive. A formal problem is that the proposed root noun is attested only in Vedic, as a second part of compounds, with an active meaning "enjoying" (NIL p.1), but not as an independent word with a resultative meaning "allotment, lot", and that it would be the only continuation of *bhag- in Germanic. The same problems woud also arise if, as I proposed,  we eliminate the root bhag- and take its purported continuations as derived from the root bheg- "break"; that root also is not continued in Germanic (at least according to LIV p. 66/67 and NIL, p. 6; the forms with nasal infix mentioned in IEW p.115 look onomatopoetic, for which reason Pokorny himself states that they don't belong to *bheg- ). There is also no root noun formed from *bheg- attested in NIL (p. 6); if we eliminate *bhag-, we would have at least the Vedic root noun mentioned above, but still with the same problems. On the other hand, an isolated root noun from a root that otherwise doesn't have any cognates in Germanic is a perfect candidate for the kind of folk etymology discussed.   

Samstag, 29. März 2014

PIE *bhag- and Armenian bak


This is a follow-up on my thoughts on PIE *bhag. I’ve come across an article by Hrach Martirosyan (“The place of Armenian in the Indo-European Language family: the relationship with Greek and Indo-Iranian”, Journal of Language Relationship, No. 10 / August 2013, p. 85 - 138), PDF here, where he adduces Armenian bak “courtyard; sheep pen, sun or moon halo” (missing in NIL) as a cognate of Indo-Iranian *bha:gá-: Sanskrit bha:gá- m. “prosperity, good fortune, property, personified distribution”, Old Avestan ba:ga- “part”, the descendants of which took on the meaning “landed property, fief, garden” (p.99, §5.1.3). Martirosyan admits the possibility that this is not a cognate, but an old loan from Iranian; he names one argument for it being a loan, namely the fact that the Armenian word is an a-stem, while the Indo-Iranian correspondences are o-stems; incorporation as an a-stem seems to be the expected outcome for an Iranian *ba:ga-; as another argument for a loan I would also see the fact that there seem to be no other formations from a root *bhag- in Armenian. On the other hand, it would have to be an old loan from before the Armenian consonant shift, but Martirosyan admits that there are other such loans.
If this is not a loan, but a cognate, it would require a proto-form *ba:g-a:-, which could be explained as a Vrddhi-formation from *bhag- or point to a PIE *bheH2g-eH2- (Martirosyan’s reconstruction). Therefore, accepting bak as a cognate would in any case require us to posit a root PIE *bhag- or bheH2g- separate from *bheg- “break” (continuants of the latter root are well-attested in Armenian).

Samstag, 28. Dezember 2013

Thoughts on PIE *bhag-


In my haul of presents this year there was a copy of NIL, so I embarked on reading it root-by-root. The first one is *bhag (NIL 1-2), and looking at the evidence for nominal derivations listed, I got a few ideas, which I’ll share below.

1)    The root has abundant nominal derivation only in two families, Indo-Iranian and Greek. These are the same families where, according to LIV 65, verbs formed from said root are attested. Interestingly, there are no matching derivations shared by both Indo-Iranian and Greek, except the o-stem *bhago- (m.): Sanscr. bhaga- “wealth”, Iranian baga- “god, allotment“, Greek phagos “eater” (originally only found as last element of compounds).

2)    Outside these families, the only attested formation is the above mentioned o-stem *bhago- (m.), found in Slavic bogъ “god” and the adjective compounds nebog- (and ubog-, not mentioned in NIL) meaning “poor”, and in Tocharian B pa:ke A pa:k “share”. Slavic also has a secondary derivation bogat- from *bagho-, formed with the productive suffix *-eH2to-. On the surface, therefore, we have three branches (Indo-Iranian, Slavic, Tocharian) showing a meaning “share, allotment, wealth”, and one branch (Greek) showing a meaning “eat”. Both NIL and LIV, following IEW and the communis opinio, take the meaning “share” to be the basic one and the Greek meaning to be a later development.

3)    According to footnote 1 in LIV, the Tocharian cognates are the main reason for positing *bhag, not **bheg with a schwa secundum as the source for Greek  ephagon ("ate" - suppletive aorist to esthio: "eat"). But as per footnote 8 in NIL, at least Adams in his Dictionary of Tocharian B classifies pa:ke as an Iranian loanword due to it having a plural in -nt-. Now, as, NIL states in footnote 6, it is widely assumed that Slavic bogъ loaned the meaning „god“ from Iranian. But it is also possible that the word itself with all its meanings is a loan from Iranian; after all, both meanings “god” and “wealth, allotment” are present in Iranian as well. The sound laws of Slavic don’t allow us to decide between loaned or inherited. But the fact that there are no old verbal formations based on bog- in Slavic and the absence of any cognates in Baltic, together with the identical dual semantics as in Indo-Iranian, speak, in my opinion, for bog- being a loan, not a cognate, in Slavic.

4)    In that case, the Tocharian forms could not be used as evidence for the existence of  /a/ as the root vowel. And instead of a three-to-one preponderance for the meaning “share”, we would have two different meanings in two different branches, as the Tocharian and Slavic correspondences to the Indo-Iranian formation, being due to loaning, not inheritance, should not be taken into account for reconstructing the original meaning.

5)    If, accordingly, there is no need to reconstruct a root containing /a/, it is possible to trace both the Greek and the Indo-Iranian words back to the root *bheg “break” (LIV 66 / IEW 114-115 / NIL 6). The development “break” to “share out” in Indo-Iranian is straightforward; in the verbal system of Indo-Iranian, we would have a neat case where the meaning “break” became associated with the nasal present which, as in Baltic, was spread also to the non-present stems (at least in Vedic), while the non-nasal forms took on the meaning “share”; in Greek, the meaning changed from break” to “eat”, either via the idea of sharing food or via the idea of cutting / chewing it; in any case, the assumed development in this case is not more tortuous than the assumed development “share” > “eat”. In Greek, the family of phag- would seem to be the sole continuant of *bheg.

6)    In summary, it is possible to eliminate the root *bhag “share” from the reconstruction of Indo-European, if one assumes that the Slavic and Tocharian cognates are actually loans from Iranian and that the Indo-Iranian and Greek cognates actually continue *bheg “break”.

I can’t say whether anything of the above is truly original, as I don’t have the means and the time to chase up even the references mentioned in NIL and LIV in order to see whether these thoughts have been discussed before. But as neither NIL nor LIV even mention such a possibility, I’d appreciate my readers to tell me if this has been addressed before and to point out any flaws in my reasoning.

Freitag, 4. Oktober 2013

Pullum on the world roles of English and Chinese

Over at Language Log, Geoffrey Pullum observes how English is currrently the world's lingua franca (obviously correct) and on how Chinese will not become the world's lingua franca "Not in fifty years, and perhaps not ever." His reasons?
First, there is no such thing as the Chinese language: Chinese is a language family, and there are far fewer people who are fluent in the politically dominant member, Mandarin, than the Chinese authorities would like you to think. Second, the Chinese languages share a writing system that is simply not fit for purpose: taking years to learn, and incredibly hard to adapt to many purposes, it is holding China's progress back by many decades. And third, nowhere in the world is there a country outside China where Chinese is used by non-Chinese to communicate with other non-Chinese.

Yeah, right. The existence of language varieties that are not mutually understandable (Scots, anyone?) and a crazy orthography sure have prevented the rise of English. Yes, it won't happen in the next fifty years, but Pullum's stance (although, of course, not his reasoning) looks a bit like that of an 18th century Frenchman regarding the possibility that the language of that rising merchant power from the neighbouring island would ever be able to challenge the dominance of French as the lingua franca of the civilised world. I doubt that orthography, writing systems, or the existence of non-standard varieties play any role in determining whether a language attains the status of lingua franca - it's all about the political, commcercial, and cultural influence of its speakers. It's fairly well possible that China will never reach the degree of political, commercial, and cultural influence that today's English-speaking nations (first and foremost the U.S.) have, but that (and not the writing system or whether Putonghua can crowd out the other Sinitic languages) will determine the status of Chinese in the future.

Montag, 3. Juni 2013

Longest Word in German Abolished

Maybe I ought to have added a few exclamation marks and used a bigger font. Maybe I ought to have added a few titillating pics. After, all, I just used a misleading headline in order to draw attention to this post. The rest of the post will try to be accurate, I promise.
Der Spiegel reports that the longest word in Germany that is actually in use has been "retired". What actually happened (and like with this post, you learn that when you read the article) is that lawmakers in the state of Mecklenburg-Vorpommern abrogated the Rinderkennzeichnungs- und Rindfleischetikettierungsüber- wachungsaufgabenübertragungsgesetz ("Law on the transfer of responsibilities for supervision of the marking of cows and the labelling of beef"). The second part of this (Rindfleisch- etikettierungsüberwachungsaufgabenübertragungsgesetz) was on record as the longest German word actually in current use. German has the ability to form theoretically endless compounds, which are also pronounced and written as one word. So it's easy to make up crazily long compunds to illustrate this, but the ones in actual use are normally quite short - normally the longer ones contain three- to four words. Longer compounds are mostly found in legal terms and scientific and technical terms. The longest one that's sufficiently frequent to be recorded in the Duden (the standard and standard-setting dictionary for German) is Kraftfahrzeug-Haftpflichtversicherung ("third-party motor insurance"), which is a combination of a three-element compound Kraftfahrzeug ("motor vehicle") that nobody uses outside of technicalese and legalese - in everyday language, you'd say Auto or Wagen "car" or Fahrzeug "vehicle", or shorten it to Kfz -, and a four-element compound Haftpflichtversicherung "third-party insurance" that, in everyday language, is often shortened to Haftpflicht (which, by itself, normally means "third-party liability").
Now, in the article it is stated the mentioned law was the longest compound in actual use since the Grundstücksverkehrsgenehmi- gungszuständigkeitsübertragungsverordnung ("Ordinance on the transfer of responsibilities for the approval of real-estate transactions") was abrogated in 2007 (in the article, length is measured by the number of letters). That reasoning is a bit curious - after all, a law can be referred to even after it is abrogated. The word can still be found in various collections of legal acts. And there's a meta-discussion out there about long German compunds where it's bandied about (going by the first four pages of a Google search I did, most attestations of the word are in discussions about long words and compounds, not in legal contexts). So the G-word is still being used, and the same reasoning is valid for Rindfleischetikettierungsüber- wachungsaufgabenübertragungsgesetz. To argue that a word is not used anymore because the thing it describes is not used anymore is a curious mix-up between signifier and signified. That notwithstanding, the article is worth a read.