Encyclopedia of Language and Linguistics

March 30, 2017 | Author: Anaclara Castro | Category: N/A
Share Embed Donate


Short Description

Download Encyclopedia of Language and Linguistics...

Description

Volume 7 Lebanon Meinong

Legal Genres 1

Lebanon: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

Recorded history in Lebanon goes back almost 5000 years to the times of the Phoenician seafaring traders. Arabic was introduced into Lebanon with the Arab conquest in 634–636 A.D. Since the Middle Ages, Lebanon has acquired a reputation for comparatively tolerant politics and has become home to a number of smaller religious and ethnic groups of the area. After the end of the Ottoman Empire, Lebanon was under French mandate from 1920 until it gained independence in 1943. From 1975 until 1991, the country suffered from a civil war, which also involved the neighboring countries, Israel and Syria, and the Palestinian Liberation Organization. Since the end of the civil war, the political system has addressed the representation of the interests of different ethnic and religious groups – including Sunni and Shia Muslims, Druze, and various Christian sects – in the country. The official language of Lebanon is Arabic, which is also the native language of the large majority of its

3.7 million inhabitants. As in other countries of the Arab world, there are two types of Arabic in Lebanon in a diglossic relation: Modern Standard Arabic, which is used in formal contexts, in the media, and in writing, and Levantine Arabic (South Levantine Spoken Arabic) dialects, which are used for spoken and everyday communication. The adult literacy rate in 1990 was 80.3%. In addition to Arabic, about 234 600 Lebanese (or 6% of the population) speak Armenian, for which an active printing and publishing industry exists. Aramaic (Assyrian and Chaldean Neo-Aramaic) is only used as a religious language in Lebanon, although it is still found as a spoken language in neighboring Syria. Because of Lebanon’s long tradition as a center for international trading and commerce, French and, more recently, English enjoy a high status in Lebanon and are comparatively widely known. Contact with foreign languages is reinforced through large Lebanese communities living abroad, especially in France, the United States, and South America. See also: Arabic; Armenian; Syria: Language Situation.

Legal Genres V K Bhatia, City University of Hong Kong, Hong Kong ! 2006 Elsevier Ltd. All rights reserved.

‘‘Cognito verborum prior est, cognito rerum postior est,’’ said Erasmus. If the knowledge of law is more ‘potent,’ the knowledge of language is ‘prior.’ Language undoubtedly plays an important role in the construction, interpretation, negotiation, and implementation of legal justice. It is through a variety of legal genres that an attempt is made to create and maintain a model world of rights and obligations, permissions and prohibitions. In principle, this model world is designed to be consistent with the vision that individual states or nations have of the society they wish to create; however, in practice, it is often constrained by the sociopolitical realities of individual national cultures. In order to regulate the real world of human behavior whenever it is viewed as inconsistent with the model world, these rules and regulations are judiciously interpreted and applied through a system of courts to negotiate and invariably enforce desired behavior. The so-called model world is thus created by imposing rights and obligations, permissions and prohibitions

through legislation, and this, in most Western democratic systems, is seen as the will of the elected representatives of the people in the parliament. However, Bhatia (1993: 102) points out: As legal draftsmen are well aware of the age-old human capacity to wriggle out of obligations and to stretch rights to unexpected limits, in order to guard against such eventualities, they attempt to define their model world of obligations and rights, permissions and prohibitions as precisely, clearly and unambiguously as linguistic resources permit. Another factor that further complicates their task is the fact that they deal with a universe of human behavior, which is unrestricted, in the sense that it is impossible to predict exactly what may happen within it. Nevertheless, they attempt to refer to every conceivable contingency within their model world and this gives their writing its second key characteristic of being all-inclusive.

This view of law and legal justice gives legislation the status of primary legal genre, which is essentially written ‘‘with mathematical precision, the object (though not always attained) being, in effect, to provide a complete answer to virtually every question that can arise’ (Sir Charles Davis, quoted in Renton,

Legal Genres 1

Lebanon: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

Recorded history in Lebanon goes back almost 5000 years to the times of the Phoenician seafaring traders. Arabic was introduced into Lebanon with the Arab conquest in 634–636 A.D. Since the Middle Ages, Lebanon has acquired a reputation for comparatively tolerant politics and has become home to a number of smaller religious and ethnic groups of the area. After the end of the Ottoman Empire, Lebanon was under French mandate from 1920 until it gained independence in 1943. From 1975 until 1991, the country suffered from a civil war, which also involved the neighboring countries, Israel and Syria, and the Palestinian Liberation Organization. Since the end of the civil war, the political system has addressed the representation of the interests of different ethnic and religious groups – including Sunni and Shia Muslims, Druze, and various Christian sects – in the country. The official language of Lebanon is Arabic, which is also the native language of the large majority of its

3.7 million inhabitants. As in other countries of the Arab world, there are two types of Arabic in Lebanon in a diglossic relation: Modern Standard Arabic, which is used in formal contexts, in the media, and in writing, and Levantine Arabic (South Levantine Spoken Arabic) dialects, which are used for spoken and everyday communication. The adult literacy rate in 1990 was 80.3%. In addition to Arabic, about 234 600 Lebanese (or 6% of the population) speak Armenian, for which an active printing and publishing industry exists. Aramaic (Assyrian and Chaldean Neo-Aramaic) is only used as a religious language in Lebanon, although it is still found as a spoken language in neighboring Syria. Because of Lebanon’s long tradition as a center for international trading and commerce, French and, more recently, English enjoy a high status in Lebanon and are comparatively widely known. Contact with foreign languages is reinforced through large Lebanese communities living abroad, especially in France, the United States, and South America. See also: Arabic; Armenian; Syria: Language Situation.

Legal Genres V K Bhatia, City University of Hong Kong, Hong Kong ! 2006 Elsevier Ltd. All rights reserved.

‘‘Cognito verborum prior est, cognito rerum postior est,’’ said Erasmus. If the knowledge of law is more ‘potent,’ the knowledge of language is ‘prior.’ Language undoubtedly plays an important role in the construction, interpretation, negotiation, and implementation of legal justice. It is through a variety of legal genres that an attempt is made to create and maintain a model world of rights and obligations, permissions and prohibitions. In principle, this model world is designed to be consistent with the vision that individual states or nations have of the society they wish to create; however, in practice, it is often constrained by the sociopolitical realities of individual national cultures. In order to regulate the real world of human behavior whenever it is viewed as inconsistent with the model world, these rules and regulations are judiciously interpreted and applied through a system of courts to negotiate and invariably enforce desired behavior. The so-called model world is thus created by imposing rights and obligations, permissions and prohibitions

through legislation, and this, in most Western democratic systems, is seen as the will of the elected representatives of the people in the parliament. However, Bhatia (1993: 102) points out: As legal draftsmen are well aware of the age-old human capacity to wriggle out of obligations and to stretch rights to unexpected limits, in order to guard against such eventualities, they attempt to define their model world of obligations and rights, permissions and prohibitions as precisely, clearly and unambiguously as linguistic resources permit. Another factor that further complicates their task is the fact that they deal with a universe of human behavior, which is unrestricted, in the sense that it is impossible to predict exactly what may happen within it. Nevertheless, they attempt to refer to every conceivable contingency within their model world and this gives their writing its second key characteristic of being all-inclusive.

This view of law and legal justice gives legislation the status of primary legal genre, which is essentially written ‘‘with mathematical precision, the object (though not always attained) being, in effect, to provide a complete answer to virtually every question that can arise’ (Sir Charles Davis, quoted in Renton,

2 Legal Genres

1975). It is hardly surprising that this legal genre has its own unique integrity and identity, often characterized by its use of a number of lexico-grammatical and discoursal resources that are rarely used in any other disciplinary or professional genre. As documented in Bhatia (1982, 1993), this genre is characterized by the use of a complex array of qualifications, often strategically positioned at syntactic points where they are unlikely to attract any ambiguous or unintended interpretation. The Renton committee report (1975) also emphasizes this aspect of legislative genre: Ordinary language relies upon the good offices of the reader to fill in omissions and give the sense intended to words or expressions capable of more than one meaning. It can afford to do this. In legal writing, on the other hand, not least in statutory writing, a primary objective is certainty of legal effect . . . Parliament seeks to leave as little as possible to inference and to use words which are capable of one meaning only.

The use of qualifying expressions is thus a pervasive phenomenon in legislation. They are more central to the rhetorical structure of legislative sentences and thus form the basis of the underlying ‘cognitive structuring’ (Bhatia, 1982). They seem to provide the essential flesh to the main proposition in the legislative provision, without which it will be like a mere skeleton of relatively modest legal significance. Qualifications thus form an important part of the linguistic repertoire of the legal draftsman. This may also be due to the fact that qualifications, particularly (complex) prepositional phrases and various subordinate clauses, are syntactically highly mobile, and the legal draftsman tends to take full advantage of their mobility to insert them at various syntactic positions to achieve the desired level of precision and unambiguity. Precision, especially as a function of adequate specification of legal intentions, is achieved through the use of a variety of legal qualifications (Bhatia, 1982), some of which are used to describe the case(s) to which a particular legislative provision applies. Others are used to impose conditions on the application and implications of the provision, in particular when it comes to the resolution of potential or real conflicts between different legislative acts (see Bhatia, 1982 for details). Qualifications thus form the basis of the underlying cognitive structuring in legislative sentences. They seem to provide the basis for legal specification as emphasized by Caldwell, an experienced parliamentary council in the United Kingdom: . . . if you extract the bare bones . . . what you end up with is a proposition which is so untrue because qualifications actually negative it all . . . it’s so far from the truth . . . it’s

like saying all red-headed people are to be executed on Monday, but when you actually read all the qualifications, you find that only one per cent of them are . . . (qtd. in Bhatia, 1982)

To illustrate this aspect of legislative provision, let me consider the following example: If in the course of winding up a company it appears that any person who has taken part in the formation or promotion of the company, or any past or present officer or liquidator or receiver of the company, has misapplied or retained or become liable or accountable for any money or property of the company, or been guilty of any misfeasance or breach of duty in relation to the company which is actionable at the suit of the company, the court may, on the application of the Official Receiver, or of the liquidator, or of any creditor or contributor, examine into the conduct of the promoter, officer, liquidator or receiver, and compel him to repay or restore the money or property or any part thereof respectively with interest at such rates the court thinks just, or to contribute such sum to the assets of the company by way of compensation in respect of the misapplication, retainer, misfeasance, or breach of trust as the court thinks just. (Hong Kong SAR, 1997: Companies Ordinance (Section 276))

If we ignore the qualifications in this provision, we find the court being assigned unlimited powers to ‘‘examine into the conduct of the promoter, officer, liquidator or receiver, and compel him to repay or restore the money or property or any part thereof . . . to the assets of the company.’’ However, the moment we take into account all the qualifications that have been positioned at various syntactic points, we notice that the court’s powers are not so widely applicable. Firstly, the court can take action only when a specific number of conditions meet within the context of a case description, that is, the winding up of a company; and then there are numerous conditions on its application, such as the involvement and participation of specific descriptions of people in the formation or promotion of the company; also, their actions must be disadvantageous to the company. If all these conditions are met, then the court can exercise the power to examine into the conduct of these persons and compel them to compensate the company. What seemed an unlimited power to the court without the qualifications turns out to be a rather modest authority to examine the conduct of a few officials of the company under specific conditions. Such is the role and function of qualifications in the legislative genre. Precision is also achieved through the use of nominalized expressions, which is often achieved by converting verbal expressions into nominalizations, as underlined in the following example from the Wills act from the Republic of Singapore:

Legal Genres 3 No obliteration, interlineation or other alteration made in any will after the execution thereof shall be valid or have effect except so far as the words or effect of the will before such alteration shall not be apparent, unless such alteration shall be executed in like manner as hereinbefore is required for the execution of the will; but the will, with such alteration as part thereof, shall be deemed to be duly executed if the signature of the testator and the subscription of the witnesses be made in the margin or on some other part of the will opposite or near to such alteration or at the foot or end of or opposite to a memorandum referring to such alteration and written at the end or some other part of the will. (Government of Singapore, 1970: Section 16 (underlining added))

As each rule of law is essentially encoded in a single-sentence legislative provision, it often contains repetitions of actions and concepts. For the sake of avoiding any potential ambiguity of reference, nominalizations are seen as a useful syntactic resource to facilitate precision in such rhetorical contexts. Precision and clarity are also achieved through the use of ‘‘complex prepositional phrases’’ (Quirk et al., 1982: 302), such as in pursuance of, in accordance with, by virtue of, for the purpose of, and several other combinations of preposition-noun-preposition (P-N-P). Legal draftsmen are particularly suspicious of simple prepositions, as they find them potentially ambiguous in meaning, and hence often go for complex prepositions, many of which are rarely used in any other variety of professional discourse. In addition to precision and unambiguity, legislative genre is notorious for being all-inclusive as well. This quality of all-inclusiveness is often achieved through the use of what is referred to as binomials and multinomials, as illustrated in the following example from the U.K. Litter Act of 1958:

permitted, to cause, contribute to, or tend to lead, the draftsman makes it possible for anyone to be guilty of littering at least in 72 different ways. The use of multinomials brings not only precision but also all-inclusiveness in the specification of legal scope. The other device that makes such a construction interesting is the use of all-inclusive expressions of a different kind, such as ‘any thing whatsoever,’ ‘otherwise deposits,’ and ‘any covered place.’ Although these expressions are often viewed as indicators of vagueness and hence lacking in precision in a traditional sense, they bring all-inclusiveness of a different kind to the provision. The legislative genre thus appears to be extremely versatile, using not only precision and allinclusiveness, but also vagueness and indeterminacy for the required specification of legal intentions. Legislative genre is also unique in yet another way, that is, it has a typical cognitive structuring that can be characterized by the following rhetorical structure: If X, then Y shall do Z.

where X is the description of the case to which the provision applies; Y is the legal subject who is given the power or authority to act, or is prohibited from acting in one way or the other; and Z is the legal action that is either permitted, authorized, or prohibited. This can be illustrated by the following example. If a police officer fails to follow the proper procedures stated in this section 1 above, (X) then the person affected by this failure (Y) shall apply to the court for the protection of his interests. (Z)

If any person throws down; drops or otherwise deposits in; into or from any place in the open air to which the public are entitled or permitted to have access without payment, and leaves, any thing whatsoever in such circumstances as to cause, contribute to, or tend to lead to, the defacement by litter of any place in the open air, then, unless that depositing and leaving was authorised by law or was done with the consent of the owner, occupier or other person or authority having the control of the place in or into which that thing was deposited, he shall be guilty of an offence and be liable on summary conviction to a fine not exceeding ten pounds; and for the purpose of this subsection any covered place open to the air on at least one side and available for public use shall be treated as being a place in the open air. (Litter Act, 1958 (underlining added))

Legislative genre thus has a 2-part rhetorical structure consisting of a case description and the legislative provisionary action, which include both the legal subject and the legal action. This form of rhetorical structuring linking case description to specific legislative action also makes an interesting link with the other set of legal genres such as judgments and cases, which are viewed as applications of legislative intentions to interpret and change, wherever necessary, the real world in line with the ideal world as envisioned through legislative procedures. These secondary genres are also seen as illustrations of professional reasoning and use of appropriate authorities either in favor of or against a particular judicial decision. Often these decisions are used as precedents in common-law jurisdictions and hence acquire the status of legal authorities, just like the legislative provisions. As Bhatia (1993: 175) points out:

This short paragraph is interesting in two rather different ways. Through the use of multinomials (Gustafsson, 1984) such as throws down, drops or otherwise deposits, in, into or from, entitled or

Legal cases and legislation are complementary to each other. If cases, on the one hand, attempt to interpret legal provisions in terms of the facts of the world, legislative provisions, on the other hand, are attempts to

4 Legal Genres account for the unlimited facts of the world in terms of legal relations.

In order to fully understand the nature and function of argumentation in legal cases, it is important to understand the relationship between different forms of legal discourse, such as the legal judgment and legislative provision, on the one hand, and the facts and events of the real world and the legal intentions and sociopolitical values and constraints as part of the model world, on the other. The two sets of relationships are mediated through legal reasoning and precedence within a particular legal framework, without which it is impossible to construe application and interpretation of legislation, including rules, statutes, regulations, and ordinances. There is also an interesting parallel between legislation and judgments as instances of legal genres. Legislation consists of two main rhetorical moves, that is, case description and the legal action; similarly, a law case typically begins with the establishing of facts and ends with a judgment that invariably indicates the kind of legal action imposed by the decision of the court. Often there is also an attempt to derive what in legal language is called ratio decidendi, which is a kind of formulation of legal principle that becomes a precedent for subsequent cases of similar description. These unprecedented parallels between the two genres are not purely coincidental. The judgment is the result of a logical argument based on evidence presented in the court and argued and interpreted in the light of rules and regulations as well as precedents from earlier judgments. This can be represented as shown in Figure 1. Judgments and cases are thus the written records of negotiation of justice, which can be viewed as attempts to enforce legislative intentions to bring the real world closer to the model world. A law case is essentially an abridged version of a judgment of a particular judge or a bench based on the actual negotiation of justice in a court of law. Judgments are public documents and are taken as records of court proceedings intended for use as precedents for subsequent judgments, especially in common-law jurisdictions. Cases are used in academic settings to demonstrate the nature and logic of judicial reasoning

Figure 1 Intertextual relations in legal genres.

in the negotiation of justice. As indicated in Bhatia (1993: 118): Legal cases are used in the law classroom, the lawyer’s office and in the courtroom as well. They are essential tools used in the law classroom to train students in the skills of legal reasoning, argumentation and decisionmaking. Cases represent the complexity of relationship between the facts of the world outside, on the one hand, and the model world of rights and obligations, permissions and prohibitions, on the other.

Language is used precisely, clearly, and unambiguously in legislation as well as in judgments; however, in legislation every effort is made to make it allinclusive as well, whereas in judgments an attempt is made to interpret all-inclusiveness in the light of the specific facts of the case. The relationship between the two genres thus is one of construction of principles and application of such principles. The following brief example of a case cited in Bhatia (1993: 120) will illustrate this relationship: Roles v. Nathan (1963) I W.L.R. 1117 Two chimney-sweeps were killed by carbon monoxide emitting from the ventilation system of the boiler on which they were working. They had chosen to ignore a prior warning of this danger. The Court of Appeal held that the defendants were liable, for, in the words of Lord Denning M. R., ‘‘When a householder calls in a specialist to deal with a defective installation on his premises, he can reasonably expect the specialist to appreciate and guard against the dangers arising from the defect.’’ (Bradbury, 1984: 107)

As argued in Bhatia (1993), rhetorical structure for legal cases often includes four moves: case identification, case description (establishing facts), argument, and judgment (which may often include ratio decidendi) (see Table 1). It is interesting to note that legal cases in their most abbreviated forms will essentially include the case description and the legal action applicable to such a description of legally material facts. This essential 2-part rhetorical structure is remarkably similar to the one in the legislative genre. Even when analyzing ratio decidendi, we still find a similar 2-part rhetorical structure (see Table 2).

Legal Genres 5

There is another parallel genre related to written judgment, and that is the negotiation of justice in the courtroom, mediated through the spoken mode. Widely familiar as courtroom interaction, this represents a legal genre embedded in a formalized professional legal setting and is employed to negotiate and maintain social relations through the questioning of witnesses. Although the main participants are the counsel, the witnesses, the jury, and the judge, the major part of interaction involves the examination and cross-examination of witnesses by the two counsels to bring to light the facts of the case, that is, to identify and establish the legally material facts of the case, which are crucial to the description of the case in question. The outcome of questioning as part of the direct and cross-examination thus is as much a function of the contributions made by these two participants, that is the counsels and the witnesses, the verbal strategies employed by them, and the degree of credibility established by them as it is of the supposed facts of the case. Facts therefore are not simply an objective phenomenon in this setting. They are constructed as a result of the questioning. However, it is interesting to note that the courtroom questioning strategies are primarily employed to win cases, not to help the jury and the judge to discover and establish facts. Counsels thus do not present facts as they might be, but they do so as they want the court to see them. This brings in the role of language and control, over not only what the counsel says but also over what the witnesses might say. All of this interaction takes place in a very formal organizational context, where forms of behavior, turn taking, participant roles, questioning and responding strategies, and even the content of questions and responses are all tightly controlled by the

Table 1 Roles v. Nathan (1963) I W.L.R. 1117

Case identification

Two chimney-sweeps were killed by carbon monoxide emitting from the ventilation system of the boiler on which they were working. They had chosen to ignore a prior warning of this danger. The Court of Appeal held that the defendants were liable, for, in the words of Lord Denning M. R., ‘‘When a

Case description

householder calls in a specialist to deal with a defective installation on his premises, he can reasonably expect the specialist to appreciate and guard against the dangers arising from the defect.’’

Legal action (ratio decidenti)

conventions established and enforced by the courts. The conventions are meant to ensure that courtroom examination of witnesses is carried out in formal language strictly enforcing not only turn taking but also the type of speaking that is allowed. The counsels, for instance, are only allowed to ask questions that are specific, answerable, and designed to elicit the evidence or statements of facts related to the case; and the witnesses, for their part, are required to answer these questions appropriately and truthfully, often with a ‘yes’ or ‘no.’ The counsel who is not engaged in questioning at a particular time is allowed to make objections where he feels that inappropriate evidence is being offered or rules being violated, but the judge decides whether the objections are valid or not. By definition there are two participants, the prosecution v. the defense, and in the end, one wins and the other loses. Although the jury and the judge often make the decision, they cannot actively participate in the court proceedings. Just like any other 2-party interaction, the courtroom examination of witnesses gives counsels more liberty to talk, interrupt, control resumption of talk, and introduce new topics. In addition, they even are allowed to a large extent to control the responses of the witnesses. Witnesses tend to acknowledge this kind of control by copying the question in their answers. If they are not able to control the witnesses’ responses, they tend to destroy the witnesses’ credibility by showing contempt for them. However, clever witnesses can sometimes make attempts to minimize the effects of such attempts. All this can be perceived as a game being played almost entirely by the counsels, who are the primary players. Witnesses know relatively little about the procedures of the court, including their own contributions, especially as to how these are interpreted, what effect their contributions will have on the jury or the judge, or on the outcome of the trial (Bhatia, 1997). This is well illustrated by the following example of an account of courtroom interaction from Allen and Guy (1989), provided by Worthington through personal communication in 1984:

Table 2 When a householder calls in a specialist to deal with a defective installation on his premises, he can reasonably expect the specialist to appreciate and guard against the dangers arising from the defect.

Case description Legal action

6 Legal Genres An off-duty policeman in a store had shot and killed an intruder. Investigation had shown a set of burglar tools at the back of the store. The prosecutor was trying to show that there was no ground for presuming criminal intent, and that this was cold-blooded murder. The victim’s wife was testifying for the prosection. Here she is being cross-examined by the defense. Defense Lawyer: Could you tell the court and the jury what your husband’s occupation was? Wife: He was a burglar. This supported the defense’s contention of criminal intent, and secured acquittal for the policeman. (Worthington, 1984, personal communication)

If only the wife had been slightly more familiar with the conventions of the courtroom examination, the task of the defense lawyer would not have been that easy. Another illustration of unfamiliarity with legal conventions and ignorance of legal knowledge on the part of witnesses in courtroom examination comes from the following supposedly made-up interaction (source unknown) between the witness (W), counsel (C), and judge (J): W: That dirty double-crosser John deceived me. C: Your honor, I object. J: Objection sustained. Now try to tell the court exactly what happened. W: He double-crossed me, the dirty lying rat. C: Your honor, I object. J: Objection sustained. Will the witness try to stick to the fact? W: But I am telling you the fact, your honor. He did double-cross me.

The nature and degree of control in the use of language in this derived unequal interaction in the courtroom is also confirmed in the advice that Wellman (1997) seems to have for specialists in cross-examination: In all your cross-examination, never lose control of the witness; confine his answers to the exact questions you ask. He will try to dodge direct answers, or if forced to answer directly, will attempt a qualification or explanation which will rob his answer of the benefit to you. (Wellman, 1997)

Conclusions Genres are products of disciplinary cultures and as such they are constructed, interpreted, and exploited to achieve disciplinary objectives. From the foregoing discussion of the nature and function of legal genres, it is obvious that the most central concern in legal writing is the specification of legal intentions in a clear, precise, unambiguous, and all-inclusive manner, and that the legislative genre is the most central and

dominant form of such expression. The other legal genres, such as judgments and cases in the written mode, and courtroom interaction (direct and crossexamination) in the spoken mode, may be viewed as derived genres, the main purpose of which is to negotiate, document, and report judicial processes and decisions, which are based on appropriate interpretations of legislative intentions. Since these derived genres, whether in spoken or written mode, are often based on interpretations of explicit and precise legislative intentions, they are likewise characterized by precise, clear, and unambiguous expressions of certainty in legal effect. This is also signaled in a complex and often dynamic interplay of ‘intertextuality’ and ‘interdiscursivity’ (Bhatia, 2004) in the construction of these derived genres. There are two more sets of legal genres that form the complete ‘system’ of legal genres, the target genres and the enabling academic genres (Bazerman, 1994). The target genres are once again based on the interdiscursive formations of legislative and judicial expressions, and therefore share the same concerns of clarity, precision, unambiguity, and all-inclusiveness, leading to certainty of legal effect. These include a range of professional genres such as property conveyance documents, contracts, and agreements, including insurance documents, court case documents, and affidavits of various kinds. These are both the products as well as the instruments of legal practice. We also find a range of academic legal genres that are used to train legal professionals within the academy as well as at the interface between the academy and the professional practice, which include textbooks, legal problems, moots, examination essays, legal memoranda, critical essays, problem-solving essays, pleadings, and so forth. To sum up, I have proposed legislative provisions as the primary legal genres, which forms the basis and essence of all legal conceptualizations and practices. I then proposed a set of secondary genres, such as judgments and cases, which are the written reports of judicial processes and negotiations of legal justice; their counterparts in spoken form are courtroom interactions, which include several genres, as well as direct and cross-examinations in the courts of law. The second category of derived genres I have termed target genres, which again share a number of linguistic as well as legal characteristics with the other legal genres. Finally, I identified a set of enabling genres far richer in intertextuality and interdiscursivity than most of the primary or secondary (derived as well as target) genres. These are largely confined to the academic settings used for training and educational purposes. The main purpose of these pedagogic genres is to explain and interpret the model world of rights and obligations, permissions and prohibitions in the context of the real world that we live in,

Legal Genres 7

Figure 2 Genre systems in law.

to train legal professionals to be able to participate in legal practice, and to handle various primary as well as secondary genres. The complete system of legal genres can be summed up as shown in Figure 2.

Acknowledgment This paper is partly based on research conducted with the support of RGC CERG project No. CityU 1108/99H. See also: Law and Language: Overview; Language of Legal Texts.

Bibliography Allen D E & Guy R F (1989). ‘Non-routine conversation in operation crisis.’ In Coleman H (ed.) Working with language: a multidisciplinary consideration of language use in work contexts. Berlin: Mouton de Gruyter. 45. Bazerman C (1994). ‘Systems of Genres and the Enhancement of Social Intentions.’ In Freedman A & Medway P (eds.) Genre and new rhetoric. London: Taylor & Francis. 79–101.

Bhatia V K (1982). An investigation into formal and functional characteristics of qualifications in legislative writing and its application to English for academic legal purposes. Ph.D. diss., University of Aston in Birmingham. Bhatia V K (1993). Analysing genre: language use in professional settings. London: Longman. Bhatia V K (1997). ‘Power and politics of genre.’ World Englishes 16(3), 359–372. Bhatia V K (2004). Worlds of written discourse: a genrebased view. London: Continuum. Bradbury P L (1984). Cases and statutes on tort. London: Sweet and Maxwell. Government of Hong Kong SAR (1997). Bankruptcy ordinance. Hong Kong: HKSAR. Government of Singapore (1970). The wills act. Singapore: Government of Singapore. Gustafsson M (1984). ‘The syntactic features of binomial expressions in legal English.’ Text 4(1–3), 123–141. Quirk R, Greenbaum S, Leech G & Svartvik J (1982). A grammar of contemporary English. London: Longman. Renton D (1975). The preparation of legislation: report of the committee appointed by the Lord President of the council. London: HMSO. The litter act (1958). London: HMSO. Wellman F L (1997). The art of cross-examination. New York: Touchstone.

8 Legal Language: History

Legal Language: History H E S Mattila, University of Lapland, Rovaniemi, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.

This article makes visible the history of legal language through examples taken from various European tongues, some of them major, some minor. A few references are also made to non-European tongues. Because legal terminology always reflects the conceptual system of the law, it is impossible to avoid entirely dealing with the history of legal systems. The following presentation concentrates on modern history (to the present), but some examples date back to the period of archaic law and to Roman and Medieval times. In order to treat logically the complex and heterogeneous materials in the field, the article is divided into three parts: (1) the rivalry between major legal languages, (2) legal language and linguistic interaction, and (3) the intelligibility of legal language. Many interesting aspects (e.g., the transition from oral language to written language in law) are found elsewhere (see Language of Legal Texts).

Rivalry between Legal Languages Language is power, and legal language, used by state administration and by law courts, is an instrument of power par excellence. This is why all strong rulers have aimed at spreading their own tongue into internal and international legal use. The Romans succeeded very well in spreading Latin, and the dominance of legal Latin continued in Europe throughout the Middle Ages, mostly due to the administrative needs of the Catholic Church. After the displacement of Latin in legal activities at the beginning of modern times, a struggle between modern legal languages begun. Legal Latin and the Vernaculars

The Medieval Dominance of Latin The great common denominator of all European legal cultures is Roman law and its language, Latin, used in law irrespective of the language of the people. After the collapse of the Western Empire, this was first visible in the former eastern parts of the Roman Empire. The early Byzantine Emperors used Latin in their administration, and the great codification of Roman law of Emperor Justinian, later called Corpus juris civilis, was essentially edited in Latin. In the course of time, the Byzantine administration and law became entirely Greek even though Latin loan words remained in legal use.

It is easy to understand that Latin was quite quickly given up in the Byzantine Empire because it had to compete with an older cultural language, Greek, that had a status comparable, if not superior, to Latin. In contrast, in the regions of the defunct Western Empire there were, in the beginning, no written languages based on the vernaculars. Even much later, after the creation of new written languages, the authority and the use value of Latin was still overwhelming. Latin was the original language of Roman law, researched in every university and directly or indirectly applied by the courts of many countries. Simultaneously, Latin was needed in administrative matters crossing linguistic borders. Latin was particularly the language of the central administration of the Catholic Church, used in its legal order, canon law, which was largely based on Roman law. It is worth noticing that Latin was also one of the old languages of the common law of England, in spite of the native origin of this law. But the dominance of Latin was not absolute in the Middle Ages. On the coasts of the Baltic Sea, for instance, it has to compete with Low German in legal activities. Switch over to National Languages Generally speaking, the switch from Latin to the national languages of the various countries was very slow in the field of law. This switch was especially slow among academic jurists. In many European countries, the 17th and the 18th centuries were still part of the golden age of legal literature in Latin, and important legal works in Latin were published even later. This was in part due to the rules of the universities. In the field of international legal affairs, Latin was the language of negotiations and treaties until the 17th century, when French started to replace it. This replacement was due to the strengthening of France as a great power in relation to the Holy German Empire and to the Pope, both having usurped Latin as their language. There was much resistance (by the German Empire and other countries) to the use of French in diplomacy and international law: Giving up Latin meant at the same time giving up the principle of linguistic neutrality and equality in the relations between states. In the field of internal legal matters, the records and decisions of law courts, administrative authorities, and notaries were often drawn up in Latin in the Middle Ages, and, in some countries, they still were at the beginning of the modern era. This also applied to statutes. Differences between individual countries were, it is true, considerable. In some countries, the vernaculars were used in addition to or instead of Latin. In the royal law courts of France, for instance,

Legal Language: History 9

French had already become the language of the proceedings at the end of the Middle Ages. In central Europe, but also in other regions, particularly in Italy, the position of Latin as the language of legal activities was strong until the 18th and 19th centuries. This was due to Humanism, to the reception of Roman law, and to academic traditions. If canon law is not taken into account, Latin was used in practical legal activities for the longest time (until the middle of the 19th century) in certain parts of the Austrian Empire: in Croatia, Hungary, and southern Poland (Galicia). In Hungary and Poland, it was a safeguard against the Germanization of these regions and, in Croatia, paradoxically, against the Magyarization of the country. Today, Latin still is the authentic language of the legal system of the Catholic Church, canon law. The jurisdictions of this Church used only Latin until 1917; the vernaculars were then introduced into ecclesiastic legal matters. Even though the Codex juris canonici (1983) now in force also exists in the vernaculars, the sole Latin text of this codification is legally binding. Rivalry between Vernaculars (Modern Legal Languages)

Rivalry within State Boundaries After the displacement of Latin, the rulers of the new national states aimed at stabilizing their superiority in relation to local power centers by different means, among others by the use of a single language. France is a good example of this. French, originally a Romance dialect of Iˆle-de-France, was made the official language of the kingdom by a number of royal ordinances at the end of the 15th century and the beginning of the 16th century. In addition to eliminating Latin, these ordinances, among other things, put an end to the use of Occitan (Languedocien) in the courts of law of the Midi. In addition to language legislation, the founding of new high courts had a key role in turning the recently conquered territories of the kingdom French. Modern examples are numerous, for example, the use of legal Russian in the Soviet Union. Sometimes there have been – and still are – cases of bitter rivalry between two variants of a same language in legal activities. Greece, in particular, is worth mentioning. The official language, and consequently also the legal language, of this country was, from the 19th century until the fall of the military junta (1974), Katharevusa, a variant of Modern Greek that conserves important elements of the classical language. Attempts (also in law) to replace this variant with Demotic Greek, which is closer to today’s spoken language, were made from the beginning of the 20th century. They succeeded only in the

1980s after a period of approximately 10 years of bitter language struggle. In Norway, a similar competition between two language variants (one, Bokma˚l, based on Danish, a cognate language; the other, Nynorsk, based on West Norwegian dialects) has been going on for over 100 years. International Rivalry In the modern era, European legal languages began to spread to other continents due to colonial expansion. This was particularly the case of both the Americas, where Spanish, Portuguese, French, and English were used in legal activities. Changes in the possession of colonies, for example, in Quebec, caused different sorts of rivalry. An important part of the struggle between great legal languages has always taken place in the field of international relations: diplomacy and treaties between states. In this field, the rivalry between French and English over the past few centuries has been of global importance. After having gained the position of the leading great power in the 17th century, France spread its language into many fields, particularly into international relations. After Latin, the French language dominated this field for a long time. This dominance started to break down during the second half of the 18th century and more clearly in the beginning of the 19th century, after the defeat of Napoleon Bonaparte. However, the position of French as an international language of diplomacy and treaties did not become really threatened until much later, in the 20th century, due to the explosive increase in the use of English in this field. The 20th century may be characterized in international legal activities as a period of transition from monolingual French dominance to the bilingual use of French and English and more recently to the monolingual English dominance. Originally, French was the leading working language of the European Communities (now the European Union); today both French and English (and, less frequently, German) are used as working languages in the institutions and organs of the European Union. The Court of Justice excepted (French is the sole language of internal working of the court), some of these bodies use mostly English, and this tendency is strengthening.

Legal Language and Linguistic Interaction Interaction between Legal Language and Common Parlance

The Two-Way Traffic between Legal and Common Vocabulary After social or political changes in different times, new vernaculars have been brought into public use. It has then been necessary to create legal languages based on these vernaculars, for example,

10 Legal Language: History

the creation of the Romance legal languages in the Middle Ages. Later examples are numerous. In Finland, Swedish was the sole official language of the country until the second half of the 19th century. When Finnish also gained official status in Finland at the end of that century, legal Finnish had to be created: a great number of common parlance words were given exact legal definitions, which made possible their technical use in law. Even more recently, democratization processes have resulted to the creation new legal languages. The best example, perhaps, is the Republic of South Africa where there now are 11 official languages, most of them African (Zulu, Xhosa, Sesotho (Sotho), etc.) used in administrative and judicial matters. A great deal of time has lately been spent on developing administrative and legal terminology for these African languages. As well, the creation of a new legal language has often been an important factor in establishing the written language of the country, in general. This is clearly shown by the development of written French at the end of the Middle Ages. A high court, Parlement de Paris, originating from the King’s Council, started its activities as a judicial organ in the middle of the 13th century. It worked in French from the very beginning, constantly creating new words for its purposes. Gradually, a great number of these words were transferred to the written language of other fields and also to common parlance. Some orthographic practices of French are also derived from the old judicial language. In this way, the language of law has played an important cultural role in many countries. Conservatism and Radicalism in the History of Legal Language Having been established, legal language has normally been conservative: Its development has been slower that that of the written language in general. Therefore, the legal language, especially legal terminology, sometimes is almost a language museum. This is clearly demonstrated by legal English. The situation in Greece (mentioned previously), on the other hand, shows that the language used in legal activities may be brought closer to the common parlance if the distance between written and spoken language grows too much. Occasionally conservatism, the typical feature of the language of law, has changed into a radical reformation of legal terminology. This has happened in connection with social crises and revolutions. For example, during the French Revolution, the old legal terminology expressing feudalism was replaced by terms expressing the values of the bourgeoisie. Another good example of this kind of radical change in terminology occurred during the first years of the Soviet state, when the terminology of legal Russian

was entirely reformed because the Bolshevik leadership felt that it was not in harmony with the new spirit of the time. For instance, the term prestuplenie ‘crime’ was replaced by the expression sotsialno opasnoe deistvie ‘socially dangerous activity’. However, this change was only temporary. Later, legal Russian again became more conservative and numerous prerevolutionary terms were restored, and, during the last decades of the Soviet re´gime, Russian legal linguists emphasized that the language of law must be stable and that no linguistic experiments should be allowed in statutory texts. Interaction between Legal Languages

The Transfer of Legal Expressions from Latin to Modern Languages During the long transition period from legal Latin to modern legal languages, much of the Latin legal vocabulary was borrowed by these languages. In the Romance languages and in English, Latin words were often adopted with minor orthographic changes. In other languages (German, the Scandinavian languages, the Slavic languages, etc.), loan translations were more common. Everywhere, many loaned senses of Latin words were taken into the modern languages. In addition to these borrowings, a great number of direct Latin citations continued to be used in modern legal languages. During the first centuries of the transition, these citations were so common that legal texts often were a hybrid language in which Latin and the modern tongue alternated. This is visible even today. According to one author, Latin terms and maxims belong to the ‘‘beloved folklore’’ of the lawyers. Borrowings between Modern Legal Languages Latin has not been the sole source of legal loan words. The French language has also been particularly important during several epochs. Because French was from the 13th century until the end of the 14th century (and also partly into the 15th century and even until the 18th century), one of the languages of law in England, a huge number of French words were transferred into the English legal vocabulary (most being ultimately derived from Latin). Similarly, after the rise of France to the position of the dominant great power in Europe in the 17th century, several legal languages adopted many of the French terms from public and private international law. Many of these were direct citations: lettres de cre´ance, renvoi, ordre public, and so on. In the very recent history, a new wave of French lexical influence is perceivable in the activities of the European Union. For instance, the term acquis communautaire has been adopted in all the languages of the member states, either in the form

Legal Language: History 11

of a direct citation, of a loan word, or of a neologism based on the French expression. The end of the 19th century was the golden age of German legal science, having great influence all over Europe, even in North America. Due to this influence, a number of German citations and loan translations related to the activity of different schools of legal science, among other things, spread into international use: Begriffsjurisprudenz, Pandektenrecht, translations of Rechtsgescha¨ ft, and so on. In the 19th and 20th centuries, English became the most important language of international commerce and commercial law, also used by non-Anglophone contract parties. Much of the English terminology for commercial law was borrowed by other languages. During the past few decades, the same phenomenon can be seen in other modern fields of law, due to the influence of American legal institutions in all countries. Borrowings between European and Non-European Legal Languages Borrowings between European and non-European legal languages had already appeared in antiquity. With the spread of Christianity, loan words from Hebrew were transferred into European legal languages. Later on, some words of Arabic origin were adopted into these languages, for example, avarie and douane in French (through Italian). In the opposite direction, the influence of Greco-Roman legal thinking on the formation of early Islamic legal concepts, although disputed, is probable. In the modern era, European legal languages have strongly influenced a great number of non-European legal languages. The tongues of the colonial powers (Dutch, English, French, Portuguese, and Spanish) were regularly used in the administration of their possessions in the Americas, Africa, and Asia and in the legal proceedings there. When these possessions became independent states and their legal languages were created and developed, the terms expressing European legal concepts were direct citations, words of foreign origin, or loan translations from European languages. In Indonesia, for instance, numerous Malay (Indonesian) law terms were created on the basis of legal Dutch: kasasi ‘cassation, annulment’, eksekusi ‘enforcement’, and so on. Lexical borrowing from non-European legal languages into European legal languages has happened less frequently in recent history. However, loan words are frequent in cases in which European legal languages are used in non-European context, for example, in legal texts in which Islamic law is described or applied in English or French. Indian treatises of Islamic law are teeming with terms such as hiba,

khula, and mutwalli. The same can be said about North African treatises in French. Borrowing of Style Features between Legal Languages Many stylistic features in European legal languages are derived from the style of Medieval Latin documents, often drawn up by the administration (the judiciary included) of the Catholic Church. Indeed, the importance of papal curia in the development of coherent administrative Latin in the Middle Ages was crucial. Its language strongly influenced secular chanceries, courts, notaries, and other organs all over Europe. The language of the administration and the judiciary of the Catholic Church was highly developed. It was by far technically superior to the language of secular organs. Therefore, it became a paragon to be imitated by these organs. Some of the stylistic features of Medieval legal language have persisted for a very long time, for example, the use of long and complicated sentences. Language specialists have to fight against this stylistic feature even today. In certain respects, the styles of the European legal cultures have grown in different directions, due to divergences in legal history. It is particularly striking that there are, in various countries, large differences in the drawing up of legal documents. The best example is the decisions of high jurisdictions: the style of these decisions (structure, references to other texts, length of sentences, formulation of votes, etc.) is strictly bound by the tradition of the particular country.

The Intelligibility of Legal Language The Tradition of Obscurity

Throughout history, legal language has been difficult to understand. This is self-evident when a language totally foreign to the population has been used, and it is also clear when legal language has been spiced with a great number of words of foreign origin. But, in addition, lawyers have traditionally used complex expressions and sentences in their documents due to the rituality and technical refinement of the law. Use of Foreign Languages and Terms of Foreign Origin In all times, languages not known by the local population have been used in law and administration because of the interests of the central power, which was frequently multiethnic. Simultaneously, the use of such languages has been supported by the growth of legal professionalism. The Medieval lawyers had a corporative interest in acquiring and holding the monopoly on knowledge in legal matters. This was guaranteed by their use of languages incomprehensible to ordinary people.

12 Legal Language: History

As already pointed out, the most important cipher throughout Western legal history has been Latin. In the Middle Ages, legal proceedings were often carried on in Latin even when the accused was ignorant of this language, and even later, in early modern times, court decisions were sometimes written in Latin. Hence, the application of the law was incomprehensible to ordinary people – and, due to this, impressive and frightening. In addition to Latin, other foreign tongues were used as well. In Medieval England, law French was one of the legal languages, and, to take a more recent example, the judiciary of Finland used only Swedish until the second half of the 19th century. Even after the transition to using the language of ordinary people in administrative and judiciary matters (and even today, this has not happened everywhere), the intelligibility of legal language has been – and still is – hindered by legal terms derived from foreign languages, especially from those that were used in earlier periods (Latin, law French, etc). Rituality and Technical Refinement The maintenance of the authority of the state has always been one of the major functions of the law. In this purpose, the state has throughout the history emphasized the sacredness of the law, and often connected it with magical elements. Ritual language has been used (e.g., in oaths) to strengthen this effect. Even today, this is clearly discernible in legal English: repetitions and binary expressions are typical of this language, making legal texts complicated and often difficult to understand. In more developed societies, the complexity of legal language was increased by the birth of the legal profession and the growth of the technicality of the law, connected with the demands of exactitude and legal safety. The development of the Latin notary institution, first in Italy and then in other countries, in the beginning of the 2nd millennium had this kind of effect. The legal documents drawn up (in Latin) by notaries were more technical and complicated than earlier documents. In England, common law grew as case law. The development of this kind of law required subtle distinctions between different cases, that is, between their facts and circumstances. The conceptual system of common law therefore became very complicated. This system, again, required a complex terminology, and within this framework each term had to be interpreted in a narrow way. Simultaneously, the completeness of common law, as finely built case law, presupposed that detailed situation combinations were taken into account in all legal texts. As a consequence, common-law statutes and contracts have always been – and still are – very complicated and verbose.

The Growth of the Concern for the Quality of Legal Language

All the phenomena mentioned so far – the strengthening of the authority of the law by using language rituals, the complexity of the law due to its growing technicality, and the use of terminology of foreign origin – have made legal language difficult to understand even when it has coincided with the tongue of the people. Therefore, demands for simplifying legal language have been made throughout the ages. However, the first serious measures to create plain legal language were only taken during the Enlightenment (18th century). This took place mostly in the German language area. The famous Prussian codification of 1794, Allgemeines Landrecht fu¨ r die preussischen Staaten, is a major monument of the plain language planning of the epoch. This codification aimed at stylistic clarity. It also avoided words of foreign origin: Such words are much less frequent in the Allgemeines Landrecht than in earlier Prussian statutes. The concern for the quality of legal language in the Enlightenment was down-to-earth but prejudiced. In Hungary (then a part of the Austrian Empire), for instance, an intelligibility test was in use: Statutory drafts were shown to an ordinary subject, unhesitatingly called buta ember ‘stupid man’, in order to see how much of them he could understand. These Enlightenment ideas became important in other parts of Europe (e.g., Russian) as well. The ideology of the Enlightenment died in Europe at the beginning of the 19th century. This was also visible in the language of law. Germany is an illustrative example. It is true that in this country more and more words of foreign origin were eliminated from the legal language, partly due to the ideology of German nationalism. However, statutes were no longer written for ordinary citizens but, rather, for legal professionals. The great codification of the end of the 19th century, Bu¨ rgerliches Gesetzbuch (1900), still in force, is in appearance written in clear German (it includes very few words of foreign origin), but its terminology gives expression to a very complicated conceptual system. Furthermore, the legal rules in this codification are often formulated in an abstract way that makes them hard to understand. After the Enlightenment, legal language seems to have become obscure everywhere (although in different ways). It is therefore understandable that a new concern for the quality of this language has grown up at the present time. See also: Language of Legal Texts; Language Politics;

Latin; Law and Language: Overview; Legal Translation; Pragmatics: Linguistic Imperialism.

Legal Pragmatics 13

Bibliography Brunot F (1966–1979). Histoire de la langue franc¸ aise des origines a` nos jours I–XIII (rev. edn.). Paris: Armand Colin. Eckert J (ed.) (1991). Sprache – Recht – Geschichte. Heidelberg: C. F. Mu¨ ller Juristischer Verlag. Fiorelli P (1994). ‘La lingua del diritto e dell’amministrazione.’ In Serianni L & Trifone P (eds.) Storia della lingua italiana. Torino: Giulio Einaudi Editore. 553–597. Gandasegui J (1998). ‘Historia del lenguaje judicial.’ In Bayo Delgado J (ed.) Lenguaje judicial. Madrid: Consejo general del poder judicial. 143–248. Ge´ mar J-C (1995). Traduire ou l’art d’interpre´ ter. Langue, droit et socie´ te´ . Ele´ ments de jurilinguistique 2: Application. Sainte-Foy, Canada: Presses de l’Universite´ du Que´ bec. Hattenhauer H (1987). Zur Geschichte der deutschen Rechts- und Gesetzessprache. Hamburg: Joachim Jungius-Gesellschaft der Wissenschaften.

Hiltunen R (1990). Chapters on legal English: aspects past and present of the language of the law. Helsinki: Suomalaisen tiedeakatemian toimituksia. Mattila H (ed.) (2002). The development of legal language. Helsinki: Talentum Media. Mellinkoff D (1963). The language of the law. Boston and Toronto: Little, Brown and Co. Pigolkin A S (ed.) (1990). Iazyk zakona. Moscow: Iuridicheskaia literatura. Polenz P von (1991/1994/1999). Deutsche Sprachgeschichte vom Spa¨ tmittelalter bis zur Gegenwart (3 vols). Berlin: Walter de Gruyter. Schoek R (1973). ‘Neo-Latin legal literature.’ In Ijsewijn J and Kessler E (eds.) Acta conventus Neo-Latini Lovaniensis. Mu¨ nchen: Wilhelm Fink Verlag. 577–588. Tiersma P (1999). Legal language. Chicago and London: University of Chicago Press. Waquet F (1998). Le latin ou l’empire d’un signe. XVIe– XXe sie`cle. Paris: Albin Michel. Zilliacus H (1935). Zum Kampf der Weltsprachen im ostro¨ mischen Reich. Helsinki: University of Helsinki.

Legal Pragmatics B Kryk-Kastovsky, Adam Mickiewicz University, Poznan, Poland ! 2006 Elsevier Ltd. All rights reserved.

A discussion of the intersection of two crucial domains of human activity, language and law, requires an interdisciplinary approach, which is the underlying assumption of this article and one of the main tenets of pragmatics, the study of language in use. The language of the law is not only notorious for its lexical and syntactic complexity, which has given rise to criticism (e.g., Mellinkoff, 1963; Danet, 1980, 1985; Lakoff, 1990), but it also has certain pragmatic peculiarities. The present study on legal pragmatics will provide a pragmaticist’s account of legal issues rather than a lawyer’s view of the linguistic problems he or she faces. Since the direction of fit is from language to law and not from law to language, the assumption is that the aim of legal pragmatics is to construct an interface between the two domains by searching for pragmatic peculiarities in the language of the law. Thus, this article will investigate the instantiations of selected pragmatic concepts in the language of the law. The starting point of my analysis consists of a few observations on the topic made by others before.

1. The turn-taking system used in court is similar to that of other institutional settings (e.g., classrooms or chaired meetings) in that it is more rigid and less flexible than the one operating in everyday face-to-face conversation (Levinson, 1983: 301). 2. The institutionalized character of the court is reflected in formulaic, if slightly archaic and stilted language (Lakoff, 1990: 94). 3. According to Danet (1985: 276), legal discourse is concerned with ‘‘the nature, functions and consequences of language use in negotiation of social order.’’ Despite its formulaic character, legal language employs a wide range of registers. It can represent various styles, ranging from frozen through formal and consultative to casual. 4. Verbal interaction in court exemplifies various questioning strategies, which lend themselves to a pragmalinguistic analysis, since ‘‘courtroom discourse is unilateral in that barristers enjoy a onesided topic control of discourse’’ (Luchjenbroers, 1997: 477). This control is a sign of power, a sociopragmatic concept discussed in the section ‘Power vs. Solidarity’ below. 5. Kurzon (1995) emphasized the role of silence in trial proceedings, and his more recent analysis (Kurzon, 2001) addressed the politeness of the judges. He claims that in formal language politeness may be

Legal Pragmatics 13

Bibliography Brunot F (1966–1979). Histoire de la langue franc¸aise des origines a` nos jours I–XIII (rev. edn.). Paris: Armand Colin. Eckert J (ed.) (1991). Sprache – Recht – Geschichte. Heidelberg: C. F. Mu¨ller Juristischer Verlag. Fiorelli P (1994). ‘La lingua del diritto e dell’amministrazione.’ In Serianni L & Trifone P (eds.) Storia della lingua italiana. Torino: Giulio Einaudi Editore. 553–597. Gandasegui J (1998). ‘Historia del lenguaje judicial.’ In Bayo Delgado J (ed.) Lenguaje judicial. Madrid: Consejo general del poder judicial. 143–248. Ge´mar J-C (1995). Traduire ou l’art d’interpre´ter. Langue, droit et socie´te´. Ele´ments de jurilinguistique 2: Application. Sainte-Foy, Canada: Presses de l’Universite´ du Que´bec. Hattenhauer H (1987). Zur Geschichte der deutschen Rechts- und Gesetzessprache. Hamburg: Joachim Jungius-Gesellschaft der Wissenschaften.

Hiltunen R (1990). Chapters on legal English: aspects past and present of the language of the law. Helsinki: Suomalaisen tiedeakatemian toimituksia. Mattila H (ed.) (2002). The development of legal language. Helsinki: Talentum Media. Mellinkoff D (1963). The language of the law. Boston and Toronto: Little, Brown and Co. Pigolkin A S (ed.) (1990). Iazyk zakona. Moscow: Iuridicheskaia literatura. Polenz P von (1991/1994/1999). Deutsche Sprachgeschichte vom Spa¨tmittelalter bis zur Gegenwart (3 vols). Berlin: Walter de Gruyter. Schoek R (1973). ‘Neo-Latin legal literature.’ In Ijsewijn J and Kessler E (eds.) Acta conventus Neo-Latini Lovaniensis. Mu¨nchen: Wilhelm Fink Verlag. 577–588. Tiersma P (1999). Legal language. Chicago and London: University of Chicago Press. Waquet F (1998). Le latin ou l’empire d’un signe. XVIe– XXe sie`cle. Paris: Albin Michel. Zilliacus H (1935). Zum Kampf der Weltsprachen im ostro¨mischen Reich. Helsinki: University of Helsinki.

Legal Pragmatics B Kryk-Kastovsky, Adam Mickiewicz University, Poznan, Poland ! 2006 Elsevier Ltd. All rights reserved.

A discussion of the intersection of two crucial domains of human activity, language and law, requires an interdisciplinary approach, which is the underlying assumption of this article and one of the main tenets of pragmatics, the study of language in use. The language of the law is not only notorious for its lexical and syntactic complexity, which has given rise to criticism (e.g., Mellinkoff, 1963; Danet, 1980, 1985; Lakoff, 1990), but it also has certain pragmatic peculiarities. The present study on legal pragmatics will provide a pragmaticist’s account of legal issues rather than a lawyer’s view of the linguistic problems he or she faces. Since the direction of fit is from language to law and not from law to language, the assumption is that the aim of legal pragmatics is to construct an interface between the two domains by searching for pragmatic peculiarities in the language of the law. Thus, this article will investigate the instantiations of selected pragmatic concepts in the language of the law. The starting point of my analysis consists of a few observations on the topic made by others before.

1. The turn-taking system used in court is similar to that of other institutional settings (e.g., classrooms or chaired meetings) in that it is more rigid and less flexible than the one operating in everyday face-to-face conversation (Levinson, 1983: 301). 2. The institutionalized character of the court is reflected in formulaic, if slightly archaic and stilted language (Lakoff, 1990: 94). 3. According to Danet (1985: 276), legal discourse is concerned with ‘‘the nature, functions and consequences of language use in negotiation of social order.’’ Despite its formulaic character, legal language employs a wide range of registers. It can represent various styles, ranging from frozen through formal and consultative to casual. 4. Verbal interaction in court exemplifies various questioning strategies, which lend themselves to a pragmalinguistic analysis, since ‘‘courtroom discourse is unilateral in that barristers enjoy a onesided topic control of discourse’’ (Luchjenbroers, 1997: 477). This control is a sign of power, a sociopragmatic concept discussed in the section ‘Power vs. Solidarity’ below. 5. Kurzon (1995) emphasized the role of silence in trial proceedings, and his more recent analysis (Kurzon, 2001) addressed the politeness of the judges. He claims that in formal language politeness may be

14 Legal Pragmatics

taken for granted and is therefore automatically present (it may in fact be presupposed).

Pragmatic Concepts in the Language of the Law The main issue addressed here is the extent to which the language of the law lends itself to a pragmatic analysis, e.g., whether typical pragmatic notions can be found in this type of discourse. My hypothesis is that the language of the law shares most of the pragmatic properties of colloquial language. These are presupposition, deixis, implicature, speech acts, and power vs. solidarity. Presupposition

The notion comes in different guises, such as: semantic, pragmatic, and lexical presupposition, but the most widely recognized distinction runs along the semantics-pragmatics line. Thus, semantic presupposition is a relation between two propositions, such that the presupposing proposition can be denied, whereas the presupposed one is immune to negation, i.e., it is always true. Among the so-called presupposition triggers are a variety of lexical items and syntactic constructions, e.g., Levinson (1983: 181ff) lists more than 30 such items. Take factive verbs like ‘regret,’ which presuppose the truth of their complements, e.g., in (1) I (don’t) regret that it’s raining.

Here the truth of the complement sentence (i.e., that it is raining) cannot be denied, regardless of whether the verb in the main clause is negated or not. In contrast, pragmatic presupposition is a relation between two utterances whose truth/factuality is taken for granted in a given context due to the mutual knowledge of the speaker and the addressee(s). For instance, if I ask a secretary in my department ‘Is the professor in?’ the question would normally presuppose that there is only one professor in our immediate context, i.e., an instance of existential presupposition related to definite descriptions. However, in this case, both I and the secretary know that only one of the many professors in the department, viz. the boss, is referred to with the definite NP ‘the professor,’ rather than with ‘Prof. Brown.’ As can be expected, questions with presuppositions permeate interrogations. Recall the proverbial: (2) When are you going to stop beating your wife?

Commenting on such questions, Shuy (1998: 15) showed that they are particularly useful in the case of suspects whose guilt is uncertain. It turns out

that questions with presuppositions intimidate the suspects who might think that the interrogators already know the facts that they do not. In her analysis of the role of questions in Supreme Court trials, Luchjenbroers (1997: 482), following Danet (1980: 521ff), defined questions in terms of the degree of factuality of the potential answers, ranging from open-ended questions of ‘high’ fact value, through wh-questions, to restrictive ‘yes/no’ questions of ‘low’ fact value. Predictably, the counsel have least control over witness replies with the open-ended questions and maximal control with ‘yes/no’ questions. The latter are also called leading questions, where the interrogators provide the facts of a testimony and the witnesses either confirm or deny them. The counsel have control over the witness, since they do not only know the answer that they expect to hear, but also gear their questions accordingly. Thus, the answer to the counsel’s leading question is presupposed as in the following examples of the most frequent forms which leading questions can take: declaratives, accusatory ‘yes/no’ questions, and alternative questions, cf. (3a), (3b), and (3c), respectively (notice the use of ‘some’ which, in contradistinction to ‘any,’ obviously presupposes the truth of the utterance): (3a) You had some alcohol? (3b) Did you have some alcohol? (3c) Did you or didn’t you have some alcohol?

(Luchjenbroers, 1997: 482). Deixis

Another pragmatic concept of relevance here is deixis, which can be defined as the expression of spatiotemporal relations in language by means of ‘indexicals’ (pronouns and adverbs that indicate three transient notions: the participant roles, the place, and the time of the utterance, labeled person, place, time deixis, respectively), as in the classical example: (4) I am here now.

This pragmatic spatiotemporal domain is relevant to an analysis of legal language, especially to court trial discourse, because there consecutive instances of deictic anchoring reflect the spatiotemporal relations between the actual event, its spoken testimony by the witness/defendant, its written records, and possibly its later reception by potential readers. Since one of the purposes of a court examination is to determine the identity of various persons involved in a case, and the exact location and time of the event, all three major deictic categories are realized in what is called ‘‘factuality of persons, place and time,’’

Legal Pragmatics 15

(Kryk-Kastovsky, 2002: 254ff). The category of person is grammaticalized in natural language by means of personal pronouns ‘I,’ ‘you,’ ‘he/she’ referring to the speaker, the addressee, and the third party, respectively. Consequently, personal pronouns used in court trial discourse should reflect the interpersonal relationships between different participants of the trial. Stygall (1994: 180) presented an insightful proposal that personal pronouns should be redistributed in two categories, placing the referent either in the trial world or in the abstract world of the legal universe. Thus, ‘you’ is ambiguous between two uses: it is deictic when it refers to the addressee(s) and nondeictic in its generic use. Singular pronouns like ‘I,’ ‘he,’ ‘she’ are obviously deictic, whereas ‘it’ can either be deictic (when it points to an entity), or nondeictic, i.e., when used as dummy subject. On the one hand, this observation contradicts the traditional view that while ‘I’ and ‘you’ are unambiguous in everyday discourse, the third person (‘he’/‘she’/‘it’)stands for the entities outside the discourse situation, and thus is often called ‘a nonperson.’ On the other hand, Stygall’s stand correctly reflects one of the possible approaches to court trial discourse, i.e., that its structure consists of textual layers. In his study of old court trial records, Koch suggested three textual layers in which the dialogue going on in court can be embedded: legal framework of questioning, report of a statement, and a dialogue rendered as a quotation (Koch, 1999: 406ff). In addition to the major deictic categories, scholars have also distinguished discourse, emotional, and social deixis. These can be labeled ‘marginal deictic uses,’ since as opposed to person, place, and time deixis, they are not obligatory occurrences determined by the very nature of language, but are totally dependent on the speaker’s choices. Thus, discourse deixis points to a (proceeding or following) portion of the text/discourse, emotional deixis employs demonstratives to mark a personal attitude towards a given entity, and social deixis concerns the use of forms of address as markers of social distance, cf. (5a), (5b), and (5c), respectively (court trial transcripts from Shuy, 1998, emphasis mine): (5a) Q: What stuff did you tell him that’s true? A: That I ain’t did that, that I knew her for a long time. (5b) A: Then when I got to school, this boy told me about what happened. (5c) I have known Ms. Lockhart for about 2 years. Sometime I call her Ms. Lillie. (Shuy, 1998: 164ff).

As we have seen, deixis is not only one of the major properties of spoken language (more than 90% of

human utterances contain deictics), but for obvious reasons, it is even more pervasive in the language of the law. Implicature

The Gricean idea of ‘what is meant but not said’ is undeniably relevant to the language of the law, where actual meanings have to be inferred from examinations, witness depositions, and other forms of judicial discourse. As Mey (1993: 99) stated, implicature is a regularity that cannot be captured in a simple syntactic or semantic rule, but by some conversational principle. One of the best-known examples of implicature comes from Levinson (1983: 97): (6) A: Can you tell me the time? B: Well, the milkman has come.

where A has to infer the answer (e.g., that the milkman always comes around 6 A.M., hence the time has to be about 6 A.M.). In the context of a court trial the process of inferencing and the resulting notion of implicature might constitute useful tools both for the interrogators in asking questions and for the interrogated in providing or evading answers. Atkinson and Drew (1979) analyzed court trial discourse in conversational-analytic terms and have shown that, in contradistinction to conversation, which is more loosely organized (the turns are not prelocated), court examination involves question-and-answer sequences. Since the turns in court trial discourse are prelocated in one direction only (i.e., from the interrogators to the interrogated), any pauses in the examination of witnesses or defendants have an inferentially implicative character, i.e., after the witnesses and defendants provide their answers, the next turn reverts to counsel who selfselect. Inferences also work on a more elusive, psychological side of the court trial discourse, since any utterance can be a basis for moral inferences made about the speaker by the hearer(s) (Atkinson and Drew, 1979: 68). Compare these observations to the role of perlocutions in everyday discourse as opposed to the perlocutions that occur in court trials and often carry moral and social obligations (Kryk-Kastovsky, 2002: 236ff). Inferencing can also be employed to discover what Harris (1994, emphasis original) called ideological propositions in courtroom interaction, such as: (7) This court is a reasonable court – trying to do justice to both sides in some disputes?

As Harris (1994: 161ff) rightly pointed out, the concepts of a court based on reason and exercising justice are ideological assumptions that underlie the judicial system by definition and are therefore often

16 Legal Pragmatics

quoted by public figures as the basic principles securing the credibility of the legal system and political democracy in general. Therefore, by challenging the justness of the court, the statement starts an ideological dispute and thus implicates an oppositional view of reality. Finally, as demonstrated by Shuy (2004: 11ff), the counsel can use inferencing analysis (which involves the working out of implicatures) in order to solve ambiguities occurring in depositions or recorded data. Speech Acts

Speech acts are utterances whereby by saying something the speaker performs certain acts, which are called performatives, as opposed to constatives, i.e., mere statements. The distinction was introduced by Austin (1962) and elaborated on by Searle (1969), who distinguished between five classes of performative utterances: representatives, directives, commissives, expressives, and declarations. In his later work, Searle (1975) emphasized the crucial role that performatives (especially declarations) play in legal language, whereas Danet gives priority to directives since ‘‘Within the facultative-regulative functions of law they are the most prominent in legislation that imposes obligations,’’ (Danet, 1980: 458). Speech acts are among the pragmatic concepts that most frequently occur in legal language, since according to Hencher, ‘‘[s]peech act theory and the law are made of much the same stuff. Pragmatic concepts such as authority, verifiability, and obligation are basic to both’’ (Hencher, 1980: 254). The crucial role of speech acts in the language of the law has also been shown in Jori (1994), Maley (1994), and Stygall (1994). Legal discourse is, by definition, permeated with performatives used by the speakers to perform legal acts. This evidence has been demonstrated by Danet, who claims that representatives, which commit the speaker to the truth of a proposition, can express a strong or a weak commitment, e.g., actions of testifying or swearing vs. asserting or claiming, respectively. Since according to Austin law is a set of commands, even more important to the legal language are directives, future-oriented speech acts intended to change the current state of affairs by making someone to perform some action, e.g., subpoenas, jury instructions, or appeals. Commissives, as the name suggests, commit the speaker to a future action, which includes any kind of contract, whether a business contract, a marriage, or a will. Expressives are supposed to cover cases of the convicted persons, asked before the sentence is announced whether they have anything (personal) to say. This moment is when they have the last opportunity to apologize, excuse

themselves, and deplore their crime. Finally, declarations produce a fit between the words and the world, a change that comes about because of the speaker’s utterance, as in the classic (8) I declare the meeting closed. (Danet, 1980: 458ff)

It must be noted, however, that the position of formulaic expressions in legal language goes far beyond single speech acts, like swearing in a witness, requesting information, or warning the suspect of his/her constitutional rights, as in the case of Miranda Rights: 1. The suspect has a right to remain silent and he need not answer any questions; 2. If he does answer questions, his answers can be used as evidence against him. 3. He has a right to consult with a lawyer and if he cannot afford hiring a lawyer, one will be provided for him without cost. (Shuy, 1998: 52) These speech acts are also part of a larger speech event. Such an event can be analyzed not only in linguistic terms, but also in terms of nonverbal context (e.g., spatial positioning and alignment in the courtroom) that can provide information concerning the social organization of discourse (Philips, 1986). Moreover, the physical characteristics of the courtroom emanate the atmosphere of power exercised by the interrogators (Lakoff, 1990: 91ff; Maley, 1994: 32ff). And it is the notion of power that this article will now address. Power vs. Solidarity

Power in the Courtroom The distinction between power vs. solidarity introduced by Brown and Gilman (1960), was later been taken up by various theories of politeness, especially by Brown and Levinson (1987). The interplay between power and solidarity is apparent in the context of the courtroom where the interrogators exercise their power on the interrogated. The asymmetrical relation was particularly acute in the past, e.g., in early modern England, when judges could use their power by verbally abusing the witnesses and the defendants. For instance, the infamous Judge Jeffreys addressed the interrogated with invectives like ‘blockhead’ or ‘vile wretch’ (KrykKastovsky, 2002: 253). The change in the politeness system to the effect that both social inferiors and social superiors receive the polite forms of address has also had repercussions in the judicial jargon, so that nowadays the distance between the interrogators and the interrogated is much more subtle. While the obnoxious behavior of judges is no longer possible nowadays, some more innocuous ways of indicating

Legal Pragmatics 17

the distance can be detected. On the extralinguistic side, it is the spatial organization of the courtroom that creates distance between the interrogators and the interrogated both due to the superior position of the judge (Lakoff, 1990: 87ff) and the role of the counsel who ‘‘reduce the witness to a function of a puppet’’ (Luchjenbroers, 1997: 480). On the linguistic side, the distinction between power and solidarity can be exemplified by the use of the appropriate forms of address and bythe presence vs. absence of the legal jargon in the speech of the insiders and outsiders of the courtroom, respectively. Moreover, the power contrast can also be detected in the verbal behavior of men and women. As noticed by O’Barr (1982), who follows Lakoff (1975), women’s speech is much more tentative and less convincing due to the use of hedges, (super)polite forms, tag questions, speaking in italics, empty adjectives, hypercorrect grammar and pronunciation, lack of a sense of humor, etc. On the basis of these characteristics, O’Barr makes the distinction between powerful vs. powerless speech and shows that although women use powerless speech more often than men, the two categories are not necessarily to be assigned to the two sexes, but rather they reflect the social position of the speaker (the less powerful the speaker’s social position, the less powerful his/her language). The transcripts quoted by the author convincingly demonstrate that powerful language is much more effective in courtroom discourse than powerless language. Consider the following quotation from O’Barr (1982: 65ff), where the emphasized words and expressions are clear instances of powerless language (emphasis mine): (9) Q: State whether or not, Mrs. A, you were acquainted with or knew the late Mrs. X. A: Quite well. Q: What was the nature of your acquaintance with her? A: Well, we were, uh, very close friends. Uh, she was even sort of like a mother to me.

As stated above, one of the manifestations of power and solidarity are forms of address, and this is what our discussion will now turn to. Forms of Address Legal language as a conservative register has preserved many of the old address forms. Even in countries like the United States, where addressing others with first names is quite common, the language used in court is characterized by a high level of formality. Thus, the judge can be addressed as ‘Your Honor,’ ‘The court,’ or much less formally as ‘Judge Smith’ (although here some restrictions apply). In addressing the judge directly, the traditional formula ‘May it please the court’ is used. Lawyers

are addressed and referred to as ‘Counsel,’ and members of the jury are addressed with the names of their social roles, e.g., ‘Juror Number One’ (depending on the assigned seat). Interestingly, the same formal register holds true for the interrogated, who are referred to as ‘the witness’ and ‘the defendant’ (Lakoff, 1990: 93). Although analogous situations hold in other languages with regard to referring to the participants of a court trial discourse, crosslinguistic variation applies to the ways a suspect/defendant/convict is referred to in the media. While in English, he is invariably referred to as ‘Mr Brown,’ in German the first and second name, without the honorific ‘Herr,’ are employed (‘Harald Braun’), and in Polish merely the second name, again without the honorific, is used (‘Kowalski’). It might be insightful to look into the (historical and sociocultural) reasons why loss of honorifics is the case in the two languages that have much more elaborate address systems than English. After all, German and Polish employ the T/V distinction (grammaticalized as du/Sie and ty/Pan-i, respectively), so that the use of honorifics is subject to strict sociopragmatic principles.

The Structure of Courtroom Discourse In the following section, this article will examine to what extent courtroom discourse can be considered an approximation to everyday speech. As shown above, legal language contains all the major (socio)pragmatic categories present in everyday communication. Therefore, it might be reasonable to assume that the parallel also holds on the discourse-analytic level; in other words, that the organization of courtroom discourse does not markedly differ from everyday discourse. Guidelines for Courtroom Discourse Participants

Intuitively, one of the major characteristics of court trial discourse should be its effectiveness in achieving discoursal goals. Such effective courtroom tactics can be found in trial practice manuals addressed to what O’Barr (1982) calls ‘‘legal tacticians.’’ Just to quote a few selected speech characteristics recommended by the authors of such manuals: 1. Overly talkative witnesses are not persuasive. 2. Narrative answers are more persuasive than fragmented ones. 3. Exaggeration weakens a witness’s testimony, etc. As can be expected, lawyers are also given advice as to effective verbal behavior (O’Barr, 1982: 32ff): 1. Make effective use of variations in questions format to get the most favorable responses for your

18 Legal Pragmatics

client. Interestingly, O’Barr notes that lawyers are often warned against asking questions to which they do not know the answers (i.e., they should resort to leading questions, cf. the section Presupposition above). 2. Another recommendation pertinent to this analysis is to vary the styles of questioning depending on the different kinds of witnesses (men vs. women, the elderly, children, etc.), a differentiation one might dub sociopragmatic, cf. the remarks on powerful vs. powerless speech in the section Power vs. Solidarity above. 3. Finally, a piece of advice that reminds a (text) linguist of a cohesion device, i.e., repetition is useful for emphasis, but it should be used with care. With these guidelines in mind, let us now have a look at the main differences between two types of discourse of interest here, court examination and everyday conversation. Court Examination vs. Conversation

At a first glance, court trial discourse looks analogous to any other discourse. However, Atkinson and Drew (1979) warn the non-initiates against hasty generalizations by showing to what extent court examinations can be compared to everyday interaction. First, if we consider the structural characteristics of conversation outlined by Sacks et al. (1974), i.e., turntaking, turn allocation, transition-relevance places (TRPs), repairs, etc., predictably, everyday conversation is much more loosely organized than court trial discourse, so that in a conversation the turns are (relatively) spontaneous and the number of participants may vary. Second, while in court the distribution of power relations is obvious, in everyday conversation any power relations between its participants are not demonstrated overtly due to the different social setting of this type of discourse, which would imply that the Politeness Principle is observed. Third, although questions and answers often occur in a conversation, they are not a norm, whereas the verbal exchange in a court examination consists solely of question-and-answer pairs (Atkinson and Drew, 1979: 61ff). Moreover, in court examinations turn order is fixed and so is the type of turn contributed by each speaker. As Stygall rightly pointed out, turns and TRPs are controlled by the interrogators, whereas repairs and self-repairs, instigated by the questioning process, come from the interrogated (1994: 117ff). Indeed, the relation between the representatives of the two social roles is asymmetric in that only the interrogators have the right to ask questions, and the turns of the interrogated necessarily constitute (preferably

adequate) answers to these questions. In other words, the turns in court examination, unlike in a conversation, are pre-allocated, or to use the terminology of Sacks et al. (1974), only one party (the interrogators) self-selects. A unique property that differentiates court trial discourse from everyday conversation is what is called ‘legal metacomments’ (Kryk-Kastovsky, 2002: 248ff). These comprise the interrogators’ comments on whatever is going on in court and are an instantiation of the metacommunicative function of language. Legal metacomments easily lend themselves to a pragmatic analysis due to their peculiar characteristics, in a way comparable to discourse deixis in everyday conversation. Thus, legal metacomments either refer (anaphorically) to a portion of previous discourse, or (exophorically) to a situation outside the actual discourse situation, cf. (10) and (11), respectively. In (10), the metalinguistic use of the verb ‘say’ takes the utterance outside the actual situation-of-utterance and gives it a special discourse function (i.e., that of a leading question), whereas in (11) the question is commented on by the judge as leading the witness, i.e., its discourse function is also changed from a simple ‘yes/no’ question to a leading question: (10) So you say you actually ran away on Friday night? (Johnson, 2004: 105) (11) Q: All right. Now the car the man was driving, was it a brown, a brown 1972 Chevrolet Nova? OPPOSITION LAWYER: Objection. JUDGE: Sustained, leading the witness. (O’Barr, 1982: 142)

The Language of the Law – At the Crossroads of Sociopragmatics, Discourse Analysis, and Intercultural Communication It follows from the discussion above that (legal) pragmatics does not suffice to explain the intricate interface of language and the law. One of the reasons is that the relation between the two disciplines cannot be reduced to a simple combination of a few pragmatic concepts that occur in courtroom discourse; it calls for a multidisciplinary approach. Apart from pragmatics proper, a broader approach is necessary. First, the relations between the participants of court trial discourse call for a sociopragmatic explanation in terms of power and solidarity and the role which these concepts play in polite behavior, in the use of forms of address, etc. Second, an analysis of the structure and organization of courtroom discourse should be conducted within the conversation-analytic

Legal Pragmatics 19

and/or discourse-analytic frameworks to provide a comparison between the verbal exchanges in the courtroom and everyday conversation. Finally, on a more global level, a crosslinguistic and intercultural investigation might be in order to juxtapose the intricacies of the use of legal language in various cultural-linguistic communities. Not only would such as investigation reveal the differences between the Anglo-Saxon and European judicial systems, but also the various organizational peculiarities that result from these differences. Since in Great Britain and the United States the trial procedure is adversarial, whereas in most other countries it is inquisitorial, the following consequences ensue. In adversarial systems, the jury (consisting of lay persons) plays a crucial role in the trial proceedings. While the judge presides, the jury is presented the evidence from both the prosecution and the defense, and decides which side sounds more convincing. The inquisitorial trial procedure is much more rigid in that professional judges listen to the evidence presented by a state-appointed attorney and deliver a verdict. This style puts the defendant at a disadvantage, especially if his/her side is not fully represented. The markedly different roles played by professionals and laymen in both legal systems have serious repercussions for the entire organization of the trial procedures and the chances of the defendants (Lakoff, 1990: 86). Finally, it might also be useful to look at the organization of court trial discourse in countries outside Europe and North America and study different legal traditions, e.g., the court trial procedures in China deeply rooted in Chinese culture based on Confucianism, cf. e.g. Gao (2003). See also: Interviewing and Examining Vulnerable Witnesses; Law and Language: Overview; Law on Language; Legal Genres.

Bibliography Atkinson J M & Drew P (1979). Order in court. London: Macmillan. Austin J L (1962). How to do things with words. Oxford: Clarendon Press. Brown R & Gilman A (1960). ‘The pronouns of power and solidarity.’ In Sebeok T (ed.) Style in language. Cambridge, MA: MIT Press. 253–276. Brown P & Levinson S (1987). Politeness: some universals in language usage. Cambridge: Cambridge University Press. Cotterill J (ed.) (2004). Language in the legal process. Basingstoke: Palgrave Macmillan. Danet B (1980). ‘Language in the legal process.’ Law and Society Review 14, 445–464.

Danet B (1985). ‘Legal discourse.’ In van Dijk T A (ed.) Handbook of discourse analysis 1: Disciplines of discourse. New York: Academic Press. 273–291. Gao H (2003). ‘Declarative questions in Chinese criminal court examinations.’ Paper delivered at 8th International Pragmatics Conference in Toronto, 2003. Gibbons J (ed.) (1994). Language and the law. London: Longman. Harris S (1994). ‘Ideological exchanges in British magistrate courts.’ In Gibbons (ed.). 156–170. Hencher M (1980). ‘Speech acts and the law.’ In Shuy R & Shnukal A (eds.) Language use and the uses of language. Georgetown: Georgetown University Press. 245–256. Johnson A (2004). ‘So . . . ? Pragmatic implications of So-prefaced questions in formal police interviews.’ In Cotterill J (ed.). 91–110. Jori M (1994). ‘Legal performatives.’ In Asher R E (ed.) The encyclopedia of language and linguistics. Oxford: Pergamon Press. 2092–2097. Koch P (1999). ‘Court records and cartoons: reflections of spontaneous dialogue in early romance texts.’ In Jucker A, Fritz G & Lebsanft F (eds.) Historical dialogue analysis. Amsterdam: Benjamins. 399–429. Kryk-Kastovsky B (2002). Synchronic and diachronic investigations in pragmatics. Poznan´ : Motivex. Kurzon D (1995). ‘The right of silence.’ Journal of Pragmatics 24, 55–69. Kurzon D (2001). ‘The politeness of judges: American and British judicial behavior.’ Journal of Pragmatics 33, 61–85. Lakoff R (1975). Language and woman’s place. New York: Harper and Row. Lakoff R (1990). Talking power: the politics of language. New York: Basic Books. Levinson S (1983). Pragmatics. Cambridge: Cambridge University Press. Luchjenbroers J (1997). ‘‘‘In your own words . . . .’’ Questions and answers in a Supreme Court trial.’ Journal of Pragmatics 27, 477–503. Maley Y (1994). ‘The language of the law.’ In Gibbons (ed.). 11–50. Mellinkoff D (1963). The language of the law. Boston: Little, Brown and Company. Mey J (1993). Pragmatics: an introduction. Oxford: Blackwell. O’Barr W E (1982). Linguistic evidence: language, power, and strategy in the courtroom. New York: Academic Press. Philips S U (1986). ‘Some functions of spatial positioning and alignment in the organization of courtroom discourse.’ In Fisher S & Dundas Todd A (eds.) Discourse and institutional authority: medicine, education and law. Norwood, NJ: Ablex. 223–233. Sacks H, Schegloff E & Jefferson G (1974). ‘A simplest systematics for the organization of turn-taking in conversation.’ Language 50, 696–735. Searle J (1969). Speech acts: an essay in the philosophy of language. Cambridge: Cambridge University Press.

20 Legal Pragmatics Searle J (1975). ‘Indirect speech acts.’ In Cole P & Morgan J (eds.) Syntax and semantics 3. New York: Academic Press. 59–82. Shuy R W (1998). Language of confession, interrogation, and deception. Thousand Oaks: Sage.

Shuy R W (2004). ‘To testify or not to testify?’ In Cotterill J (ed.). 3–18. Stygall G (1994). Trial language: differential discourse processing and discursive formation. Amsterdam: Benjamins.

Legal Semiotics P C Shon, Indiana State University, Terre Haute, IN, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction As the principal agents of social control, the police are the most visible, accessible, and available representatives of the criminal law (Black, 1980). They, by virtue of their uniform, badge, gun, and other occupational accouterments display palpable coercive power (Niederhoffer, 1967). And as the primary gatekeepers of the justice system, they determine which cases – persons – enter the criminal justice system; to that end they exercise a considerable amount of discretion (Black, 2005). Prior scholars have maintained that the decision to apply the law in police–citizen encounters is axiomatically linked to the demeanor of the citizens involved (Lundmann, 1994). In other words, the likelihood of receiving tickets, being arrested, and having other official – and unofficial – sanctions applied has been demonstrated to be contingent upon citizens’ attitude and comportment toward the police. Police–citizen encounters are, by definition, asymmetrical since the police possess the brute capacity to threaten, harm, and coerce the clients of their service into compliance; at the same time, the encounters are tinged with a veneer of sociality in that the bulk of police work involves a fundamentally interactive process. That is, the essence of police work entails explaining, requesting, directing, and counseling citizens, victims, witnesses, and suspects to get them to do something (Mastrofski et al., 2000). Yet, despite the communicative nature of police–citizen encounters, researchers have not been able to agree on the constitutive elements of disrespectful behavior; this highlights one of the major limitations of existing police scholarship: there has yet to be a principled and rigorous empirical expatiation of the communicative – intersubjective – nature of police work. How a police officer communicates with citizens and colleagues has been regarded as the paragon of police professionalism (Muir, 1977). In a very obvious way, language is the vehicle through which

knowledge, intention, and meaning is transmitted; furthermore, that the police use language to persuade and dissuade citizens from certain courses of action points to its assumed, yet empirically unexplored, rhetorical function. From this standpoint, aside from the obvious capacity to exercise coercion, a theoretically warranted question it generates is how that awesome power is embodied in a much more mundane form of social action. This article, rather than assuming prima facie the capacity of the police to threaten, hurt, and kill the very purveyors of that power, demonstrates how that awesome power is interactionally, structurally, and socially enacted, exercised, and contested in the semiotic details of police–citizen interactions. How is the interaction between the police and the public made meaningful during the interaction? How do the contours of police ideology, authority, and identity extend to the realm of communicative interaction, and how might they be enacted and sustained during such moments? Does the coercive power of the police manifest itself even in something as prosaic as talk and other signaling systems? This article examines how the encounter between the police and the public is semiotically organized.

Instantiating the Interaction: The Semiotic Summons As other police scholars have already noted, the work that the police do begins in two ways: citizen-initiated or police-initiated (Wilson, 1968). The talk between the police and citizens also begins in a similar way: either citizens contact the police or the police initiate their contacts with the citizens. Citizens contact the police to request help for troubles and problems they witness or experience (Whalen and Zimmerman, 1990; Zimmerman, 1992), and they do so through emergency call numbers, such as 911 in the United States. The request for assistance is then symbolically encoded into organizationally relevant codes and assigned to the police (Manning, 1988). When the talk between the police and citizens is police-initiated, more salient differences – and

20 Legal Pragmatics Searle J (1975). ‘Indirect speech acts.’ In Cole P & Morgan J (eds.) Syntax and semantics 3. New York: Academic Press. 59–82. Shuy R W (1998). Language of confession, interrogation, and deception. Thousand Oaks: Sage.

Shuy R W (2004). ‘To testify or not to testify?’ In Cotterill J (ed.). 3–18. Stygall G (1994). Trial language: differential discourse processing and discursive formation. Amsterdam: Benjamins.

Legal Semiotics P C Shon, Indiana State University, Terre Haute, IN, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction As the principal agents of social control, the police are the most visible, accessible, and available representatives of the criminal law (Black, 1980). They, by virtue of their uniform, badge, gun, and other occupational accouterments display palpable coercive power (Niederhoffer, 1967). And as the primary gatekeepers of the justice system, they determine which cases – persons – enter the criminal justice system; to that end they exercise a considerable amount of discretion (Black, 2005). Prior scholars have maintained that the decision to apply the law in police–citizen encounters is axiomatically linked to the demeanor of the citizens involved (Lundmann, 1994). In other words, the likelihood of receiving tickets, being arrested, and having other official – and unofficial – sanctions applied has been demonstrated to be contingent upon citizens’ attitude and comportment toward the police. Police–citizen encounters are, by definition, asymmetrical since the police possess the brute capacity to threaten, harm, and coerce the clients of their service into compliance; at the same time, the encounters are tinged with a veneer of sociality in that the bulk of police work involves a fundamentally interactive process. That is, the essence of police work entails explaining, requesting, directing, and counseling citizens, victims, witnesses, and suspects to get them to do something (Mastrofski et al., 2000). Yet, despite the communicative nature of police–citizen encounters, researchers have not been able to agree on the constitutive elements of disrespectful behavior; this highlights one of the major limitations of existing police scholarship: there has yet to be a principled and rigorous empirical expatiation of the communicative – intersubjective – nature of police work. How a police officer communicates with citizens and colleagues has been regarded as the paragon of police professionalism (Muir, 1977). In a very obvious way, language is the vehicle through which

knowledge, intention, and meaning is transmitted; furthermore, that the police use language to persuade and dissuade citizens from certain courses of action points to its assumed, yet empirically unexplored, rhetorical function. From this standpoint, aside from the obvious capacity to exercise coercion, a theoretically warranted question it generates is how that awesome power is embodied in a much more mundane form of social action. This article, rather than assuming prima facie the capacity of the police to threaten, hurt, and kill the very purveyors of that power, demonstrates how that awesome power is interactionally, structurally, and socially enacted, exercised, and contested in the semiotic details of police–citizen interactions. How is the interaction between the police and the public made meaningful during the interaction? How do the contours of police ideology, authority, and identity extend to the realm of communicative interaction, and how might they be enacted and sustained during such moments? Does the coercive power of the police manifest itself even in something as prosaic as talk and other signaling systems? This article examines how the encounter between the police and the public is semiotically organized.

Instantiating the Interaction: The Semiotic Summons As other police scholars have already noted, the work that the police do begins in two ways: citizen-initiated or police-initiated (Wilson, 1968). The talk between the police and citizens also begins in a similar way: either citizens contact the police or the police initiate their contacts with the citizens. Citizens contact the police to request help for troubles and problems they witness or experience (Whalen and Zimmerman, 1990; Zimmerman, 1992), and they do so through emergency call numbers, such as 911 in the United States. The request for assistance is then symbolically encoded into organizationally relevant codes and assigned to the police (Manning, 1988). When the talk between the police and citizens is police-initiated, more salient differences – and

Legal Semiotics 21

controversies – emerge. Self-generated police activity such as suspicious person stops, traffic stops, and vehicle searches intrude into the lives of citizens: it hinders free movement and contradicts principles of democracy (Skolnick and Fyfe, 1993). Moreover, these types of activities usually involve members of the minority; hence, they are fraught with controversy (Radelet, 1977; Spitzer, 1999). A good example of a police initiated encounter is the traffic or vehicle stop – for driving through a red light, running a stop sign, speeding, driving with broken taillight etc. In traffic stops, the police initiate the encounter, thus hailing another to interact, by activating their sirens, horns, and flashing lights. Those nonverbal signs constitute a visual and aural – semiotic – summons. To appreciate the pragmatic force of such a summons, it must be noted that police–citizen encounters are different from other mundane interactions in a rudimentary way. First, police–citizen encounters are bureaucratic and institutional occasions for talk; hence, different from ordinary conversations that unfold over a dinner table or over coffee in that the structures of talk are preallocated as a function of institutionality (Atkinson and Drew, 1979; Drew and Heritage, 1992). Yet, despite this difference, institutional encounters draw their communicative intelligibility from a mundanely recurring and locally organized site of sociality (Schegloff, 1999). Research has shown that in one of the most banal and recurring sites of social interaction – opening sequences in 500 ordinary telephone calls – talk is interactionally ordered and sequentially organized (Schegloff, 1968). That is, telephone openings are marked by several core sequences: the ringing of the telephone constitutes a summons, beckoning the called party to answer, thus establishing a means of communication, and forming the summons/answer sequence. Next, the identification/recognition sequence is accomplished through the greeting sequence. And it is after the how-are-you sequence that the reason for the call is introduced. That burden – introducing the first topic – canonically belongs to the caller (Schegloff, 1968). According to Whalen and Zimmerman (1987), the opening sequences in emergency call centers – whose business is related to the context of law enforcement – are structurally and sequentially different from openings in ordinary conversations and other institutional discourses. Calls to emergency services, such as fire, police, and paramedics, are socially organized into the following core components: (1) prebeginning, (2) opening/identification/acknowledgment sequence, (3) request, (4) interrogative series, (5) response, (6) close (Whalen and Zimmerman, 1987; Whalen et al., 1988; Zimmerman, 1992).

Prebeginnings refer to the – unobservable – physical act of picking up a phone and dialing the other party’s number (911), ‘‘thereby summoning another to interact’’ (Whalen and Zimmerman, 1987: 180). In such institutional contexts, the callers generally remain anonymous; consequently, the task of selfidentification falls to the institution, and it is categorical. A key point of difference between openings in ordinary telephone conversations and emergency calls for assistance then involves the reduction of opening sequences: ‘‘Reduction plays an important role in achieving an institutionally constrained focus to the talk, for it routinely locates the first topic slot to the callers in their first turn, which is the second turn of the call’’ (Whalen and Zimmerman, 1987: 175). In ordinary talk, the caller or the called party may introduce the topic through preemptive moves; however, Whalen and Zimmerman (1987) and Zimmerman (1992) find that the callers’ first turn is the environment where topics are initiated in calls to emergency centers. Whalen and Zimmerman note that when citizens call emergency services it begins with a categorical identification by the dispatcher (e.g., Mid-City Emergency, nine-one-one emergency). Calls to emergency centers have been chosen as a point of departure because they are genealogically related to the law enforcement context of this paper (for other institutional discourses see Clayman and Heritage, 2002). First, the structural organization of institutional communication is derived from the structures found in ordinary talk; that is, institutional talk is modified and adapted from ordinary talk to meet the contextual and situational exigencies of talk at work (see Drew and Heritage, 1992). This difference is exemplified in calls to emergency centers; scholars such as Don Zimmerman, Marilyn Whalen, and Jack Whalen have demonstrated how calls to emergency centers are different from ordinary conversations (e.g., telephone calls); furthermore, their work parallels the research that semioticians have done on the police communication systems (Manning, 1988). The missing component of this body of research is what happens after the police are dispatched to a call, when they encounter the citizens – how do the police go about settling and managing their assignments? The identities in calls to emergency centers are organized such that the call-taker (dispatcher) chooses identification-oriented recognition over recognition-oriented response to the telephone summons. When citizens call emergency services for exigent problems or pressing troubles they may have experienced or witnessed, the format of talk is constrained in such a way that the business of the

22 Legal Semiotics

institution is built into the structure of the talk itself (i.e., institution speaks first). Thus, following the first sequence, citizens make a request for service (police, fire, paramedics), state the reason for the call (e.g., somebody just vandalized my car); or the request and categorical self-identification component of the call is collapsed into the opening first turn (e.g., nine-oneone, what is your emergency?). Just as callers to emergency centers initiate contact for assistance, the police, much like callers to emergency centers, initiate contact with motorists concerning a particular problematic relevancy – what the traffic stop is about. If activation of sirens, flashing lights, and horns constitutes an identification-oriented summons – identifying the police qua police by virtue of their flashing lights – then the answering of that summons is the act of pulling over. Motorists, however, do not respond to the summons in equal ways; they do so in ways that shape their moral identity and the contours of the interaction. Let me provide some concrete examples from my research. Patrol officers who are assigned to traffic enforcement (i.e., speeding infractions) ‘set up’ their radars to catch the speeding drivers at streets where traffic offenses are known to occur with regularity. Once motorists are ‘clocked’ on the radar, the officer must start from inertia and frantically catch up to the speeding car before ‘lighting em’ up’ (activating the sirens). And it is during this ‘preinteraction’ phase, between inertia and activation of sirens, and prior to any verbal contact, that motorists’ demeanor and moral character – and outcome – are conveyed to the police. For example, some motorists do not pull over right away. In fact, they make a series of turns at the closest intersection in an attempt to ‘lose’ the pursuing officer. Some succeed. Some motorists stop in the middle of the road, without pulling over to the shoulder, putting themselves, passing motorists, and the officer in danger. Some motorists speed up and attempt to run and instantiate a pursuit. Some motorists do not speed, but they do not pull over either. They lazily cruise to their homes, pull into the garage, park the car, and walk toward their backdoor, erroneously believing that they are immune from the summons book once inside the confines of their castle. Thus, through the different ways that motorists acknowledge and give recognition to the police as the police, prior to any conversational interaction, they convey to officers their attitude toward the police, as a ‘governable’ or an ‘asshole’ (van Maanen, 1978). If this prebeginning determines, to a significant degree, the legal identities of motorists and colors the moral contours of the interaction, then a similar process also marks the lenient outcomes of these

encounters. The research for this project was conducted in five different law enforcement agencies in two states, all with a distinct orientation toward the community, organizational values and norms, and size; and as part of the research, ride-alongs were supplemented with audio recordings of 50 traffic stops in those geographically different agencies. In the data, the police cited the motorists in 58% (29) of the cases, and released the drivers with a mere warning in 40% (20) of the cases; one driver was arrested. Some officers turned off their tape recorders when they decided they were going to release the speeding motorists (they thought the researcher would find them ‘uninteresting’). Consequently, detailed notes had to be written at a later time. Consider the following field note of such an encounter: (Southern City Police Traffic Stop #16) As Officer H and I were sitting in the squad car, a white crown Victoria was clocked going over 50 miles per hour in a 30 miles per hour zone. The officer immediately turned on his lights and went after the car. As he began to close in on the speeding vehicle, the officer said to the researcher, ‘‘That looks a squad car doesn’t it? I mean look at all the antennas it’s got growing out of the car.’’ When the speeding car came to a stop, the driver immediately got out; and as he was doing so, a sidearm and a (Southern) state police trooper’s badge was visible on his belt. Moreover, it turned out that the state trooper was an acquaintance of the host officer. They briefly talked about family and work, and the state trooper mimicked an apologetic motorist in a sarcastic way: ‘‘I’m sorry for speeding officer.’’ Then Officer H said good bye and let the speeding trooper go.

In this traffic stop, a city cop in a marked squad car pulls over a speeding, unmarked state police vehicle. Yet, despite the expansive authority of the state police over city police, the trooper obeys the summons to interact by pulling over. That is, the state trooper, by pulling over, acknowledges and gives recognition to the categorical and absolute authority of the city police as police, thus sustaining the legitimacy and authority of the police. But notice that the seeds of clemency are recognized before the speeding motorist even completes the act of pulling over. Officer H recognizes the speeding car as a potential ‘kinsman’ in blue. (‘‘That looks a squad car doesn’t it? I mean look at all the antennas it’s got growing out of the car.’’) Police officers do not, generally, as a matter of professional courtesy, ticket other police (this courtesy is extended to doctors, prosecutors, judges, and clergy). Thus, the speeding state trooper’s act of obeisance to the local authority is mutually ratified and acknowledged as such in the semiotic moments of the traffic stop. Moreover, the trooper steps out of his car, and in the process, reveals

Legal Semiotics 23

his sidearm and badge, thus signifying his identity as the police. It turns out that the speeding motorist is the benefactor of leniency in two ways, as a kinsman in blue and a personal acquaintance. Even when a cop pulls over another cop, traffic stops begin with a prebeginning where the officer activates the squad’s lights to initiate the interaction. In the excerpt above, the interaction that begins as an identification-oriented sequence is transformed into a recognition-oriented one after the host cop recognizes the law violator in a categorical (fellow cop) and social way (acquaintance). The point worth reiterating is that even other police officers follow the interactional order of traffic stops, and the accusatorial force of the semiotic summons, thus becoming collaborators in the productions of their legal and moral identities in the prebeginnings of the institutional encounter (Shon, 2005). The semiotic nature of police–citizen encounters, and the summons, is further supported when the opening sequences of the conversations between the police and the public are examined. For instance, Chicago police officers are instructed in the training academy to pull the vehicle over, walk up to the vehicle, and initiate traffic stops with a self-introduction such as ‘‘Hi, how you doing tonight. My name is Officer X of the Chicago Police Department. May I see your driver’s license and proof of insurance please?’’ Once motorists provide the officer with necessary documents, officers are taught to ask, ‘‘Do you know why I stopped you?’’ The rationale, according to the instructors at the training academy, is that this question ‘‘opens communication between citizens and the police.’’ As one patrol officer related to me, ‘‘Some people have a good reason for breaking traffic laws.’’ What is noteworthy is that police officers in actual stops or on mass mediated programs about police work (e.g., COPS) rarely make such formal and categorical self-introductions. COPS is a reality television program about police work. The show employs no actors; the participants are all real police officers who go about responding to calls for service. And as the first of its kind to blend reality programming and policing, in a compressed format and distinct series of vignettes, it purports to provide viewers with an intimate peek into the mundane and fantastic aspects of routine police work.

The Institutionality of Presence and Absence It is possible to attribute the absence of formal and categorical self-introductions in police work to the

size of a department and its locale. However, rather than explaining its presence and absence exogenously, there is a sociosemiotic way to account for such a phenomenon. That is, the police rely on the visual and symbolic accouterments of the uniform and the patrol car to convey their institutional affiliation. In other words, they are oriented toward culturally normative recognitional identification rather than identification-oriented recognition. As one patrol officer snidely remarked to me, ‘‘What, they can’t see the flashing lights and my badge and my gun and tell who I am? I’m the police.’’ The semiotic summons initializes the encounter, which is then followed by the bureaucratic request to see a driver’s license in the officer’s first turn. Not all patrol officers, however, ‘close’ the introduction slot. This observation can be further strengthened from the following traffic stop data from COPS: (COPS data: ‘‘You got an attitude’’) 3 PO1: Turn it off 4 D: Huh? 5 PO1: TURN IT OFF 6 Can I see yo hands ( ) who’s car? 7 D: Mine 8 PO1: Step on out 9 Hands on the car 10 You have a license or ID? 11 D: Yea I got ID 12 PO1: Slide back back back of the car right here 13 D: (mumble) Hey uhh 14 PO2: Hey what’s up? That’s what up ah-ight? first of all you don’t have 15 a driver’s license so you’re not supposed to be driving a car 16 that’s the first thing. I don’t care whether you bought the car or 17 not the law says for you to operate this car you have to 18 have a valid driver’s license. Now you have an attitude 19 you got a taillight out

This episode of COPS takes place in Philadelphia. Two patrol officers make a traffic stop after witnessing a car driving at high speeds; in addition, the car does not have visible license plates. After they activate the sirens and pull the car over, both officers approach the car and open the encounter in line 3. There is a lot of action taking place in this encounter. It should be noted, however, that these Philadelphia officers, much like Chicago officers, do not make formal self-introductions. They do not say, ‘‘Good evening. My name is Officer X of Philadelphia Police Department. May I see your driver’s license and registration please?’’ Contrast the way the big city traffic stop is opened with a police department in a much smaller city:

24 Legal Semiotics (Southern City Police Traffic Stop #9) PO: Hello. Can I see your driver’s license please? Good evening ma’am, my name is officer X with the Southern City Police Department. The reason I’m stopping you is you are doing 42 in a 30. Is there any medical emergency you’re in a hurry for? D: (shakes her head)

In this traffic stop the officer does begin the interaction in a way that is characteristic of bureaucratic institutions: he introduces himself (my name is officer X) and categorically identifies his institutional affiliation (with the Southern City Police Department). This officer from the Southern City Police Department was assigned to the Traffic Division: his entire shift was spent making traffic stops for speeding. Other beat officers during the author’s ride-alongs did not open the encounter (self-identification) in this way, although some did. There may be other reasons why this particular officer begins the encounter the way he does (small city police, traffic specialist, politeness); but the key point is that it is markedly different from the ones shown thus far. That is because the summons/answer sequence and the identification/recognition sequence found in ordinary talk are reduced to one in traffic stops. The absence of formal and categorical identification of an officer’s institutional affiliation is the norm from which other deviations occur – formal identification slots have to opened from a number of structural possibilities. By virtue of their uniforms, squad cars with flashing lights, and other occupational accouterments, the task of verbal self-introduction and categorical identification are not necessary since they are semiotically conveyed. The semiotic summons already identifies the ‘caller’ as the police. Thus, in a traffic stop, the summons/answer sequence and identification/recognition sequence – first semiotic turn – are compressed into a visual and aural communicative sequence, thus further reducing the opening sequences noted in prior work.

Conclusion Police–citizen encounters are asymmetrical in an obvious way: as representatives of the criminal law, the police are anointed by the state with the capacity to exercise coercive (lethal) force over citizens to bring about compliance to their directives. To that end, they are shrouded in symbolism, given the necessary tools, and project vestiges of brute power. It would be too presumptuous, however, to believe that the asymmetry between the police and the public rests solely on the possession of this awesome power. The police do

have the authority to stop citizens and motorists, interrogate them, and impede their movements, but they also have the power to control the structural patterns of communication. As shown here, the police bring that symbolic and bureaucratic power to life by semiotically summoning citizens to interact. In the summons/answer sequence Schegloff formulated, the failure to answer a summons incurs only a morally accountable sanction. That is, if someone or something (telephone ring) calls and beckons interaction in a social encounter, it can be refused at a moral expense. However, when the police summon citizens to interact and that summons is ignored, the accountability they face is not only moral but also legal. They can be sternly commanded to do so; the police can physically grab citizens and force them to interact, and should they physically resist, the police can use pain compliance holds and strikes to make them interact. As Skolnick and Fyfe (1993) posit, however, the police need not use those types of methods; their mere presence – uniform, identity – sometimes, deters crime and forces citizens to change their behaviors. This article has reviewed the summons/answer sequence in social and institutional encounters. A key finding in prior research is that ordinary summons sequences are reduced to meet the institutional and structural exigencies and constraints that bureaucracies face. It has shown how the summons sequence is further reduced and adapted to meet the situational relevancies of police work. That the police routinely rely on the uniform to convey their categorical, selfintroductions alludes to another way their presence is semiotically marked (Shon, 1998). As shown here, the semiotic summons is the first step in giving that awesome power of the police a concrete face and an audible voice in the opening moments of traffic stops (Shon, 2003a, 2003b, 2003c). There is tremendous diversity in the way motorists react to the wail of sirens and lights; and precisely because the presence of the police evokes a different range of responses, from hostility and fear to anger and rage, they have been conceptualized as ‘Rorschach-in-uniform.’ That motorists are being legally and semiotically summoned already carries the pragmatic sting of an accusation. That is to say, they are directly accused – with sirens, flashing lights, and horns – regarding some problematic relevancy, without a ratified knowledge of what that might be. Of course, this is not to say that motorists who are generally pulled over have no idea as to why they are being stopped: the manner in which the accusation is made is direct, but the accusation itself is indirect because it has yet to be announced, and ratified, in the opening moments.

Legal Semiotics 25

Niederhoffer (1967: 1) observed long ago that the policeman is ‘‘clothed in a mantle of symbolism that stimulates fantasy and projection.’’ The police are often baffled by the attitudes and behaviors citizens show them; they do not understand why citizens ‘‘stiffen with compulsive rage or anxiety at the sight of a patrol car’’; for that matter, the uniform. For instance, in an informal gathering with police officers, one patrolman related how he had pulled over a motorist to warn her about low tire pressure, only to be greeted with severe indignation. Or as Baker (1985) notes, a highway patrolman offers a ride to a hitchhiker on Christmas Eve, only to be violently cursed at and attacked. Why would such seemingly friendly gestures engender such hostile responses? It is tempting at this point to offer a reductionist and simple psychological explanation for motorists’ apparently bizarre behavior. Thus, we could say that motorists and citizens who act out toward the police are not the models of mental health, that they are paranoid, crazy, antisocial, and have a deep-seated hatred of the police. We could even offer a historical explanation and claim that some motorists react belligerently toward the police because they have been systematically harassed and mistreated by the police. Or a contextual explanation and conjecture could be offered: the motorist is having a bad day, and is acting out toward the cop. And it is tempting to stop here; but if the phenomenon is examined sociosemiotically, the act of being summoned, pulled over, covertly interrogated, and having the problem announced as part of the opening sequences of a traffic stop, then the analysis can be salvaged from inward-looking psychology by grounding analysis in observable, demonstrable, and empirical action that subjects engage in. The summons instantiates the bureaucratic power of the police through the wail of sirens and flashing lights, thus beckoning motorists to interact, but why motorists have been summoned to interact is not yet made relevant until a later turn. Thus, motorists are signaled as to the indeterminate nature of the problem without a ratified knowledge of the problematic relevancy: they are indirectly accused in a direct manner: ‘‘As you are driving down the highway, suddenly there are flashing lights in the rearview mirror and the whoop of a siren in your ears. A small dose of adrenaline surges into your blood stream. Your heart beats faster; your palms sweat. You feel guilty whether you’ve consciously done something or not’’ (Baker, 1985: 246).

Thus when motorists and citizens act belligerently to officers’ apparently friendly and concerned overtures – telling citizens about a tire dangerously low on air; offering a ride to a hitchhiker on Christmas

Eve – their reaction seems to point the finger at the squad car, flashing lights, officer, and his or her uniform. If a sociosemiotic explanation of traffic stops is adopted, a citizen’s violent reaction is a reaction to the direct and blatant manner of accusation – the summons – not the uniform. See also: Anthropological Linguistics: Overview; Applied Forensic Linguistics; Gesture: Sociocultural Analysis; Interactionism; Legal Semiotics; Police Questioning.

Bibliography Atkinson J M & Drew P (1979). Order in court: the organization of verbal interaction in judicial settings. London: Macmillian. Baker M (1985). COPS: their lives in their own words. New York: Pocket Books. Black D (1980). The manners and customs of the police. New York: Academic Press. Black D (2005). ‘Legal relativity.’ In Clark D S (ed.) Encyclopedia of law and society: American and global perspectives. Thousand Oaks, CA: Sage Publications. Clayman S & Heritage J (2002). The news interview: journalists and public figures on the air. New York: Cambridge University Press. Drew P & Heritage J (1992). ‘Introduction.’ In Drew P & Heritage J (eds.) Talk at work: interaction in institutional settings. Cambridge: Cambridge University Press. 3–65. Lundman R J (1994). ‘Demeanor or crime? The midwest city police-citizen encounters study.’ Criminology 32(4), 631–656. Manning P K (1988). Symbolic communication: signifying calls and the police response. Cambridge, MA: MIT Press. Mastrofski S D, Snipes J B, Parks R B & Maxwell C D (2000). ‘The helping hand of the law: police control of citizens on request.’ Criminology 38(2), 307–342. Muir W K (1977). Police: Street corner politicians. Chicago: University of Chicago Press. Niederhoffer A (1967). Behind the shield: the police in urban society. Garden City, NY: Anchor Books. Radelet M (1977). The police and the community. Encino, CA: Glencoe Press. Schegloff E A (1968). ‘Sequencing in conversational openings.’ American Anthropologist 70, 1075–1095. Schegloff E A (1999). ‘What next? Language and social interaction study at the century’s turn.’ Research on language and social interaction 32(1&2), 141–148. Skolnick J H & Fyfe J J (1993). Above the law: police and the excessive use of force. New York: The Free Press. Shon P C H (1998). ‘‘‘Now you got a dead baby on your hands’’: discursive tyranny in cop talk.’ International Journal for the Semiotics of Law 11(33), 275–301. Shon P C H (2003a). ‘Rorschach-in-action: some further observations on the semiotic summons in police–citizen encounters.’ International Journal for the Semiotics of Law 16, 101–112.

26 Legal Semiotics Shon P C H (2003b). The social organization of massmediated and actual police-citizen encounters. Ph.D. diss. University of Illinois at Chicago. Shon P C H (2003c). ‘Bringing the spoken words back in: conversationalizing (postmodernizing) police–citizen encounter research.’ Critical Criminology: An International Journal 11(2), 151–172. Shon P C H (2005). ‘I’d grab the S-O-B by his hair and yank him out the window: the fraternal order of warnings and threats in police-citizen encounters.’ Discourse & Society. Spitzer E (1999). The New York City Police Department’s ‘‘stop & frisk’’ practices: a report to the People of the State of New York from the Office of the Attorney General. New York: New York State Attorney General. Van Maanen J (1978). ‘The asshole.’ In Maanen J V & Manning P K (eds.) Policing: a view from the street. Santa Monica, CA: Goodyear Publishing. 221–237.

Whalen M R & Zimmerman D H (1987). ‘Sequential and institutional contexts in calls for help.’ Social Psychology Quarterly 50, 172–185. Whalen M R & Zimmerman D H (1990). ‘Describing trouble: practical epistemology in citizen calls to the police.’ Language in Society 19, 465–492. Whalen J, Zimmerman D H & Whalen M R (1988). ‘When words fail: a single case analysis.’ Social Problems 35, 335–360. Zimmerman D H (1992). ‘The interactional organization of calls for emergency assistance.’ In Drew P & Heritage J (eds.) Talk at work: interaction in institutional settings. Cambridge, UK: Cambridge University Press. 418–469. Wilson J Q (1968). Varieties of police behavior: the management of law and order in eight communities. Cambridge, MA: Harvard University Press.

Legal Translation S Sˇarcˇevic´, University of Rijeka, Rijeka, Croatia ! 2006 Elsevier Ltd. All rights reserved.

Although translations of legal documents are among the oldest and most important in the world, legal translation has long been neglected in translation studies. Thanks to recent impulses from comparative law and legal linguistics, a growing number of scholars and practitioners have been attracted to this interdisciplinary field, making legal translation one of the most vital areas of contemporary translation. In our age of globalization, translation plays a major role as a means of communication in national, supranational, and international law. The volume of legal translation in international organizations has increased significantly as a result of the modern trend of adopting treaties, conventions, and model laws in more than one language. International relations, international trade, and even international dispute resolution are more dependent on translation than ever before, as is the European Union, whose legislation is currently drafted in 20 official languages, all texts being equally authentic. At the national level, the number of plurilingual countries and regions that translate their national laws and administer justice in two or more official languages is on the rise, as is the number of countries that translate their legislation and other legal documents for information purposes.

Why Is Legal Translation Special? In keeping with Sager’s definition of specific-purpose texts, a legal text can be regarded as a communicative

occurrence between specialists intended to serve a particular function (Sager et al., 1980: 210) (see Languages for Specific Purposes). Although it is their function that distinguishes legal texts from other specific-purpose texts, linguists initially failed to recognize that sources of law, such as codes, statutes and regulations, and treaties and conventions are primarily prescriptive, as are contracts. Such texts are normative instruments that prescribe commands and prohibitions, grant permissions and powers, or create obligations and rights (Cornu, 2000: 268) (see Definition/Rules in Legal Language). Judicial decisions and documents used to carry on judicial and administrative proceedings, such as actions, pleadings, briefs, appeals, requests, etc., are primarily descriptive but usually contain prescriptive parts as well (see Language of Legal Texts). These two groups of texts constitute the main bulk of legal translation. While lawyers agree that legal translation is a dual operation consisting of both ‘‘legal and interlingual transfer,’’ in their opinion, the main operation is legal in nature (Sacco, 1990: 34). Legal texts require a special type of translation basically because the translations also produce legal effects and many have the force of law just like the original(s). Above all, skilled translators must possess considerable legal and language competence ‘‘to understand not only what the words mean and what a sentence means, but also what legal effect it is supposed to have, and how to achieve that legal effect in the other language’’ (Schroth, 1986: 56). Unlike other areas of specialized translation, the success of a legal translation is measured by its interpretation and application in

26 Legal Semiotics Shon P C H (2003b). The social organization of massmediated and actual police-citizen encounters. Ph.D. diss. University of Illinois at Chicago. Shon P C H (2003c). ‘Bringing the spoken words back in: conversationalizing (postmodernizing) police–citizen encounter research.’ Critical Criminology: An International Journal 11(2), 151–172. Shon P C H (2005). ‘I’d grab the S-O-B by his hair and yank him out the window: the fraternal order of warnings and threats in police-citizen encounters.’ Discourse & Society. Spitzer E (1999). The New York City Police Department’s ‘‘stop & frisk’’ practices: a report to the People of the State of New York from the Office of the Attorney General. New York: New York State Attorney General. Van Maanen J (1978). ‘The asshole.’ In Maanen J V & Manning P K (eds.) Policing: a view from the street. Santa Monica, CA: Goodyear Publishing. 221–237.

Whalen M R & Zimmerman D H (1987). ‘Sequential and institutional contexts in calls for help.’ Social Psychology Quarterly 50, 172–185. Whalen M R & Zimmerman D H (1990). ‘Describing trouble: practical epistemology in citizen calls to the police.’ Language in Society 19, 465–492. Whalen J, Zimmerman D H & Whalen M R (1988). ‘When words fail: a single case analysis.’ Social Problems 35, 335–360. Zimmerman D H (1992). ‘The interactional organization of calls for emergency assistance.’ In Drew P & Heritage J (eds.) Talk at work: interaction in institutional settings. Cambridge, UK: Cambridge University Press. 418–469. Wilson J Q (1968). Varieties of police behavior: the management of law and order in eight communities. Cambridge, MA: Harvard University Press.

Legal Translation S Sˇarcˇevic´, University of Rijeka, Rijeka, Croatia ! 2006 Elsevier Ltd. All rights reserved.

Although translations of legal documents are among the oldest and most important in the world, legal translation has long been neglected in translation studies. Thanks to recent impulses from comparative law and legal linguistics, a growing number of scholars and practitioners have been attracted to this interdisciplinary field, making legal translation one of the most vital areas of contemporary translation. In our age of globalization, translation plays a major role as a means of communication in national, supranational, and international law. The volume of legal translation in international organizations has increased significantly as a result of the modern trend of adopting treaties, conventions, and model laws in more than one language. International relations, international trade, and even international dispute resolution are more dependent on translation than ever before, as is the European Union, whose legislation is currently drafted in 20 official languages, all texts being equally authentic. At the national level, the number of plurilingual countries and regions that translate their national laws and administer justice in two or more official languages is on the rise, as is the number of countries that translate their legislation and other legal documents for information purposes.

Why Is Legal Translation Special? In keeping with Sager’s definition of specific-purpose texts, a legal text can be regarded as a communicative

occurrence between specialists intended to serve a particular function (Sager et al., 1980: 210) (see Languages for Specific Purposes). Although it is their function that distinguishes legal texts from other specific-purpose texts, linguists initially failed to recognize that sources of law, such as codes, statutes and regulations, and treaties and conventions are primarily prescriptive, as are contracts. Such texts are normative instruments that prescribe commands and prohibitions, grant permissions and powers, or create obligations and rights (Cornu, 2000: 268) (see Definition/Rules in Legal Language). Judicial decisions and documents used to carry on judicial and administrative proceedings, such as actions, pleadings, briefs, appeals, requests, etc., are primarily descriptive but usually contain prescriptive parts as well (see Language of Legal Texts). These two groups of texts constitute the main bulk of legal translation. While lawyers agree that legal translation is a dual operation consisting of both ‘‘legal and interlingual transfer,’’ in their opinion, the main operation is legal in nature (Sacco, 1990: 34). Legal texts require a special type of translation basically because the translations also produce legal effects and many have the force of law just like the original(s). Above all, skilled translators must possess considerable legal and language competence ‘‘to understand not only what the words mean and what a sentence means, but also what legal effect it is supposed to have, and how to achieve that legal effect in the other language’’ (Schroth, 1986: 56). Unlike other areas of specialized translation, the success of a legal translation is measured by its interpretation and application in

Legal Translation 27

practice, especially by courts of law. Accuracy is thus essential, and as much attention must be paid to the content as to the intention and all possible interpretations and misinterpretations of the text. Translators must understand the source text but not interpret it in the legal sense. In particular, they must avoid value judgments, taking care to convey what is said in the source text, not what they believe it ought to say. Contrary to texts of the exact sciences, legal texts do not have a single agreed meaning independent of local context but usually derive their meaning from a particular national legal system. This system is referred to as the source legal system, while the target legal system is the system to which the recipients of the target text belong. According to lawyers, the difficulty of a translation depends primarily on the degree to which the source and target legal systems are related and secondarily on the similarity of the source and target languages (cf. Berteloot, 1999: 103). In this sense, the difficulty of a translation increases when the legal systems belong to families with considerably different legal traditions and their languages are not related. Sometimes it occurs that the source and target legal systems are identical, thus greatly facilitating the translation process, mainly because both texts derive their meaning from the same conceptual system. This is the case in plurilingual states with one legal system, such as Switzerland, Belgium, and Finland, as opposed to plurilingual states and regions with different legal systems or a mixed legal system (e.g., Canada, India, Sri Lanka, Israel, South Africa and recently China, in Hong Kong and Macau). Although European law is regarded as an independent legal system, ‘Euroterms’ and concepts are in a state of flux, causing confusion for translators who must distinguish between European and national institutions and concepts (Schu¨ bel-Pfister, 2004: 115). Despite increased efforts to unify law at the international level, the number of standardized concepts in international law is still small.

Conceptual Incongruity of Legal Systems Lawyers often describe legal translation as a form of transposition juridique (Tallon, 1995: 342); by no means, however, is it a mere mechanical process of transcoding terminology from one legal system into another. On the contrary, the conceptual incongruity of legal systems poses the greatest challenge to legal translators, making it extremely difficult and sometimes impossible to find an adequate equivalent in the target legal system for a given term in the source legal system (see Law and Language: Overview). As in Hjelmslev’s analysis of terminological incongruity in

ordinary languages, it can be shown that the boundaries between the meanings of legal concepts are incongruent even in closely related legal systems, such as the continental civil law systems. For example, the concept of de´ cision in French law corresponds with two, more specific concepts in German law (Entscheidung and Beschluss) and three in Dutch law (Beschikking, Besluit, Beslissing). Even etymological equivalents that constitute the common core of all legal systems, such as dettes and debts or contrat and contract, are not identical at the conceptual level. Moreover, within the same language, it occurs that the same term designates different concepts in different legal systems or, conversely, the same concept is designated by different terms in different jurisdictions. In addition, all legal systems contain a number of ‘untranslatable’ system-bound terms designating concepts peculiar to their own legal reality. Not surprisingly, the majority of studies on legal translation deal with terminological problems. Early studies by lawyers focused on using methods of comparative law (especially the functional approach) to identify and compare the corresponding concepts of different legal systems. Hence, the terms designating such concepts are called functional equivalents (Sˇarcˇevic´, 1989: 278). Using functional equivalents is considered the ‘‘ideal method of translation’’ (Weston, 1991: 23). Frequently, however, they are only partially equivalent, or no functional equivalent exists in the target legal system, thus forcing translators to resort to linguistic means. As a result of recent interaction between lawyers and linguists, conceptual analysis is now being used to measure the degree of equivalence of the corresponding concepts of different legal systems; legal criteria have been established to determine the adequacy of functional equivalents; methods of lexical expansion and contraction are used to compensate for the incongruity of functional equivalents; and a catalogue of linguistic equivalents has been proposed to translate system-bound terms (literal equivalents, descriptive paraphrases, borrowings, naturalizations, and other neologisms).

Attempts to Provide a Systematic Approach to Legal Translation The popularity of legal translation has been accompanied by a sharp increase in the number of scholarly publications by both lawyers and linguists, most of which focus on specific terminological issues, languages, legal systems, or text types. Only recently have there been attempts to establish general principles of legal translation. Aware of the inadequacies of general translation theories, specialists in legal

28 Legal Translation

linguistics have attempted to analyze the phases of the translation process (Ge´ mar, 1995) and propose a theoretical framework for legal translation based on both legal and linguistic criteria (Sˇ arcˇ evic´ , 2000). Recent emphasis on the textual aspects of legal translation (syntactical, pragmatic, and stylistic) has shown that the quality of translations can be improved without compromising reliability. For centuries literal translation was the only acceptable translation procedure for all legal texts, especially legislation (see Approaches to Translation, Linguistic). For the purpose of preserving the letter of the law, translators adhered to the principle of fidelity to the source text by reconstructing the form and substance of the original text as closely as possible. In fact, the dominance of literal translation was not seriously challenged until the 20th century when dissident translators began to demand equal language rights for lesser-used official languages. Accused of heresy by his critics, Rossel defended his liberal translation of the German text of the Swiss Civil Code into French (1907) by insisting that the francophone population of Switzerland had the right to have their code written in ‘natural’ French. In the 1960s, demands for equal treatment of the French language at the federal level in Canada triggered the so-called silent revolution in Quebec, emancipating translators by encouraging them to convey the sense of the original in a new text written in the genius of the target language. In the late 1970s, the Canadians went a step further by introducing new methods of bilingual drafting, the most common of which is co-drafting, a form of simultaneous text production that goes beyond translation. While translators of Canadian federal legislation are encouraged to strive not so much for verbal and grammatical parallelism as for linguistic purity within the confines of legal equivalence (Beaupre´ , 1986: 179), the extent of the translator’s freedom to produce a new text in the genius of the target language is controversial in other jurisdictions and for certain text types, mainly out of fear that the translator will scuttle the intent of the original. For example, translators of Swiss legislation are advised not to alter the length of sentences to avoid imposing their own interpretation. Similarly, translators of instruments of international law are instructed to give priority to formal concordance and to refrain from clarifying any ambiguities. Moreover, numerous legal instruments are institutional texts and as such are subject to a varying degree of standardization. Strict observance of standard form and formulae is required particularly in translations of documents of the European Union. From the above, it follows that text type does play a role in determining translation strategy; however,

there is no given strategy for translating a specific text type, such as legislation or judgments, even when the skopos (purpose) of the translation is the same (see Functional Approaches to Translation: Skopos Theory). Unlike other areas of translation, the skopos of a legal translation is often predetermined by its status, i.e., whether it is non-authoritative or authoritative. Whereas non-authoritative translations are intended for information purposes only, authoritative translations are legally binding instruments. In both cases, however, the selection of a translation strategy depends primarily on legal factors of the specific communication situation: which source and target legal systems are involved, how many, which drafting techniques and rules of interpretation are commonly used in those systems, etc.

Redefining the Goal of Legal Translation The future of legal translation is bright, and there are valid arguments to elevate its status to an independent discipline in translation studies. A symbolic step in this direction could be taken by redefining the goal of legal translation. In the tradition of specificpurpose translation, it has long been accepted that the primary task of the legal translator is to achieve semantic equivalence (see Translational Equivalence). While this aim is in keeping with the presumption that a translation has the same meaning as the original(s), lawyers are the first to admit that legal translation is at best approximation, thus reducing the presumption of equal meaning to a fiction. Whereas it is not always possible to achieve semantic equivalence, translators are expected to produce texts that lead to the same legal effects in practice. Moreover, these are presumed to be the intended legal effects. Emphasizing the pragmatic aspects of legal translation (see Legal Pragmatics), it can be said that the translator’s task is to put language into action to achieve the intended legal effects, i.e., to achieve legal equivalence. Although legal equivalence has far-reaching implications, it can essentially be regarded as a synthesis of content, intent, and legal effect, with the main emphasis on the latter. In a receiver-oriented approach to legal translation, the test of legal equivalence is how the text is interpreted and applied by the courts. For example, the authoritative translations of a given instrument may be deemed legally equivalent if they are interpreted and applied uniformly by the courts. In this sense, the ultimate goal of legal translation is to produce a text that will promote the uniform interpretation and application of the single instrument, as it is called. Achieving this goal is not easy, however, it is hoped that greater interaction between lawyers and linguists

Legal–Professional Language in Jury Trial 29

will lead to new and better methods of overcoming the legal and linguistic barriers of legal translation, making it a more effective means of communication in our global world. See also: Approaches to Translation, Linguistic; Definition/

Rules in Legal Language; Functional Approaches to Translation: Skopos Theory; Language of Legal Texts; Languages for Specific Purposes; Law and Language: Overview; Legal Pragmatics; Translational Equivalence.

Bibliography Alcaraz E & Hughes B (2002). Legal translation explained. Manchester: St. Jerome. Arntz R (2001). Fachbezogene Mehrsprachigkeit in Recht und Technik. Hildesheim: Georg Olms. ASTTI/ETI (2000). La traduction juridique, Histoire, the´ orie(s) et pratique/Legal translation, History, theory(ies) and practice. Bern/Gene`ve: ASTTI/ETI. Beaupre´ M (1986). Interpreting bilingual legislation. Toronto: Carswell. Beaupre´ M (1987). ‘La traduction juridique.’ (General report in English). Les Cahiers de Droit 28, 735–745. ¨ bersetzunBerteloot P (1999). ‘Der Rahmen juristischer U gen.’ In de Groot G-R & Schulze R (eds.) Recht und ¨ bersetzen. Baden-Baden: Nomos. 101–113. U Bocquet Cl (1994). Pour une me´ thode de traduction juridique. Prilly: CB Service. Cornu G (2000). Linguistique juridique (2nd edn.). Paris: Montchrestien. Cosmai D (2003). Tradurre per l’Unione europea. Milano: Hoepli. De Groot G-R (1999). ‘Das U¨ bersetzen juristischer Terminologie.’ In de Groot G-R & Schulze R (eds.) Recht und ¨ bersetzen. Baden-Baden: Nomos. 11–47. U (1979). Meta (Journal des traducteurs, Montre´ al) 24(1), 1–220. In Ge´ mar J-Cl (ed.) Special Edition: La traduction juridique/Legal Translation. (2002). Meta (Journal des traducteurs, Montre´ al) 47(2), 1–293. In Schwab W (ed.) Special Edition: Traduction et terminologie juridique/Legal translation and terminology. Galdia M (2003). ‘Rechtsvergleichendes U¨ bersetzen.’ The European Legal Forum 3, 1–5.

Ge´ mar J-Cl (ed.) (1982). Langage du droit et traduction. Montre´ al: Linguatech et Conseil de la langue franc¸ aise. Ge´ mar J-Cl (1995). Traduire ou l’art d’interpre´ ter, Langue, droit et socie´ te´ : e´ le´ ments de jurilinguistique. Tome 2: Application – Traduire le texte juridique. Sainte-Foy: Presses de l’Universite´ du Que´ bec. Mattila H (2002). Veraileva oikeuslingvistiikka. Helsinki: Kauppakarri. Mayoral Asensio R (2003). Translating official documents. Manchester: St. Jerome. Morris M (ed.) (1995). Translation and the law. Amsterdam: Benjamins. Sacco R (1990). Introduzione al diritto comparato (4th edn.). Torino: G. Giappichelli. Sager J, Dungworth D & McDonald P (1980). English special languages: principles and practice in science and technology. Wiesbaden: Brandstetter. ¨ bersetzen von Rechtstexten. Sandrini P (ed.) (1999). U Tu¨ bingen: Gunter Narr. Sˇ arcˇ evic´ S (1989). ‘Conceptual dictionaries for translation in the field of law.’ International Journal of Lexicography 2, 277–293. Sˇ arcˇ evic´ S (2000). New approach to legal translation (2nd edn.). The Hague: Kluwer Law International. Sˇ arcˇ evic´ S (ed.) (2001). Legal translation: preparation for accession to the European Union. Rijeka (Croatia): Faculty of Law, University of Rijeka. Schroth P W (1986). ‘Legal translation.’ American Journal of Comparative Law 34, 47–65. Schu¨ bel-Pfister I (2004). Sprache und Gemeinschaftsrecht. Berlin: Duncker und Humblot. Tallon D (1995). ‘Franc¸ ais juridique et science du droit: quelques observations.’ In Snow G & Vanderlinden J (eds.) Franc¸ ais juridique et science du droit. Bruxelles: Bruylant. 339–349. Weisflog W (1996). Rechtsvergleichung und juristische ¨ bersetzung. Zu¨ rich: Schultess. U Weston M (1991). An English reader’s guide to the French legal system. Oxford: Berg. Weyers G (1999). ‘Das U¨ bersetzen von Rechtstexten: eine Herausforderung an die U¨ bersetzungswissenschaft.’ In de ¨ bersetzen. Groot G-R & Schulze R (eds.) Recht und U Baden-Baden: Nomos. 151–174. ¨ bersetzung in Recht/Translation Zaccaria G (ed.) (2000). U in law. Mu¨ nster: LIT Verlag.

Legal–Professional Language in Jury Trial C Heffer, Nottingham Trent University, Nottingham, UK ! 2006 Elsevier Ltd. All rights reserved.

The study of legal–professional language in jury trial is concerned with how trial lawyers and judges use language in the context of legal cases tried by jury.

While some code switching occurs between legal English and Standard English, brought on by frequent changes in topic and audience, it is often difficult to distinguish these linguistic varieties within the context of jury trial. Legal professionals often produce their own hybrid variety of language, a ‘legal-lay discourse’ (Heffer, 2005), which is neither truly legal in its lexis and syntax, nor typical of everyday

Legal–Professional Language in Jury Trial 29

will lead to new and better methods of overcoming the legal and linguistic barriers of legal translation, making it a more effective means of communication in our global world. See also: Approaches to Translation, Linguistic; Definition/

Rules in Legal Language; Functional Approaches to Translation: Skopos Theory; Language of Legal Texts; Languages for Specific Purposes; Law and Language: Overview; Legal Pragmatics; Translational Equivalence.

Bibliography Alcaraz E & Hughes B (2002). Legal translation explained. Manchester: St. Jerome. Arntz R (2001). Fachbezogene Mehrsprachigkeit in Recht und Technik. Hildesheim: Georg Olms. ASTTI/ETI (2000). La traduction juridique, Histoire, the´orie(s) et pratique/Legal translation, History, theory(ies) and practice. Bern/Gene`ve: ASTTI/ETI. Beaupre´ M (1986). Interpreting bilingual legislation. Toronto: Carswell. Beaupre´ M (1987). ‘La traduction juridique.’ (General report in English). Les Cahiers de Droit 28, 735–745. ¨ bersetzunBerteloot P (1999). ‘Der Rahmen juristischer U gen.’ In de Groot G-R & Schulze R (eds.) Recht und ¨ bersetzen. Baden-Baden: Nomos. 101–113. U Bocquet Cl (1994). Pour une me´thode de traduction juridique. Prilly: CB Service. Cornu G (2000). Linguistique juridique (2nd edn.). Paris: Montchrestien. Cosmai D (2003). Tradurre per l’Unione europea. Milano: Hoepli. De Groot G-R (1999). ‘Das U¨bersetzen juristischer Terminologie.’ In de Groot G-R & Schulze R (eds.) Recht und ¨ bersetzen. Baden-Baden: Nomos. 11–47. U (1979). Meta (Journal des traducteurs, Montre´al) 24(1), 1–220. In Ge´mar J-Cl (ed.) Special Edition: La traduction juridique/Legal Translation. (2002). Meta (Journal des traducteurs, Montre´al) 47(2), 1–293. In Schwab W (ed.) Special Edition: Traduction et terminologie juridique/Legal translation and terminology. Galdia M (2003). ‘Rechtsvergleichendes U¨bersetzen.’ The European Legal Forum 3, 1–5.

Ge´mar J-Cl (ed.) (1982). Langage du droit et traduction. Montre´al: Linguatech et Conseil de la langue franc¸aise. Ge´mar J-Cl (1995). Traduire ou l’art d’interpre´ter, Langue, droit et socie´te´: e´le´ments de jurilinguistique. Tome 2: Application – Traduire le texte juridique. Sainte-Foy: Presses de l’Universite´ du Que´bec. Mattila H (2002). Veraileva oikeuslingvistiikka. Helsinki: Kauppakarri. Mayoral Asensio R (2003). Translating official documents. Manchester: St. Jerome. Morris M (ed.) (1995). Translation and the law. Amsterdam: Benjamins. Sacco R (1990). Introduzione al diritto comparato (4th edn.). Torino: G. Giappichelli. Sager J, Dungworth D & McDonald P (1980). English special languages: principles and practice in science and technology. Wiesbaden: Brandstetter. ¨ bersetzen von Rechtstexten. Sandrini P (ed.) (1999). U Tu¨bingen: Gunter Narr. Sˇarcˇevic´ S (1989). ‘Conceptual dictionaries for translation in the field of law.’ International Journal of Lexicography 2, 277–293. Sˇarcˇevic´ S (2000). New approach to legal translation (2nd edn.). The Hague: Kluwer Law International. Sˇarcˇevic´ S (ed.) (2001). Legal translation: preparation for accession to the European Union. Rijeka (Croatia): Faculty of Law, University of Rijeka. Schroth P W (1986). ‘Legal translation.’ American Journal of Comparative Law 34, 47–65. Schu¨bel-Pfister I (2004). Sprache und Gemeinschaftsrecht. Berlin: Duncker und Humblot. Tallon D (1995). ‘Franc¸ais juridique et science du droit: quelques observations.’ In Snow G & Vanderlinden J (eds.) Franc¸ais juridique et science du droit. Bruxelles: Bruylant. 339–349. Weisflog W (1996). Rechtsvergleichung und juristische ¨ bersetzung. Zu¨rich: Schultess. U Weston M (1991). An English reader’s guide to the French legal system. Oxford: Berg. Weyers G (1999). ‘Das U¨bersetzen von Rechtstexten: eine Herausforderung an die U¨bersetzungswissenschaft.’ In de ¨ bersetzen. Groot G-R & Schulze R (eds.) Recht und U Baden-Baden: Nomos. 151–174. ¨ bersetzung in Recht/Translation Zaccaria G (ed.) (2000). U in law. Mu¨nster: LIT Verlag.

Legal–Professional Language in Jury Trial C Heffer, Nottingham Trent University, Nottingham, UK ! 2006 Elsevier Ltd. All rights reserved.

The study of legal–professional language in jury trial is concerned with how trial lawyers and judges use language in the context of legal cases tried by jury.

While some code switching occurs between legal English and Standard English, brought on by frequent changes in topic and audience, it is often difficult to distinguish these linguistic varieties within the context of jury trial. Legal professionals often produce their own hybrid variety of language, a ‘legal-lay discourse’ (Heffer, 2005), which is neither truly legal in its lexis and syntax, nor typical of everyday

30 Legal–Professional Language in Jury Trial

English. This discourse can partly be explained in terms of strategic tensions resulting from the need to persuade a lay jury to act in certain ways, but from within a highly constrained legal framework. Crucially, then, it is the complexity of the jury trial context which determines the complex nature of language use observable in court. This entry will sketch out some of the key features of that complex context and consider how they influence the nature of the language used. It will also attempt to show how variation in context across legal jurisdictions can result in quite distinct linguistic preferences.

The Narrative Context and Opening Statements At the heart of all cases tried by jury is a story of wrongdoing. That story needs to be presented by the prosecution or plaintiff to the jurors and will then be deconstructed by the defense, who might also present alternative stories. In addition to this ‘natural’ narrative context are lawyers’ understandings of the process of jury decision-making. Lawyers have long understood the power of a good story, and this folk theory has recently found some support from psychological research into jury decision-making. Narrative coherence, for example, has been claimed to be more crucial to jurors than evidential weight (Bennett and Feldman, 1981), while a narrative form of presentation has been shown to enhance recall of story elements (Mandler, 1984). Both folk theory and science have fed into the advice of advocacy manuals (e.g., Evans, 1993), which now enthusiastically encourage narrative forms of presentation, and there is some empirical evidence that lawyers, particularly prosecution lawyers, do indeed follow narrative principles. In terms of the macroorganization of their cases, counsel will often try to ensure that their first witness is the one who can tell the story as completely as possible (e.g., a good eyewitness) and then subsequent witnesses will provide support for various elements of that story (e.g., forensic evidence). Counsel will often structure their examination-in-chief of the key ‘storytelling’ witnesses following a schema of natural narrative, such as ‘setting-episode,’ in which the scene is set and then the episode is recounted chronologically. Finally, lawyers are concerned not only with, as Jackson (1995) puts it, the ‘story in the trial’ (the story of wrongdoing) but also the ‘story of the trial’ (the events of the trial itself, including the impression made by witnesses and counsel). The most crucial point of the trial for narrative presentation is the opening statement, for this will be the jury’s initial exposure to the prosecution’s story (and, in

the U.S., that of the defense). Opening statements in most jurisdictions are supposed to be informative rather than persuasive, to provide an outline of the story and the legal issues rather than argue the case. They are potentially highly influential, though, since they construct a cognitive filter through which jurors will then view the subsequent evidence (Moore, 1989). The opening outline provides the core or master narrative while ‘satellite narratives fill out, elaborate, and extend the narrative through the information gathered during the examination of witnesses’ (Snedaker, 1991: 134). The following is an abridged and anonymized version of the outline summary from a dangerous driving case, annotated to show how the structure conforms to a schema of natural narrative (Labov, 1972): Orientation [When?] At about 2:00 in the morning of November 16 last [Who?] four off-duty policemen [What?] were driving along in a white Toyota Corona [Where?] from the Ludley direction to the Bedley direction. Complicating Action (What happened then?) As they came up to [the Busby] bends, the car behind them came up close and (. . .) went out into the oncoming lane, over the double white lines and overtook. Evaluation (So what?) Those bends are a number of sharp and, in some cases, blind bends, and they have got double white lines in the middle of the road which, as you know, means you must not overtake there. Resolution (What finally happened?) The car continued (. . .) until it disappeared out of sight round the next blind left-hand bend.

Despite conforming to a natural narrative schema, there are a number of features in the complete narrative which mark it out as distinctive to trial narrative. In the first place, there is an apparent overspecification of orientational information. While the Orientation sentence above would be perfectly adequate for a newspaper narrative, here it is both preceded and followed with circumstantial detail such as the precise designation of the road, the names of all the passengers in the car, and a detailed description of the road. A second feature is the frequent blending of orientation and evaluation: As they were driving along – it is a single carriage road, a country lane; an ‘A’ road, but in the countryside with hedgerows either side, no streetlights – they become aware of a car coming up behind them fast.

Here the barrister is not so much providing descriptive background as suspending the narration to stress

Legal–Professional Language in Jury Trial 31

the evaluative point he is making: that it was dangerous to overtake. This suggests a third feature of trial narrative: a distinctive use of evaluation. In Labov’s model of narrative, evaluation functions as both a primary and secondary structure. As a primary structure, Evaluation in trial narrative explicitly indicates the probative value of a piece of evidence, the ‘Point’ (Harris, 2001). This is indicated in several sections of the outline and then summed up in an evaluative Coda: The prosecution would say that is an obvious piece of dangerous driving.

As a secondary structure, evaluation appears throughout the narrative and tends to function in a more holistic fashion, gradually building up an impression that will influence the jury as much affectively as cognitively, as in the following lexical chain suggesting danger: no streetlights . . . fast . . . sharp . . . blind . . . close . . . in the middle of the night . . . obviously . . . cannot see

At this level of impression creation, the strategic choice of words can help prime the jury into viewing the events and participants in a given way, as Cotterill demonstrates in a detailed corpus linguistic analysis of lexical choices made in the opening statements of the O. J. Simpson criminal trial (2003: 68–83). Narrative, then, is clearly central to case construction. Yet to a nonlegal observer, the evidential phase of a run-of-the-mill trial can appear quite unlike narrative. Although advocacy manuals insist that the aim in witness examination is for ‘‘the story to be told naturally, spontaneously and conversationally [thus] enhancing trust in the witness’’ (Stone, 1995: 95), in practice, extended witness narratives of any length are extremely rare. Much of the explanation for this discrepancy between the ‘deep’ narrative structure of cases and the superficial nonnarrativeness of trial genres can be found in the legal context of jury trial.

The Legal Context and Submissions Language in jury trial is profoundly affected by both the law itself and by the mindset and customs of legal professionals. The substantive law both restricts the types of stories of wrongdoing that can be told in court and determines which elements of those narratives are legally relevant. The ‘matters in issue’ for legal professionals often differ from those considered relevant by lay participants, so that the narratives of lay litigants in small claims courts are often construed as legally inadequate (O’Barr and Conley, 1982). The narratives constructed by lawyers in court must fit the

categories determined by the law and, in a criminal case, the legal elements of the indictment. This involves the lawyer controlling quite closely the direction a witness narrative takes and making sure that the ‘materially relevant’ evidential Points are being made. The narration is also constrained by rules of evidence. For example, lawyers must make sure the witness does not speculate or relate what others have said (hearsay). There is also, though, a deeper level at which the legal context influences the language of legal professionals. As with most institutional discourse, lawyers and judges are heavily influenced by their professional training. Law students are trained to ‘think like lawyers’, and this is often different from lay ways of thinking (Mertz, 1996). Cases are viewed, for example, in terms of legal rules and principles and are effectively transformed from complex narratives of human vicissitudes to exemplars of decontextualized case categories. Such legal reasoning is demonstrated in advocates’ legal submissions to judges. In the English context, these are delivered orally in open court but in the absence of the jury. In the following extract from a submission, the barrister is attempting to persuade the judge of the legal correctness of dismissing one of the counts against the defendant, suggesting effectively that that particular legal story should not be told in court: To deal with the other matter, it is a question of dangerous driving in Count 1 of the indictment. Your Honor, I make a submission of no case in respect of that, and I do it for this reason: (. . .) one has, in dangerous driving cases, really two different types of circumstance in which dangerous driving can arise. One presents an objective test; one a subjective test. Of course, the offense is driving which falls far below the standard of the ordinary, prudent, and competent motorist. So, one has the ‘far below’ aspect. The second point is that the driving must be dangerous. Danger is defined as there being a danger that there would be (and the Statute uses the words ‘would be’) injury to the person, or substantial damage to property . . .

While researchers have tended to focus on advocates’ attempts to influence the jury, counsel are equally concerned to convince the judge. More than 40% of English jury trials end in an acquittal ordered or directed by the judge, so this legal line of persuasion is potentially fruitful. It is also quite unlike that used on the jury. Here, the events of the trial are relevant only as an instantiation of a class of similar events which have been abstracted from time and place (‘in dangerous driving cases’). Rather than a sequence of

32 Legal–Professional Language in Jury Trial

causally related actions, the propositions are presented through existential processes in the timeless present (‘it is,’ ‘one has’). Hypotheses are formulated and then tested (‘one presents an objective test; one a subjective test’), and there is considerable reliance on definition (‘Danger is defined as there being . . .’) and on the precise wording of authoritative written texts (‘the Statute uses the words ‘‘would be’’’). In producing such discourse, the advocate is demonstrating his skills of legal argumentation to his professional superior, and although the submission in this case fails, he is praised by the judge for its construction – ‘‘very clearly made, if I may say so.’’

The Adversarial Context and Objection Sequences While the deep structure of the trial might be determined by narrative and the law, the surface structure is more akin to classical debate. In a criminal trial, the prosecution sets forth its Proposition motion (that the defendant is guilty of certain crimes) in the Indictment. They then introduce the Argument For conviction in the Opening statement. This argument is developed through their own witnesses. Then the Defense presents the Argument Against conviction through its witnesses. Finally, Prosecution and Defense produce Perorations for and against the motion in their Closing Arguments. In addition to this overall adversarial structure, trial procedure allows for stories presented in court to be challenged at three other levels. First, the ‘tellability’ of the story itself can be challenged through legal submissions, as in the submission for no case above. Second, the satellite narratives of the witnesses can be challenged through crossexamination. Finally, opposing counsel can object at any point to the examiner’s line of questioning. While the highly adversarial nature of jury trial discourse is indisputable, the extent to which it is manifested before the jury varies considerably across jurisdictions. This is particularly evident in the handling of objections in U.S. vs. English courts. In U.S. courts, objections are both frequent and formulaic. Although admittedly an extreme case, the O. J. Simpson criminal trial contained more than 16 000 objection sequences (Cotterill, 2003: 95). Perhaps due to this frequency of use, a sublanguage has developed for performing objections, as illustrated in this example from the O. J. trial (2003: 95): Q: And Miss Mazzola was under the impression that you had left the Rockingham scene earlier than 5:20, wasn’t she? Mr. Goldberg: Calls for speculation. The Court: Overruled.

Opposing counsel often initiate the objection with the legal performative ‘Objection.’ This is followed by an obligatory indication of the evidentiary cause for the objection, generally taken from a finite set of legal categories. Finally, the judge allows or disallows the objection using the terms ‘Sustained’ or ‘Overruled.’ In contrast, objection sequences are quite rare in English jury trials. Judges can and do intervene directly to object to or redirect a line of questioning. Furthermore, objections are perceived by barristers to be disfavored both by judges (for interrupting the flow of the examination) and jurors (for concealing information from them). When objections are made, then, they tend to be much less formulaic and more deferential: Mr. Spurs: If she said to us she hit the car, can you explain why that might be? Mr. Rider: No, your Honor. With the greatest of respect, this lady cannot explain why anybody else said something. The Recorder:* That must be right. (* A part-time judge)

The contrasting form and frequency of objection sequences is indicative of a general difference in approach to adversariality in the two jurisdictions, which might be characterized as ‘cooperative’ and ‘aggressive.’ In the cooperative mode, English barristers are confined to their tables when they speak, dress in the same manner, and tend to resolve disagreements privately rather than make public objections. Jurors are imposed by the court rather than selected adversarially, and the defense does not present a counter opening statement. Judges take a more active role in ensuring that proceedings are civil and flow smoothly and then assist the jury by summarizing the evidence before deliberation. Research with mock jurors suggests that the more cooperative English trial procedures allow better recall of trial evidence and lead to a perception that the trial was more civil and fairer than American trials (Collett and Kovera, 2003). On the other hand, there is a greater risk of both nonverbal and verbal influence from the more powerful judge.

The Persuasive Context and Closing Arguments Clearly, the central goal of trial lawyers is to persuade the jury and judge of their case. This persuasion process, then, occurs throughout the trial, either overtly or covertly. In opening statements, counsel persuade in part through their construction of narrative. In witness examination, they exploit the asymmetrical speaking rights and clear power differential to put across their case to the jury fairly covertly. This is

Legal–Professional Language in Jury Trial 33

achieved both through the strategic use of questions and through a wide range of other pragmatic strategies designed to enhance or diminish the credibility of the witness (‘person targeted’) or influence the perception of the targeted events themselves (‘idea targeted’) (Gibbons, 2003: 112). These include, among others, the strategic use of silence and interruption, presupposition in the question, reformulation of the witness’s answer in a more negative light, and the use of address forms and personal pronouns to create solidarity or distance. The closing argument, though, is the phase of the trial where the persuasion process is most overt and most adversarial. Although we are lacking clear empirical evidence of its effect on the jury, trial lawyers tend to attribute great importance to the speech and to see it as their main performance event in the trial (Walter, 1988). Here advocates recount the story of the trial as well as the story in the trial. As elsewhere in the trial, persuasion in closing arguments can occur at all linguistic levels, from subtle phonetic features, through lexical and grammatical choice, to higher levels of discourse organization. Equally important in the persuasion process is the integration of the verbal and the visual. Advocacy manuals stress the importance of visual evidence during witness examination, in the form of realia, photographs, maps, and diagrams which illustrate and bring to life the witness’s account. In one of the closing arguments of the O. J. Simpson criminal trial, the prosecutor famously constructed a jigsaw puzzle which made up a picture of the ‘murderer.’ Even without sophisticated visual aids, though, advocates can integrate the verbal and visual through metaphor. Defense lawyers often talk of the picture or impression created by the evidence. Cotterill (2003: 208– 217) shows how the prosecution and defense in the O. J. trial both use the ‘jigsaw puzzle’ metaphor to argue their respective cases. While the jigsaw is visually effective, it does have the weakness that there are always missing pieces of evidence. Consequently, Marcia Clark attempts to persuade the jury that those pieces are peripheral to the central picture (Cotterill, 2003: 216–217): In order to get the picture, to know what a jigsaw puzzle is depicting, if you’re missing a couple of pieces of the sky, you still have the picture. You know, for example, it shows a house and, you know, and a dog and a kid in the yard and that sort of thing, you can see the picture. You miss a couple of pieces of the sky sometimes, you do lose those pieces, no big deal. You’ve got the picture . . . you’ve got all the necessary pieces of the puzzle . . .

On the other hand, the defense note that ‘the prosecution took a photograph or picture of O. J. Simpson

first, then they took the pieces apart’ (Cotterill, 2003: 218), essentially accusing the prosecution of finding the evidence to fit the picture. The jury’s impression of counsel, protagonists in the story of the trial, can be an important part of the persuasion process. Consequently, impression management is an important rhetorical strategy, as Hobbs (2003) shows in an analysis of a U.S. prosecutor’s rebuttal argument (one delivered after the defense closing in some U.S. jurisdictions). However, it is wrong to think that trial lawyers focus solely on the narrative aspects of the trial in their closings. They also attempt to bridge the gap between storytelling and the legal categories to which the jury will have to fit the evidence. Indeed, there is some empirical evidence to suggest that ‘legal-expository’ closing arguments, in which legal elements are outlined along with the evidence that supports or fails to support those elements, might be more effective for both plaintiffs and defendants in civil jury trials than narratively organized closings (Spiecker and Worthington, 2003).

The Pedagogic Context and Summing-up This last point suggests a context to jury trial which is generally overlooked: its pedagogic aspect. When jurors come to their task, they might have little understanding of the law, trial procedure, or the forensic sciences. If they are to perform their task well, they will have to learn a great deal in a short time. While examining counsel are expected to get across technical concepts as clearly as possible, judges appear to lack the same obligation to communicate legal concepts and procedures effectively. Jury instructions are infamous for their complex and often archaic legal language which, in many U.S. jurisdictions, has to be read out verbatim. Even where judges are permitted some linguistic discretion, they will not always accommodate the lay participants. In a detailed study of the taking of guilty pleas, Philips (1998) distinguished two types of judges: ‘procedure-oriented’ and ‘record-oriented.’ While the former will tailor the plea-taking procedure to the perceived due process needs of the particular defendant, the latter are concerned to standardize that procedure so that they will not be overturned at appeal. The result is that defendants are less likely to understand their rights. Similarly, Heffer (2002) found that English judges, given a model of the burden and standard of proof instructions in relatively plain English, will either accommodate even more to lay language or will revert back to a more legal wording. The same phenomenon can be observed at an

34 Legal–Professional Language in Jury Trial

institutional level: higher tribunals in many U.S. states appear to focus more on issues of standardization, while the English Court of Appeal is more apt to allow judges to tailor the jury instruction process to the needs of the lay jury. The tension between legal standardization and effective communication can also be seen in the different approaches to the summing-up of evidence. Most U.S. state jurisdictions do not allow judges to review or comment on the evidence. This ensures consistency and eliminates the danger of judicial bias influencing the jury’s decision, one of the main criticisms of the English summing-up (Robertshaw, 1998). On the other hand, jurors are given no help in the difficult task of applying the law to the evidence. In discussing a ‘strong’ Tasmanian rape case that led to a hung jury, Henning notes that: . . . in the foreign and weighty circumstances of a criminal trial, it can be easy for lay people with little experience of trials to misjudge or misestimate the significance of particular matters, to lose sight of the wood for the trees (Henning, 1999: 212–213).

He thus stresses the need to guide the jury to ‘a proper evaluation of the evidence,’ using organizational and rhetorical techniques appropriate to such guidance. Currently, when judges do make explicit comments on the evidence, they tend to hedge and disclaim to such an extent that it is often difficult to extract the evaluative point. The language of judicial comments is arguably unique, particularly in the way that judges disclaim their comments through the forms you may think and it is a matter for you (Heffer, 2005) and raises the question as to how such a unique form of discourse is received by jurors. However, in considering the possible effect of the linguistic strategies adopted by judges, it is impossible to separate that language from the nature of the evidence and the case. Where a case hinges on one officer’s uncorroborated identification of the defendant, a comment on reliability, however hedged, is likely to be influential: I am certainly not going to suggest to you that because Constable Bowles is an experienced policeman that he is any more reliable than you or I or anybody else necessarily in identifying people. That would be a big mistake. What I do suggest is this. That one should take into account the fact that this was not a chance meeting between people who did not expect to meet each other. He was wholly interested in that car. That is, as I say, the only reason why he was there. So, it is a matter for you, but you may think that that is something that may cause his identification to be the more reliable. However, as I say, that is entirely a matter for you to assess.

Here the explicit comment on reliability is preceded by a number of clear evaluations supporting that

comment: ‘experienced policeman,’ ‘not a chance meeting,’ ‘wholly interested,’ ‘the only reason.’ It is doubtful, then, whether the heavy disclamation in the last two sentences makes the comment any less influential. There is a fine line, then, between guiding a jury to a reasoned evaluation of the evidence and influencing them unfairly. Yet to pretend there is no responsibility for guidance is to ignore the real difficulty of the jury’s task.

The Moral Context and Sentencing While judges are constrained in the nature of their comments during the evidential phase and the summing-up, they are quite free to express moral judgment and epistemic certainty in the act of sentencing. In Early Modern trials, it was not uncommon for the prosecutor to produce extremely negative characterizations of the defendant, as in this outburst from the Lord Chief Justice in the trial of Lady Lisle (Helsinki Corpus of Historical Texts): LCJ: Thou art a strange, prevaricating, shuffling, sniveling, lying rascal.

Analyzed according to the systemic–functional framework of JUDGEMENT, a semantic resource ‘for construing moral evaluations of behavior’ (Martin, 2000), this one comment reveals a full range of negative evaluations of social esteem and sanction: LCJ: Thou art a strange [– normality], prevaricating [– reliability], shuffling [– reliability/capacity], sniveling [– capacity], lying [– veracity] rascal [– propriety].

However, evidentiary rules now prevent crossexaminers from being so blunt, with the result that legal-professional judgment of others’ behavior is construed lexically only rarely during most stages of the trial (Heffer, 2005). The nearest equivalent to the above that can be found in a modern corpus of crossexaminations is a negative suggestion that the defendant is lying again: Q. It is a lie that you are just telling this jury today, is it not, another lie?

On the other hand, it is still common in the act of sentencing for the judge to spell out the full extent of a defendant’s moral impropriety: Your record as far as sexual offenses are concerned is quite appalling. Indeed, in this case on the last occasion when you appeared in front of me, I said, and I still think and believe, that your conduct was verging on the satanic. In relation to counts 8, 9, and 10, the rape, buggery, and false imprisonment, you behaved quite atrociously. You are an evil man and you are a wicked man.

Legal–Professional Language in Jury Trial 35

Such moral outbursts do in fact serve a legal purpose because severity of punishment is linked to the ‘‘relative moral wickedness of different offenders’’ (Hart, 1962: 37). It is the lexical inscription of extreme moral wickedness (quite appalling [sexual offenses record]; satanic [conduct]; [behaving] quite atrociously; evil [man]; wicked [man]), which justifies institutionally and socially the life sentence handed down to the convicted. There is a danger, though, in expressing moral or epistemic certainty in this way. Solan notes the tendency of appeals court judges, mindful of the severe consequences to the parties that their decisions will have, ‘‘to issue pronouncements as though there were no alternative’’ (1993: 13). Just as such judges will make out that their decisions are based on inviolable legal principle, so some lower court judges will tend to suggest, once the verdict is in, that the evidence presented before the court was incontrovertible. In sentencing six suspected IRA terrorists of the murder of 21 people in two pub bombings in Birmingham, United Kingdom, in 1974, the judge pronounced ‘‘You stand convicted on each of twenty-one counts, on the clearest and most overwhelming evidence I have ever heard.’’ Given that the convictions of the Birmingham Six were finally quashed in 1991 after it had been proven that the evidence against them had been very weak indeed, one has to ask whether the evidence in the original trial really appeared so ‘overwhelming’ or whether the judge felt compelled to justify an essentially unsafe verdict.

Conclusion While the bulk of research on the language of jury trial has focused on the interactional dynamics of witness examination and the legal wording of jury instructions, this entry has attempted to show that the field encompasses much more than this, and that there are rich veins open for future research. The complex context of jury trial makes it a fascinating site for linguistic analysis, but research in this area also has clear applications for both improving legal professional communication and reforming the jury trial system.

See also: Jury Instructions; Witness Examination.

Bibliography Bennett W L & Feldman M S (1981). Reconstructing reality in the courtroom. London: Tavistock.

Collett M & Kovera M (2003). ‘The effects of British and American trial procedures on the quality of juror decision-making.’ Law and human behavior 27(4), 403–422. Cotterill J (2003). Language and power in court: a linguistic analysis of the O. J. Simpson trial. Basingstoke: Palgrave. Evans K (1993). The golden rules of advocacy. London: Blackstone Press. Gibbons J (2003). Forensic linguistics: an introduction to language in the justice system. Malden, Mass/Oxford: Blackwell. Harris S (2001). ‘Fragmented narratives and multiple tellers: witness and defendant accounts in trials.’ Discourse Studies 3(1), 53–74. Hart H L A (1962). Law, liberty and morality. Oxford: Oxford University Press. Heffer C (2002). ‘‘‘If you were standing in Marks and Spencers’’: narrativisation and comprehension in the English summing-up.’ In Cotterill J (ed.) Language in the legal process. Houndmills: Palgrave. 228–245. Heffer C (2005). The language of jury trial: a corpus-aided linguistic analysis of legal-lay discourse. Houndmills: Palgrave. Henning T (1999). ‘Judicial summation: the trial judge’s version of the facts or the chimera of neutrality.’ International Journal for the Semiotics of Law 12, 171–213. Hobbs P (2003). ‘‘‘Is that what we’re here about?’’: a lawyer’s use of impression management in a closing argument at trial.’ Discourse and Society 14(3), 273–290. Jackson B S (1995). Making sense in law. Liverpool: Deborah Charles Publications. Labov W (1972). Language in the inner city. Philadelphia: University of Pennsylvania Press. Mandler J M (1984). Stories, scripts and scenes: aspects of schema theory. Hillsdale, NJ: Lawrence Erlbaum. Martin J R (2000). ‘Beyond exchange: appraisal systems in English.’ In Hunston S & Thompson G (eds.) Evaluation in text: authorial stance and the construction of discourse. Oxford: Oxford University Press. 142–175. Mertz E (1996). ‘Recontextualization as socialization: text and pragmatics in the law school classroom.’ In Silverstein M & Urban G (eds.) Natural histories of discourse. Chicago: University of Chicago Press. 229–249. Moore A (1989). ‘Trial by schema: cognitive filters in the courtroom.’ UCLA Law Review 37, 273–340. O’Barr W M & Conley J M (1982). ‘Litigant satisfaction versus legal adequacy in small claims court narratives.’ In Levi J N & Walker A G (eds.) Language in the judicial process. New York: Plenum Press. 97–131. Philips S (1998). Ideology in the language of judges. Oxford: Oxford University Press. Robertshaw P (1998). Summary justice. London: Cassell. Snedaker K (1991). ‘Storytelling in opening statements: framing the argumentation of the trial.’ In Papke D (ed.) Narrative and the legal discourse. Liverpool: Deborah Charles. 132–157. Solan L M (1993). The language of judges. Chicago: University of Chicago Press.

36 Legal–Professional Language in Jury Trial Spiecker S & Worthington D (2003). ‘The influence of opening statement/closing argument organizational strategy on juror verdict and damage awards.’ Law and Human Behavior 27(4), 437–456.

Stone M (1995). Cross-examination in criminal trials. London: Butterworths. Walter B (1988). The jury summation as speech genre. Amsterdam: John Benjamins.

Lehiste, Ilse (b. 1922) L Shockey, University of Reading, Reading, UK ! 2006 Elsevier Ltd. All rights reserved.

Ilse Lehiste was born on January 31, 1922 in Tallinn, Estonia. The Baltic countries suffered heavily in World War II, undergoing multiple occupation: accordingly, her first postgraduate degree was undertaken in Germany. She received her Doctorate of Philosophy at the University of Hamburg in 1948, having specialized in English and Old Norse philology and writing on Old Norse sources in the works of William Morris. She lectured at that institution until 1949, when she emigrated to the USA. Having conducted a large part of her professional life in Estonian, Russian, and German, the young Ilse then launched into lecturing in English: she became an associate professor of Germanic Philology at Kansas Wesleyan in 1950. In 1951, she moved to Michigan, where she worked as Associate Professor of Languages at Detroit Institute of Technology. Six years later, she made a smaller move to Ann Arbor, where she became a research associate and graduate student in the Communication Sciences Laboratory of the University of Michigan. She received her Ph.D. in Linguistics in 1959. While at Michigan, she worked with Gordon Peterson who was head of the laboratory, and several of her early publications were done in collaboration with him. Her history of experimental investigation of suprasegmental aspects of spoken language dates from this time, as revealed by publications in the early 1960s on the acoustic– phonetic realization of juncture, stress, accent, intonation, and quantity. These works set the tone of much subsequent phonetic research by herself and others. In 1963, Lehiste was appointed as Associate Professor at the Ohio State University in Columbus, becoming a full professor two years later. Through her success in obtaining research grants, she improved the Ohio State phonetics laboratory and in the years following supervised the research of many graduate students, a number of whom are, in the early 1990s, active in phonetic teaching and research in the

USA and elsewhere. Though research was (and is) her activity of choice, she did not ignore teaching, specializing in courses in acoustic and articulatory phonetics and historical linguistics. Two of her books, Readings in Acoustic Phonetics (1967) and Principles and Methods for Historical Linguistics (1979, with Robert Jeffers) reflect these interests. Her work in acoustic phonetics, while wide-ranging, has centered around investigating the salience of suprasegmental cues in spoken language, especially temporal ones. (It is probably no coincidence that she is a native speaker of a language with three degrees of phonemic length in both consonants and vowels and with related phonotactic restrictions.) Several experiments have dealt with changes in temporal structure when a linguistic unit is used in higher-order constructions, for example what happens temporally to a stem morph when it is embedded in longer and longer words (stick, sticky, stickier); what are the temporal cues for word boundaries (why choose versus white shoes); and what are the phonetic cues for disambiguation of grammatical structure (lighthouse keeper versus light housekeeper). The interrelation of timing and fundamental frequency as cues has also been a consistent theme, as typified by her experiments with Serbo–Croatian short and long tones. Her search for temporal units of speech production has provided fuel for the lively debate on the role of syllables, feet, and morae in phonological theory. Despite her interest in speech production, the majority of her investigations have been in speech and language perception. Her emphasis on the acoustic aspects of speech is based on the theory that the salient articulatory properties of spoken language can be recovered from the acoustic signal. Her work in historical linguistics is a natural development of her early research on Germanic philology and of the practical observation of languages in contact in her native country. Several of her publications deal with areal phenomena in both written and spoken language. Between 1963 and her retirement in 1987, Ilse Lehiste published four books (arguably the most influential of which, Suprasegmentals, appeared in 1970), five other monographs, about 100 articles,

36 Legal–Professional Language in Jury Trial Spiecker S & Worthington D (2003). ‘The influence of opening statement/closing argument organizational strategy on juror verdict and damage awards.’ Law and Human Behavior 27(4), 437–456.

Stone M (1995). Cross-examination in criminal trials. London: Butterworths. Walter B (1988). The jury summation as speech genre. Amsterdam: John Benjamins.

Lehiste, Ilse (b. 1922) L Shockey, University of Reading, Reading, UK ! 2006 Elsevier Ltd. All rights reserved.

Ilse Lehiste was born on January 31, 1922 in Tallinn, Estonia. The Baltic countries suffered heavily in World War II, undergoing multiple occupation: accordingly, her first postgraduate degree was undertaken in Germany. She received her Doctorate of Philosophy at the University of Hamburg in 1948, having specialized in English and Old Norse philology and writing on Old Norse sources in the works of William Morris. She lectured at that institution until 1949, when she emigrated to the USA. Having conducted a large part of her professional life in Estonian, Russian, and German, the young Ilse then launched into lecturing in English: she became an associate professor of Germanic Philology at Kansas Wesleyan in 1950. In 1951, she moved to Michigan, where she worked as Associate Professor of Languages at Detroit Institute of Technology. Six years later, she made a smaller move to Ann Arbor, where she became a research associate and graduate student in the Communication Sciences Laboratory of the University of Michigan. She received her Ph.D. in Linguistics in 1959. While at Michigan, she worked with Gordon Peterson who was head of the laboratory, and several of her early publications were done in collaboration with him. Her history of experimental investigation of suprasegmental aspects of spoken language dates from this time, as revealed by publications in the early 1960s on the acoustic– phonetic realization of juncture, stress, accent, intonation, and quantity. These works set the tone of much subsequent phonetic research by herself and others. In 1963, Lehiste was appointed as Associate Professor at the Ohio State University in Columbus, becoming a full professor two years later. Through her success in obtaining research grants, she improved the Ohio State phonetics laboratory and in the years following supervised the research of many graduate students, a number of whom are, in the early 1990s, active in phonetic teaching and research in the

USA and elsewhere. Though research was (and is) her activity of choice, she did not ignore teaching, specializing in courses in acoustic and articulatory phonetics and historical linguistics. Two of her books, Readings in Acoustic Phonetics (1967) and Principles and Methods for Historical Linguistics (1979, with Robert Jeffers) reflect these interests. Her work in acoustic phonetics, while wide-ranging, has centered around investigating the salience of suprasegmental cues in spoken language, especially temporal ones. (It is probably no coincidence that she is a native speaker of a language with three degrees of phonemic length in both consonants and vowels and with related phonotactic restrictions.) Several experiments have dealt with changes in temporal structure when a linguistic unit is used in higher-order constructions, for example what happens temporally to a stem morph when it is embedded in longer and longer words (stick, sticky, stickier); what are the temporal cues for word boundaries (why choose versus white shoes); and what are the phonetic cues for disambiguation of grammatical structure (lighthouse keeper versus light housekeeper). The interrelation of timing and fundamental frequency as cues has also been a consistent theme, as typified by her experiments with Serbo–Croatian short and long tones. Her search for temporal units of speech production has provided fuel for the lively debate on the role of syllables, feet, and morae in phonological theory. Despite her interest in speech production, the majority of her investigations have been in speech and language perception. Her emphasis on the acoustic aspects of speech is based on the theory that the salient articulatory properties of spoken language can be recovered from the acoustic signal. Her work in historical linguistics is a natural development of her early research on Germanic philology and of the practical observation of languages in contact in her native country. Several of her publications deal with areal phenomena in both written and spoken language. Between 1963 and her retirement in 1987, Ilse Lehiste published four books (arguably the most influential of which, Suprasegmentals, appeared in 1970), five other monographs, about 100 articles,

Lehmann, Winfred Philipp (b. 1916) 37

and over 60 reviews. The majority of these were in experimental phonetics, but her work is also strongly represented in historical linguistics, Slavic and Finno– Ugric linguistics, and literary analysis and criticism. Retirement provided her with an opportunity to do yet more research, and she was in 1992 the recipient of a grant to investigate the phonetic realization of metrical structure in orally produced poetry. Though she would like to cover all the languages with which she is familiar, she limited herself (for this project) to English, Estonian, Swedish (as spoken both in Sweden and Finland), Finnish, Latvian, Lithuanian, Icelandic, Faroese, Serbo–Croatian, and Hungarian. She is a member of over 20 learned societies and was president of the Linguistic Society of America in 1980. Her many honors include Honorary Doctorates from the University of Essex in 1977, from Lund University in 1982, and from Tartu University in 1989. She is a Fellow of the American Academy of Arts and Sciences. Ilse Lehiste has, throughout her career, collected data in order to contribute to linguistic theory. Her interests lie not only in how speech is produced and perceived, but in what facts about speech can tell about the cognitive representation of language. Her attitude may be best deduced from a quotation from Suprasegmentals: ‘A phonologist ignores phonetics at his own peril.’ In 1994, she won the Alumnae Athena Award from the University of Michigan; in 1998, she was elected as Foreign Member of the Finnish Academy of Science and Letters; and in 1999, she was awarded an Honorary Doctor of Humane Letters from Ohio State University. She was awarded the Order of the White Star, 3rd Class, by the Republic of Estonia in 2001; in 2002 she won the Medal for Scientific Achievement from the International Speech Communication

Association; and in 2003 she won the Kay Elemetrics Award from the International Society of Phonetic Sciences. She gave the keynote address at the 2002 International Conference on Spoken Language Processing in Denver, Colorado; an invited paper at the 2004 International Conference: Speech Prosody, in Nara City, Japan (on ‘Prosody in speech and singing’); and an invited paper at the 2004 International Symposium on Tonal Aspects of Languages in Beijing (‘Bisyllabicity and tone’). See also: Experimental and Instrumental Phonetics: Histo-

ry; Phonetics, Acoustic; Phonetics, Articulatory; Prosodic Aspects of Speech and Language.

Bibliography Channon R & Shockey L (eds.) (1987). In honor of Ilse Lehiste. Dordrecht: Foris. Jeffers R J & Lehiste I (eds.) (1979). Principles and methods for historical linguistics. Cambridge, MA: MIT Press. Joseph B D, De Stefano J, Jacobs N & Lehiste I (eds.) (2003). When languages collide: perspectives on language conflict, language competition, and language coexistence. Columbus: The Ohio State University Press. Lehiste I (ed.) (1967). Readings in acoustic phonetics. Cambridge, MA: MIT Press. Lehiste I (1970). Suprasegmentals. Cambridge, MA: MIT Press. Lehiste I (2000/2001). Keel kirjanduses. Tartu: Ilmamaa. Lehiste I & Ross J (eds.) (1997). Estonian prosody: papers from a symposium. Tallinn: Institute of Estonian Language. Lehiste I, Aasmae N, Meister E, Pajusalu K, Teras P & Viitso T-R (2003). Erzya prosody. Helsinki: Societe´ Finno-Ougrienne. Ross J & Lehiste I (2001). The temporal structure of Estonian runic songs. Berlin, New York: Mouton de Gruyter.

Lehmann, Winfred Philipp (b. 1916) E S Firchow, University of Minnesota, Minneapolis, MN, USA ! 2006 Elsevier Ltd. All rights reserved.

Winfred Philipp Lehmann was born on June 23, 1916, near Surprise, Nebraska, the son of a Lutheran minister. He spent his early childhood in a Germanspeaking community until 1919, when his family moved to an English-speaking parish. He attended Northwestern Academy High School in Wisconsin from 1928 to 1932, when he entered Northwestern

College in Watertown, Wisconsin, where he studied liberal arts and comparative philology (B.A. 1936). Graduate studies in Germanic linguistics followed from 1937 to 1941 at the University of Wisconsin (Madison). There he studied with Einar Haugen, Roe-Merrill Heffner, Alexander Hohlfeld, Alfred Senn, and William Freeman Twaddell. He received his M.A. from the university in 1938 and his Ph.D. in 1941. In 1940, he married Ruth Preston Lehmann (ne´ e Miller), a professor of English at the University of Texas who specialized in Old English and Old Irish. She died in 2000.

Lehmann, Winfred Philipp (b. 1916) 37

and over 60 reviews. The majority of these were in experimental phonetics, but her work is also strongly represented in historical linguistics, Slavic and Finno– Ugric linguistics, and literary analysis and criticism. Retirement provided her with an opportunity to do yet more research, and she was in 1992 the recipient of a grant to investigate the phonetic realization of metrical structure in orally produced poetry. Though she would like to cover all the languages with which she is familiar, she limited herself (for this project) to English, Estonian, Swedish (as spoken both in Sweden and Finland), Finnish, Latvian, Lithuanian, Icelandic, Faroese, Serbo–Croatian, and Hungarian. She is a member of over 20 learned societies and was president of the Linguistic Society of America in 1980. Her many honors include Honorary Doctorates from the University of Essex in 1977, from Lund University in 1982, and from Tartu University in 1989. She is a Fellow of the American Academy of Arts and Sciences. Ilse Lehiste has, throughout her career, collected data in order to contribute to linguistic theory. Her interests lie not only in how speech is produced and perceived, but in what facts about speech can tell about the cognitive representation of language. Her attitude may be best deduced from a quotation from Suprasegmentals: ‘A phonologist ignores phonetics at his own peril.’ In 1994, she won the Alumnae Athena Award from the University of Michigan; in 1998, she was elected as Foreign Member of the Finnish Academy of Science and Letters; and in 1999, she was awarded an Honorary Doctor of Humane Letters from Ohio State University. She was awarded the Order of the White Star, 3rd Class, by the Republic of Estonia in 2001; in 2002 she won the Medal for Scientific Achievement from the International Speech Communication

Association; and in 2003 she won the Kay Elemetrics Award from the International Society of Phonetic Sciences. She gave the keynote address at the 2002 International Conference on Spoken Language Processing in Denver, Colorado; an invited paper at the 2004 International Conference: Speech Prosody, in Nara City, Japan (on ‘Prosody in speech and singing’); and an invited paper at the 2004 International Symposium on Tonal Aspects of Languages in Beijing (‘Bisyllabicity and tone’). See also: Experimental and Instrumental Phonetics: Histo-

ry; Phonetics, Acoustic; Phonetics, Articulatory; Prosodic Aspects of Speech and Language.

Bibliography Channon R & Shockey L (eds.) (1987). In honor of Ilse Lehiste. Dordrecht: Foris. Jeffers R J & Lehiste I (eds.) (1979). Principles and methods for historical linguistics. Cambridge, MA: MIT Press. Joseph B D, De Stefano J, Jacobs N & Lehiste I (eds.) (2003). When languages collide: perspectives on language conflict, language competition, and language coexistence. Columbus: The Ohio State University Press. Lehiste I (ed.) (1967). Readings in acoustic phonetics. Cambridge, MA: MIT Press. Lehiste I (1970). Suprasegmentals. Cambridge, MA: MIT Press. Lehiste I (2000/2001). Keel kirjanduses. Tartu: Ilmamaa. Lehiste I & Ross J (eds.) (1997). Estonian prosody: papers from a symposium. Tallinn: Institute of Estonian Language. Lehiste I, Aasmae N, Meister E, Pajusalu K, Teras P & Viitso T-R (2003). Erzya prosody. Helsinki: Societe´ Finno-Ougrienne. Ross J & Lehiste I (2001). The temporal structure of Estonian runic songs. Berlin, New York: Mouton de Gruyter.

Lehmann, Winfred Philipp (b. 1916) E S Firchow, University of Minnesota, Minneapolis, MN, USA ! 2006 Elsevier Ltd. All rights reserved.

Winfred Philipp Lehmann was born on June 23, 1916, near Surprise, Nebraska, the son of a Lutheran minister. He spent his early childhood in a Germanspeaking community until 1919, when his family moved to an English-speaking parish. He attended Northwestern Academy High School in Wisconsin from 1928 to 1932, when he entered Northwestern

College in Watertown, Wisconsin, where he studied liberal arts and comparative philology (B.A. 1936). Graduate studies in Germanic linguistics followed from 1937 to 1941 at the University of Wisconsin (Madison). There he studied with Einar Haugen, Roe-Merrill Heffner, Alexander Hohlfeld, Alfred Senn, and William Freeman Twaddell. He received his M.A. from the university in 1938 and his Ph.D. in 1941. In 1940, he married Ruth Preston Lehmann (ne´e Miller), a professor of English at the University of Texas who specialized in Old English and Old Irish. She died in 2000.

38 Lehmann, Winfred Philipp (b. 1916)

From 1942 to 1946, Lehmann served in the American Army Signal Corps. In 1943, he became instructor and officer-in-charge at the Japanese Language School in Arlington, Virginia. In 1946, he joined the German faculty at Washington University in St Louis, leaving in 1949 to take up an associate professorship in Germanic languages at the University of Texas (Austin). In 1952, Lehmann was promoted to full professor, and from 1966 until his retirement in 1986 he also served as professor of linguistics. An able administrator, he chaired the department of Germanic languages from 1957 to 1964, and in 1961 became the director of the Linguistics Research Center. He was appointed Ashbel Smith Professor of Germanic Languages from 1963 to 1983, chaired the linguistics department from 1966 to 1972, and was Temple Centennial Professor in the Humanities from 1983 to 1986. He was well known for his good teaching, and his visiting professorships included the University of Marburg, Germany (1964); the University of Illinois, Urbana (Collitz Professor, 1968); and the State University of New York, Oswego (1976). Lehmann has held numerous offices and received many honors. He was director of the Georgetown English Language Program in Ankara, Turkey (1955–1956); was named corresponding fellow of the Institute for German Language in Mannheim (1969); and in 1975 was made a member of the Royal Danish Academy of Sciences. In 1974, he served as chair of the linguistic delegation, and in 1981 as cochair of the Commission on Humanities and Social Sciences, to the People’s Republic of China. From 1974 to 1978, Lehmann acted as chairman of the board of trustees for the Center for Applied Linguistics in Washington, DC. From 1979 to 1986, he acted as a member of the Advisory Board of the Guggenheim Memorial Foundation, and from 1972 to 1986 as secretary of the board of directors of the American Council of Learned Societies. He has held numerous fellowships, including a Fulbright (1950–1951) to Norway and a Guggenheim (1972– 1973). He is a fellow of the American Association for the Advancement of Science (since 1963); received the Brothers-Grimm-Prize (Marburg 1974); holds honorary doctorates from the State University of New York, Binghamton (1985), and from the University of Wisconsin (1995); and in 1987 was awarded the Order of Merit (Commander’s Cross) of the Federal Republic of Germany and the Pro bene meritis Award at the University of Texas. Lehmann is a member of numerous scholarly societies, among them the Linguistics Society of America (president, 1973), Modern Language Association (president, 1987), South-Central Modern Language Association (president, 1982), Association for Compu-

tational Linguistics (president, 1964), and Linguistic Association of the Southwest (president, 1975). His research interests center around Indo– European linguistics – particularly reconstruction, as well as the phonology and syntax of Proto-IndoEuropean. Originally trained in structural linguistics, Lehmann has broadened his interests to include general linguistics and the history and development of American linguistics. In the historical Germanic field, he traced the development of the Germanic verse form and investigated Gothic etymology (with H.-J. J. Hewitt, 1986) and Middle High German vocabulary (especially Walther von der Vogelweide, with R.-M. Heffner, 1940). Together with his wife Ruth Lehmann, he authored an introduction to Old Irish and to Biblical Hebrew (with E. Raizen and H.J. J. Hewitt, 1999). Lehmann has also done a great deal to further the teaching of German in the Englishspeaking world, including contributing to a widely used textbook for beginning German (with H. Rehder and G. Schulz-Behrend, 1962). He has published in English, German, and Russian. Lehmann’s numerous publications include Historical linguistics (1962), Descriptive linguistics (1972), Theoretical bases of Indo-European linguistics (1993), Pre-Indo-European (2002), as well as many edited collections, essays, and reviews in scholarly journals. He was honored by Festschriften in 1977, 1992, and 1999. His biographies appear in Who’s Who in the World, Who’s Who in America, Who’s Who in Science and Engineering, Who’s Who in American Education, Who is Who in the South and Southwest, Wer ist Was in der Sprachwissenschaft, and Internationales Germanistenlexikon (1800–1950). See also: Haugen, Einar (1906–1994); Logic and Language: Philosophical Aspects; Origin of Language Debate.

Bibliography Lehmann W P (1952). Proto-Indo-European phonology. Austin, TX: University of Texas Press and Linguistic Society of America. Lehmann W P (1953). The alliteration of Old Saxon poetry. Oslo: Aschehoug. Lehmann W P (1956). The development of Germanic verse form. New York: Gordian Press. Lehmann W P (1962). Historical linguistics: an introduction. London; New York: Routledge [3rd edition, 1992]. Lehmann W P (1972). Descriptive linguistics: an introduction. New York: Random House [2nd edition, 1976]. Lehmann W P (1974). Proto-Indo-European syntax. Austin, TX: University of Texas Press. Lehmann W P (1977). Studies in descriptive and historical linguistics: festschrift for Winfred P. Lehmann. Edited by

Leibniz, Gottfried Wilhelm (1646–1716) 39 Paul J. Hopper; with the collaboration of Harriet G. Penensick and Jerome Bunnag. Amsterdam: Benjamins. Lehmann W P (1981). Linguistische Theorien der Moderne. Bern, Las Vegas: P. Lang. Lehmann W P (1983). Language: an introduction. New York: Random House. Lehmann W P (1986). A Gothic etymological dictionary. Leiden: E. J. Brill.

Lehmann W P (1993). Theoretical bases of Indo-European linguistics. London, New York: Routledge. Lehmann W P (2002). Pre-Indo-European. Washington, DC: Institute for the Study of Man. Lehmann W P & Stachowitz R (1973). Development of German-English machine translation system. Griffiss Air Force Base, NY: Rome Air Development Center, Air Force Systems Command.

Leibniz, Gottfried Wilhelm (1646–1716) M Piot, Universite´ Stendhal-Grenoble 3, Grenoble, France ! 2006 Elsevier Ltd. All rights reserved.

Gottfried Wilhelm Leibniz (Figure 1) was a German philosopher (the designer of one of the great systems of philosophy), mathematician (co-inventor, independently of Isaac Newton, of differential and integral calculus), logician, scientist, diplomat, librarian, lawyer, historian, and linguist. He is often described as the last universal genius: a thinker whose range extended to all that was known in his day. He was born in Leipzig, Germany, on June 1, 1646, of Slavic descent, the son of Frederick Leibniz, a jurisconsult and professor of moral philosophy at the University of Leipzig, and of Catherine Schmuck, the daughter of a doctor and professor of law. In

Figure 1 Gottfried Wilhelm Leibniz.

1600, the emperor Rodolf conferred on his greatuncle, a fighting captain in Hungary, a title of nobility and the arms Leibniz bore. His father died before he was 6 years old, and his mother took care of his education. By the time he was 12, he had taught himself to read Latin easily, and had begun Greek with the great library he had inherited from his father. Intellectually precocious, he entered the University of Leipzig at the age of 15 as a law student, obtaining his baccalaureate in 1663. Before he was 20, he had mastered the ordinary textbooks on mathematics, philosophy, theology, and law. In 1666, the University of Leipzig declined to confer the degree of doctor of law upon him, because of his youth, and he went instead to the University of Altdorf (near Nuremberg), where his dissertation gained him not only the doctorate, but the offer of a professorship. An essay written on the study of law was dedicated to the Elector of Mainz, and led to his appointment by the elector, from which he was subsequently promoted to the diplomatic service. In the latter capacity, Leibniz drew up a scheme proposing to offer German cooperation. In 1672, Leibnitz went to Paris on the invitation of the French government to explain the details of the scheme, but nothing came of it. At Paris (until 1676), he met the French philosophers Malebranche and Arnauld and studied Pascal’s mathematical works; he also met the astronomer and mathematician Huygens and their conversation led Leibnitz to study geometry, which he described as opening a new world to him, although he had already written tracts on various minor points in mathematics, the most important being a paper on combinations written in 1668, and built a working model of an arithmetical calculating machine (the basis of his election to the Royal Society in 1673). This was the first mechanical calculator capable of multiplication and division (improving Pascal’s first calculating machine, capable of addition and subtraction). He also developed the modern form of the binary numeral system used in the digital computer.

Leibniz, Gottfried Wilhelm (1646–1716) 39 Paul J. Hopper; with the collaboration of Harriet G. Penensick and Jerome Bunnag. Amsterdam: Benjamins. Lehmann W P (1981). Linguistische Theorien der Moderne. Bern, Las Vegas: P. Lang. Lehmann W P (1983). Language: an introduction. New York: Random House. Lehmann W P (1986). A Gothic etymological dictionary. Leiden: E. J. Brill.

Lehmann W P (1993). Theoretical bases of Indo-European linguistics. London, New York: Routledge. Lehmann W P (2002). Pre-Indo-European. Washington, DC: Institute for the Study of Man. Lehmann W P & Stachowitz R (1973). Development of German-English machine translation system. Griffiss Air Force Base, NY: Rome Air Development Center, Air Force Systems Command.

Leibniz, Gottfried Wilhelm (1646–1716) M Piot, Universite´ Stendhal-Grenoble 3, Grenoble, France ! 2006 Elsevier Ltd. All rights reserved.

Gottfried Wilhelm Leibniz (Figure 1) was a German philosopher (the designer of one of the great systems of philosophy), mathematician (co-inventor, independently of Isaac Newton, of differential and integral calculus), logician, scientist, diplomat, librarian, lawyer, historian, and linguist. He is often described as the last universal genius: a thinker whose range extended to all that was known in his day. He was born in Leipzig, Germany, on June 1, 1646, of Slavic descent, the son of Frederick Leibniz, a jurisconsult and professor of moral philosophy at the University of Leipzig, and of Catherine Schmuck, the daughter of a doctor and professor of law. In

Figure 1 Gottfried Wilhelm Leibniz.

1600, the emperor Rodolf conferred on his greatuncle, a fighting captain in Hungary, a title of nobility and the arms Leibniz bore. His father died before he was 6 years old, and his mother took care of his education. By the time he was 12, he had taught himself to read Latin easily, and had begun Greek with the great library he had inherited from his father. Intellectually precocious, he entered the University of Leipzig at the age of 15 as a law student, obtaining his baccalaureate in 1663. Before he was 20, he had mastered the ordinary textbooks on mathematics, philosophy, theology, and law. In 1666, the University of Leipzig declined to confer the degree of doctor of law upon him, because of his youth, and he went instead to the University of Altdorf (near Nuremberg), where his dissertation gained him not only the doctorate, but the offer of a professorship. An essay written on the study of law was dedicated to the Elector of Mainz, and led to his appointment by the elector, from which he was subsequently promoted to the diplomatic service. In the latter capacity, Leibniz drew up a scheme proposing to offer German cooperation. In 1672, Leibnitz went to Paris on the invitation of the French government to explain the details of the scheme, but nothing came of it. At Paris (until 1676), he met the French philosophers Malebranche and Arnauld and studied Pascal’s mathematical works; he also met the astronomer and mathematician Huygens and their conversation led Leibnitz to study geometry, which he described as opening a new world to him, although he had already written tracts on various minor points in mathematics, the most important being a paper on combinations written in 1668, and built a working model of an arithmetical calculating machine (the basis of his election to the Royal Society in 1673). This was the first mechanical calculator capable of multiplication and division (improving Pascal’s first calculating machine, capable of addition and subtraction). He also developed the modern form of the binary numeral system used in the digital computer.

40 Leibniz, Gottfried Wilhelm (1646–1716)

Along with Newton, Leibniz is credited with inventing infinitesimal calculus in the 1670s (a mathematical foundation of modern science and engineering) and is given particular credit for his development of the integral and the product rule. Although there is some question of original authorship, at the end of the 18th century, the prevalent opinion – except the French philosopher Fontenelle – was against Leibnitz. Based on his many published and unpublished papers, today the majority of writers tend to think it more likely that the inventions were independent. He introduced several notations used in calculus to this day, for instance the integral sign representing an elongated S from the Latin word summa and the d used for differentials from the Latin word differentia. Leibniz is also credited with the term ‘function’ (1694), which he used to describe a quantity related to a curve; such as a curve’s slope or a specific point of said curve. In 1673, the Elector of Mainz died, and Leibnitz entered the service of the Brunswick family. In 1676 he again visited London, and then moved to Hannover, where, until his death, he occupied the well-paid post of librarian in the ducal library. His memoranda on the various political, historical, and theological questions concerning the Hanoverian family and dynasty of 40 years (1673–1713) form a valuable contribution to the history of that time. He was fascinated by the application of technology to the solution of both practical and theoretical problems, the method he used to work on the hydraulic press, windmills, lamps, submarines, clocks, carriages, water pumps, and so on. As a physicist, he made advances in mechanics, specifically the theory of momentum. He also made contributions to linguistics, history, esthetics, and political theory. He dedicated much of his work to the systematic organization of existing knowledge across the disciplines: he worked to establish major libraries and organize scientific societies (in Berlin and Vienna). In 1700, the German Academy of Sciences of Berlin was created on his advice, and he drew up the first body of statutes for it; he was also named foreign member of the French Academy of Sciences in Paris. In 1714, on George I’s ascension to the throne of England, Leibniz was thrown aside; he was forbidden to come to England, and the last 2 years of his life were spent in neglect. He died on November 14, 1716, in Hannover. His contributions on a vast array of subjects are found scattered in obscure journals, in hundreds of correspondences (with the leading intellectual and political figures of his era), and in

a huge collection of unpublished manuscripts, the majority of which are preserved in the Lower Saxony State Library in Hannover. Leibniz occupies at least as large a place in the history of philosophy as he does in the history of mathematics. Throughout his life, he hoped that his work on philosophy (as well as his work as a diplomat) would form the basis of a theology capable of reuniting the Church, divided since the Reformation in the 16th Century. Similarly, he was willing to engage with, and borrow ideas from, the materialists as well as the Cartesians, the Aristotelians as well as the most modern scientists. Leibniz is known among philosophers for his wide range of thought on fundamental philosophical ideas and principles, including truth, necessary and contingent truths, possible worlds, the principle of sufficient reason (i.e., that nothing occurs without a reason), the principle of pre-established harmony (i.e., that God constructed the universe in such a way that corresponding mental and physical events occur simultaneously), and the principle of noncontradiction (i.e., that any proposition from which a contradiction can be derived is false). Leibniz’s view that ours is the best of all possible worlds has been vigorously debated. Voltaire, after being attracted in Zadig, parodied Leibniz’s solution in his other famous novel Candide. His philosophical contribution to metaphysics is based on the Monadologia, which introduces ‘monads’ as ‘substantial forms of being,’ which are akin to spiritual atoms, eternal, indecomposable, individual, following their own laws, not interacting (‘windowless’) but each reflecting the whole universe in pre-established harmony. The notion of a monad solves the problem of the interaction of mind and matter that arises in Descartes’ system, as well as the individuation that seems problematic in Spinoza’s system, which represents individual creatures as mere accidental modifications of the one and only substance. During this period in philosophy, innate ideas tended to be opposed to the thorough-going empiricism of Locke. Like Descartes before him – and for many of the same reasons – Leibniz found it necessary to posit the existence of innate ideas. Leibniz’s most extensive discussion of innate ideas, not surprisingly, is in the Nouveaux essais sur l’entendement humain (1705), the most important source for Leibniz’s philosophy of language, a polemical writing directed against Locke’s An essay concerning human understanding. Now, at the metaphysical level, since monads have no ‘windows,’ it must be the case that all ideas are innate. That is to say, an idea in every monad/soul is just another property of that

Leibniz, Gottfried Wilhelm (1646–1716) 41

monad, which happens according to an entirely internal explanation represented by the complete concept. But at the phenomenal level, it is certainly the case that many ideas are represented as arriving through my senses. In general, at least any relation in space or time will appear in this way. For Leibniz, the location of an object is not a property of an independent space, but a property of the located object itself – and also of every other object relative to it. It is also worth pointing out that Leibniz (and after him Kant) continued a long tradition of philosophizing about space and time from the point of view of space – as if the two were always in a strict analogy. Space and time are not in themselves real, are not substances, are ideal. Space and time are just ways (metaphysically illegitimate ways) of perceiving certain virtual relations between substances. They are ‘phenomena.’ On comparative philology and linguistics, Leibniz’s inquiries into the origin and the development of languages were connected with his interest in the history of the peoples and nations of Europe. He held that an original language could not be found among the languages that were known (the only exception being Chinese) and proposed etymological principles, so that they would be used in historical linguistics, and criticized what he considered to be etymologically unsound reasoning. Leibniz had a lifelong interest in and pursuit of the idea that the principles of reasoning could be reduced to a formal symbolic system, an algebra or calculus of thought, in which controversy would be settled by calculations. In Leibniz’s time, several proposals for the construction of an artificial ‘universal language’ existed. Two goals to be achieved by putting such a language into practice were first to overcome language barriers, and second to have a language that was more efficient and easier to learn than existing ones. Furthermore, these artificial languages were meant to incorporate an accurate representation of knowledge, so that learning the language would entail acquiring knowledge of the world of nature. Before putting forward his project for a Characteristica Universalis, Leibniz had studied both Dalgarno’s and Wilkins’s work very carefully. Although he rightly emphasized that the language he envisaged differed fundamentally

from the languages constructed by his English precursors, he made use of their work in executing his own plans. He took the Aristotelian categories as a starting point just as Dalgarno and Wilkins had done, but he proposed a thorough revision of this theory, giving more prominence to combinatorial principles than to classificatory ones. Like the other proposals, its starting point was to be an analysis of concepts, backing up the nominal definitions of the signs of the language. His innovation was that the concepts would be coded as numbers. Reasoning could then be reduced to Calculus Ratiocinator (numerical calculation). Leibniz’s rational grammar project, which was aimed at explicating the semantics of natural language expressions so as to determine their logical structure, deserves to be further explored. Some scholars now suggest that Leibniz should be regarded as one of the first thinkers to envision something like the idea of artificial intelligence. It is clear that Leibniz, like contemporary cognitive scientists, saw an intimate connection between the form and content of language and the operations of the mind. He believed that such a language would perfectly mirror the processes of intelligible human reasoning. This view of Leibniz’s led him to formulate a plan for a universal language, an artificial language composed of symbols, which would stand for concepts or ideas, and logical rules for their valid manipulation. Leibniz’s writings about this project (which, it should be noted, he never got the chance to actualize) reveal significant insights into his understanding of the nature of human reasoning. This understanding, it turns out, is not that different from contemporary conceptions of the mind, as many of his discussions bear considerable relevance to discussions in the cognitive sciences.

Bibliography Dascal M (1987). Leibniz: Language, signs, and thought. Philadelphia: John Benjarnins. Ishiguro H (1990). Leibniz’s philosophy of logic and language (2nd edn.). Cambridge: Cambridge University Press. Pratt V (1987). Thinking machines: the evolution of artificial intelligence. Oxford: Basil Blackwell.

42 Lejeune, Michel (1907–2000)

Lejeune, Michel (1907–2000) P-Y Lambert, The Sorbonne, Paris, France ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 4, pp. 2132–2133 ! 1994, Elsevier Ltd.

Michel Lejeune, born in Paris in 1907, was educated in Classics in the E´cole Normale Supe´rieure, and soon joined the seminars of Joseph-Jean-Baptiste-Marie Vendryes and other comparativists in the E´cole Pratique des Hautes E´tudes. Received as agre´ge´ de grammaire, he later became a lecturer in Greek and Latin philology at Poitiers University, and then a professor at Bordeaux University (from 1937 on), where he was later chosen as the dean of the faculty of arts. Elected as directeur d’e´tudes in the E´cole Pratique des Hautes E´tudes (in Indo-European Comparative Grammar), he came back to Paris. He acted as the director for Human Sciences in the C.N.R.S. (French Research Foundation) from 1955 to 1963. He entered the Acade´mie des Inscriptions et Belles-Lettres in 1963. Leader of the French comparativists, he left a prolific and important amount of studies (some 20 books and 300 articles). His scientific interests ranged from ancient Greek and Mycenean to Celtiberian, from Etruscan to Phrygian, with a clear preference for Italic and ancient Celtic dialects. Attracted by the epigraphical difficulties, he was led to edit texts in minor Indo-European dialects, which had been previously discarded as undecipherable or unusable for comparative purposes. His work brought to light some Indo-European dialects, which were previously misunderstood, either in their phonetic composition or in their dialectal affinities, such as Lepontic and Venetic. His first publications dealt with ancient Greek dialects: as a thesis, he wrote a linguistic analysis of Delphic manumission texts. His teaching of Greek led him to publish a small handbook on Greek accentuation, and then an important phonetic history of Greek, with both comparative and dialectal analysis (1947; a revised edition, incorporating Mycenean sources, was published in 1972). Associated with the discoveries of Ventris and Chadwick from the start, he made a substantial contribution to the progress

of Mycenean studies, with more than 50 articles (reprinted in Me´moires de philologie myce´nienne, 3 vols, 1958–1973). This type of study, associating epigraphy and linguistics, appealed very much to his analytical mind: facts had to be carefully and rationally identified, through methodical classification and comparison; but also new facts were available that could invalidate established theories. He applied the same careful method to other epigraphies: Celtiberian, on which he published a monograph in 1956, Lepontic and Cisalpine Gaulish (1971), Venetic (1974), Gallo-Greek (1985) etc. A linguistic problem he frequently dealt with was anthroponymy: a precise classification of names can result in important information on family structures. For the sake of clarity he invented such terms as ‘idionyms’ (for simple individual names) and ‘gamonyms’ (the reference to the spouse, in Venetic). His conception of epigraphical study included a thorough understanding of the script, as well as a clear analysis of the linguistic data. All his epigraphical studies provide an important contribution to the history of writing. Reused alphabets, and particularly the use of the Greek alphabet for non-Greek languages, were a constant interest in his scientific research: these offered him the opportunity of a new phonological analysis, searching for the analysis made by the inventors of these alphabets. He published important epigraphic corpora for three of these languages, Phrygian, South Oscan (Osco-Greek), and South Gaulish (Gallo-Greek).

See also: Greek, Ancient.

Bibliography Lejeune M (1958–1973). Me´moires de philologie myce´nienne, vol. 1. Paris: C. N. R. S., 1958; Vol. 2. Rome: Ateneo, 1971; Vol. 3. Rome: 1973. Lejeune M (1972). Phone´tique historique du myce´nien et du grec ancien. Paris: Klincksieck. Lejeune M (1974). Manuel de la langue ve´ne`te. Heidelberg: C. Winter. Lejeune M (1985). Textes gallo-grecs (Recueil des Inscriptions Gauloises, vol. 1), Paris: C. N. R. S.

Lenneberg, Eric H. (1921–1975) 43

Lenneberg, Eric H. (1921–1975) J-M Fortis, CNRS, Paris, France ! 2006 Elsevier Ltd. All rights reserved.

Eric Heinz Lenneberg (Figure 1) was born in 1921 in Du¨sseldorf, Germany. His father was a physician. He moved to Brazil in 1933, when Hitler came to power, where he stayed until 1945, before settling in the United States. He graduated from Harvard in 1955 (in psychology and language) before turning to neurology during his postgraduate years as a fellow of the Russell Sage Foundation. His career can be divided into three periods: from 1953 to 1960– 1962, his interest was directed toward the linguistic categorization of color and its cognitive consequences. From 1962 to 1965, he turned to the study of language acquisition. After this date, he scaled up the scope of his investigations and embarked on the more ambitious project of describing the biological determinants of language. Most of Lenneberg’s ideas are compiled in the groundbreaking book he published in 1967, his Biological foundations of language. His academic career was spent at Cornell University. He died in New York in 1975. One of Lenneberg’s initial motivations was the desire to test empirically the Sapir-Whorf hypothesis of a correspondence (or even a causal dependence) between linguistic structures and cognitive representations. In collaboration with Roger Brown, he chose a domain that he believed to be amenable to experimentation: colors. A number of experiments on color categorization were conducted, showing that recognition of a color was positively related to an index called codability, which in fact measured the degree

Figure 1 Eric Heinz Lenneberg (1921–1975).

to which subjects agreed in naming a given color. At the time, his nominalist inclinations were witnessed by the fact that he neglected the influence of perceptual discriminability, thus attributing to codability the bulk of the facilitation effect. He was also preoccupied with intercultural comparison, and, in collaboration with the anthropologist John Milton Roberts, he extended his research on colors to Zuni classification (the Zunis are an Indian tribe on the New Mexico-Arizona border). As for Englishspeaking subjects, recognition was found to be correlated with linguistic discriminations. The stability of colors perceived as focal across subjects and cultures (which was also observed by Eleanor Heider-Rosch during the time of Lenneberg) led him, among other arguments, to adopt a perspective which he described as Neo-Kantian. In this perspective, concepts (including linguistic concepts) reflect the structure that orders ‘‘impinging physical stimuli in a predetermined and species-specific way’’ (Lenneberg, 1962: 105). This move was in line with the nativist ideas on language that he was to propose shortly after. Lenneberg’s experiments, with their emphasis on terms referring to perceptual qualities, set the agenda for much subsequent work on the linguistic relativity of cognitive representations (Lucy, 1992). His work at the Children’s Hospital Medical Center in Boston gave Lenneberg the opportunity to explore another facet of the relation between biology and language, namely the relation of language development to cerebral maturation. His observations led him to the conclusion that the compelling force that drives language acquisition is internal and biological; a minimal exposure is needed, of course, and this exposure must occur before a critical age, between 2 and the early teens. This conclusion, for which he argues at length in his main opus of 1967, is widely known today as the Critical Period Hypothesis. Lenneberg’s arguments are based on the pattern of recovery from aphasic disturbances or sudden deafness in children. They also rest on the fact that hearing children born to deaf parents do not even suffer a delay in the course of language development, provided they have been minimally exposed to language before this critical age. Particularly striking for Lenneberg was the fact that the maturation of the brain reaches an asymptote concomitantly with this critical age. However, he was cautious not to postulate a direct causal relationship between the maturation of the brain and the course of language development. He merely suggested that the maturation of the brain would set limits on what can be learned.

44 Lenneberg, Eric H. (1921–1975)

As for the cerebral localization of speech functions, Lenneberg’s stance seems to be that of a skeptic with holistic proclivities. Obviously, he was much impressed by the observations of Penfield and Roberts, who were initiating the technique of electrostimulation to map the language areas of the brain. Penfield and Roberts had shown the point-like, disseminated character of the areas where electrical stimulation interferes with speech. Such results led Lenneberg to reject any strict localization of language, beyond a left–right asymmetry and an anterior–posterior polarization, respectively, of motor and sensory aspects of language. In his 1967 book, Lenneberg also reiterates Lashley’s arguments for the need of central regulatory mechanisms involved in the coordination of complex movements. In particular, Lenneberg shows that sequential association of nervous signals can in no way account for the production of speech (which parallels Chomsky’s contemporary attack on sequential models of grammar). Although speech production is centrally regulated, Lenneberg also claims that it must adjust to a fixed rhythmic pattern providing temporal slots for the sequencing of phonemes. This last point appears quite important: in the eyes of Lenneberg, the specificity of language resides to a certain extent in its timing mechanisms.

Lenneberg’s ideas are often considered to be supportive of the Chomskyan language organ, or language acquisition device. It is unclear whether Lenneberg regarded language as a very specialized skill with a specific substrate. But it can be said that language was for him a species-specific capacity, whose development was internally driven, given a minimal amount of exposure. In this respect, he does offer support for the language organ hypothesis. See also: Aphasia Syndromes; Chomsky, Noam (b. 1928); Whorf, Benjamin Lee (1897–1941).

Bibliography Brown RW & Lenneberg E H (1954). ‘A study in language and cognition.’ Journal of Abnormal and Social Psychology 49, 454–462. Lenneberg E H (1962). ‘The relationship of language to the formation of concepts.’ Synthese 14, 103–109. Lenneberg E H (1967). Biological foundations of language. New York: John Wiley & Sons. Lenneberg E H & Roberts J M (1956). ‘The language of experience: a study in methodology.’ International Journal of American Linguistics 22(2), Memoir 13. Lucy J A (1992). Language diversity and thought. Cambridge: Cambridge University Press.

Leont’ev, Aleksei Alekseevich (1936–2004) O Thomason, University of Georgia, Athens, GA, USA ! 2006 Elsevier Ltd. All rights reserved.

Aleksei Alekseevich Leont’ev (also spelled Leontiev and Leontyev) was born on January 14, 1936, in Moscow, and died in Moscow on August 12, 2004. He received a Ph.D. in linguistics in 1968 and in psychology in 1975. He was a winner of the Lomonosov Prize, a member of the Russian Academy of Education (1992), an honorary president of the L. S. Vygotsky Psycholinguistic Society (from 1991), a professor in the psychology department at M.V. Lomonosov Moscow State University (from 1998), an author and a supervisor of the educational program ‘Shkola 2100’ (from 1996), and a member of many councils and editorial boards of international and Russian journals. After graduating from M. V. Lomonosov Moscow State University with a B.A. in German language (1958), Leont’ev worked in the Linguistic Institute of the Academy of Science USSR, where he conducted

research in linguistics (his dissertations were entitled (‘General linguistic views of I. A. Baudouin de Courtenay’) and (‘The theoretical problems of the speech act psycholinguistic modeling’) and later in psychology (‘The psychology of speech’)). Leont’ev’s areas of interest were general psychology, the psychology of speech, psycholinguistics, methods of teaching native and foreign languages, general linguistics, the history of linguistics, ethnography, and poetics. Leont’ev was the son of the famous Russian psychologist A. N. Leont’ev, who was a student of L. S. Vygotsky. Leont’ev continued his father’s research in psychology and became one of the founders of the Russian psychological school, which was based on the theoretical works of L. S. Vygotsky, A. N. Leont’ev, A. R. Lurija, and the linguistic tradition developed by L. V. Shcherba. Leont’ev was opposed to the Behaviorist theory and considered speech not to be a passive succession of vocal reactions, but a dynamic and purposeful vocal activity. He stressed the active nature of speech processes and pointed out that speech was strongly conditioned by social factors.

44 Lenneberg, Eric H. (1921–1975)

As for the cerebral localization of speech functions, Lenneberg’s stance seems to be that of a skeptic with holistic proclivities. Obviously, he was much impressed by the observations of Penfield and Roberts, who were initiating the technique of electrostimulation to map the language areas of the brain. Penfield and Roberts had shown the point-like, disseminated character of the areas where electrical stimulation interferes with speech. Such results led Lenneberg to reject any strict localization of language, beyond a left–right asymmetry and an anterior–posterior polarization, respectively, of motor and sensory aspects of language. In his 1967 book, Lenneberg also reiterates Lashley’s arguments for the need of central regulatory mechanisms involved in the coordination of complex movements. In particular, Lenneberg shows that sequential association of nervous signals can in no way account for the production of speech (which parallels Chomsky’s contemporary attack on sequential models of grammar). Although speech production is centrally regulated, Lenneberg also claims that it must adjust to a fixed rhythmic pattern providing temporal slots for the sequencing of phonemes. This last point appears quite important: in the eyes of Lenneberg, the specificity of language resides to a certain extent in its timing mechanisms.

Lenneberg’s ideas are often considered to be supportive of the Chomskyan language organ, or language acquisition device. It is unclear whether Lenneberg regarded language as a very specialized skill with a specific substrate. But it can be said that language was for him a species-specific capacity, whose development was internally driven, given a minimal amount of exposure. In this respect, he does offer support for the language organ hypothesis. See also: Aphasia Syndromes; Chomsky, Noam (b. 1928); Whorf, Benjamin Lee (1897–1941).

Bibliography Brown RW & Lenneberg E H (1954). ‘A study in language and cognition.’ Journal of Abnormal and Social Psychology 49, 454–462. Lenneberg E H (1962). ‘The relationship of language to the formation of concepts.’ Synthese 14, 103–109. Lenneberg E H (1967). Biological foundations of language. New York: John Wiley & Sons. Lenneberg E H & Roberts J M (1956). ‘The language of experience: a study in methodology.’ International Journal of American Linguistics 22(2), Memoir 13. Lucy J A (1992). Language diversity and thought. Cambridge: Cambridge University Press.

Leont’ev, Aleksei Alekseevich (1936–2004) O Thomason, University of Georgia, Athens, GA, USA ! 2006 Elsevier Ltd. All rights reserved.

Aleksei Alekseevich Leont’ev (also spelled Leontiev and Leontyev) was born on January 14, 1936, in Moscow, and died in Moscow on August 12, 2004. He received a Ph.D. in linguistics in 1968 and in psychology in 1975. He was a winner of the Lomonosov Prize, a member of the Russian Academy of Education (1992), an honorary president of the L. S. Vygotsky Psycholinguistic Society (from 1991), a professor in the psychology department at M.V. Lomonosov Moscow State University (from 1998), an author and a supervisor of the educational program ‘Shkola 2100’ (from 1996), and a member of many councils and editorial boards of international and Russian journals. After graduating from M. V. Lomonosov Moscow State University with a B.A. in German language (1958), Leont’ev worked in the Linguistic Institute of the Academy of Science USSR, where he conducted

research in linguistics (his dissertations were entitled (‘General linguistic views of I. A. Baudouin de Courtenay’) and (‘The theoretical problems of the speech act psycholinguistic modeling’) and later in psychology (‘The psychology of speech’)). Leont’ev’s areas of interest were general psychology, the psychology of speech, psycholinguistics, methods of teaching native and foreign languages, general linguistics, the history of linguistics, ethnography, and poetics. Leont’ev was the son of the famous Russian psychologist A. N. Leont’ev, who was a student of L. S. Vygotsky. Leont’ev continued his father’s research in psychology and became one of the founders of the Russian psychological school, which was based on the theoretical works of L. S. Vygotsky, A. N. Leont’ev, A. R. Lurija, and the linguistic tradition developed by L. V. Shcherba. Leont’ev was opposed to the Behaviorist theory and considered speech not to be a passive succession of vocal reactions, but a dynamic and purposeful vocal activity. He stressed the active nature of speech processes and pointed out that speech was strongly conditioned by social factors.

Lepsius, Carl Richard (1810–1884) 45

He researched different problems in communication, and for the first time in Russian psychology presented a monographic work (‘Psixologiia obshcheniia’) devoted to the psychological theory of speech (1974). He also worked on problems of speech influence in commercials and newspapers in the former USSR. Leont’ev was interested in associative experiments and published one of the first dictionaries of associative norms in a language (1977). Leont’ev was concerned with problems of language planning, education, and teaching Russian as a foreign language. He was one of the founders and a leader of the group that created the interregional public organization that helped develop and introduce a new educational program, ‘Shkola 2100,’ in Russia and parts of the former Soviet Union in 1999. This program is primarily concerned with the work of kindergartens and middle schools and is expected to run for 10 years (till 2010). Leont’ev is the author of some 600 publications, which have both theoretical and practical value. His works have been translated into 30 different languages. His research was notable for using information from different fields of the humanities and applying this knowledge to other areas (for example, his work ‘Rech v kriminalistike i sudebnoi psixologii’ (Leont’ev et al., 1977). Leont’ev gave the first detailed description of the Papuan languages and Tok Pisin in Russian linguistics. He was also the first publisher of many works written by I. A. Baudouin de Courtenay, E. D. Polivanov, L. P. Jakubinsky, L. S. Vygotsky, and A. N. Leont’ev.

See also: Psycholinguistics: Overview.

Bibliography Iusupova A (2001). Alekei Alekseevich Leont’ev: ‘Luchshe byt’ umnym troechnikom, chem glupym piatiorochnikom!’ http://flogiston.ru/interview/leontiev. Leont’ev A A (1963). Vozniknoveniie i pervonachal’noiie razvitiie iazyka. Moskva: AN SSSR. Leont’ev A A (1969). Iazyk, rech, rechevaia deiatel’nost.’ Moskva: Prosveshcheniie. Leont’ev A A (1974). Papuasskiie iazyki. Moskva: Nauka. Leont’ev A A (1974). Psixologiia obshcheniia. Tartu. Leont’ev A A (1977). ‘Ispol’zovaniie testov pri obuchenii russkomu iazyku inostranstev (psixologicheskiie osnovy i nekotorye vyvody).’ In Leont’ev A A & Zarubina N D (eds.) Psixolingvistika i obucheniie russkomu iazyku nerusskix. Moskva: Russkii iazyk. 56–68. Leont’ev A A (ed.) (1977). Slovar’ assotsiativnyx norm russkogo iazyka.. Moskva: Moskovskii universitet. Leont’ev A A (1981). Psychology and the language learning process. Moscow: Pushkin Institute. Leont’ev A A (1998). Kul’tury i iazyki narodov Rossii, stran SNG i Baltii: Uchebno-spravochnoie posobiie. Moskva: Moskovskii psixologo-sotsial’nyi institute. Leont’ev A A, Shaxnarovich A M & Batov V I (1977). Rech v kriminalistike I sudebnoi psixilogii. Moskva: Nauka. (2004). Biografiia Alekseia Alekseevicha Leont’eva. http:// www.school2100.ru. (2004). Spisok nauchnyx trudov A. A. Leont’eva. http:// www.school2100.ru. (2004). Vazhneishiie tereticheskiie oboshcheniia. (o naukax, sozdannyx A. A. Leont’evym) http://www.school2100.ru.

Lepsius, Carl Richard (1810–1884) S Pugach, The Ohio State University, Lima, OH, USA ! 2006 Elsevier Ltd. All rights reserved.

Karl Richard Lepsius, also known as Richard Lepsius, was born on December 23, 1810, in Naumburg, Thuringia, and died on July 10, 1884, in Berlin. His parents were Carl Peter Lepsius and Friederike Gla¨ ser. Lepsius was to distinguish himself as both a philologist and an archaeologist, and in that sense he followed in the footsteps of his father, a financial procurer who had founded the Thuringia-Saxon Archaeological Society. Lepsius demonstrated linguistic talent as early as Gymnasium, where he wrote a class theme on the significance of Sanskrit to comparative philology. He went to university in Leipzig, Go¨ ttingen, and Berlin, where he continued to study philology and archaeology, subjects that he

considered so closely related that he had difficulty choosing which to pursue. In Go¨ ttingen Lepsius became acquainted with philologist and folklorist Jacob Grimm (see Grimm, Jacob Ludwig Carl (1785–1863)) and his brother Wilhelm, while in Berlin he worked with comparative philologist Franz Bopp (see Bopp, Franz (1791–1867)) and theologian Friedrich Schleiermacher. From Berlin Lepsius proceeded to Paris, where he studied the works of JeanFranc¸ ois Champollion, the first scholar to decipher the Rosetta stone (see Champollion, Jean-Francois (1790–1832)). Lepsius researched hieroglyphics, too, and in 1837 demonstrated that the letters of its pictographic alphabet were polysyllabic. While Lepsius was an expert philologist, he was also an historian and Egyptologist, and he became professor of Egyptology in Berlin in 1842. From Berlin Lepsius led an archaeological expedition to Nubia and

Lepsius, Carl Richard (1810–1884) 45

He researched different problems in communication, and for the first time in Russian psychology presented a monographic work (‘Psixologiia obshcheniia’) devoted to the psychological theory of speech (1974). He also worked on problems of speech influence in commercials and newspapers in the former USSR. Leont’ev was interested in associative experiments and published one of the first dictionaries of associative norms in a language (1977). Leont’ev was concerned with problems of language planning, education, and teaching Russian as a foreign language. He was one of the founders and a leader of the group that created the interregional public organization that helped develop and introduce a new educational program, ‘Shkola 2100,’ in Russia and parts of the former Soviet Union in 1999. This program is primarily concerned with the work of kindergartens and middle schools and is expected to run for 10 years (till 2010). Leont’ev is the author of some 600 publications, which have both theoretical and practical value. His works have been translated into 30 different languages. His research was notable for using information from different fields of the humanities and applying this knowledge to other areas (for example, his work ‘Rech v kriminalistike i sudebnoi psixologii’ (Leont’ev et al., 1977). Leont’ev gave the first detailed description of the Papuan languages and Tok Pisin in Russian linguistics. He was also the first publisher of many works written by I. A. Baudouin de Courtenay, E. D. Polivanov, L. P. Jakubinsky, L. S. Vygotsky, and A. N. Leont’ev.

See also: Psycholinguistics: Overview.

Bibliography Iusupova A (2001). Alekei Alekseevich Leont’ev: ‘Luchshe byt’ umnym troechnikom, chem glupym piatiorochnikom!’ http://flogiston.ru/interview/leontiev. Leont’ev A A (1963). Vozniknoveniie i pervonachal’noiie razvitiie iazyka. Moskva: AN SSSR. Leont’ev A A (1969). Iazyk, rech, rechevaia deiatel’nost.’ Moskva: Prosveshcheniie. Leont’ev A A (1974). Papuasskiie iazyki. Moskva: Nauka. Leont’ev A A (1974). Psixologiia obshcheniia. Tartu. Leont’ev A A (1977). ‘Ispol’zovaniie testov pri obuchenii russkomu iazyku inostranstev (psixologicheskiie osnovy i nekotorye vyvody).’ In Leont’ev A A & Zarubina N D (eds.) Psixolingvistika i obucheniie russkomu iazyku nerusskix. Moskva: Russkii iazyk. 56–68. Leont’ev A A (ed.) (1977). Slovar’ assotsiativnyx norm russkogo iazyka.. Moskva: Moskovskii universitet. Leont’ev A A (1981). Psychology and the language learning process. Moscow: Pushkin Institute. Leont’ev A A (1998). Kul’tury i iazyki narodov Rossii, stran SNG i Baltii: Uchebno-spravochnoie posobiie. Moskva: Moskovskii psixologo-sotsial’nyi institute. Leont’ev A A, Shaxnarovich A M & Batov V I (1977). Rech v kriminalistike I sudebnoi psixilogii. Moskva: Nauka. (2004). Biografiia Alekseia Alekseevicha Leont’eva. http:// www.school2100.ru. (2004). Spisok nauchnyx trudov A. A. Leont’eva. http:// www.school2100.ru. (2004). Vazhneishiie tereticheskiie oboshcheniia. (o naukax, sozdannyx A. A. Leont’evym) http://www.school2100.ru.

Lepsius, Carl Richard (1810–1884) S Pugach, The Ohio State University, Lima, OH, USA ! 2006 Elsevier Ltd. All rights reserved.

Karl Richard Lepsius, also known as Richard Lepsius, was born on December 23, 1810, in Naumburg, Thuringia, and died on July 10, 1884, in Berlin. His parents were Carl Peter Lepsius and Friederike Gla¨ser. Lepsius was to distinguish himself as both a philologist and an archaeologist, and in that sense he followed in the footsteps of his father, a financial procurer who had founded the Thuringia-Saxon Archaeological Society. Lepsius demonstrated linguistic talent as early as Gymnasium, where he wrote a class theme on the significance of Sanskrit to comparative philology. He went to university in Leipzig, Go¨ttingen, and Berlin, where he continued to study philology and archaeology, subjects that he

considered so closely related that he had difficulty choosing which to pursue. In Go¨ttingen Lepsius became acquainted with philologist and folklorist Jacob Grimm (see Grimm, Jacob Ludwig Carl (1785–1863)) and his brother Wilhelm, while in Berlin he worked with comparative philologist Franz Bopp (see Bopp, Franz (1791–1867)) and theologian Friedrich Schleiermacher. From Berlin Lepsius proceeded to Paris, where he studied the works of JeanFranc¸ois Champollion, the first scholar to decipher the Rosetta stone (see Champollion, Jean-Francois (1790–1832)). Lepsius researched hieroglyphics, too, and in 1837 demonstrated that the letters of its pictographic alphabet were polysyllabic. While Lepsius was an expert philologist, he was also an historian and Egyptologist, and he became professor of Egyptology in Berlin in 1842. From Berlin Lepsius led an archaeological expedition to Nubia and

46 Lepsius, Carl Richard (1810–1884)

Egypt that was funded by the Prussian state. Much of Lepsius’s expedition was devoted to transcribing and interpreting Egyptian inscriptions. He wrote up and published the results of the expedition in the 12-volume ¨ thiopien (Lepsius, Denkma¨ler aus A¨ gypten und A 1849–1856). Lepsius also brought myriad objects back for Berlin’s Archaeological Museum, where he became curator and director in 1855. At the request of the Church Missionary Society in London, Lepsius compiled a standard phonetic alphabet for linguistic transcription in 1855 (Lepsius, 1855, 1863). He transcribed 120 languages using the system he had developed, which became a fundamental tool for missionaries throughout Africa and beyond. Scholars such as Bantu expert Carl Meinhof applied his system to their work, as did missionaries such as Johannes Raum when he transcribed the East African Chagga language (Meinhof, 1899, 1910; Raum, 1909) (see Meinhof, Carl Friedrich Michael (1857–1944)). In 1864, Rhenish Society Missionary director Friedrich Fabri commented that his Society was consulting with Lepsius about the transcription of Southwest African languages (Fabri, letter to the British and Foreign Bible Society, Archives of the British and Foreign Bible Society, July 1864); and in 1867, J. L. Krapf went further, saying that most missionaries, Protestant and Catholic alike, were using Lepsius’s orthography in their translations (Krapf, letter to the British and Foreign Bible Society, Archives of the British and Foreign Bible Society, April 1867). Later linguists and ethnologists deemed the system impractical, and it was ultimately superseded by the orthography that Diedrich Westermann and Ida Ward developed in the 1930s (Westermann and Ward, 1933; Mackert, 1999) (see Westermann, Diedrich Hermann (1875– 1956)). Late in his career, Lepsius published a book on Nubian grammar, which included an introduction about the people and languages of Africa (Lepsius, 1880). In this grammar, Lepsius attempted to classify the people and languages of the continent, and argued that the languages of Nubia were formed from a mixture of Bantu and Hamitic languages. Unlike Carl Meinhof, who stressed the primacy of lightskinned, culturally superior ‘Hamites’ from Europe or central Asia in the shaping of African languages, Lepsius declared that the oldest languages in Africa were first spoken by peoples living south of the equator. Like Meinhof, however, Lepsius believed

that grammatical characteristics were more reliable for purposes of classification than were lexical ones, as grammatical features tended to remain more constant (Jungraithmayr, 1983). Lepsius maintained missionary connections throughout his life, and was a devout Protestant who, shortly before his death in 1884, translated the Gospel of Saint Mark into Nubian. He had originally translated this Gospel for his Nubian grammar, in which it was included (Lepsius, 1880). See also: Africa as a Linguistic Area; Bopp, Franz (1791– 1867); Champollion, Jean-Francois (1790–1832); Grimm, Jacob Ludwig Carl (1785–1863); Krapf, Johann Ludwig (1810–1881); Meinhof, Carl Friedrich Michael (1857–1944); Sanskrit; Westermann, Diedrich Hermann (1875–1956).

Bibliography British and Foreign Bible Society Archive, Cambridge University Library, correspondence of J. L. Krapf, correspondence of F. Fabri. Ebers G (1887). Richard Lepsius: a biography. New York: William S. Gottsberger. Jungraithmayr H (1983). ‘Lepsius, Karl Richard.’ In Jungraithmayr H & Mohlig W J G (eds.) Lexikon der Afrikanistik: Afrikanische Sprachen und ihre Erforschung. Berlin: Dietrich Reimer Verlag. 144–145. Lepsius K R (1849–1856). Denkma¨ ler aus A¨ gyptien und A¨ thiopien. Berlin: Nicolaische Buchhandlung. Lepsius K R (1855, 1863). Standard alphabet for reducing unwritten languages and foreign graphics systems to a uniform orthography in European letters. London: Seeleys. Lepsius K R (1880). Nubische Grammatik, mit einer Anleitung u¨ ber die Vo¨ lker und Sprachen Afrikas. Berlin: Hertz. Mackert M (1999). ‘Franz Boas’ early northwest coast alphabet.’ Historiographia Linguistica: International Journal for the History of the Language Sciences/ Revue Internationale pour l’Histoire des Sciences du Langage/Internationale Zeitschrift fu¨ r die Geschichte der Sprachwissenschaften 26(3), 273–294. Meinhof C F M (1899, 1910). Grundriss einer Lautlehre der Bantusprachen nebst Anleitung zur Aufnahme von Bantusprachen (2nd edn.). Berlin: Dietrich Reimer (Ernst Vohsen). Raum J (1909). Versuch einer Grammatik der Dschaggasprache. Berlin: Dietrich Reimer. Westermann D & Ward I C (1933). Practical phonetics for students of African languages. London: Milford.

Leskien, August (1840–1916) 47

Leskien, August (1840–1916) K R Jankowsky, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.

August Leskien was born on July 8, 1840 in Kiel, a city in the northern part of Germany. He completed high school in 1860 and began university studies in his home town with the classical scholar Georg Curtius (see Curtius, Georg (1820–1885)). When Curtius left for Leipzig in 1862, Leskien went along. Apart from classical philology, he specialized in comparative philology. While still a student, he traveled to eastern European countries, where he became fascinated with southern Slavic languages. During semester breaks, he acquainted himself with Lower Sorbian by regularly visiting farmers in Spreewald in the Brandenburg province. In 1864, he obtained his doctorate and also passed the Staatsexamen, thus qualifying for teaching high school. For two years he taught Latin and Greek at the Thomas-Schule in Leipzig until, in 1866, he went to Jena to continue his academic studies under August Schleicher (see Schleicher, August (1821–1868)), with special emphasis on Baltic and Slavic languages. He received his final degree, venia legendi or habilitation, the formal initiation into the profession of a university teacher, in the field of comparative philology from the University of Go¨ttingen in 1867 with the thesis Futur und Aorist bei Homer. He taught at Go¨ttingen for two years, then accepted an appointment as associate professor (Extra-Ordinarius) in Jena, where he succeeded Schleicher. A year later, in 1870, he transferred to Leipzig as ExtraOrdinarius for Slavic philology. He became full professor (Ordinarius) there in 1876. During the following years, he declined numerous invitations from other German universities and preferred to stay in Leipzig, where he died on September 20, 1916. Leskien occupied a special place in the Neogrammarian movement, which emerged around 1875. Directly or indirectly, he exerted a decisive influence on all principal participants in that he initiated the formation of the new philological methodology. When he published his Declination im SlavischLitauischen und Germanischen in 1876 (Leskien, 1876), he formally laid down the ground rules for what had already been practiced for quite some time by a large number of his friends and students: all historical language changes must either be reduced to the exceptionless operation of sound laws or be identified as the results of analogy. Leskien spoke for all Neogrammarians when he asserted that discarding the regularity principle would amount to

admitting that language changes cannot be scientifically investigated at all. Like all other Neogrammarians, Leskien was above all an extraordinarily productive scholar. At his time he was undisputed leader in his field of specialization, the Baltic–Slavic languages, and remained so for several generations, especially in the field of phonology. His work on syntax, amply documented as truly significant by his academic lectures held during the last years of his life, did not reach the stage of completion. Leskien devoted most of his scholarly activities to the linguistic investigation of languages, but he was deeply interested also in all cultural aspects of language. His numerous publications on folk songs, fairy tales, and related literary topics, mostly dealing with Baltic and Slavic languages, were widely accepted as equal in quality to his linguistic works. Leskien was quick in recognizing and acknowledging great talent in his own country as well as elsewhere. His speedy translation of William Dwight Whitney’s (see Whitney, William Dwight (1827– 1894)) Life and growth of language into German, only one year after its English original had appeared in 1875, is just one of many possible illustrations (Leskien, 1876). Leskien’s widely acknowledged achievements left their imprint not only on Baltic and Slavic studies, but also on historical and comparative linguistics in general. His works remain stimulating, and in some respects, indispensable. See also: Balto-Slavic Languages; Curtius, Georg (1820–

1885); Neogrammarians; Schleicher, August (1821–1868); Whitney, William Dwight (1827–1894).

Bibliography Leskien A (1867). Futur und Aorist bei Homer. Doctoral Dissertation, Go¨ttingen. Leskien A (1868). ‘Zur neuesten Geschichte der slavischen Sprachforschung.’ In Kuhn A & Schleicher A (eds.) Beitra¨ge zur vergleichenden Sprachforschung auf dem Gebiete, der arischen, celtischen und slavischen Sprachen 5, 403–444. Leskien A (1871). Handbuch der altbulgarischen (altkirchenslavischen) Sprache. Grammatik, Texte, Glossar. Weimar: Bo¨hlau. Leskien A (1876a). Die Declination im SlavischLitauischen und Germanischen. Leipzig: S. Hirzel. Leskien A (trans.) (1876b). Leben und Wachstum der Sprache (William Dwight Whitney). Leipzig: Brockhaus. Leskien A (1885, 1893). ‘Untersuchungen u¨ber Quantitat und Betonung in den slavischen Sprachen.’ Abhandlungen der Ko¨niglich-Sa¨chsischen Gesellschaft der Wissenschaften, Philol.-hist. Klasse 10, 69–220; 13, 527–610.

48 Leskien, August (1840–1916) Leskien A (1909). Grammatik der altbulgarischen (altkirchen-slavischen) Sprache. Heidelberg: Winter. Leskien A (1913). Grammatik der serbo-kroatischen Sprache. Heidelberg: Winter. Pohl H D (1985). ‘August Leskien.’ In Neue Deutsche Biographie, vol. 14. Berlin: Duncker & Humblot. 329–330. Streitberg W (1914). ‘August Leskien.’ Indogermanisches Jahrbuch [fu¨ r das Jahr 1913] 1, 216–266.

Streitberg W (1919). ‘August Leskien.’ Indogermanisches Jahrbuch 7, 138–143. Repr. in Sebeok T A (ed.) (1966). Portraits of linguists, vol. 1. Bloomington, IN: Indiana University Press. Streitberg G (1932). ‘August Leskiens Schriften (1866– 1916).’ Zeitschrift fu¨ r slavische Philologie 9, 1–10.

Lesotho: Language Situation N C Kula, Rijks Universiteit Leiden, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

Lesotho, located in southern Africa, is completely surrounded by South Africa. The country covers an area of 30 355 km2 and has a population of 1.8 million, based on 2003 United Nation estimates. Sesotho (or Southern Sotho) is the major language spoken in Lesotho; according to the Ethnologue (Grimes, 2003), the number of speakers is estimated to be 1.49 million people, which is approximately 85% of the population. This makes up about 37.5% of the total Sesotho-speaking population of southern Africa. Sesotho-speaking peoples have been in southern Africa since around 1400, after moving to this area from central parts of the continent. The Basotho nation emerged from the merger of a number of small Southern Sotho-speaking clans that King Moshoeshoe moved into the highlands of central South Africa at the beginning of the 19th century. Sesotho was the language of the Basotho people even before they settled in the region that is the current Lesotho, because the fleeing Nguni people that merged into the Basotho society were required to learn Sesotho. Sesotho was also the only official language of Lesotho up until its annexation by the British in 1868, at which time English was given official status. With independence in 1966, both Sesotho and English were made official languages by legislation. Today, Sesotho, with the majority of speakers, remains the national and official first language, with English as the official second language. Despite being the official second language, English is quite prominent as the language of government,

commerce, education, and the judicial system. In education, Sesotho is used as a medium of instruction from the ages of 6 to 9 years, after which it is taught as a subject in secondary school and may also be studied at the national university. Sesotho is also one of the earliest written African languages, with the first texts by French missionaries dating back to 1833. As a result, it currently boasts an extensive literature. In addition to Sesotho and English, a few Nguni languages, particularly Zulu and Xhosa, are also spoken. This is not surprising, since apart from a large Southern Sotho-speaking area to the west and north of Lesotho, there are Zulu- and Xhosa-speaking peoples of South Africa to the east and south, respectively. Afrikaans is also spoken to some extent, in addition to various Indian and European languages. Sesotho is classified in the Niger-Congo Bantu language phylum and is part of Guthrie’s (1948) Zone S languages. Sesotho has some affinities with Tswana and Northern Sotho, which are spoken further north of Lesotho, although it must be noted that despite what the nomenclature may suggest, Northern Sotho (also called Sepedi) is linguistically closer to Tswana than it is to Sesotho (Southern Sotho). See also: Bantu Languages; Niger-Congo Languages; South Africa: Language Situation; Southern Bantu Languages.

Bibliography Grimes M (2003). Ethnologue: languages of the world (14th edn.). Dallas: Summer Institute of Linguistics (http://www.ethnologue.com). Guthrie M (1948). The classification of Bantu languages. London: International African Institute.

48 Leskien, August (1840–1916) Leskien A (1909). Grammatik der altbulgarischen (altkirchen-slavischen) Sprache. Heidelberg: Winter. Leskien A (1913). Grammatik der serbo-kroatischen Sprache. Heidelberg: Winter. Pohl H D (1985). ‘August Leskien.’ In Neue Deutsche Biographie, vol. 14. Berlin: Duncker & Humblot. 329–330. Streitberg W (1914). ‘August Leskien.’ Indogermanisches Jahrbuch [fu¨r das Jahr 1913] 1, 216–266.

Streitberg W (1919). ‘August Leskien.’ Indogermanisches Jahrbuch 7, 138–143. Repr. in Sebeok T A (ed.) (1966). Portraits of linguists, vol. 1. Bloomington, IN: Indiana University Press. Streitberg G (1932). ‘August Leskiens Schriften (1866– 1916).’ Zeitschrift fu¨r slavische Philologie 9, 1–10.

Lesotho: Language Situation N C Kula, Rijks Universiteit Leiden, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

Lesotho, located in southern Africa, is completely surrounded by South Africa. The country covers an area of 30 355 km2 and has a population of 1.8 million, based on 2003 United Nation estimates. Sesotho (or Southern Sotho) is the major language spoken in Lesotho; according to the Ethnologue (Grimes, 2003), the number of speakers is estimated to be 1.49 million people, which is approximately 85% of the population. This makes up about 37.5% of the total Sesotho-speaking population of southern Africa. Sesotho-speaking peoples have been in southern Africa since around 1400, after moving to this area from central parts of the continent. The Basotho nation emerged from the merger of a number of small Southern Sotho-speaking clans that King Moshoeshoe moved into the highlands of central South Africa at the beginning of the 19th century. Sesotho was the language of the Basotho people even before they settled in the region that is the current Lesotho, because the fleeing Nguni people that merged into the Basotho society were required to learn Sesotho. Sesotho was also the only official language of Lesotho up until its annexation by the British in 1868, at which time English was given official status. With independence in 1966, both Sesotho and English were made official languages by legislation. Today, Sesotho, with the majority of speakers, remains the national and official first language, with English as the official second language. Despite being the official second language, English is quite prominent as the language of government,

commerce, education, and the judicial system. In education, Sesotho is used as a medium of instruction from the ages of 6 to 9 years, after which it is taught as a subject in secondary school and may also be studied at the national university. Sesotho is also one of the earliest written African languages, with the first texts by French missionaries dating back to 1833. As a result, it currently boasts an extensive literature. In addition to Sesotho and English, a few Nguni languages, particularly Zulu and Xhosa, are also spoken. This is not surprising, since apart from a large Southern Sotho-speaking area to the west and north of Lesotho, there are Zulu- and Xhosa-speaking peoples of South Africa to the east and south, respectively. Afrikaans is also spoken to some extent, in addition to various Indian and European languages. Sesotho is classified in the Niger-Congo Bantu language phylum and is part of Guthrie’s (1948) Zone S languages. Sesotho has some affinities with Tswana and Northern Sotho, which are spoken further north of Lesotho, although it must be noted that despite what the nomenclature may suggest, Northern Sotho (also called Sepedi) is linguistically closer to Tswana than it is to Sesotho (Southern Sotho). See also: Bantu Languages; Niger-Congo Languages; South Africa: Language Situation; Southern Bantu Languages.

Bibliography Grimes M (2003). Ethnologue: languages of the world (14th edn.). Dallas: Summer Institute of Linguistics (http://www.ethnologue.com). Guthrie M (1948). The classification of Bantu languages. London: International African Institute.

Levels of Adequacy, Observational, Descriptive, Explanatory 49

Levels of Adequacy, Observational, Descriptive, Explanatory M Green, University of Sussex, Brighton, UK ! 2006 Elsevier Ltd. All rights reserved.

Background Assumptions Chomsky’s earliest work (1957, 1959) stated the logical problem of language acquisition, or the poverty of the stimulus argument: the idea that given finite linguistic input, the speaker acquires the potential for infinite linguistic output. Furthermore, imperfect input arising from performance errors does not result in imperfect knowledge. According to Chomsky, it follows that the principles of grammar are underdetermined by the input, or primary linguistic data (PLD), and therefore the speaker must bring some innate linguistic knowledge or competence to the process of language acquisition. The idea that linguistic knowledge arises from ‘‘drawing out what is innate in the mind’’ (Chomsky, 1965: 51) is known by philosophers as the rationalist view, and contrasts with the empiricist view, which holds that linguistic knowledge is constructed on the basis of experience and is independent of any specialized cognitive system. The empiricist view is evident, for example, in the usage-based model of language acquisition favoured in cognitive linguistics (Langacker, 2000). The model of the initial state of the innate language faculty posited by Chomsky is known as Universal Grammar and assumes a modular model of mind (see Chomsky, 1986: 13, 1986: 150; Fodor, 1983, 2000), according to which the language faculty represents an encapsulated system of specialized knowledge that equips the child for the acquisition of language. In developing this mentalist theory of language, Chomsky asserted that the only revealing object of linguistic study, given the objective of characterizing competence, is the system of linguistic knowledge in the mind of the idealized individual speaker. This system of internalized linguistic knowledge is known as I-language (Chomsky, 1986: 19–56) and generates the expressions uttered by the speaker. The theory of this system is referred to as the grammar, hence the term ‘generative grammar.’ From this perspective, the externalized language of the speech community (E-language) is merely epiphenomenal, in the sense that it arises as the output of individual I-languages. It is I-language that underlies the native speaker’s intuitions concerning grammaticality. Central to this model is the grammatical transformation or displacement operation, which links two positions in a structure: the (thematic) position in which an expression is interpreted, and the position to which that expression has ‘moved’ in order to satisfy other

grammatical requirements. For example, in (1), the fronted question word who is interpreted as the object of the verb saw; its thematic position is indicated by the underscore. (1) Who did Lily say George saw yesterday?

Levels of Adequacy Since Chomsky’s early work (see, for example, 1965: 24–37), levels of adequacy have played a central role in the design of the generative model of language, which aims to characterize the linguistic knowledge or competence that underlies the individual speaker’s I-language. According to Chomsky, an adequate theory of language must meet three criteria. The weakest of these criteria is observational adequacy, which is likely to be (at least partially) met by any grammar of any language. However, descriptive adequacy and explanatory adequacy are more stringent criteria that must be met by ‘a genuine theory of human language’ (Chomsky, 2000: 7). The three levels of adequacy can be summarized as follows. Observational adequacy: The grammar of a particular language specifies which sentences are and are not well formed in that language. A grammar that is observationally adequate might contain statements such as ‘This sentence is (un)grammatical in English’ or ‘These strings of words are sentences in English; these strings of words are not’. For example, an observationally adequate grammar of English would distinguish between (2a) and (2b), grammatical sentences, and (2c), an ungrammatical string. (2a) Lily loves him (2b) Lily loves herself (2c) *Lily loves himself

Descriptive adequacy: The grammar of a particular language specifies which sentences are and are not well formed in that language and accounts for the speaker’s intuitions concerning grammaticality by accurately modeling the tacit knowledge that underlies the (un)grammaticality of those strings and by describing their properties in terms of principles of language. A grammar that attains descriptive adequacy might contain statements such as ‘This sentence is grammatical in English, and contains elements ABC, which stand in structural configurations PQR, and are governed by principles XYZ’. For example, a descriptively adequate grammar of English would account for the (un)grammaticality of the examples in (2) in terms of principles governing the distribution of pronouns like him, himself, and herself across the language as a whole. In Binding Theory (Chomsky,

50 Levels of Adequacy, Observational, Descriptive, Explanatory

1986: 164–184), these principles are stated in terms of coreference and locality. Lily can be coreferential with herself, since both expressions are third person, singular, and feminine; Lily is therefore a potential binder for herself, but not for himself, which explains why (2b) is grammatical while (2c) is not. Furthermore, reflexive pronouns like herself need to be bound locally: compare (2b) with (3a), where brackets indicate an embedded clause. In contrast, pronouns like him do not tolerate local binding. This explains why (2a) is grammatical (Lily cannot bind him), and why (3b) is only grammatical if her is not coreferential with Lily. (3a) *Lily said [herself was hungry] (3b) Lily loves her

Explanatory adequacy: The grammar provides a descriptively adequate account of each individual language and explains how knowledge of language is acquired via the interaction of the input (PLD) and the innate endowment (UG). A grammar that attains explanatory adequacy might contain statements such as ‘This sentence is grammatical in English, and contains elements ABC, which stand in structural configurations PQR, and are governed by principles XYZ. These principles are represented as follows in UG and activated by exposure to PLD of the following kind...’. For example, an explanatorily adequate model of grammar would represent the binding principles described earlier in terms that are sufficiently general to account for the distribution of pronouns in all human languages, and sufficiently simple and economical to plausibly account for the acquisition of this knowledge on the part of the child, given exposure to everyday spoken language. In Binding Theory, the statement of principles capturing the distribution of both types of pronoun in (2) and (3) in terms of a single local binding domain represents an attempt to meet these objectives: while reflexive pronouns like herself must be bound within this domain, pronouns like him must not be bound within this domain.

Grammar of a Language versus a Theory of Language As the previous section indicates, it is important to distinguish between a grammar of a particular language and a theory of human language (see Radford, 1988: 28–30). While the latter may be described as the ultimate objective of modern linguistic theory, it cannot be achieved in isolation from the former. In other words, a theory of human language that attains explanatory adequacy must contain the basis for a descriptively adequate account of each particular language. It has often been observed that there is a

tension between the two objectives of descriptive and explanatory adequacy (see Chomsky, 1986: 51– 52, 2000: 7, 2002: 31). While descriptive adequacy requires a large set of complex, intricate, and detailed statements in order to account for particular languages, explanatory adequacy requires a finite set of universal statements that can represent Universal Grammar in terms that are psychologically plausible. In particular, the goal of explanatory adequacy requires a small set of statements that are maximally general (in order to apply cross-linguistically) and maximally simple and economical (in order to account for the rapid and effortless acquisition of language, as well as for its creativity and the intuitions of its speakers). For example, as explained earlier, the principles of binding theory must account not only for speaker intuitions concerning grammaticality (‘example [2c] is not a possible sentence in English’) but also for speaker intuitions concerning possible interpretations of grammatical sentences (‘in example [3b], Lily and her do not refer to the same person’).

Role in the Development of the Model We can trace a number of key developments in the Chomskyan model that represent attempts to resolve this tension between the objectives of descriptive and explanatory adequacy. For example, the move away from sets of constructions and construction-specific transformational rules to the single generalized transformational rule Move-alpha (Chomsky, 1980) represented a move toward explanatory adequacy. The development of the Principles and Parameters approach (Chomsky, 1981) also brought generative grammar closer to resolving the tension between the requirements of descriptive and explanatory adequacy, by positing the existence of a universal set of principles that account for Universal Grammar, together with a finite set of options or parameters that account for cross-linguistic variation.

Recent Developments More recently, the development of the Minimalist Program (Chomsky, 1995) also represented a move toward the objective of explanatory adequacy, by reducing the grammar to two basic operations: ‘Merge’ (structure building) and ‘Move’ (displacement). Indeed, Chomsky argued that the Principles and Parameters approach was so successful in resolving the tension between the objectives of descriptive and explanatory adequacy that the Minimalist research agenda could go a step further. ‘‘In principle, then, we can seek a level of explanation deeper than

Le´vi-Strauss, Claude (b. 1908) 51

explanatory adequacy, asking not only what the properties of language are, but why they are that way’’ (Chomsky, 2001: 2). In pursuing this question, the Minimalist Program seeks to reexamine the principles of Universal Grammar from the perspective of questions about the design of the language system itself, rather than focusing primarily upon the design of the theory of language. See also: Chomsky, Noam (b. 1928); Generative Grammar; Linguistics as a Science; Minimalism; Principles and Parameters Framework of Generative Grammar.

Bibliography Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1959). ‘Review of B. F. Skinner’s Verbal behaviour 1957.’ Language 35, 26–58. Chomsky N (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky N (1980). Rules and representations. Oxford: Blackwell. Chomsky N (1981). Lectures on government and binding. Dordrecht: Foris.

Chomsky N (1986). Knowledge of language: its nature, origin and use. New York: Praeger. Chomsky N (1995). The Minimalist program. Cambridge, MA: MIT Press. Chomsky N (2000). New horizons in the study of language and mind. Cambridge: CUP. Chomsky N (2001). ‘Beyond explanatory adequacy.’ In MIT Occasional Papers in Linguistics 20. MIT. Chomsky N (2002). Belletti A & Rizzi L (eds.) On nature and language. Cambridge: CUP. Fodor J A (1983). The modularity of mind. Cambridge, MA: MIT Press. Fodor J A (2000). ‘Precis of The modularity of mind.’ In Cummins R & Dellarosa Cummins D (eds.) Minds, brains and computers: the foundations of cognitive science. Oxford: Blackwell. 493–499. Langacker R (2000). ‘A dynamic usage-based model.’ In Barlow M & Kemmer S (eds.) Usage-based models of language. Stanford: CSLI Publications. 1–63. Pinker S (1994). The language instinct. Harmondsworth: Penguin. Radford A (1988). Transformational grammar: a first course. Cambridge: CUP. Radford A (2004). Minimalist syntax: exploring the structure of English. Cambridge: CUP. Skinner B F (1957). Verbal behavior. New York: AppletonCentury-Crofts.

Le´vi-Strauss, Claude (b. 1908) A Campbell ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 4, pp. 2135–2136, ! 1994, Elsevier Ltd.

Having graduated in law from the Paris Law Faculty and in philosophy from the Sorbonne in 1931, Le´ viStrauss visited Brazil in 1934. During a short spell teaching at the University of Sa˜ o Paulo he took the opportunity to do some fieldwork amongst Brazilian Indians. He returned to Brazil for another field trip in 1938–1939. He left France after the German occupation and went to New York where he taught at the New School for Social Research and met Roman Jakobson (see Jakobson, Roman (1896– 1982)) whose work on phonology was to be one of his main intellectual inspirations. After the War he returned to Paris and held a post at the Muse´ e de l’Homme and in 1950 became directeur d’e´tudes at the E´ cole Pratiques des Hautes E´ tudes. In 1959 the Chair of Social Anthropology was created for him at the Colle`ge de France. In 1973 he became a member of the Acade´ mie Franc¸ aise. (He continues to publish at the time of writing, 1991.)

Le´ vi-Strauss is credited with creating the intellectual movement called ‘structuralism.’ The massive, technical, Les Structures e´le´mentaires de la parente´ was published in 1949 but perhaps nothing would have come of structuralism had Le´ vi-Strauss not published Tristes tropiques in 1955. This was a personal account of his travels, written in a readable, but sophisticated style, and it became a best-seller. That set the stage for the publication of Anthropologie structurale (Structural anthropology) in 1958 (a collection of essays written in the 1940s and 1950s). The popularity of the previous work guaranteed intense interest in this one. A new intellectual fashion had arrived. Structuralism replaced ‘existentialism,’ and Le´ vi-Strauss challenged Sartre’s dominance as the star of Parisian intellectual life. Le´ vi-Strauss declared that his three ‘intellectual mistresses’ were geology, psychoanalysis, and Marxism, the common idea being that in each of these enquiries what appears to be the case on the surface is determined by hidden (unconscious) structures, laws, or determinants. Among his other sources of inspiration were cybernetics, music, and above all linguistics. The original structuralist charter is set out in Chapter 2 of Structural anthropology, originally

Le´vi-Strauss, Claude (b. 1908) 51

explanatory adequacy, asking not only what the properties of language are, but why they are that way’’ (Chomsky, 2001: 2). In pursuing this question, the Minimalist Program seeks to reexamine the principles of Universal Grammar from the perspective of questions about the design of the language system itself, rather than focusing primarily upon the design of the theory of language. See also: Chomsky, Noam (b. 1928); Generative Grammar; Linguistics as a Science; Minimalism; Principles and Parameters Framework of Generative Grammar.

Bibliography Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1959). ‘Review of B. F. Skinner’s Verbal behaviour 1957.’ Language 35, 26–58. Chomsky N (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky N (1980). Rules and representations. Oxford: Blackwell. Chomsky N (1981). Lectures on government and binding. Dordrecht: Foris.

Chomsky N (1986). Knowledge of language: its nature, origin and use. New York: Praeger. Chomsky N (1995). The Minimalist program. Cambridge, MA: MIT Press. Chomsky N (2000). New horizons in the study of language and mind. Cambridge: CUP. Chomsky N (2001). ‘Beyond explanatory adequacy.’ In MIT Occasional Papers in Linguistics 20. MIT. Chomsky N (2002). Belletti A & Rizzi L (eds.) On nature and language. Cambridge: CUP. Fodor J A (1983). The modularity of mind. Cambridge, MA: MIT Press. Fodor J A (2000). ‘Precis of The modularity of mind.’ In Cummins R & Dellarosa Cummins D (eds.) Minds, brains and computers: the foundations of cognitive science. Oxford: Blackwell. 493–499. Langacker R (2000). ‘A dynamic usage-based model.’ In Barlow M & Kemmer S (eds.) Usage-based models of language. Stanford: CSLI Publications. 1–63. Pinker S (1994). The language instinct. Harmondsworth: Penguin. Radford A (1988). Transformational grammar: a first course. Cambridge: CUP. Radford A (2004). Minimalist syntax: exploring the structure of English. Cambridge: CUP. Skinner B F (1957). Verbal behavior. New York: AppletonCentury-Crofts.

Le´vi-Strauss, Claude (b. 1908) A T Campbell ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 4, pp. 2135–2136, ! 1994, Elsevier Ltd.

Having graduated in law from the Paris Law Faculty and in philosophy from the Sorbonne in 1931, Le´viStrauss visited Brazil in 1934. During a short spell teaching at the University of Sa˜o Paulo he took the opportunity to do some fieldwork amongst Brazilian Indians. He returned to Brazil for another field trip in 1938–1939. He left France after the German occupation and went to New York where he taught at the New School for Social Research and met Roman Jakobson (see Jakobson, Roman (1896– 1982)) whose work on phonology was to be one of his main intellectual inspirations. After the War he returned to Paris and held a post at the Muse´e de l’Homme and in 1950 became directeur d’e´tudes at the E´cole Pratiques des Hautes E´tudes. In 1959 the Chair of Social Anthropology was created for him at the Colle`ge de France. In 1973 he became a member of the Acade´mie Franc¸aise. (He continues to publish at the time of writing, 1991.)

Le´vi-Strauss is credited with creating the intellectual movement called ‘structuralism.’ The massive, technical, Les Structures e´le´mentaires de la parente´ was published in 1949 but perhaps nothing would have come of structuralism had Le´vi-Strauss not published Tristes tropiques in 1955. This was a personal account of his travels, written in a readable, but sophisticated style, and it became a best-seller. That set the stage for the publication of Anthropologie structurale (Structural anthropology) in 1958 (a collection of essays written in the 1940s and 1950s). The popularity of the previous work guaranteed intense interest in this one. A new intellectual fashion had arrived. Structuralism replaced ‘existentialism,’ and Le´vi-Strauss challenged Sartre’s dominance as the star of Parisian intellectual life. Le´vi-Strauss declared that his three ‘intellectual mistresses’ were geology, psychoanalysis, and Marxism, the common idea being that in each of these enquiries what appears to be the case on the surface is determined by hidden (unconscious) structures, laws, or determinants. Among his other sources of inspiration were cybernetics, music, and above all linguistics. The original structuralist charter is set out in Chapter 2 of Structural anthropology, originally

52 Le´ vi-Strauss, Claude (b. 1908)

published as an article in 1945 in one of the first issues of Word. Structural linguistics, through Trubetzkoy’s phonology (see Trubetskoy, Nikolai Sergeievich, Prince (1890–1938)), was set to play a ‘renovating role’ in the social sciences. Edwin Ardener has pointed out that the specialized terminology of phonemic analysis was a red herring. It was the Saussurian principles (see also Saussure, Ferdinand (-Mongin) de (1857–1913)) lying behind phonology that Le´ viStrauss required: a few simple distinctions (like langue/parole, syntagmatic/paradigmatic), related notions such as ‘system’ and ‘value,’ and above all, the idea that the building blocks of human logic and human thinking consist of binary oppositions. The guiding idea behind Le´ vi-Strauss’s work was to find fundamental structures behind the bewildering disparateness of phenomena. Elementary structures was an attempt to show that a huge array of kinship systems could be seen to be based on two simple structural forms: restricted exchange and generalized exchange. From Totemism onward the search is more explicitly for fundamental structures ‘of the mind,’ the idea being that by an examination of the structure of various ‘objects of thought’ (classifications, myths, designs) one can discover something of the structure of the mind that created them. Just as phonology had shown that language can be reduced to those ‘distinctive features’ expressed as binary distinctions (tense/lax, grave/acute), so the logic of primitive classification and ‘mythologic’ can be reduced to endless structures of binary oppositions (wet/dry, honey/ tobacco, raw/cooked). While Le´ vi-Strauss emphasizes his profound concern with linguistics, Chomsky’s generative grammar passed him by. (‘Transformation,’ a key term in structuralism, is taken from D’Arcy Wentworth Thompson’s On growth and form, 1917, a classic in zoology, and has nothing to do with transformational grammar.) Ironically, a short passage in Chomsky’s Language and mind provides an illuminating, terse, ‘no nonsense’ critique of Le´ vi-Strauss’s use of linguistic models. (The work on classification, says Chomsky, reduces to the conclusion

‘‘that humans classify, if they perform any mental acts at all.’’) Linguistics has long left Le´ vi-Strauss behind. Structuralism, as a fashion, also looks rather dated, but because of the engrossing nature of the material Le´ vi-Strauss deals with (myths, masks, exotic names and habits) his writing is still found intriguing, and he is still revered as one of the leading intellectuels of France. See also: Anthropological Linguistics: Overview; Jakobson, Roman (1896–1982); Linguistic Anthropology; Saussure, Ferdinand (-Mongin) de (1857–1913); Structuralism in Anthropology; Trubetskoy, Nikolai Sergeievich, Prince (1890–1938).

Bibliography Ardener E (ed.) (1971). Social anthropology and language. London: Tavistock. Hayes E N & Hayes T (eds.) (1970). Claude Le´ viStrauss: the anthropologist as hero. Cambridge, MA: MIT Press. Leach E R (1970). Le´ vi-Strauss. London: Fontana. Le´ vi-Strauss C (1969). The elementary structures of kinship (2nd edn.). London: Eyre & Spottiswoode. [1st edn., 1949.] Le´ vi-Strauss C (1973). Tristes tropiques. New York: Atheneum. [1st edn., 1955.] Le´ vi-Strauss C (1963). Structural anthropology. New York: Basic Books Inc. [1st edn., 1958.] Le´ vi-Strauss C (1963). Totemism. Boston, MA: Beacon Press. [1st edn. 1962.] Le´ vi-Strauss C (1972). The savage mind (2nd edn.). London: Weidenfeld & Nicolson. [1st edn., 1962.] Le´ vi-Strauss C (1970–81). Mythologiques; introduction to a science of mythology. English transl., Weightman J & Weightman D (4 vols.). vol. 1: The raw and the cooked, 1970 (1st edn., 1964); vol. 2: From honey to ashes, 1973 (1st edn., 1967); vol. 3: The origin of table manners, 1978 (1st edn., 1968); vol. 4: The naked man, 1981 (1st edn., 1971). London: Cape. Le´ vi-Strauss C (1977). Structural anthropology II. London: Cape. [1st edn., 1973.] Pace D (1983). Claude Le´ vi-Strauss: the bearer of ashes. London: Routledge & Kegan Paul.

Levita, Elijah (1469–1549) H Y Sheynin, Gratz College, Melrose Park, PA, USA ! 2006 Elsevier Ltd. All rights reserved.

Elijah Levita (1469–1549) is known in Jewish scholarship as Eliyya¯ hu¯ Ba¯ h. u¯ r, an outstanding Hebrew and Aramaic grammarian and lexicographer, Rabbinic

scholar, and Yiddish writer. A native of Neustadt, near Nuremberg, Germany, he lived mostly in Italy (Padua, Venice, and Rome) and died in Venice. Levita’s main achievement was the instruction of Christian scholars in Jewish linguistic teachings. Some of his pupils were his patrons, in whose homes he resided for long time. Among his disciples were

52 Le´vi-Strauss, Claude (b. 1908)

published as an article in 1945 in one of the first issues of Word. Structural linguistics, through Trubetzkoy’s phonology (see Trubetskoy, Nikolai Sergeievich, Prince (1890–1938)), was set to play a ‘renovating role’ in the social sciences. Edwin Ardener has pointed out that the specialized terminology of phonemic analysis was a red herring. It was the Saussurian principles (see also Saussure, Ferdinand (-Mongin) de (1857–1913)) lying behind phonology that Le´viStrauss required: a few simple distinctions (like langue/parole, syntagmatic/paradigmatic), related notions such as ‘system’ and ‘value,’ and above all, the idea that the building blocks of human logic and human thinking consist of binary oppositions. The guiding idea behind Le´vi-Strauss’s work was to find fundamental structures behind the bewildering disparateness of phenomena. Elementary structures was an attempt to show that a huge array of kinship systems could be seen to be based on two simple structural forms: restricted exchange and generalized exchange. From Totemism onward the search is more explicitly for fundamental structures ‘of the mind,’ the idea being that by an examination of the structure of various ‘objects of thought’ (classifications, myths, designs) one can discover something of the structure of the mind that created them. Just as phonology had shown that language can be reduced to those ‘distinctive features’ expressed as binary distinctions (tense/lax, grave/acute), so the logic of primitive classification and ‘mythologic’ can be reduced to endless structures of binary oppositions (wet/dry, honey/ tobacco, raw/cooked). While Le´vi-Strauss emphasizes his profound concern with linguistics, Chomsky’s generative grammar passed him by. (‘Transformation,’ a key term in structuralism, is taken from D’Arcy Wentworth Thompson’s On growth and form, 1917, a classic in zoology, and has nothing to do with transformational grammar.) Ironically, a short passage in Chomsky’s Language and mind provides an illuminating, terse, ‘no nonsense’ critique of Le´vi-Strauss’s use of linguistic models. (The work on classification, says Chomsky, reduces to the conclusion

‘‘that humans classify, if they perform any mental acts at all.’’) Linguistics has long left Le´vi-Strauss behind. Structuralism, as a fashion, also looks rather dated, but because of the engrossing nature of the material Le´vi-Strauss deals with (myths, masks, exotic names and habits) his writing is still found intriguing, and he is still revered as one of the leading intellectuels of France. See also: Anthropological Linguistics: Overview; Jakobson, Roman (1896–1982); Linguistic Anthropology; Saussure, Ferdinand (-Mongin) de (1857–1913); Structuralism in Anthropology; Trubetskoy, Nikolai Sergeievich, Prince (1890–1938).

Bibliography Ardener E (ed.) (1971). Social anthropology and language. London: Tavistock. Hayes E N & Hayes T (eds.) (1970). Claude Le´viStrauss: the anthropologist as hero. Cambridge, MA: MIT Press. Leach E R (1970). Le´vi-Strauss. London: Fontana. Le´vi-Strauss C (1969). The elementary structures of kinship (2nd edn.). London: Eyre & Spottiswoode. [1st edn., 1949.] Le´vi-Strauss C (1973). Tristes tropiques. New York: Atheneum. [1st edn., 1955.] Le´vi-Strauss C (1963). Structural anthropology. New York: Basic Books Inc. [1st edn., 1958.] Le´vi-Strauss C (1963). Totemism. Boston, MA: Beacon Press. [1st edn. 1962.] Le´vi-Strauss C (1972). The savage mind (2nd edn.). London: Weidenfeld & Nicolson. [1st edn., 1962.] Le´vi-Strauss C (1970–81). Mythologiques; introduction to a science of mythology. English transl., Weightman J & Weightman D (4 vols.). vol. 1: The raw and the cooked, 1970 (1st edn., 1964); vol. 2: From honey to ashes, 1973 (1st edn., 1967); vol. 3: The origin of table manners, 1978 (1st edn., 1968); vol. 4: The naked man, 1981 (1st edn., 1971). London: Cape. Le´vi-Strauss C (1977). Structural anthropology II. London: Cape. [1st edn., 1973.] Pace D (1983). Claude Le´vi-Strauss: the bearer of ashes. London: Routledge & Kegan Paul.

Levita, Elijah (1469–1549) H Y Sheynin, Gratz College, Melrose Park, PA, USA ! 2006 Elsevier Ltd. All rights reserved.

Elijah Levita (1469–1549) is known in Jewish scholarship as Eliyya¯hu¯ Ba¯h. u¯r, an outstanding Hebrew and Aramaic grammarian and lexicographer, Rabbinic

scholar, and Yiddish writer. A native of Neustadt, near Nuremberg, Germany, he lived mostly in Italy (Padua, Venice, and Rome) and died in Venice. Levita’s main achievement was the instruction of Christian scholars in Jewish linguistic teachings. Some of his pupils were his patrons, in whose homes he resided for long time. Among his disciples were

Levita, Elijah (1469–1549) 53

scholars who became the leading Hebrew linguists of the 16th century, such as Sebastian Mu¨ nster, Paulus Fagius, Jean de Campen (Campensis), Andreas Maes, Guillaume Postel, Johann Albrecht Widmanstetter (Widmanstadius), Georges de Selves, and Cardinal Egidio di Viterbo, with whom Levita stayed for 13 years. Levita, who wrote in Hebrew, published many Hebrew grammar works, Hebrew and Aramaic dictionaries, and masoretic treatises. The most important in chronological order are: his edition of Moses Qimhi’s short grammar Maha˘lakh Sˇe˘vı¯le¯ ha-Da’at (‘Course of paths to the knowledge’) with his own commentary (Pesaro, 1508), his notes (‘Nimmu¯ qı¯m’) on David Qimhi’s grammar Se´fer ha Mikhlo¯l, (‘Book of perfection’) and dictionary Se¯fer ha Sˇo˘ra¯sˇı¯m (‘Book of roots’), published only partially in his editions of David Qimhi’s works. Two of his own grammatical treatises were first printed in Rome in 1518–1519 and almost simultaneously translated into Latin by his pupil Sebastian Mu¨nster. These were Se¯fer ha-Harka¯va¯h (‘The book of construction,’ Rome, 1518) and Se¯fer ha Bah. u¯r (‘The book of the chosen,’ Rome, 1518); while the third one, Lu’ah. beˇ-diqdu¯q ha-Pe’a¯lı¯m wehabinya¯nı¯m (‘The tables of the grammar of the verbs and the formations’) remained unpublished. The first one of these books analyzed foreign and compound words in the bible, and the morphology of the noun and the verb; the second was a more general book on Hebrew that was retitled, starting with its second edition as (Isny, 1542) Diqdu¯q Eliyya¯hu ha-Le¯wı¯ (‘Eliyya¯hu¯ ha-Le¯wı¯’s grammar’). His minor work, Pirqe¯ Eliyya¯hu¯ (‘Elija’s chapters,’ Pesaro, 1520) dealt with Hebrew consonants and vowels; in the 2nd edition (Venice, 1546), there were additions regarding nouns’ templates (mishqa¯lim) and the auxiliary letters used in construction of the noun (roughly coincide with affixes). Levita devoted at least two of his works to masoretic research, Ma¯so¯ret ha-Ma¯so¯ret (‘The tradition of the Ma¯so¯ra¯h,’ Venice, 1538), in which he explained the symbols and the terminology of the ancient transmission of the biblical text and T. u¯v T. a’am (‘The correctness of the accent’), where he attempted to formulate rules for use of accents in the bible. Levita went beyond the study of biblical Hebrew, an unprecedented move compared to previous linguistic tradition. His lexicographic work Tishbı¯ (‘The Tishbite’) listed words selected from the Talmud and other rabbinic sources; his Meˇturgeˇma¯n (‘The interpreter,’ Isny, 1541), analyzed the Aramaic lexics of the Targu¯mı¯m (Aramaic versions of the Hebrew bible). The last is still instrumental for Aramaic research; his Se¯fer ha-zikhro¯no¯t (‘Book of the

mentions’) was reportedly the first concordance to the bible (still in manuscripts). Levita’s pioneer work on Old Western Yiddish, or Judeo–German language, Sheˇmo¯t deˇva¯rı¯m (‘Names of things,’ Isny, 1542) was the first Yiddish–Hebrew dictionary. Levita is largely responsible for the development of the linguistic study of Hebrew in Europe, falling in line with David Qimh. i’s teachings, which Levita adopted as a model for his research. See also: Aramaic and Syriac; Hebrew, Biblical and

Jewish; Yiddish.

Bibliography Bacher W (1827–1889). ‘Elia Levita.’ In Allgemeine Encyclopa¨die der Wissenschaften und Ku¨nste. Section 2 (H-N). Ersch J S & Gruber J G (eds). Leipzig: Gleditsch. 301–303. Bacher W (1889). ‘Levitas wissenschaftliche Leistungen.’ ZDMG XLIII, 206–272. Bacher W (1893). ‘Zur Biographie E. Levitas.’ MGWJ. 398ff. Bacher W (1894). ‘Die hebra¨ische Sprachwissenschaft vom 10 bis zum16. Jahrhundert.’ In Die ju¨dische Litteratur seit Abschluß des Kanons, hrsg. von J Winter und A Wu¨nsche, Bd. 2. (Trier: Mayer, 1894–1896, 225–230; also separate offprint (Trier, 1892): 104–109.) Berggru¨n N (1948–1949). ‘Keˇla¯l H shel R Eliyya¯hu¯ Ba¯h. u¯r: Pereq beˇ-to¯le˘do¯t ha-diqdu¯q ha-‘Ivrı¯.’ Le˘sho¯ne¯nu¯ 16, 169–179. Buber S (1856). To¯leˇdo¯t Eliyya¯hu¯ ha-Tishbı¯. Leipzig. Dotan A (1970). ‘Prolegomenon’ [to ‘W. Wickes’ Two treatises on the accentuation of the Old Testament (1881, 1887)]. New York: Ktav, 1970. Ginsburg C D (1867). ‘Life of Elias Levita.’ In his reprint of Levita’s Massoreth Ha-Massoreth. London. 1–84 (many reprints). Grin˜o R (1971). ‘Importancia del ‘Meturgeman’ de Elı´as Levita y del Ms. Ange´lica 6–6 para el studio del mismo.’ Sefarad 31, 353–361. Grin˜o R (1977). ‘El Meturgeman y Neofiti I.’ Biblica 58(2), 153–188. Grin˜o R (1979). ‘El Meturgeman de Elı´as Levita y el ‘Aruk de Nata´n ben Yeh. iel como fuentes de la lexicografia tagu´mica.’ Biblica 60(1), 110–117. Hirschfeld H (1926). Literary history of Hebrew grammarians and lexicographers. Oxford: Oxford University Press. 99–100. Hrushovski B (1964). ‘The Creation of Accented Iambs in European Poetry and Their First Employment in a Yiddish Romance in Italy (1508–9).’ In For Max Weinreich on his seventieth birthday. The Hague: Mouton. 108–146. Kahana D (1883–1884). ‘To¯leˇdo¯t Eliyya¯hu¯ Ba¯h. u¯r.’ haShah. ar 12, 498–505, 539–547. Klein M (1975). ‘Elias Levita and Ms Neofiti I.’ Biblica 56, 242–246. Levi J (1888). Elia Levita und seine Leistungen als Grammatiker. Breslau: Schottlaender.

54 Levita, Elijah (1469–1549) Medan M (1971). ‘Levita, Elijah.’ EJ 11, 132–135. Weil G (1963). E´ lie Le´ vita, humaniste et massore`te (1469–1549). Leiden: Brill. Weil G (1964). Initiation a` la Massorah: l’introduction au Sefer Zikhronot d’E´ lie Le´ vita. Leiden: Brill. Weinberg W (1969/1970). ‘Elijah Levita.’ Jewish Book Annual 27, 106–110.

Willi T (1974). ‘Christliche Hebra¨isten der Renaissance und Reformation.’ Judaica 30, 78–85, 100–125. Yalon H (1962–1964). ‘Mah beyn R. She˘ lo¯ mo¯ h Almo¯ lı¯ le˘ -R. Eliyya¯ hu¯ Ba¯ h. u¯ r: Keˇ la¯ l yeˇ so¯ dı¯ bi-ne˘ .tiyyo¯ t ha-she¯ m we˘ ha-po¯ ‘al.’ Le˘ sho¯ ne¯ nu¯ 27–28, 225–229.

Le´ vy-Bruhl, Lucien (1857–1939) S O’Neill, University of Oklahoma, Norman, OK, USA ! 2006 Elsevier Ltd. All rights reserved.

Lucien Le´ vy-Bruhl was born in Paris, France, on April 10, 1857. There he attended the E´ cole normale supe´ rieure, receiving a broad education in philosophy, music, natural science, and clinical psychology, before graduating in 1879. After earning a doctorate in philosophy at University of Paris, in 1896 Le´ vy-Bruhl accepted a professorship at Sorbonne’s Department of Modern Philosophy, where he later served as chair, starting in 1904. A chance encounter with Chinese philosophy early in his career first prompted him to explore nonWestern modes of thought, beginning an intellectual adventure that he continued until his death in 1939. Le´ vy-Bruhl saw the emerging literature on nonWestern societies – though virtually unknown to previous generations of philosophers – as a promising testing ground for exploring the scope of human thought. Based on his extensive readings of the many ethnographic reports available at the time, Le´ vy-Bruhl began to build a second career as an anthropologist, later helping to found the Institute of Ethnology at the Sorbonne in 1925, with noted anthropologists Paul Rivet and Marcel Mauss. Yet after a falling-out with the eminent sociologist E´ mile Durkheim, Le´ vy-Bruhl left the Sorbonne in 1927, never to return. For the next decade, he continued his research on non-Western thought processes while delivering a series of popular though controversial lectures at Harvard, Berkeley, Columbia, and Johns Hopkins, to name a just few of the great universities where he presented his ideas. Le´ vy-Bruhl died in Paris on March 13, 1939, heartbroken about the coming war. Le´ vy-Bruhl is best known for his bold speculations on cognition in preliterate societies, something he initially characterized as ‘‘primitive mentality’’ or even ‘‘prelogical thought,’’ before uncovering similar trends in Western societies. As a student of Greek philosophy, Le´ vy-Bruhl was struck by what

he characterized as non-Aristotelian nature of the thought processes found in such widespread phenomena as magical practices and totemism, or the belief in unseen spiritual forces. For none of these phenomena could be directly verified with the senses, nor were they generally amenable to logical proof or contradiction. When a magical spell failed to produce its result, another unseen force was blamed. When a family claimed descent from an animal or a plant, the premise was accepted through the force of tradition and emotional attachment, not empiricism or reason. Le´ vy-Bruhl’s awareness that all humans apply logical principles to some endeavors, such as economics, only brought the absence of so-called rational thought into greater relief, where it lay dormant. Bent on illustrating the lesser development of nonWestern societies, Le´ vy-Bruhl quickly uncovered evidence of ‘primitive mentality’ in the sphere of language, which he reported in his first major work, How natives think, first published in 1910. Most notably, Le´ vy-Bruhl stumbled on reports that numerals are limited in some languages, often with numbers going no higher than two, three, or four. Yet, to his credit, Le´ vy-Bruhl was quick to note that the absence of higher numerals does not preclude the possibility of counting, which he regarded as a human universal, available to all through addition, multiplication, or tallies, even where the set of numbers is small. He simply argued that numbers remained undeveloped where there is little use for counting, with language following the mental habits, though not the potential ability, of the speakers. Upon discovering that some languages have special grammatical markers for just two, three, or four objects, Le´ vy-Bruhl claimed these languages failed to develop the far more abstract category of the plural that is common in the West. Instead, he claimed those languages cling to the more ‘concrete’ concepts of the dual, trial, or quintal where an exact number must be specified. He made a similar argument about classifier languages, arguing that a concrete shape must always be stated, at the expense of developing a more ‘abstract’ concept of number, apart from the

54 Levita, Elijah (1469–1549) Medan M (1971). ‘Levita, Elijah.’ EJ 11, 132–135. Weil G (1963). E´lie Le´vita, humaniste et massore`te (1469–1549). Leiden: Brill. Weil G (1964). Initiation a` la Massorah: l’introduction au Sefer Zikhronot d’E´lie Le´vita. Leiden: Brill. Weinberg W (1969/1970). ‘Elijah Levita.’ Jewish Book Annual 27, 106–110.

Willi T (1974). ‘Christliche Hebra¨isten der Renaissance und Reformation.’ Judaica 30, 78–85, 100–125. Yalon H (1962–1964). ‘Mah beyn R. She˘lo¯mo¯h Almo¯lı¯ le˘-R. Eliyya¯hu¯ Ba¯h. u¯r: Keˇla¯l yeˇso¯dı¯ bi-ne˘.tiyyo¯t ha-she¯m we˘ha-po¯‘al.’ Le˘sho¯ne¯nu¯ 27–28, 225–229.

Le´vy-Bruhl, Lucien (1857–1939) S O’Neill, University of Oklahoma, Norman, OK, USA ! 2006 Elsevier Ltd. All rights reserved.

Lucien Le´vy-Bruhl was born in Paris, France, on April 10, 1857. There he attended the E´cole normale supe´rieure, receiving a broad education in philosophy, music, natural science, and clinical psychology, before graduating in 1879. After earning a doctorate in philosophy at University of Paris, in 1896 Le´vy-Bruhl accepted a professorship at Sorbonne’s Department of Modern Philosophy, where he later served as chair, starting in 1904. A chance encounter with Chinese philosophy early in his career first prompted him to explore nonWestern modes of thought, beginning an intellectual adventure that he continued until his death in 1939. Le´vy-Bruhl saw the emerging literature on nonWestern societies – though virtually unknown to previous generations of philosophers – as a promising testing ground for exploring the scope of human thought. Based on his extensive readings of the many ethnographic reports available at the time, Le´vy-Bruhl began to build a second career as an anthropologist, later helping to found the Institute of Ethnology at the Sorbonne in 1925, with noted anthropologists Paul Rivet and Marcel Mauss. Yet after a falling-out with the eminent sociologist E´mile Durkheim, Le´vy-Bruhl left the Sorbonne in 1927, never to return. For the next decade, he continued his research on non-Western thought processes while delivering a series of popular though controversial lectures at Harvard, Berkeley, Columbia, and Johns Hopkins, to name a just few of the great universities where he presented his ideas. Le´vy-Bruhl died in Paris on March 13, 1939, heartbroken about the coming war. Le´vy-Bruhl is best known for his bold speculations on cognition in preliterate societies, something he initially characterized as ‘‘primitive mentality’’ or even ‘‘prelogical thought,’’ before uncovering similar trends in Western societies. As a student of Greek philosophy, Le´vy-Bruhl was struck by what

he characterized as non-Aristotelian nature of the thought processes found in such widespread phenomena as magical practices and totemism, or the belief in unseen spiritual forces. For none of these phenomena could be directly verified with the senses, nor were they generally amenable to logical proof or contradiction. When a magical spell failed to produce its result, another unseen force was blamed. When a family claimed descent from an animal or a plant, the premise was accepted through the force of tradition and emotional attachment, not empiricism or reason. Le´vy-Bruhl’s awareness that all humans apply logical principles to some endeavors, such as economics, only brought the absence of so-called rational thought into greater relief, where it lay dormant. Bent on illustrating the lesser development of nonWestern societies, Le´vy-Bruhl quickly uncovered evidence of ‘primitive mentality’ in the sphere of language, which he reported in his first major work, How natives think, first published in 1910. Most notably, Le´vy-Bruhl stumbled on reports that numerals are limited in some languages, often with numbers going no higher than two, three, or four. Yet, to his credit, Le´vy-Bruhl was quick to note that the absence of higher numerals does not preclude the possibility of counting, which he regarded as a human universal, available to all through addition, multiplication, or tallies, even where the set of numbers is small. He simply argued that numbers remained undeveloped where there is little use for counting, with language following the mental habits, though not the potential ability, of the speakers. Upon discovering that some languages have special grammatical markers for just two, three, or four objects, Le´vy-Bruhl claimed these languages failed to develop the far more abstract category of the plural that is common in the West. Instead, he claimed those languages cling to the more ‘concrete’ concepts of the dual, trial, or quintal where an exact number must be specified. He made a similar argument about classifier languages, arguing that a concrete shape must always be stated, at the expense of developing a more ‘abstract’ concept of number, apart from the

Lewis, Henry (1889–1968) 55

actual objects being counted. Of course, his Boasian contemporaries in the United States argued that ranking languages is a futile endeavor, like comparing apples and oranges, because all languages are complex in some areas and less so in others. At the other extreme, Benjamin Lee Whorf took cases like classifiers or the grammatical dual as evidence of great cognitive sophistication on the linguistic plane. Surveying the literature on the languages of North America, Le´ vy-Bruhl encountered several detailed accounts of polysynthetic tongues, which he described as ‘‘pictorial’’ because of the high number of grammatical affixes that must modify the stem. Again, rather than portraying these languages as sophisticated, he charged the speakers of polysynthetic languages with requiring the painstaking specification of so many ‘concrete’ categories, indicating, for instance, the shape of an object as it moves through space, its precise directional bearing, or its placement at an exact moment in the near or distant future. At the same time, Le´ vy-Bruhl noted that many languages have a profusion of words for individual species, often far exceeding the common terms available in European tongues while lacking a detailed series of generic categories for general classes of plants and animals. Again, instead of crediting indigenous languages with sophistication in some areas, Le´ vy-Bruhl selectively focused on the area in which a language appeared to be deficient from the perspective of English or French. By the same token, English, French, or German could also be regarded as equally deficient, with a paucity of grammatical affixes or precise names for the many animal species found remote parts of the world! Late in his life, Le´ vy-Bruhl eventually answered his critics, acknowledging that similar processes were at work in Western societies, a revelation that came only with the posthumous publication of his private notebooks in 1949. For even such a cherished doctrine as the Holy Trinity to some extent defies rational thought, with its assertion that one God is simultaneously present in three species as the Father, the Son,

and the Holy Spirit. Like the concept of the totem, the proposition is beyond empirical proof and must be accepted on faith, emotion, and tradition, or not at all. Similarly, magical thinking was prevalent until recently, and though it is mostly gone, luck and prayer survive. No longer able to maintain the sharp distinction between Western and non-Western societies, Le´ vy-Bruhl recognized two streams of thought found in all societies, one ‘rational’ and another one ‘prelogical’ or ‘mystical.’ By collapsing the two-way split between the Western and non-Western worlds, he paved the way for his intellectual successor, Claude Levi-Strauss, who continued to explore the scope of human thought, though with emphasis on purely universal process, manifest in various ways across a range of societies. See also: Historiography of Linguistics; Linguistic Anthropology; Primitive Languages; Psycholinguistics: Overview; Variation in Native Languages of North America.

Bibliography Cazeneuve J (1972). Lucien Le´ vy-Bruhl. New York: Harper. Le´ vy-Bruhl L (1899). History of modern philosophy in France. Chicago: Open Court Publishing. Le´ vy-Bruhl L (1903). The philosophy of Auguste Comte, de Beaumont Klein K M (trans.). London: Sonnenschien. Le´ vy-Bruhl L (1923) [1922]. Primitive mentality, Clare L A (trans.). New York: Macmillan. Le´ vy-Bruhl L (1926) [1910]. How natives think, Clare L A (trans.). London: George Allen & Unwin. Le´ vy-Bruhl L (1928) [1927]. The ‘soul’ of the primitive, Clare L A (trans.). London: George Allen & Unwin. Le´ vy-Bruhl L (1935) [1931]. Primitives and the supernatural, Clare L A (trans.). New York: E. P. Dutton. Le´ vy-Bruhl L (1975) [1949]. The notebooks on primitive mentality, Rivie`re P (trans.). Oxford: Basil Blackwell & Mott. Le´ vy-Bruhl L (1983) [1935]. Primitive mythology: the mythic world of Australian and Papuan natives, Elliot B (trans.). St. Lucia: University of Queensland Press.

Lewis, Henry (1889–1968) B F Roberts, Ceredigion, UK ! 2006 Elsevier Ltd. All rights reserved.

As one of the first generation of professors of Welsh at the University of Wales, Henry Lewis, like his colleagues, was called upon to prepare editions of

and commentaries on a wide range of texts as well as writing numerous lexicographical notes, but from the early years of his career he made the study of Welsh syntax and comparative grammar the cornerstones of his own pioneering research. He became the foremost Welsh comparativist and grammarian of the first half of the 20th century.

Lewis, Henry (1889–1968) 55

actual objects being counted. Of course, his Boasian contemporaries in the United States argued that ranking languages is a futile endeavor, like comparing apples and oranges, because all languages are complex in some areas and less so in others. At the other extreme, Benjamin Lee Whorf took cases like classifiers or the grammatical dual as evidence of great cognitive sophistication on the linguistic plane. Surveying the literature on the languages of North America, Le´vy-Bruhl encountered several detailed accounts of polysynthetic tongues, which he described as ‘‘pictorial’’ because of the high number of grammatical affixes that must modify the stem. Again, rather than portraying these languages as sophisticated, he charged the speakers of polysynthetic languages with requiring the painstaking specification of so many ‘concrete’ categories, indicating, for instance, the shape of an object as it moves through space, its precise directional bearing, or its placement at an exact moment in the near or distant future. At the same time, Le´vy-Bruhl noted that many languages have a profusion of words for individual species, often far exceeding the common terms available in European tongues while lacking a detailed series of generic categories for general classes of plants and animals. Again, instead of crediting indigenous languages with sophistication in some areas, Le´vy-Bruhl selectively focused on the area in which a language appeared to be deficient from the perspective of English or French. By the same token, English, French, or German could also be regarded as equally deficient, with a paucity of grammatical affixes or precise names for the many animal species found remote parts of the world! Late in his life, Le´vy-Bruhl eventually answered his critics, acknowledging that similar processes were at work in Western societies, a revelation that came only with the posthumous publication of his private notebooks in 1949. For even such a cherished doctrine as the Holy Trinity to some extent defies rational thought, with its assertion that one God is simultaneously present in three species as the Father, the Son,

and the Holy Spirit. Like the concept of the totem, the proposition is beyond empirical proof and must be accepted on faith, emotion, and tradition, or not at all. Similarly, magical thinking was prevalent until recently, and though it is mostly gone, luck and prayer survive. No longer able to maintain the sharp distinction between Western and non-Western societies, Le´vy-Bruhl recognized two streams of thought found in all societies, one ‘rational’ and another one ‘prelogical’ or ‘mystical.’ By collapsing the two-way split between the Western and non-Western worlds, he paved the way for his intellectual successor, Claude Levi-Strauss, who continued to explore the scope of human thought, though with emphasis on purely universal process, manifest in various ways across a range of societies. See also: Historiography of Linguistics; Linguistic Anthropology; Primitive Languages; Psycholinguistics: Overview; Variation in Native Languages of North America.

Bibliography Cazeneuve J (1972). Lucien Le´vy-Bruhl. New York: Harper. Le´vy-Bruhl L (1899). History of modern philosophy in France. Chicago: Open Court Publishing. Le´vy-Bruhl L (1903). The philosophy of Auguste Comte, de Beaumont Klein K M (trans.). London: Sonnenschien. Le´vy-Bruhl L (1923) [1922]. Primitive mentality, Clare L A (trans.). New York: Macmillan. Le´vy-Bruhl L (1926) [1910]. How natives think, Clare L A (trans.). London: George Allen & Unwin. Le´vy-Bruhl L (1928) [1927]. The ‘soul’ of the primitive, Clare L A (trans.). London: George Allen & Unwin. Le´vy-Bruhl L (1935) [1931]. Primitives and the supernatural, Clare L A (trans.). New York: E. P. Dutton. Le´vy-Bruhl L (1975) [1949]. The notebooks on primitive mentality, Rivie`re P (trans.). Oxford: Basil Blackwell & Mott. Le´vy-Bruhl L (1983) [1935]. Primitive mythology: the mythic world of Australian and Papuan natives, Elliot B (trans.). St. Lucia: University of Queensland Press.

Lewis, Henry (1889–1968) B F Roberts, Ceredigion, UK ! 2006 Elsevier Ltd. All rights reserved.

As one of the first generation of professors of Welsh at the University of Wales, Henry Lewis, like his colleagues, was called upon to prepare editions of

and commentaries on a wide range of texts as well as writing numerous lexicographical notes, but from the early years of his career he made the study of Welsh syntax and comparative grammar the cornerstones of his own pioneering research. He became the foremost Welsh comparativist and grammarian of the first half of the 20th century.

56 Lewis, Henry (1889–1968)

Born on August 21, 1889 in Ynystawe in the Swansea Valley in Glamorganshire, south Wales, Henry Lewis was educated at his local county school at Ystalyfera, where he received a sound classical education and was fortunate in also having a scholarly and inspiring Welsh teacher who attracted him to Welsh studies. In 1907 he entered the University College of South Wales at Cardiff, where he studied under Thomas Powel, taking his degree in Welsh in 1910. Lewis gained his research degree in 1913 for work on the Welsh translations of Geoffrey of Monmouth’s Historia Regum Britanniae, and he also spent a period at Jesus College, Oxford, with Sir John Rhys, professor of Celtic; though he did not take a degree, one can assume that Rhys would have broadened his outlook toward the wider field of Celtic studies. Following a period of school teaching and war service, Lewis was appointed assistant lecturer in Welsh at Cardiff, where he remained from 1918 to 1921. The university college of Swansea, another constituent college of the University of Wales, was opened in 1920; the following year Henry Lewis took up his post as the first professor of Welsh. He was to remain there until his retirement in 1954. He died on January 14, 1968. At Swansea, Lewis had the responsibility of establishing a university department from scratch and teaching a full initial degree program with the help of one, sometimes two, lecturers. He was also actively involved in the broader Welsh cultural scene as a member of official bodies and institutions and as a popular extramural teacher, educationalist, and convinced Congregationalist. These activities and the different facets of his teaching are reflected in the range of his publications, from Early Modern Welsh translations and renaissance prose, to editions of 18th and 19th century minor classics and school and church texts. One of Lewis’s main contributions, however, was his editions of medieval prose and poetry. His studies of cywyddwyr, poets of the 15th and 16th centuries, especially in a joint work, Cywyddau Iolo Goch ac eraill (The cywyddau of Iolo Goch and others, 1925, 1937) helped to establish modern scholarly editorial conventions for this body of verse. More innovative was his work on the poets of the Welsh princes of the 12th and 13th centuries; his Hen gerddi crefyddol (Old religious verse, 1931) was the first modern edition of any of the poems of these poets, and his annotations helped to lay the foundations for later studies of metrics, style, and lexicon. In 1942 Lewis returned to the subject of his early research. Brut Dingestow is an edition of a single manuscript translation of Geoffrey of Monmouth.

His analysis of some of the grammatical features of this 13th century text, the culmination of work begun in other editions of medieval prose texts, Chwedleu seith doethon Rufein (The seven sages of Rome, 1924) and Delw y byd, Imago mundi (with Paul Diveres, 1928), is an extended, though never intended as a comprehensive, grammar of Middle Welsh. His Elfen Ladin yn yr iaith Gymraeg (The Latin element in Welsh, 1943) remains a useful handbook for plotting the phonological changes from British to Welsh, while his deep knowledge of British Celtic languages enabled him to make a significant contribution in the examination of their phonology and morphology in the revised English edition of Holger Pedersen’s Vergleichende Grammatik der Keltischen Sprachen (A concise comparative Celtic grammar, 1937; Russian translation in 1954, with Lewis’s further corrections and a supplement in 1961). Lewis had begun to publish grammars of Middle Breton and Middle Cornish in 1922 and 1923 (revised editions in 1935, 1946; new edition with J. R. F. Piette in 1966; German editions in 1990), and there is no doubt that it was he who paved the way for these languages to become an integral part of Celtic linguistics in Wales and beyond. Lewis’s other major, and probably more important, achievement in Welsh scholarship was to make syntactical analysis central to Welsh linguistics. His work on Old Welsh syntax, prepositions, the syntax of the verb and of relative clauses, and especially his description and proposed history of the syntax of the unmarked noun-initial sentence, set out in his 1942 British Academy Lecture, ‘The sentence in Welsh,’ have been very influential (though it has to be said that this thesis has been criticized in recent years). Lewis was an inspiring teacher, some of whose students have become leading Welsh grammarians; he was a fluent communicator, and his Datblygiad yr iaith Gymraeg (The development of the Welsh language, 1931, revised in 1946, German edition in 1989), written for non-academic readers, is still a scholarly, readable account of language development. See also: Breton; Celtic; Cornish; Pedersen, Holger (1867–

1953).

Bibliography Bachellery E (1968–1969). ‘Henry Lewis (1890 [sic]– 1968).’ Etudes Celtiques XII, 276–281. Evans D E (1970). ‘Rhestr o gyhoeddiadau y diweddar Athro Henry Lewis (1889–1968)’ (A bibliography of Henry Lewis’s published work). Journal of the Welsh Bibliographical Society 10, 144–152.

Lexeme–Morpheme Based Morphology 57 Lewis H (ed.) (1924). Chwedleu Seith Doethon Rufein. (2nd edn., Cardiff, 1958). Cardiff: University of Wales Press. Lewis H (1931). Datblygiad yr iaith Gymraeg. (The development of the Welsh language) (rev. edn., Cardiff, 1946; tr. by Wolfgang Meid, Die Kymrische Sprache, Innsbruck: Innsbrucker Beitra¨ge zur Sprachwissenschaft, 1989). Cardiff: University of Wales Press. Lewis H (ed.) (1942). Brut Dingestow. Cardiff: University of Wales Press. Lewis H (1942). ‘The sentence in Welsh.’ Sir John Rhys Memorial Lecture. Proceedings of the British Academy xxviii, 259–280. Lewis H (1946). Llawlyfr Cernyweg Canol (Handbook of Middle Cornish) (prev. edn., Aberdaˆ r, 1923; tr. by Stefan Zimmer, Handbuch des Mittelkornischen, Innsbruck: Innsbrucker Beitra¨ ge zur Sprachwissenschaft, 1990). Cardiff: University of Wales Press. Lewis H (1990). Handbuch des Mittelbretonischen, tr. by Wolfgang Meid. Innsbruck: Innsbrucker Beitra¨ ge zur Sprachwissenschaft. Lewis H & Diverres P (eds.) (1928). Delw y Byt (Imago Mundi). Cardiff: University of Wales Press.

Lewis H & Pedersen H (1937). A concise comparative Celtic grammar (with corrections and a supplement by Henry Lewis, Go¨ ttingen, 1961; tr. by Smirnov A A, edited by Jarceva V N, Kratkaja sravnitel’naja kelk’tskich jazykov, Moscow, 1954). Go¨ ttingen. Lewis H & Piette J R F (1966). Llawlyfr Llydaweg Canol (Handbook of Middle Breton), rev. edn. (prev. edns., Aberdaˆ r, 1922, new edn., Cardiff, 1935). Cardiff: University of Wales Press. Lloyd D M (2001). ‘Henry Lewis.’ In Jones E D & Roberts B F (eds.) The dictionary of Welsh biography 1941–1970. London: Honourable Society of Cymmrodorion. 162–163. Mac Cana P (1991). ‘On Celtic word-order and the Welsh ‘‘abnormal’’ sentence.’ Eriu XXIV, 90–120. Mac Cana P (1991). ‘Further notes on constituent order in Welsh.’ In Fife J & Poppe E (eds.) Studies in Brythonic word order, current issues in linguistic theory, vol. 83. 45–80. Thomas B B (1971). ‘Henry Lewis: Gwerinwr, Cymro’ (Henry Lewis, man of the people, Welshman). National Library of Wales Journal XVII, 121–135.

Lexeme–Morpheme Based Morphology R Beard, Bucknell University, Lewisburg, PA, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction Lexeme–Morpheme Based Morphology is a complete set of lexeme-based morphological theories and hypotheses including the following: 1. The Separation Hypothesis, that lexical and inflectional derivation are distinct from affixation (phonological realization); 2. The Universal Grammatical Function Theory, whereby the functions of inflectional and lexical derivation are one and the same; 3. The Base Rule Hypothesis, that universal functions must originate in a base component if we are to explain both lexical and syntactic (inflectional) derivation (Botha, 1988); 4. Stephen Anderson’s General Theory of Affixation (Anderson, 1992), which, as extended recently by Anderson, predicts the placement of all affixes and clitics; 5. The Defective Adjective Hypothesis, which claims that adpositions are adjectival pronouns in a class with case endings and hence grammatical morphemes rather than lexemes.

6. A morphological performance theory, which includes: a theory of lexical stock expansion processes; a theory of normal speech errors; a theory of pathological speech errors (morphological agrammatism). The LMBM lexicon is exclusively the domain of lexemes, which are defined specifically as noun, verb, and adjective stems and the lexical categories that define them (number, gender, transitivity, and so on). All other meaningful material must belong to the closed set of closed categories of grammar and are handled by ‘morphology’ in the general sense (derivation plus affixation). LMBM, then, distinguishes itself from other lexeme-based theories in that it maintains a pristine distinction between lexemes and grammatical morphemes and consequently predicts this distinction at every level of language and speech.

Lexemes and Morphemes Lexeme-Morpheme Based Morphology (LMBM) is a variant of what Aronoff (1994) refers to as a ‘lexeme-base’ morphological theory. Lexeme-base morphology assumes that only the lexeme is a true linguistic sign, where ‘lexeme’ is defined exclusively and explicitly as any and all noun, verb, and adjective stems. The effects of lexical and inflectional

Lexeme–Morpheme Based Morphology 57 Lewis H (ed.) (1924). Chwedleu Seith Doethon Rufein. (2nd edn., Cardiff, 1958). Cardiff: University of Wales Press. Lewis H (1931). Datblygiad yr iaith Gymraeg. (The development of the Welsh language) (rev. edn., Cardiff, 1946; tr. by Wolfgang Meid, Die Kymrische Sprache, Innsbruck: Innsbrucker Beitra¨ge zur Sprachwissenschaft, 1989). Cardiff: University of Wales Press. Lewis H (ed.) (1942). Brut Dingestow. Cardiff: University of Wales Press. Lewis H (1942). ‘The sentence in Welsh.’ Sir John Rhys Memorial Lecture. Proceedings of the British Academy xxviii, 259–280. Lewis H (1946). Llawlyfr Cernyweg Canol (Handbook of Middle Cornish) (prev. edn., Aberdaˆr, 1923; tr. by Stefan Zimmer, Handbuch des Mittelkornischen, Innsbruck: Innsbrucker Beitra¨ge zur Sprachwissenschaft, 1990). Cardiff: University of Wales Press. Lewis H (1990). Handbuch des Mittelbretonischen, tr. by Wolfgang Meid. Innsbruck: Innsbrucker Beitra¨ge zur Sprachwissenschaft. Lewis H & Diverres P (eds.) (1928). Delw y Byt (Imago Mundi). Cardiff: University of Wales Press.

Lewis H & Pedersen H (1937). A concise comparative Celtic grammar (with corrections and a supplement by Henry Lewis, Go¨ttingen, 1961; tr. by Smirnov A A, edited by Jarceva V N, Kratkaja sravnitel’naja kelk’tskich jazykov, Moscow, 1954). Go¨ttingen. Lewis H & Piette J R F (1966). Llawlyfr Llydaweg Canol (Handbook of Middle Breton), rev. edn. (prev. edns., Aberdaˆr, 1922, new edn., Cardiff, 1935). Cardiff: University of Wales Press. Lloyd D M (2001). ‘Henry Lewis.’ In Jones E D & Roberts B F (eds.) The dictionary of Welsh biography 1941–1970. London: Honourable Society of Cymmrodorion. 162–163. Mac Cana P (1991). ‘On Celtic word-order and the Welsh ‘‘abnormal’’ sentence.’ Eriu XXIV, 90–120. Mac Cana P (1991). ‘Further notes on constituent order in Welsh.’ In Fife J & Poppe E (eds.) Studies in Brythonic word order, current issues in linguistic theory, vol. 83. 45–80. Thomas B B (1971). ‘Henry Lewis: Gwerinwr, Cymro’ (Henry Lewis, man of the people, Welshman). National Library of Wales Journal XVII, 121–135.

Lexeme–Morpheme Based Morphology R Beard, Bucknell University, Lewisburg, PA, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction Lexeme–Morpheme Based Morphology is a complete set of lexeme-based morphological theories and hypotheses including the following: 1. The Separation Hypothesis, that lexical and inflectional derivation are distinct from affixation (phonological realization); 2. The Universal Grammatical Function Theory, whereby the functions of inflectional and lexical derivation are one and the same; 3. The Base Rule Hypothesis, that universal functions must originate in a base component if we are to explain both lexical and syntactic (inflectional) derivation (Botha, 1988); 4. Stephen Anderson’s General Theory of Affixation (Anderson, 1992), which, as extended recently by Anderson, predicts the placement of all affixes and clitics; 5. The Defective Adjective Hypothesis, which claims that adpositions are adjectival pronouns in a class with case endings and hence grammatical morphemes rather than lexemes.

6. A morphological performance theory, which includes: a theory of lexical stock expansion processes; a theory of normal speech errors; a theory of pathological speech errors (morphological agrammatism). The LMBM lexicon is exclusively the domain of lexemes, which are defined specifically as noun, verb, and adjective stems and the lexical categories that define them (number, gender, transitivity, and so on). All other meaningful material must belong to the closed set of closed categories of grammar and are handled by ‘morphology’ in the general sense (derivation plus affixation). LMBM, then, distinguishes itself from other lexeme-based theories in that it maintains a pristine distinction between lexemes and grammatical morphemes and consequently predicts this distinction at every level of language and speech.

Lexemes and Morphemes Lexeme-Morpheme Based Morphology (LMBM) is a variant of what Aronoff (1994) refers to as a ‘lexeme-base’ morphological theory. Lexeme-base morphology assumes that only the lexeme is a true linguistic sign, where ‘lexeme’ is defined exclusively and explicitly as any and all noun, verb, and adjective stems. The effects of lexical and inflectional

58 Lexeme–Morpheme Based Morphology

derivation on the lexeme do not affect its status as a sign at all. These processes, it follows, must involve elements other than linguistic signs. Lexical (derivational) and inflectional morphemes hence must represent different types of linguistic elements. The fundamental claim distinguishing LMBM from other morphological frameworks is its rigid distinction of lexemes and (grammatical) morphemes; hence its name, ‘Lexeme–Morpheme Based Morphology.’ Beard (1988, 1995) argues that grammatical morphemes differ from lexemes in the following ways: 1. lexemes belong to open classes; morphemes belong to closed classes. 2. lexemes do not allow zero or empty forms; morphemes do. 3. lexemes have extragrammatical referents; morphemes have grammatical functions. 4. lexemes may undergo lexical derivation; morphemes may not. 5. lexemes are not paradigmatic; morphemes are.

The Separation Hypothesis If lexemes are noun, verb, and adjective stems, and if they are the only objects stored in the lexicon, it is natural to ask what determines or generates grammatical morphemes. Grammatical morphemes are the output of purely phonological operations independent of the semantic (grammatical) operations they mark (‘realize’). This conclusion is the Separation Hypothesis. The Separation Hypothesis splits all derivation, lexical and inflectional alike, into three processes: lexical (L-) derivation, inflectional (I-) derivation, and morphological spelling. Derivation involves operations on abstract lexical and inflectional category functions such as [þPlural, -Singular], [þPast, -Present], [þ1st], and the like. Spelling is the purely phonological realization of the morphological categories of any base lexeme that has undergone such derivation. Its function is to distinguish stems that have undergone derivation from those that have not. If the derivation is inflectional, the marker may be attached to the lexical stem or assigned independently to a structural position in syntax in ways which syntax alone cannot predict. For example, [þDefinite] singular nouns in Swedish are suffixed, but [-Definite] singular nouns are marked by the same morpheme in a free position, e.g., en katt ‘a cat’ vs. katt-en ‘the cat.’ Hence, the LMBM lexicon contains neither bound nor free grammatical morphemes – both are the responsibility of morphological spelling.

‘Morphemes’

What, then, are ‘morphemes’ under this hypothesis? Bound morphemes include affixes and other modifications of the phonological representation of a lexeme, such as reduplication and (Semitic) revoweling, the spelling out of articles, auxiliaries, adpositions – any expression associated with a grammatical category or relation rather than a semantic one. Under LMBM, L-derivation takes place in the lexicon. Here, lexemes with new lexical meanings are formed from underived lexemes. Inflectional derivation, following the arguments of Matthews (1972), Anderson (1992), and Aronoff (1994), takes place in syntax. Both these derivational types are phonologically realized by an autonomous morphological spelling component located after all syntactic rules but before any phonological rules operate. Unlike free morphemes, then, bound morphemes are phonological modifications (only) of the phonological representations of lexemes, which mark the fact that the lexeme has undergone lexical or inflectional derivation (the Empty Morpheme Entailment). Neither type of morpheme has any meaning, and thus morphemes attribute no meaning to the stems to which they attach. All meaning is accounted for by the derivation rules. This immediately explains empty and zero morphemes. Empty morphemes represent affixation without derivation; zero morphemes are derivations without affixation. Morphological asymmetry, the many-to-one and one-to-many mappings of morphological forms to function, is also explained by Separation. Free morphemes, on the other hand, may require syntactic positions, since they are ostensibly subject to movement and also often belong to paradigms. I have in mind now such morphemes as free adpositions, auxiliaries, conjunctions, and pronouns. All but conjunctions are themselves subject to inflection. Pronouns such as which and who, as well as free auxiliaries, seem to move from one syntactic position to another in some languages. However, even the positions of auxiliaries and pronouns usually conform to Anderson’s General Theory of Affixation, which LMBM also assumes. Moreover, movement may simply be the phonological realization of grammatical morphemes in positions other than that in which the morphological features determining such realization appear (Beard, 1995). Lexical Derivation

Derivation rules operate in the lexicon (L-derivation) and in syntax (I-derivation). The illusion of a direct relation between the sound and meaning of affixes is the result of two facts. First, derivation rules and

Lexeme–Morpheme Based Morphology 59

realization rules operate on the same lexeme during the generation of a derived word. This means that affixation will correspond to the modified features in the stem. (The original features of the stem will also be modified, neutralized, so that they will not trigger affixation.) The order of affixes generally, but not always, follows that of the features in the base. The assumption of sign base morphology (Saussure, 1916) results from the fact that the default procedure for the MS (spelling) component is to realize features in the order in which it encounters them. However, feature order is often irrelevant, as in the case of affixes, which can precede or follow some other affix, and cumulative exponents, which realize two or more features simultaneously. Beard (1988, 1995) describes four types of L-derivation: transpositions, functional derivation, feature switches, and expressive derivation. Transposition simply changes the lexical class (category) of a lexeme, e.g., N to V, A to N. Functional derivations add a semantic(ally interpretable) category function, such as Subject, Object, Locus, Manner, to the featural inventory of the base lexeme. Lexical switches modify the value of inherent lexical features such as Gender, Number, and Paradigm Class. Finally, expressive derivations, such as Diminutive and Augmentative, reflect the speaker’s attitude in ways still not clearly understood. The Subjective (Agentive) nominalization, e.g., ‘bake : baker,’ is a familiar L-derivation. It invokes three of the four L-derivation types, a functional derivation illustrated in Figures 1–3, a transposition, whose output is as seen in Figure 3, and a featural switch, whose output is illustrated in Figure 4.

Figure 1 is a simplified illustration of the underlying form of an NP with a relative clause, the origin of all nominalizations under LMBM. The problem is that for all the structure, only one phonologically realizable lexeme has been inserted. The lexicon has two options: remove the structure and nominalize the verb, or fill in the structure with lexical categories it knows and controls (Gender, Number, and Noun Class). If the lexicon chooses the former tack, the superfluous structure marked in Figure 1 is eliminated (Figure 2). The result is a single V incorrectly positioned under an NP. When this occurs, the lexicon’s job is to transpose the illegal category to the legal one, i.e., provide the verb with nominal category functions (Figure 3). Figure 3 also shows the effects of the feature switches, which sets the functional features added by the transpositions. The output of this series of processes is then submitted to morphological spelling (the MS-module), similar to ‘Spellout’ in related models. Of course, the lexical description of the verb bake, {BAKE}, is preserved in the derivation and will continue to determine government and binding properties. This set of assumptions accounts for the following facts. Languages with L-derivation have exactly two ways of expressing the same notion using identical lexemes: one lexical (bak-er), the other syntactic (someone who bakes). The same functions (Subject, Object, Instrument, Location) found among Case relations also determine functional L-derivations. Although there are two types of derivations, lexical and inflectional, only one spelling component realizes the results of both.

Figure 2 Input to a transposition.

Figure 1 A functional derivation.

Figure 3 Output of a transposition.

60 Lexeme–Morpheme Based Morphology

the output of (1) from syntax will be, roughly, the structure illustrated in Figure 4. Syntactic structures like the example in Figure 4, as well as lexically derived structures as in Figure 3, are then subjected to the phonological realization rules of the morphological spelling or MS component. LMBM postulates only one integrated MS component for both lexical and inflectional derivations. Pronouns such as one, he, she, it, being grammatical morphemes, will be phonologically realized by the same MS component that inserts affixes. Morphological Spelling (Realization)

Figure 4 Output of a featural switch.

Now that we have followed a lexical derivation, let us examine an inflectional one operating on the same underlying form as in Figure 1. Inflectional Derivation

If the lexicon decides not to operate on a structure such as that in Figure 1 above and does not collapse all that structure into a single word, it must license all the nodes that pass through it for syntax; that is, it must assure that all NPs are headed by acceptable Ns, all VPs by Vs, all APs by As. This is apparently the role of pronominals. Pronominals are also tightly constrained by LMBM: categorically, they may contain only those lexical categories determined by the lexicon, the inflectional categories found in syntax. No other type of semantic or grammatical category may be found in a pronominal (proadjective, pronoun, proverb). That is, for example, pronouns may be completely described in terms of Number, Gender, Lexical Category, etc. However, the processes of transposition and feature switching provide precisely the categories demanded by the empty nodes in Figure 1, and these categories are necessary and sufficient conditions on the semantic interpretations of pronominals. If no L-derivation applies to the example in Figure 1, it will be provided with the lexical categories of Gender, Noun Class, and Number from the lexicon. (Animacy is derivable from Natural Gender; see Beard, 1995.) If no L-derivation applies to the example in Figure 1, these pure lexical categories, along with the inflectional categories of the Subject NP, will be raised to C by wh-movement, and

Morphological spelling (MS) comprises the phonological realization rules of morphology. These are rules that determine the modification of lexical stems, if any, conditioned by the presence of derivational features. The MS module can read any features of a lexeme but can modify only the phonological (P) representation. It must contain a short-term memory component that can hold several features before writing to the P-representation of the lexeme. This accounts for the fact that a single affix can represent several grammatical features. MS memory will have to collect the features [þActive], [þIndicative], [þ1st], and [þSingular] before it can attach the /o/ to the Latin stem am- in order to generate amo ‘I love.’ This same capacity allows the MS component to reorder affixes, such as in Turkish gelir-ler-se and gelir-se-ler, both of which mean ‘if they come.’ Realizations of the pronoun one, i.e., someone, anyone, the one(s), result from accumulations of lexical features like [ " Masculine], [ " Plural] with syntactic features like [ " Specific] and [ " Definite] inserted into empty nodes that are not allowed by English grammar. The MS-module interprets both the lexical and inflectional features of the ultimate nodes and realizes them as pronouns, i.e., the one/ someone who bake-s. Lexeme representations obviously comprise a phonological, a grammatical, and a semantic representation. The MS-module must be able to read all three representations but can modify only P, given its purely phonological nature. In fact, this is the function of morphology, which is the interface between the lexicon and syntax, on the one hand, and the phonological or sign output, on the other.

Conclusion This brief introduction to LMBM has outlined the basic assumptions about (1) the lexicon, (2) lexical derivation rules, (3) inflectional derivation rules, and (4) morphological realization. The full theory is available in Beard (1995). Beard (1981, 1987)

Lexical Acquisition 61

provides a general theory of morphological performance that distinguishes most types of extragrammatical derivational processes from those determined by grammar. See also: Affixation; A-Morphous Morphology; Morphology: Overview.

Bibliography Anderson S R (1992). A-morphous morphology. Cambridge: Cambridge University Press. Aronoff M (1993). Morphology by itself; stems and inflectional classes. Cambridge, MA: MIT Press. Beard R (1981). North-Holland Linguistic Series 44: The Indo–European lexicon: a full synchronic theory. Amsterdam: North-Holland. Beard R (1987). ‘Lexical stock expansion.’ In Gussmann E (ed.) Rules and the lexicon: studies in word formation. Lublin: Catholic University Press. 24–41.

Beard R (1988). ‘On the separation of derivation from morphology: toward a lexeme–morpheme based morphology.’ Quarderni di semantica 9, 3–59. Beard R (1995). Lexeme morpheme base morphology. Albany: SUNY Press. Botha R (1980). Stellenbosch Papers in Linguistics 5: Word-based morphology and synthetic compounding. Stellenbosch: University of Stellenbosch. Halle M & Marantz A (1993). ‘Distributed morphology and the pieces of inflection.’ In Hale K & Keyser S J (eds.) The view from building 20: essays in linguistics in honor of Sylvain Bromberger. Cambridge, MA: MIT Press. 111–176. Matthews P (1972). Inflectional morphology. Cambridge: Cambridge University Press. Saussure F de (1916). Cours de linguistique ge´ ne´ rale. Paris: Payot. Szymanek B (1988). Categories and categorization in morphology. Lublin: Catholic University Press. Volpe M (2002). ‘Locatum and location verbs in lexeme– morpheme base morphology.’ Lingua 112, 103–119.

Lexical Acquisition D McCarthy, University of Sussex, Brighton, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Lexical acquisition is the production or augmentation of a lexicon for a natural language processing (NLP) system. The resultant lexicon is a resource like a computerized dictionary or thesaurus but in a format for machines rather than people. The entries in a lexicon are lexemes, and the information acquired for these includes the forms, meanings, collocations, and associated statistics. Acquisition is vital for NLP because the performance of any system that processes text or speech is dependent on its knowledge of the vocabulary of the language being processed. As well as an extensive grasp of the vocabulary, a system needs a means to cope when it encounters a word that it has not seen before. Because of the requirements of an application, there may be a case for limiting a system to the words appropriate within a specific domain. However, even in a specific domain, everyday words will be used and there will be additional domain-specific terminology. It is possible, and sometimes appropriate, to build a system that can run using only a small vocabulary. However, it is nevertheless necessary to acquire appropriate forms and meanings of the lexemes for the given domain. The lexical information required is not

explicitly listed in any available resource, and getting humans to provide it is extremely costly. For this reason lexical acquisition is frequently referred to as the bottleneck for NLP. In order to produce lexicons for deployable NLP systems, lexical acquisition must be automated. There are significant differences between the requirements of a lexicon intended for a computer system and the contents of a dictionary or thesaurus written for humans. Machine-readable dictionaries and thesauruses (MRDs and MRTs) have been designed for online use by humans. If the information contained were sufficient then one could permit a NLP system to use them directly, or after some simple reformatting. However, while it was hoped that machine-readable resources produced by humans for humans would be a viable means of populating lexicons, it is now widely recognized that this method will not work. The resources are prone to omissions, errors, and inconsistencies. Furthermore, the lexicographers producing these resources rely on the fact that users have an adequate grasp of the language, knowledge of the world, and basic intelligence to understand the definitions supplied and fill in any gaps. For example, a human user can look up unknown words in definitions and determine the meaning of these words given the context in the definition. A computer is not predisposed to determining the meaning from definitions, e.g., given the definition of pipette as a slender tube for measuring or transferring

Lexical Acquisition 61

provides a general theory of morphological performance that distinguishes most types of extragrammatical derivational processes from those determined by grammar. See also: Affixation; A-Morphous Morphology; Morphology: Overview.

Bibliography Anderson S R (1992). A-morphous morphology. Cambridge: Cambridge University Press. Aronoff M (1993). Morphology by itself; stems and inflectional classes. Cambridge, MA: MIT Press. Beard R (1981). North-Holland Linguistic Series 44: The Indo–European lexicon: a full synchronic theory. Amsterdam: North-Holland. Beard R (1987). ‘Lexical stock expansion.’ In Gussmann E (ed.) Rules and the lexicon: studies in word formation. Lublin: Catholic University Press. 24–41.

Beard R (1988). ‘On the separation of derivation from morphology: toward a lexeme–morpheme based morphology.’ Quarderni di semantica 9, 3–59. Beard R (1995). Lexeme morpheme base morphology. Albany: SUNY Press. Botha R (1980). Stellenbosch Papers in Linguistics 5: Word-based morphology and synthetic compounding. Stellenbosch: University of Stellenbosch. Halle M & Marantz A (1993). ‘Distributed morphology and the pieces of inflection.’ In Hale K & Keyser S J (eds.) The view from building 20: essays in linguistics in honor of Sylvain Bromberger. Cambridge, MA: MIT Press. 111–176. Matthews P (1972). Inflectional morphology. Cambridge: Cambridge University Press. Saussure F de (1916). Cours de linguistique ge´ne´rale. Paris: Payot. Szymanek B (1988). Categories and categorization in morphology. Lublin: Catholic University Press. Volpe M (2002). ‘Locatum and location verbs in lexeme– morpheme base morphology.’ Lingua 112, 103–119.

Lexical Acquisition D McCarthy, University of Sussex, Brighton, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Lexical acquisition is the production or augmentation of a lexicon for a natural language processing (NLP) system. The resultant lexicon is a resource like a computerized dictionary or thesaurus but in a format for machines rather than people. The entries in a lexicon are lexemes, and the information acquired for these includes the forms, meanings, collocations, and associated statistics. Acquisition is vital for NLP because the performance of any system that processes text or speech is dependent on its knowledge of the vocabulary of the language being processed. As well as an extensive grasp of the vocabulary, a system needs a means to cope when it encounters a word that it has not seen before. Because of the requirements of an application, there may be a case for limiting a system to the words appropriate within a specific domain. However, even in a specific domain, everyday words will be used and there will be additional domain-specific terminology. It is possible, and sometimes appropriate, to build a system that can run using only a small vocabulary. However, it is nevertheless necessary to acquire appropriate forms and meanings of the lexemes for the given domain. The lexical information required is not

explicitly listed in any available resource, and getting humans to provide it is extremely costly. For this reason lexical acquisition is frequently referred to as the bottleneck for NLP. In order to produce lexicons for deployable NLP systems, lexical acquisition must be automated. There are significant differences between the requirements of a lexicon intended for a computer system and the contents of a dictionary or thesaurus written for humans. Machine-readable dictionaries and thesauruses (MRDs and MRTs) have been designed for online use by humans. If the information contained were sufficient then one could permit a NLP system to use them directly, or after some simple reformatting. However, while it was hoped that machine-readable resources produced by humans for humans would be a viable means of populating lexicons, it is now widely recognized that this method will not work. The resources are prone to omissions, errors, and inconsistencies. Furthermore, the lexicographers producing these resources rely on the fact that users have an adequate grasp of the language, knowledge of the world, and basic intelligence to understand the definitions supplied and fill in any gaps. For example, a human user can look up unknown words in definitions and determine the meaning of these words given the context in the definition. A computer is not predisposed to determining the meaning from definitions, e.g., given the definition of pipette as a slender tube for measuring or transferring

62 Lexical Acquisition

small quantities of liquid, a computer may need to look up the meaning of tube, and would then be faced with the ambiguity that tube can mean metro as well as hollow object. In addition to the errors, omissions and problems of format, NLP systems require information that is simply not present in a dictionary because humans do not need it. A prime example of this deficiency is frequency information. Frequency information permits systems to concentrate effort on the more likely analyses or utterances. This evidence is important since a great many interpretations are possible for most utterances, and most information can be said in more than one way. The majority of NLP systems rely on statistical processing that requires probability estimates obtained from frequency data. Not only do lexicons require frequency data, but they need to be tailored to the domain of the application. Man-made resources are typically general purpose. For these reasons, most acquisition is now performed automatically from corpora, comprising large collections of real life text or speech samples. The corpus is chosen with regard to the intended domain of the application. Man-made resources are still often used in hybrid systems that collect the frequency data from the corpora but with recourse to the entities provided in the MRDs. There are also cases where acquisition is performed semiautomatically, with human lexicographers guiding the acquisition process or correcting the output.

Resources Machine-Readable Dictionaries

Dictionaries contain entries for words that are broken down into the various meanings, or senses, of the word. These meanings are supplied with definitions and part-of-speech information, i.e., whether this use of the word is a noun, verb, adjective, or adverb. Other information is also sometimes available, for example the various morphological forms of the word, its phonological realization and information on syntactic behavior. Some dictionaries also include a subject code or domain label, for example, the Longman Dictionary of Contemporary English (LDOCE) (Proctor, 1978), or the Oxford Dictionary of English (Soanes and Stevenson, 2003). LDOCE also provides information on the semantic class of the arguments of verbs (for the subject, direct object, and indirect object grammatical slots). Machine-Readable Thesauruses

Entries in thesauruses provide words with similar meanings. The most widely used MRT is WordNet

(Fellbaum, 1998) not just because of the wealth of information, particularly semantic, but also because of its free availability. WordNet is an online thesaurus, organized by semantic relations rather than alphabetically. Words are classified by their part-ofspeech (noun, verb, adjective, and adverb). They are then subdivided into small classes called ‘synsets’ where members are near synonyms of each other. These synsets are then linked together by semantic relationships such as hyponymy (nouns and verbs), meronymy (nouns), antonymy (adjectives), and entailment (verbs). Figure 1 illustrates some of these relationships. Similar resources are available in other languages, for example, a number of WordNets are being distributed in other languages, notably European languages. There are also dictionaries in other languages organized by semantic relationships, for example, EDR (NICT, 2002). Corpora

Corpora range in the amount of linguistic annotation provided. Some occur in raw form, as archives of text or spoken language. Some are ‘balanced’ with a mix of texts selected from a range of genres and domains, for example, the British National Corpus (Leech, 1992). This corpus also includes part-of-speech annotations for every word form. Other annotations are sometimes available, for example the Penn Treebank II (Marcus, 1995) includes syntactic analyses that have been hand-corrected from the output of an automatic parser. SemCor (Landes et al., 1998) is a 220 000-word corpus that has been manually annotated with WordNet sense tags. Acquisition from labeled data (supervised acquisition) is often more accurate than acquisition from raw text (unsupervised acquisition). However, annotations are costly to create and are not always available for a particular language or text type. Unsupervised systems are useful when there is no labeled data to learn from. The lack of annotation for corpora is compensated in part by the size of the corpora available. Multilingual Resources

As well as monolingual dictionaries and corpora, there are also some multilingual resources available. These resources are potentially extremely useful since the redundancies and different forms in one language can help resolve ambiguities in another. CELEX (Baayen et al., 1995) contains databases for English, Dutch, and German that includes syntactic, morphological, phonological, and orthographic information.

Lexical Acquisition 63

Figure 1 Some semantic relationships in WordNet.

Other multilingual dictionaries and thesauruses often have links between words with the same meaning in different languages. Multilingual corpora include both parallel corpora, where the data in one language has been translated into another, and comparable corpora where essentially the same content, such as news events, has been collected from different sources.

Automatic Techniques There are a wide variety of techniques used for lexical acquisition, and the appropriate solution will depend on the information sought and the resources available. The simplest approach is to look up information directly in an existing MRD. For many types of information this approach is not possible because the information is not present or complete. Furthermore, it is quite possible that a usage of an existing word will not be listed in a MRD. Alternatively, a system can be built to learn the information from data, referred to as training data, usually obtained from corpora. An extremely simple method is to apply pattern-matching techniques to the training data. This method has yielded success for some types of information, notably ‘is-a’ semantic relations between nouns. So, for example using the pattern, ‘X is a type of Y’ ! X is-a Y and the text, A dog is a type of animal, a system can be made to infer dog is-a animal. The majority of approaches use some form of statistics. There are straightforward applications

of textbook statistics, e.g., the chi-square or t-test, to frequency counts of word co-occurrences. A significant problem for all statistical approaches is that the frequency distributions of linguistic phenomena are skewed so that a minority of forms or meanings takes up the majority of instances in a corpus. This inequity means that systems need to cope with the problem of data sparseness, and that for a great many forms it is hard to get good frequency estimates. There are more complicated methods that incorporate statistical estimates. There are techniques that use concepts from information-theory. For example, Lin (1998) proposed a measure of similarity that takes into account the shared contexts of words. Machine-learning techniques, which may also involve a statistical component, induce lexical information from examples. The examples can be labeled (supervised training) or unlabeled (unsupervised). In supervised approaches, often referred to as memorybased learning, stored examples can be simply compared to a new candidate, and the closest fitting neighbor is used to obtain a label for the new candidate. Alternatively, instead of simply storing the examples, a system can generalize from the examples and find a decision tree that best partitions the examples according to the labels. Unsupervised learning is frequently performed by clustering entities according to features, e.g., words according to the contexts in which they occur. For example, the verbs in Table 1 might be clustered into three groups according to the frequency of their direct objects.

64 Lexical Acquisition Table 1 Clustering verbs by the frequency of their direct objects

The Entries and Acquired Information The entries stored in a lexicon are referred to as lexemes. They are listed with lexical information depending on the needs of the application that the system is being produced for. For example, a spelling checker may only require orthographic information to render the lexeme into word forms that can occur in text. Additional lexical information is often needed by applications including pronunciation; part-ofspeech; morphology; syntax, argument structure; preferences; semantics; pragmatics; and multiwords. Pronunciation

The phonetic realization of a word is needed by systems that recognize or produce speech rather than text. Word-specific pronunciation can be learned automatically from examples. This possibility has been demonstrated for Dutch (Daelemans and Durieux, 2000). In this memory-based approach, the examples to teach the system were constructed by aligning the letters of a word and its context to the phonemes. Pronunciations of new words are acquired by comparing the input letter sequence to stored examples. The approach could be applied to a language other than Dutch, provided that a grapheme-phoneme alignment is possible for that language. In other work, stress has also been acquired automatically using memory-based learning techniques (Daelemans et al., 1994). Part-of-Speech

The part-of-speech of a given word is important for morphological, syntactic, and semantic analysis. Indeed, since a given word may belong to different parts-of-speech, frequency information is often required in order to arrive at the most likely part-of-speech for a word in a given sentence. This

information is typically obtained using a labeled corpus, but the possible parts-of-speech for a word can also be induced using frequency information of other words in the context (Finch and Chater, 1994). While acquisition of part-of-speech is required for all words, further information on morphology, syntax, and semantics is usually acquired for nouns, verbs, adjectives, and adverbs only. These are openclass words that change meaning and behavior depending on the domain and with time. The set of open class words increases, whereas closed-class words are a restricted set for which most of the information can be obtained in one go and does not change radically. Morphology

The word forms that a lexeme can take are important for NLP systems. The variety of forms is due to inflections (to indicate number, gender, tense, etc., for example, eat versus ate) and derivations (from a different part-of-speech, for example, central centralize). While word forms are explicit in corpus data, acquisition is not straightforward because of both unseen and ambiguous forms. It is quite common for a given form to serve several functions (syncretism). For example, one cannot distinguish between the past tense and past participle forms of many English verbs, e.g., I jumped versus I have jumped. There are also cases where word forms of the same lexeme bear no morphological relationship e.g., go and went. Morphologically related words can be found using a combination of string edit measures, which measure the changes required to turn one string into another, and word co-occurrence statistics designed to capture semantic similarity. Neural networks, a method useful for detecting patterns in data, have been used for learning roots and suffixes from input data. Furthermore, the stems, suffixes, and families (or signatures) of suffixes of a language can

Lexical Acquisition 65

all be learned from corpora, for example, using methods from information theory that strive for the best compromise between a compact model and a good explanation of the data (Goldsmith, 2001).

Table 2 Selectional preferences for the direct object slot of the verb start Semantic class

Preference score

Example direct objects

Syntax, Argument Structure, and Preferences

time period communication activity entity

0.1 0.08 0.2 0.14

week, day speech, yelling construction, war, migration car, meal

The syntactic structures that a word can occur in are important for understanding utterances and ensuring grammatical output. There is a good deal of work on the argument structure of verbs since these play a key role relating the phrases within sentences and verbs have the most complex syntactic constructions. The correct arguments of a verb are more easily identified in a sentence when the parser knows the types of construction (subcategorization frames) possible for a verb. For example, information that give can take both a single object and a double object construction allows both the following interpretations: (1) ‘She gave (the dog) (bones).’ i.e., the dog gets the bones ‘She gave (the dog bones).’ i.e., some unidentified entity gets dog bones

Subcategorization information is acquired with rudimentary syntactic analysis (without subcategorization frame information) and statistical filters that reduce the impact of noise in the analysis. Subcategorization acquisition has included basic sets of frames (Brent, 1993), as well as more detailed sets (Briscoe and Carroll, 1997) and languages other than English. As well as argument structure, the semantic classes of arguments are acquired. For example, the direct objects of the verb eat are typically types of food. These are referred to as selectional restrictions, implying hard and fast constraints, or preferences. Preferences are usually acquired as a set of semantic classes, each with a score representing the strength of the preference for that class, as illustrated in Table 2. Many systems have followed the approach of Resnik (1998) and used corpus counts collected over WordNet. However semantic classes induced from corpus data (see below) can also be used for representation. Given both subcategorization frames and selectional preferences, alternative ways of expressing the verbs underlying arguments can be identified. For example, the alternation: (2) ‘The boy broke the window.’ $ ‘The window broke.’

can be observed by detecting the subcategorization frames for the verb break and observing the similarity between the preferences for words that occupy the object slot in the first variant with those in the subject slot in the alternate form. These are known as

diathesis alternations and are important because they are a link between the syntax and semantics of the verb. Verbs sharing diathesis patterns typically also share semantic characteristics (Levin, 1993). As well as detecting diathesis patterns directly, researchers have demonstrated that syntactic and semantic features can be used to cluster verbs into classes relevant to these alternations. Merlo and Stevenson (2001) have a method where manually selected and linguistically motivated features are used for classifying verbs into three classes relating to a selection of alternations using a supervised decision tree machine learning algorithm. Subsequent work has shown that a multilingual approach can facilitate acquisition using features available in another language to help make distinctions in the target language. Semantics

The meanings of words are required for some applications. For example, given the query, Who employs Mr. Smith and the sentence, Mr. Smith works for Hydro-Systems, lexical knowledge of the relationship between employ and work can be used to ensure an appropriate answer. Words can have more than one meaning, and this multiplicity may also need taking into account. For example, a question such as Who is on the board? would not want responses concerning planks of wood. One problem with including word meanings in lexicons is that the set of possible meanings is vast. When defining the meanings of a word it is possible to arrive at an extremely fine-grained classification with many different senses that are related, though subtly different in some way. For example, the verb break has 59 senses in WordNet version 2.0. Having a fine-grained sense inventory is not always beneficial for NLP, since distinguishing these meanings in text is not easy even for human annotators. There is typically a large skew in the frequency distribution of senses so that one or two senses of a word are much more predominant than the others. This emphasis is particularly the case when dealing with

66 Lexical Acquisition

language in a given domain. As a consequence, much acquisition of other information, such as subcategorization, has assumed a predominant sense for a word and not differentiated information by sense. Rather than use a man-made resource for lexical meaning, many have exploited the fact that semantically similar words tend to occur in similar contexts. These are distributional approaches that attempt to find similar words given input data such as that in Table 1. The data can be clustered to give distinct classes, e.g., (begin, commence), (eat, consume, devour), and (construct, build). Or, instead of explicitly creating classes, a target word can simply be listed with a specified number of ‘nearest neighbor’ words as an acquired thesaurus, e.g., eat: consume devour, etc. These lists of nearest neighbors are produced to avoid losing information during the clustering process. The neighbors are associated with the distributional similarity scores used for ordering them. Nearest neighbors or induced classes can be used by an NLP system to provide evidence when there is not enough data for a related word and to cope with the fact that things can be said in different ways, depending on subtle nuances. As well as finding related words, different senses of the same word can be found from the data by either (i) clustering the nearest neighbors of a word (Pantel and Lin, 2002) according to the contexts that these neighbors occur in or (ii) clustering the contexts of individual tokens of the word (Schu¨ tze, 1998). Another approach to determining the meanings of words from data is to use parallel corpora and define senses where a word in one language is realized by more than one word in another language (Resnik and Yarowsky, 2000). This technique can be done with more than two languages, and with languages that are more distantly related since the ambiguities are more likely to differ. The sense inventory produced by this strategy will be relevant for translation between those languages. This approach contrasts with using a predefined sense inventory such as WordNet, where the senses may be too fine-grained and never reflect ambiguities that actually give rise to problems for an application such as machine translation. As an alternative to identifying senses in text, there are also approaches that discover patterns of use within a generative lexicon framework (Pustejovsky, 1995) rather than assuming distinct senses. Pustejovsky and Hanks (2004) propose a semi-automatic method using selectional preference patterns identified by hand. Current work on acquisition of lexical semantics focuses on relationships between linguistic units.

Model-theoretic semantics where linguistic units are related to entities in the world have not received much attention. Though recently there has been work on acquisition of lexical semantics from non-linguistic data (Barzilay et al., 2003). Pragmatics

One way of conveying the user’s intent in an utterance is the choice of words. This choice can reflect stylistic preferences such as formality and attitude, for example, knowing whether to use demand, stipulate, request, or ask in a given utterance. Such information is particularly important for generation, but could also be useful for a system trying to detect sentiment. It is not obvious how to identify the nuances of meaning of near-synonyms from raw data. Work has been done however extracting this sort of information from MRDs (Inkpen and Hirst, 2001). Work on sentiment detection identifies words that are subjective, and those that indicate a positive or negative sentiment by partitioning good and bad reviews. Though sentiment detection is not focused on contrasting the meanings of semantically related words, e.g., brave versus foolhardy, it might benefit from such information. Multiwords

Lexemes can have more than one word stem, for example, post office. These are referred to as ‘multiwords’ and are phrases where the meaning is not compositional, that is, the meaning of the phrase is not simply a sum of the meaning of component words. Multiwords are important for interpretation and for production of natural sounding text. There are a variety of multiword types including idioms, specific constructions such as phrasal verbs and collocations, i.e., words which occur together by convention (Sag et al., 2002). Acquisition of multiwords is typically done using a combination of a statistical association measure and a linguistic filter (Smadja, 1993). Another approach is to search for potential candidates with syntactic analysis and statistics, and then look for signs of noncompositionality. This approach can be done by finding a lack of combinations of component words with substitutions from the same semantic class in a corpus. For example, we find red herring in corpus data but not usually yellow herring. Both WordNet and acquired thesauruses can be used to define the semantic classes. Non-compositionality has also been detected using distributional approaches to demonstrate that the semantics of the multiword has very different characteristics to that anticipated, given the component words.

Lexical Acquisition 67

Updating the Lexicon Most lexical acquisition systems have been designed to extract lexical knowledge from resources in advance and subsequently deploy the lexicon within a NLP system. However, due to data sparseness, the creative way in which language is used, and the very nature of open-class words, there will inevitably be linguistic forms and meanings that have not been acquired beforehand. A method is required for handling new phenomena as they occur. One updating task is the detection of words that are not already in the lexicon. This process must filter misspellings and non-real words, for example codes, from lists of candidates. Genuine words can be identified by examining the letter combinations and ensuring that these would produce regular sounding words for the given language. This process is performed using frequency data collected over character sequences from a given corpus (Park, 2002). When obtaining estimates for unseen word forms, for example, morphological forms, it might seem appropriate to take an average over forms observed in a training corpus. However, Baayen and Sproat (1996) demonstrate that care should be taken since frequent forms behave differently to infrequent forms. They recommend using estimates from forms that have occurred only once, rather than an average over all forms in the training corpus. As well as finding new words, there are also methods aimed at learning lexical information on the fly. Part-of-speech information is obtained using morphological information. Morphological information can also be used to identify the basic semantic category of a word, alongside other information, such as collocations. A chief motivation for thesaurus acquisition and clustering is that these will help with prediction of unseen behavior for members within a given class. There are also systems that place new word forms into a predefined inventory, such as WordNet.

Evaluation The output of lexical acquisition systems must be evaluated. There are two main ways in which this evaluation can be done. Firstly, the performance of the lexicon on a task can be measured. A task-based evaluation is appropriate when there is a specific task that the lexicon is being designed for, however a ready-made ‘plug and play’ application is not always available. Acquired lexical information have often been evaluated on sub-tasks, so for example, selectional preferences have been evaluated on their

performance at determining whether a given prepositional phrase should be attached to the preceding verb or noun phrase. For example, given the sentence The boy hit the man with a stick, the preference of hit with a stick is contrasted with the preference for man with a stick. The second method is evaluation of the quality and coverage of the lexical entries. Evaluating the entries is often done by consulting a man-made resource referred to as a ‘gold standard.’ The obvious problem with using such a resource, or even a combination of several resources, is that there will be rare forms in the gold standard that are simply not attested in the corpus data used for training. Likewise the very need for lexical acquisition means that there will be legitimate information that is omitted from a gold standard. For this reason, evaluation is also performed by examining a manually annotated sample of text input to the acquisition process to verify that what is acquired does indeed reflect what is attested in the corpus, and also the adequacy of any acquired frequency estimates. Human experts can also perform a manual verification of the acquired information. This approach is used to identify the errors in the output rather than to prove that all information available in the input was successfully extracted. There are further problems when there is no widely agreed upon classification, as is the case when acquiring semantic classes, or no precise definition of the phenomena being acquired, for example, in the case of multiword acquisition. These problems show up in a lack of agreement between human judges. For example, if asked to determine the multiwords in: (3) ‘They ate the chocolate cake in the wendy house.’

human judges might unanimously decide that wendy house is a multiword, but will potentially have more trouble with chocolate cake since the interpretation is more standard.

Future Directions Currently, acquisition focuses on coverage and accuracy of extracted information. There are issues of organization and representation that received attention when lexicons were manually constructed. Expressive representations with inheritance of features capture generalizations and avoid unnecessary repetition while handling exceptions. Acquisition from corpora currently focuses on acquiring one type of information as a flat list, perhaps ordered by frequency, alphabetically or arbitrarily. While issues of representation have been seen as important in the semi-automatic construction of some lexicons from machine readable resources (Copestake, 1992), there

68 Lexical Acquisition

has been little done on acquiring structured information from corpora. One reason that lexical acquisition from corpora has not used more complex representations is due to the widespread use of statistics and absence of a formal model for storing probability estimates in an inheritance framework. If such a framework were developed, acquisition of information into more complex representations might help in identifying cross-lingual generalizations and establishing correlations between different types of linguistic information, for example, linking a semantic class with syntactic behavior. With the increasing variety of information required, lexical acquisition continues to play a crucial role within NLP. The importance is emphasized by many large-scale projects focusing on specific aspects of lexical acquisition, e.g., the Multiword Expression Project and MEANING. The process of acquisition has been greatly enhanced by vast quantities of data available in corpora, along with statistical techniques to derive the information from the data and the increasing computational power available. In recent years, new resources have proved useful, e.g., comparable corpora in different languages and the World Wide Web. In the future, further resources could be exploited. Speech or image processing might provide the prosody and visual cues that humans use when acquiring vocabulary. Dialogue systems might provide feedback when acquisition has gone wrong. See also: Antonymy and Incompatibility; Computational Lexicons and Dictionaries; Corpora; Hyponymy and Hyperonymy; Lexeme-Morpheme-Based Morphology; Lexical Semantics: Overview; Lexicography: Overview; Lexicon: Structure; Machine Translation: Overview; Meronymy; Natural Language Processing: Overview; Syntactic Features and Feature Structures; WordNet(s).

Bibliography Baayen R H, Piepenbrock R & Gulikers L (1995). The CELEX Lexical Database (Release 2) [CD-ROM]. Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania. Baayen H & Sproat R (1996). ‘Estimating lexical priors for low-frequency morphologically ambiguous forms.’ Computational Linguistics 22(2), 155–166. Barzilay R, Reiter E & Siskind J (eds.) (2003). Proceedings of the HLT-NAACL03 workshop on learning word meaning from non-linguistic data. Canada: Edmonton. Brent M (1993). ‘From grammar to lexicon: unsupervised learning of lexical syntax.’ Computational Linguistics 19(2), 243–262. Briscoe E & Carroll J (1997). ‘Automatic extraction of subcategorization from corpora.’ In Proceedings of the

Fifth Applied Natural Language Processing Conference. 356–363. Copestake A (1992). ‘The ACQUILEX LKB: Representation Issues in the Semi-automatic Acquisition of Large Lexicons. (ACQUILEX WP NO. 36).’ In Proceedings of the Third Conference on Applied Natural Language Processing. Italy: Trento. Daelemans W & Durieux G (2000). ‘Inductive Lexica in Lexicon Development for Speech and Language Processing.’ In van Eynde F & Gibbon D (eds.) Lexicon development for speech and language processing. Dordrecht: Kluwer Academic Publishers. 115–139. Daelemans W, Gillis S & Durieux G (1994). ‘The acquisition of stress, a data-oriented approach.’ Computational Linguistics 20(3), 421–451. Fellbaum C (ed.) (1998). WordNet an electronic lexical database. Cambridge, Massachusetts: MIT Press. Finch S & Chater N (1994). ‘Learning syntactic categories: a statistical approach.’ In Oaksford M & Brown G D A (eds.) Neurodynamics and psychology. San Diego, California: Academic Press. Goldsmith J (2001). ‘Unsupervised learning of the morphology of a natural language.’ Computational Linguistics 27(2), 153–198. Inkpen D & Hirst G (2001). ‘Building a lexical knowledgebase of near-synonym differences.’ In Proceedings of the Workshop on WordNet and other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics. Pittsburgh, PA. June. Landes S, Leacock C & Tengi R (1998). ‘Building semantic concordances.’ In Fellbaum C (ed.) WordNet: An electronic lexical database. 199–216. Leech G (1992). ‘100 million words of English: the British National Corpus.’ Language Research 28(1), 1–13. Levin B (1993). English verb classes and alternations: a preliminary investigation. Chicago/London: University of Chicago Press. Lin D (1998). ‘An information-theoretic definition of similarity.’ In Proceedings of International Conference on Machine Learning. Madison, Wisconsin, July. Marcus M (1995). ‘The Penn Treebank: a revised corpus design for extracting predicate-argument structure.’ In Proceedings of the ARPA Human Language Technology Workshop. Princeton, NJ. Merlo P & Stevenson S (2001). ‘Automatic verb classification based on statistical distributions of argument structure.’ Computational Linguistics 27(3), 373–408. NiCT (2002). ‘EDR Electron Dictionary version 2.0, technical guide.’ National Institute of Information and Communications Technology, Tokyo, Japan. Pantel P & Lin D (2002). ‘Discovering Word Senses from Text.’ In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Canada: Edmonton. 613–619. Park Y (2002). ‘Identification of probable real words: an entropy-based approach.’ In Proceedings of the Workshop on Unsupervised Lexical Acquisition, 40th meeting of the Association for Computational Linguistics. Philadelphia, PA.

Lexical Conceptual Structure 69 Procter P (1978). Longman dictionary of contemporary English. Longman Group Ltd. Harlow: UK. Pustejovsky J (1995). The Generative Lexicon. Cambridge, MA: MIT Press. Pustejovsky J, Hanks P & Rumshisky A (2004). ‘Automated induction of sense in context.’ In Proceedings of the 20th International Conference of Computational Linguistics, COLING-2004 2, 924–930. Resnik P (1998). ‘WordNet and class-based probabilities.’ In Fellbaum C (ed.) WordNet: An electronic lexical database. 239–264. Resnik P & Yarowsky D (2000). ‘Distinguishing systems and distinguishing senses: new evaluation methods for word sense disambiguation.’ Natural Language Engineering 5(2), 113–133. Sag I, Baldwin T, Bond F et al. (2002). ‘Multiword expressions: a pain in the neck for NLP.’ In Proceedings of the Third International Conference on Intelligent Text

Processing and Computational Linguistics (CICLING 2002). Mexico City: Mexico. 1–15. Schu¨ tze H (1998). ‘Automatic word sense discrimination.’ Computational Linguistics 24(1), 97–123. Smadja F (1993). ‘Retrieving collocations from text: Xtract.’ Computational Linguistics (Special Issue on Using Large Corpora) 19(1), 143–177. Soanes C & Stevenson A (eds.) (2003). The Oxford Dictionary of English. Oxford University Press.

Relevant Websites http://mwe.stanford.edu – The Multiword Expression Project. http://www.lsi.upc.es/nrigau/meaning/meaning.html – The MEANING project.

Lexical Conceptual Structure J S Jun, Hangkuk University of Foreign Studies, Seoul, Korea ! 2006 Elsevier Ltd. All rights reserved.

Introduction The lexical conceptual structure (LCS) or simply the conceptual structure (CS) is an autonomous level of grammar in conceptual semantics (Jackendoff, 1983, 1990, 1997, 2002), in which the semantic interpretation of a linguistic expression is explicitly represented. Jackendoff’s (1983) original conception is to posit a level of mental representation in which thought is couched (cf. the language of thought in Fodor, 1975). CS is a relay station between language and peripheral systems such as vision, hearing, smell, taste, kinesthesia, etc. Without this level, we would have difficulty in describing what we see and hear. There are two ways to view CS in formalizing a linguistic theory. One is to view CS as a nonlinguistic system that serves as an interface between meaning and nonlinguistic modalities. Then, we need another level of representation for meaning (cf. Chomsky’s [1981, 1995] LF); and CS is related to the linguistic meaning by pragmatics as shown in Figure 1. This is the view of Katz and Fodor (1963), Jackendoff (1972), Katz (1980), and Bierwisch and Schreuder (1992). The alternative conception is to view CS as the semantic structure. The linguistic meaning as well as nonlinguistic information compatible with sensory and motor inputs is directly represented in CS. CS is related with other linguistic levels such as syntax

and phonology by correspondence rules, and therefore CS is part of the lexical information (hence called LCS) as shown in Figure 2. This is the current view of conceptual semantics. One argument that supports the latter view comes from generic judgment sentences. In the standard view of linguistic meaning, judgments of superordination, subordination, synonymy, entailment, etc., are linguistic. We judge that ‘bird’ and ‘chicken’ make a superordinate-subordinate pair; that in some dialects ‘cellar’ and ‘basement’ are synonymous; and that ‘Max is a chicken’ entails ‘Max is a bird.’ Linguistic judgments of this sort are formalized in theories such as meaning postulates (Fodor, 1975) and semantic networks (Collins and Quillian, 1969). Jackendoff (1983) points out one problem in formalizing these judgments from a purely linguistic perspective: judgments of superordination and subordination, for instance, are directly related to judgments of generic categorization sentences such as ‘A chicken is a bird.’ The judgment about generic categorization is, however, not entirely linguistic or semantic, in that it behaves creatively enough to include ambiguous cases such as (1) below. (1a) (1b) (1c) (1d)

A piano is a percussion instrument. An australopithecine was a human. Washoe (the chimp)’s sign system is a language. An abortion is a murder. (Jackendoff, 1983: 102)

We make generic categorization judgments about (1) not on the basis of meaning postulates or semantic networks but on the basis of our factual, often

Lexical Conceptual Structure 69 Procter P (1978). Longman dictionary of contemporary English. Longman Group Ltd. Harlow: UK. Pustejovsky J (1995). The Generative Lexicon. Cambridge, MA: MIT Press. Pustejovsky J, Hanks P & Rumshisky A (2004). ‘Automated induction of sense in context.’ In Proceedings of the 20th International Conference of Computational Linguistics, COLING-2004 2, 924–930. Resnik P (1998). ‘WordNet and class-based probabilities.’ In Fellbaum C (ed.) WordNet: An electronic lexical database. 239–264. Resnik P & Yarowsky D (2000). ‘Distinguishing systems and distinguishing senses: new evaluation methods for word sense disambiguation.’ Natural Language Engineering 5(2), 113–133. Sag I, Baldwin T, Bond F et al. (2002). ‘Multiword expressions: a pain in the neck for NLP.’ In Proceedings of the Third International Conference on Intelligent Text

Processing and Computational Linguistics (CICLING 2002). Mexico City: Mexico. 1–15. Schu¨tze H (1998). ‘Automatic word sense discrimination.’ Computational Linguistics 24(1), 97–123. Smadja F (1993). ‘Retrieving collocations from text: Xtract.’ Computational Linguistics (Special Issue on Using Large Corpora) 19(1), 143–177. Soanes C & Stevenson A (eds.) (2003). The Oxford Dictionary of English. Oxford University Press.

Relevant Websites http://mwe.stanford.edu – The Multiword Expression Project. http://www.lsi.upc.es/nrigau/meaning/meaning.html – The MEANING project.

Lexical Conceptual Structure J S Jun, Hangkuk University of Foreign Studies, Seoul, Korea ! 2006 Elsevier Ltd. All rights reserved.

Introduction The lexical conceptual structure (LCS) or simply the conceptual structure (CS) is an autonomous level of grammar in conceptual semantics (Jackendoff, 1983, 1990, 1997, 2002), in which the semantic interpretation of a linguistic expression is explicitly represented. Jackendoff’s (1983) original conception is to posit a level of mental representation in which thought is couched (cf. the language of thought in Fodor, 1975). CS is a relay station between language and peripheral systems such as vision, hearing, smell, taste, kinesthesia, etc. Without this level, we would have difficulty in describing what we see and hear. There are two ways to view CS in formalizing a linguistic theory. One is to view CS as a nonlinguistic system that serves as an interface between meaning and nonlinguistic modalities. Then, we need another level of representation for meaning (cf. Chomsky’s [1981, 1995] LF); and CS is related to the linguistic meaning by pragmatics as shown in Figure 1. This is the view of Katz and Fodor (1963), Jackendoff (1972), Katz (1980), and Bierwisch and Schreuder (1992). The alternative conception is to view CS as the semantic structure. The linguistic meaning as well as nonlinguistic information compatible with sensory and motor inputs is directly represented in CS. CS is related with other linguistic levels such as syntax

and phonology by correspondence rules, and therefore CS is part of the lexical information (hence called LCS) as shown in Figure 2. This is the current view of conceptual semantics. One argument that supports the latter view comes from generic judgment sentences. In the standard view of linguistic meaning, judgments of superordination, subordination, synonymy, entailment, etc., are linguistic. We judge that ‘bird’ and ‘chicken’ make a superordinate-subordinate pair; that in some dialects ‘cellar’ and ‘basement’ are synonymous; and that ‘Max is a chicken’ entails ‘Max is a bird.’ Linguistic judgments of this sort are formalized in theories such as meaning postulates (Fodor, 1975) and semantic networks (Collins and Quillian, 1969). Jackendoff (1983) points out one problem in formalizing these judgments from a purely linguistic perspective: judgments of superordination and subordination, for instance, are directly related to judgments of generic categorization sentences such as ‘A chicken is a bird.’ The judgment about generic categorization is, however, not entirely linguistic or semantic, in that it behaves creatively enough to include ambiguous cases such as (1) below. (1a) (1b) (1c) (1d)

A piano is a percussion instrument. An australopithecine was a human. Washoe (the chimp)’s sign system is a language. An abortion is a murder. (Jackendoff, 1983: 102)

We make generic categorization judgments about (1) not on the basis of meaning postulates or semantic networks but on the basis of our factual, often

70 Lexical Conceptual Structure

Figure 1 CS as a nonlinguistic system (adapted from Jackendoff R (1983). Semantics and cognition. Cambridge, MA: MIT Press, 20, with permission).

Figure 2 CS as part of the linguistic system (adapted from Jackendoff R (1983). Semantics and cognition. Cambridge, MA: MIT Press, 21, with permission).

political, world knowledge. For instance, our judgment about (1d) is influenced by our political position, religion, and knowledge about biology. This is analogous to Labov’s (1973) dubious ‘cup-bowl’ judgment, which obviously resorts to nonlinguistic encyclopedic knowledge as well as the linguistic type system. CS is, by definition, the level that represents encyclopedic knowledge as part of our thought. Hence, we should refer to CS to make generic categorization judgments about (1). Jackendoff’s (1983) puzzle is summarized, as follows. We make judgments of semantic properties such as superordination and subordination at the level of semantic structure. We make generic categorization judgments at the level of CS as shown by (1). If the semantic structure were separated from CS, we would fail to catch the obvious generalization between the superordinate-subordinate judgment and

the generic categorization judgment. If, by contrast, CS were the semantic structure, we would have no trouble in accounting for the intuitive identity between the two judgments. Therefore, CS is the semantic structure. For more arguments to support the view that CS is the semantic structure, see Jackendoff (1983: Ch. 6).

Overview of Conceptual Semantics Autonomy of Semantics

A central assumption in conceptual semantics is the autonomy of semantics. In Chomsky’s view of language, syntax makes an autonomous level of grammar, whereas phonology and semantics merely serve as interpretive components (PF and LF). Jackendoff (1997) criticizes this view as syntactocentric, and

Lexical Conceptual Structure 71

provides convincing arguments to support his thesis that phonology and semantics as well as syntax make autonomous levels of grammar. We find numerous pieces of evidence for the autonomy of semantics in the literature of both psycholinguistics and theoretical linguistics. Zurif and Blumstein’s (1978) pioneering work shows that Wernicke’s area is the center of semantic knowledge in the brain in comparison with Zurif, Caramazza and Myerson’s (1972) previous finding that Broca’s area is the center of syntactic knowledge. Swinney’s (1979) classical work on lexical semantic priming shows that lexical semantics is independent of the grammatical contexts like the movement chain in a sentence. Pin˜ ango, Zurif, and Jackendoff (1999) report more workload for the online processing of aspectual coercion sentences (e.g., John jumped for two hours) than for the processing of syntactically equivalent noncoerced sentences (e.g., John jumped from the stage). Semantic categories are not in one-to-one correspondence with syntactic categories. For instance, all physical object concepts correspond to nouns, but not all nouns express physical object concepts; e.g., earthquake and concert express event concepts. All verbs express event/state concepts, but not all event/state concepts are expressed by verbs; e.g., earthquake and concert are nouns. Contrary to Chomsky’s (1981) theta criterion, we have plenty of data that shows mismatch between syntactic functions and thematic roles. For instance, the semantic interpretation of buy necessarily encodes both the transfer of money from the buyer to the seller and the transfer of the purchased entity from the seller to the buyer. Among the three semantic arguments, i.e., the buyer, the seller, and the purchased object, only the buyer and the purchased entity are syntactic arguments (e.g., John bought the book). The seller is syntactically expressed as an adjunct (e.g., John bought the book from Jill). Moreover, the buyer plays the source role of money and the target role of the purchased entity simultaneously; the seller plays the source role of the purchased entity and the target role of money simultaneously. In short, the buyer and the seller have multiple theta roles even though each of them corresponds to one and only one syntactic entity. A simple semantic distinction often corresponds to many syntactic devices. For instance, telicity is expressed by such various syntactic devices as choice of verb (2a), choice of preposition (2b), choice of adverbial (2c), choice of determiner in the subject NP (2d) and in the object NP (2e), and choice of prepositional object (2f) (Jackendoff, 1997: 35).

(2a) John destroyed the cart (in/*for an hour). John pushed the cart (for/*in an hour). (2b) John ran to the station (in/*for an hour). John ran toward the station (for/*in an hour). (2c) The light flashed once (in/*for an hour). The light flashed constantly (for/*in an hour). (2d) Four people died (in/*for two days). People died (for/*in two days) (2e) John ate lots of peanuts (in/*for an hour) John ate peanuts (for/*in an hour). (2f) John crashed into three walls (in/*for an hour) John crashed into walls (for/*in an hour)

! Telic ! Atelic ! Telic ! Atelic ! Telic ! Atelic ! Telic ! Atelic ! Telic ! Atelic ! Telic ! Atelic

To sum up, the mapping between syntax and semantics is not one-to-one; rather, it is one-to-many, many-to-one, or at best many-to-many. The mapping problem is not easy to explain in the syntactocentric architecture of language. The overall difficulty in treating semantics merely as an interpretive component of grammar along with a similar difficulty treating phonology as an interpretive component (cf. Jackendoff, 1997: Ch. 2) leads Jackendoff to propose a tripartite architecture of language, in which phonology, syntax, and semantics are all independent levels of grammar licensed by phonological formations rules, syntactic formation rules, and conceptual/ semantic formation rules respectively, and interfaced by correspondence rules between each pair of modules, as shown in Figure 3. Lexical Conceptual Structure

Conceptual semantics assumes striking similarities for the organization of CS with the structural organization of syntax. As syntax makes use of syntactic categories, namely syntactic parts of speech like nouns, adjectives, prepositions, verbs, etc., semantics makes use of semantic categories or semantic parts of speech such as Thing, Property, Place, Path, Event, State, etc. As syntactic categories are motivated by each category member’s behavioral properties in syntax, semantic or ontological categories are motivated by each category member’s behavioral properties in meaning. Syntactic categories are combined by syntactic phrase-structure structure rules into larger syntactic expressions; likewise, semantic categories are combined by semantic phrase-structure rules into larger semantic expressions. The syntactic representation is structurally organized, so we can define dominance or government relations among syntactic constituents; likewise, the semantic representation is structurally organized, so we can define grammatically significant hierarchical relations among semantic constituents. Various syntactic phrase-structure rules can be generalized into a rule schema called X-bar syntax (Jackendoff, 1977); likewise, various semantic

72 Lexical Conceptual Structure

Figure 3 The tripartite parallel architecture (reproduced from Jackendoff R (2002). Foundations of language: brain, meaning, grammar, evolution. Oxford: Oxford University Press).

phrase-structure rules can be generalized into a rule schema called X-bar semantics (Jackendoff, 1987b). Ontological Categories Ontological categories are first motivated by our cognitive layouts. To mention some from the vast psychology literature, Piaget’s developmental theory of object permanence shows that infants must recognize objects as a whole, and develop a sense of permanent existence of the objects in question when they are not visible to the infants. Researchers in language acquisition have identified many innate constraints on language learning like reference principle, object bias, whole object principle, shape bias, and so on (cf. Berko Gleason, 1997). For instance, children rely on the assumption that words refer to objects, actions, and attributes in the environments by reference principle. Wertheimer’s (1912) classical experiment on apparent movement reveals that humans are equipped with an innate tendency to perceive the change of location as movement from one position to the other; the apparent movement experiment renders support for the expansion of the event category into function argument structures like [Event GO ([Thing ], [Path ])]. Ontological categories also have numerous linguistic motivations. Pragmatic anaphora (exophora) provides one such motivation. In order to understand the sentence in (3), the hearer might have to pick out the referent of that among several entities in the visual field. If the hearer did not have object concepts to organize the visible entities, (s)he could not pick out the proper referent of the pragmatic anaphora that.

The object concept involved in the semantic interpretation of (3) motivates the ontological category Thing. (3) I bought that last night.

The category Thing proves useful in interpreting many other grammatical structures. It provides the basis of interpreting the Wh-variable in (4a); it supports the notion of identity in the same construction in (4b); and it supports the notion of quantification as shown in (4c). (4a) What did you buy last night? (4b) John bought the same thing as Jill. (4c) John bought something/everything that Jack bought.

Likewise, we find different sorts of pragmatic anaphora that motivate ontological categories like Place (5a), Direction (5b), Action (5c), Event (5d), Manner (5e), and Amount (5f). (5a) (5b) (5c) (5d) (5e) (5f)

Your book was here/there. They went there yesterday. Can he do this/that? It happened this morning. Bill shuffled a deck of cards this way. The man I met yesterday was this tall.

These ontological categories provide innate bases for interpreting Wh-variables, the identity construction, and the quantification, as shown in (6)–(8). (6a) Where was my book? (6b) Where did they go yesterday? (6c) What can he do?

Lexical Conceptual Structure 73 (6d) What happened this morning? (6e) How did Bill shuffle a deck of cards? (6f) How tall was the man you met yesterday? (7a) (7b) (7c) (7d)

John put the book on the same place as Bill. John went the same way as Bill. John did the same thing as Bill. The same thing happened yesterday as happened this morning. (7e) John shuffled a deck of cards the same way as Bill. (7f) John is as tall as the man I met yesterday. (8a) (8b) (8c) (8d)

John put the book at some place that Bill put it. John went somewhere that Bill went. John did something Bill did. Something that happened this morning will happen again. (8e) John will shuffle cards in some way that Bill did. (8f) (no parallel for amounts)

For more about justifying ontological categories, see Jackendoff (1983: Ch. 3). Conceptual Formation Rules Basic ontological categories are expanded into more complex expressions using function-argument structural descriptions. (9) shows such expansions of some ontological categories. (9a) EVENT ! [Event GO (THING, PATH)] (9b) EVENT ! [Event STAY (THING, PLACE)] (9c) EVENT ! [Event CAUSE (THING or EVENT, EVENT)] (9d) EVENT ! [Event INCH (STATE)] (9e) STATE ! [State BE (THING, PLACE)] (9f) PLACE ! [Place PLACE-FUNCTION (THING)] (9g) PATH ! [Path PATH-FUNCTION (THING)]

The function-argument expansion is exactly parallel with rewriting rules in syntax (e.g., S ! NP VP; NP ! Det (AP)* N; VP ! V NP PP), and hence can be regarded as semantic phrase-structure rules. The semantic phrase-structure rules in (9) allow recursion such as syntactic phrase-structure rules: an Event category can be embedded in another Event category as shown in (9c). We also can define hierarchical relations among conceptual categories in terms of the depth of embedding as we define syntactic dominance or government in terms of the depth of embedding in syntactic structures. The depth of embedding in CS plays a significant role in explaining such various grammatical phenomena as subject selection, case, binding, control, etc. See Culicover and Jackendoff (2005) for more about these issues. Place functions in (9f) may include IN, ON, TOP-OF, BOTTOM-OF, etc. Path functions in (9g) may include TO, FROM, TOWARD, VIA, etc.

Conceptual semantics is a parsimonious theory, in that it makes use of only a handful of functions as conceptual primitives. All functions should be motivated on strict empirical grounds. This is exactly parallel with using only a handful of syntactic categories motivated on strict empirical grounds. Syntactic phrase-structure rules do not refer to unlimited number of syntactic categories. Syntactic categories such as noun, adjective, preposition, verb, etc. are syntactic primitives, and they are motivated by each category member’s behavioral properties in syntax. Likewise, semantic phrase-structure rules refer to a restricted set of semantic or conceptual primitives that are empirically motivated by general properties of meaning. Functions such as GO, BE, and STAY are empirically motivated in various semantic fields. They are the bases for interpreting spatial sentences in (10). (10a) GO: The train traveled from Boston to Chicago. (10b) BE: The statue stands on Cambridge common. (10c) STAY: John remained in China.

These functions also support the interpretation of possession sentences in (11). (11a) GO: John gave the book to Bill. (11b) BE: John had no money. (11c) STAY: The library kept several volumes of the Korean medieval literature.

Interpreting ascription sentences also require GO, BE, and STAY, as shown in (12). (12a) GO: The light turned from yellow to red. (12b) BE: The stew seemed distasteful. (12c) STAY: The aluminum stayed hard.

One interesting consequence of having GO, BE, and STAY in both spatial and nonspatial semantic fields is that we can explain how we use the same verb for different semantic fields. (13a) The professor turned into a driveway. (13b) The professor turned into a pumpkin.

(Spatial)

(14a) The bus goes to Paris. (14b) The inheritance went to Bill.

(Spatial) (Possession)

(15a) John is in China. (15b) John is a doctor.

(Spatial) (Ascription)

(Ascription)

(16a) John kept the CD in his pocket. (Spatial) (16b) John kept the CD. (Possession) (17a) The professor remained in the driveway. (17b) The professor remained a pumpkin.

(Spatial) (Ascription)

74 Lexical Conceptual Structure

In (13), the verb turn is used in both spatial and ascription sentences with the GO meaning. How do we use the same verb for two different semantic fields? Do we have to assume two different lexical entries for turn? Conceptual semantics does not pay anything to explain this puzzle. We do not need two different lexical entries for turn to explain the spatial and ascription meanings. We just posit the event function GO for the lexical semantic description or LCS for turn in (13). Both spatial and ascription meanings follow form the LCS for turn, since the function GO is in principle motivated by both spatial and ascription sentences. We can provide similar accounts for all the data in (14)–(17). For more about the general overview of conceptual semantics, see Jackendoff (1983, 1987a, 1990, 2002). X-bar Semantics

Generative linguists in the 1950s and 1960s succeeded in showing the systematic nature of language with a handful of syntactic phrase-structure rules. But they were not sure how the phrase-structure rules got into language learners’ minds within a relatively short period of time; it was a learnability problem. X-bar syntax (Chomsky, 1970; Jackendoff, 1977) opened a doorway to the puzzle. Children do not have to be born with dozens of syntactic categories; children are born with one syntactic category, namely, category X. Children do not have to learn dozens of totally unrelated syntactic phrase-structure rules separately; all seemingly different syntactic phrase-structure rules share a fundamental pattern, namely, X-bar syntax. Jackendoff (1987b, 1990), who was a central figure in developing X-bar syntax in the 1970s, has completed his X-bar theory by proposing X-bar semantics. We have so far observed that CS is exactly parallel with the syntactic structure. Conceptual categories are structurally organized into CS by virtue of semantic phrase-structure rules, as syntactic categories are structurally organized into syntactic structure by virtue of syntactic phrase structure rules. (18) is the basic formation of X-bar syntax. (18a) XP ! Spec X’ (18b) X’ ! X Comp (18c) X ! [ ! N, ! V]

Now that CS has all parallel properties with the syntactic structure, all semantic phrase-structure rules are generalized into X-bar semantics along the same line with X-bar syntax as shown in (19). 2

3

Event Thing Place . . . 5 (19) [Entity] !4 Token Type Fð< Entity1 ; < Entity2 ; < Entity3 >>>

(19) provides not only the function-argument structural generalization for all the semantic phrasestructure rules but also shows how major syntactic constituents correspond to major conceptual categories. That is, the linking between syntax and semantics can be formalized as (20) and (21). (20) XP corresponds to [Entity] (21)

! X0

" < YP < ZP >>

corresponds to

! " Entity FðE1 ; < E2 ; < E3 >>Þ

where YP corresponds to E2, ZP corresponds to E3, and the subject (if there is one) corresponds to E1. To sum up, the obvious similarity between (18) and (19) enables us to account for the tedious linking problem without any extra cost.

General Constraints on Semantic Theories Jackendoff (1983) suggests six general requirements that any semantic theory should fulfill: expressiveness, compositionality, universality, semantic properties, the grammatical constraint, and the cognitive constraint. First, a semantic theory must be observationally adequate; it must be expressive enough to describe most, if not all, semantic distinctions in a natural language. Conceptual semantics has expressive power, in that most semantic distinctions in a natural language can be represented by CS with a handful of conceptual categories plus conceptual formation rules. What is better is that the expressive power has improved since the original conception of the theory. For instance, Jackendoff (1990: Ch. 7) introduced the action tier into the theory to represent the actor/patient relation aside from motion and location. In (22a), John is the source of the ball and the actor of the throwing event simultaneously; the ball is a moving object, the theme, and an affected entity, the patient, simultaneously. It is quite common for one syntactic entity to bear double theta roles contra Chomsky’s (1981) theta criterion; conceptual semantics captures this by representing the motion/location event in the thematic tier (22b), and the actor/patient relation in the action tier (22c). (22a) John threw the ball. Source Goal Actor Patient (22b) [Event CAUSE ([JOHN], [Event GO([BALL], [Path TO([ . . . ])])])] (22c) [AFF([JOHN], [BALL])]

The action tier not only explains the fine semantic distinction in language but also plays a central role in such grammatical phenomena as linking and case. Besides the action tier, Jackendoff (1991) introduced

Lexical Conceptual Structure 75

an elaborate feature system into CS to account for the semantics of parts and boundaries; Csuri (1996) introduced the referential tier into CS that describes the definiteness of expressions; Jackendoff (2002) introduced the lambda extraction and the topic/ focus tier into CS. All these and many other innovations make the theory expressive enough to account for significant portion of natural language semantics. The second constraint on a semantic theory is compositionality: an adequate semantic theory must show how the meanings of parts are composed into the meaning of a larger expression. Conceptual semantics is compositional, in that it shows how combinatorial rules of grammar compose the meanings of ontological categories into the CS of a larger expression. The third requirement is universality: an adequate semantic theory must provide cross-linguistically relevant semantic descriptions. Conceptual semantics is not a theory of meaning for any particular language. It is a universal theory of meaning; numerous cross-linguistic studies have been conducted with the conceptual semantic formalism. See Jun (2003), for instance, for a review of many conceptual semantic studies on the argument linking and case in languages such as Korean, Japanese, Hindi, Urdu, English, Old English, French, etc. The fourth requirement is semantic properties: an adequate semantic theory should be able to explain many semantic properties of language like synonymy, anomaly, presupposition, and so on. That is, any semantic theory must explicate the valid inference of expressions. CS provides a direct solution to this problem in many ways. The type/token distinction is directly expressed in CS, and explains most semantic distinctions made by the semantic type system. By decomposing verbs such as kill into [CAUSE ([THING], [NOT-ALIVE ([THING])])], conceptual semantics explains how John killed Bill entails Bill is dead. For more about semantic properties, see Jackendoff (1983, 1990, 2002). The fifth requirement is the grammatical constraint: if other things were equal, a semantic theory that explains otherwise arbitrary generalizations about the lexicon and the syntax would be highly preferable. Conceptual semantics is a theory of meaning that shows how a handful of conceptual primitives organize the vast domain of lexical semantics. Conceptual semantics also explains how semantic entities are mapped onto syntactic entities in a principled manner. For instance, the linking principle in conceptual semantics states that the least embedded argument in the CS is mapped onto the least embedded syntactic argument, namely the subject. In (22b & c), [JOHN] is the least embedded argument

in both the action and thematic tiers; this explains why [JOHN] instead of [BALL] is mapped onto the subject of (22a). Jun (2003) is a conceptual semantic work on case; Culicover and Jackendoff (2005) offer conceptual semantic treatments of binding, control, and many other syntax-related phenomena. In short, conceptual semantics is an interface theory between syntax and semantics. The theory has a desirable consequence for the learnability problem, too. Language learners cannot acquire language solely by syntax or solely by semantics. As Levin (1993) demonstrates, a number of syntactic regularities are predicted by semantic properties of predicates. Conceptual semantics makes a number of predictions about syntax in terms of CS. Chomsky’s explanatory adequacy is a requirement for the learnability problem; conceptual semantics is thus a theory that aims to achieve the highest goal of a linguistic theory. The final requirement on a semantic theory is the cognitive constraint: a semantic theory should address interface problems between language and other peripheral systems like vision, hearing, smell, taste, kinesthesia, etc. Conceptual semantics fulfills this requirement, as CS is by definition a level of mental representation at which both linguistic and nonlinguistic modalities converge. Jackendoff (1987c) focuses on the interface problem, and shows, for instance, how the visual representation is formally compatible with the linguistic representation based on Marr’s (1982) theory of visual perception.

Comparison with Other Works Bierwisch and Schreuder’s (B&S; 1992) work is another influential theory that makes explicit use of the term conceptual structure. Conceptual semantics shares two important assumptions with B&S, but there are crucial distinctions between the two theories. First, B&S also assume a separate level of conceptual structure. Their conception of CS is similar to Jackendoff’s conception of CS in that CS is a representational system of message structure where non-linguistic factual/encyclopedic information is expressed. B&S, however, assume that CS strictly belongs to a nonlinguistic modality, and that the linguistic meaning is represented in another level called semantic form (SF). As a result, SF, but not CS, is the object of lexical semantics, and hence LCS does not make much sense in this theory. In the first section of this article, we discussed two possible views of CS; B&S take the former view of CS, whereas Jackendoff advocates the latter view. Second, SF in B&S’s theory is compositional as CS in conceptual semantics. B&S’s lexical decomposition relies on two sorts of elements: constants such as DO,

76 Lexical Conceptual Structure

MOVE, FIN, LOC, etc., and variables such as x, y, z. Constants and variables are composed into a larger expression in terms of formal logic. (23a) illustrates B&S’s SF for enter; (23b) is the CS for the same word in Jackendoff’s theory. (23a) [y DO [MOVE y] : FIN [y LOC IN x]] (23b) [Event GO ([Thing ], [Path TO ([Place IN ([Thing ])])])]

One reason B&S maintain a purely nonlinguistic CS as well as a separate SF is that factual or encyclopedic knowledge does not seem to make much grammatical contribution to language. To B&S, there is a clear boundary where the semantic and the encyclopedic diverge. Pustejovsky’s (1995) generative lexicon (GL) theory is interesting in this regard. GL also assumes lexical decomposition. Pustejovsky’s lexical decomposition makes use of factual or encyclopedic knowledge in a rigorous formalism called the qualia structure. The qualia structure of book, for instance, expresses such factual knowledge as the origin of book as write(x, y) in the Agentive quale, where x is a writer (i.e., human(x)), and y is a book (i.e., book(y)). The qualia structure also expresses the use of the word in the Telic quale; hence, the lexical semantic structure for book includes such factual knowledge as read(w, y), where w is a reader (i.e., human(w)), and y is a book. The factual or encyclopedic knowledge is not only expressed in formal linguistic representations but also plays a crucial role in explaining a significant portion of linguistic phenomena. We interpret (24) as either Chomsky began writing a book or Chomsky began reading a book. Pustejovsky suggests generative devices like type coercion and co-composition to explain the two readings of (24) in a formal theory; i.e., writing or reading is part of the qualia structure of book, and, hence, the two readings of (24) are predicted by formal principles of lexical semantics. (24) Chomsky began a book.

It is far beyond the scope of this article to discuss the GL theory in detail. But the success of the GL theory for a vast range of empirical data shows that the boundary between semantic and encyclopedic or between linguistic and nonlinguistic is not so clear as B&S assume in their distinction between CS and SF.

Suggested Readings For a quick overview of conceptual semantics with one paper, see Jackendoff (1987a). For foundational issues of conceptual semantics, see Jackendoff (1983).

For an overview of language and other cognitive capacities from a broad perspective, see Jackendoff (1987c). Jackendoff (1990) offers a comprehensive picture of conceptual semantics. Jackendoff (1997) is a bit technical, but it is important to set up the parallel architecture of language. For syntactic issues of conceptual semantics, see Jun (2003) and Culicover and Jackendoff (2005). See also: Agrammatism II: Linguistic Approaches; Anaphora, Cataphora, Exophora, Logophoricity; Anatomical Asymmetries versus Variability of Language Areas of the Brain; Constants and Variables; Lexicon, Generative; Formal Models and Language Acquisition; Hyponymy and Hyperonymy; Meaning Postulates; Semantic Primitives; X-Bar Theory.

Bibliography Berko Gleason J (ed.) (1997). The development of language. Boston: Allyn and Bacon. Bierwisch M & Schreuder R (1992). ‘From concepts to lexical items.’ Cognition 42, 23–60. Chomsky N (1970). ‘Remarks on nominalization.’ In Jacobs R A & Rosenbaum P S (eds.) Readings in English Transformational Grammar. Waltham: Ginn and Company. 184–221. Chomsky N (1981). Lectures on government and binding: the Pisa lectures. Dordrecht: Foris. Chomsky N (1995). The minimalist program. Cambridge, MA: MIT Press. Csuri P (1996). ‘Generalized dependencies: description, reference, and anaphora.’ Ph.D. diss., Brandeis University. Collins A & Quillian M (1969). ‘Retrieval time from semantic memory.’ Journal of Verbal Learning and Verbal Behavior 9, 240–247. Culicover P & Jackendoff R (2005). Simpler syntax. Oxford: Oxford Univ. Press. Fodor J A (1975). The language of thought. Cambridge, MA: Harvard University Press. Jackendoff R (1972). Semantic interpretation in generative grammar. Cambridge, MA: MIT Press. Jackendoff R (1977). X-bar syntax: a study of phrase structure. Cambridge, MA: MIT Press. Jackendoff R (1983). Semantics and cognition. Cambridge, MA: MIT Press. Jackendoff R (1987a). ‘The status of thematic relations in linguistic theory.’ Linguistic Inquiry 18, 369–411. Jackendoff R (1987b). ‘X-bar semantics.’ In Pustejovsky James (ed.) Semantics and the lexicon. Dordrecht: Kluwer Academic Publishers. 15–26. Jackendoff R (1987c). Consciousness and the computational mind. Cambridge, MA: MIT Press. Jackendoff R (1990). Semantic structures. Cambridge, MA: MIT Press. Jackendoff R (1991). ‘Parts and boundaries.’ Cognition 41, 9–45.

Lexical Conditions 77 Jackendoff R (1997). The architecture of the language faculty. Cambridge, MA: MIT Press. Jackendoff R (2002). Foundations of language: brain, meaning, grammar, evolution. Oxford: Oxford University Press. Jun J S (2003). ‘Syntactic and semantic bases of case assignment: a study of verbal nouns, light verbs, and dative.’ Ph.D. diss., Brandeis University. Katz J J (1980). ‘Chomsky on meaning.’ Language 56(1), 1–41. Katz J J & Fodor J A (1963). ‘The structure of a semantic theory.’ Language 39(2), 170–210. Labov W (1973). ‘The boundaries of words and their meanings.’ In Bailey C -J N & Shuy R W (eds.) New ways of analyzing variation in English, vol. 1. Washington, DC: Georgetown University Press. Levin B (1993). English verb classes and alternations. Chicago: University of Chicago Press. Marr D (1982). Vision. San Francisco: W. H. Freeman.

Pin˜ ango M M, Zurif E & Jackendoff R (1999). ‘Real-time processing implications of aspectual coercion at the syntax-semantics interface.’ Journal of Psycholinguistic Research 28(4), 395–414. Pustejovsky J (1995). The generative lexicon. Cambridge, MA: MIT Press. Swinney D (1979). ‘Lexical access during sentence comprehension: (re)consideration of context effects.’ Journal of Verbal Learning and Verbal Behavior 18, 645–659. Wertheimer M (1912). ‘Experimentelle Studien u¨ ber das Sehen von Bewegung.’ Zeitschrift fu¨r Psychologie 61, 161–265. Zurif E & Blumstein S (1978). ‘Language and the brain.’ In Halle M, Bresnan J & Miller G A (eds.) Linguistic theory and psychological reality. Cambridge, MA: MIT Press. 229–245. Zurif E, Caramazza A & Myerson R (1972). ‘Grammatical judgments of agrammatic aphasics.’ Neuropsychologia 10, 405–417.

Lexical Conditions P A M Seuren, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

In many schools of linguistics it is assumed that each sentence S in a natural language has a so-called semantic analysis (SA), a syntactic structure representing S in such a way that the meaning of S can be read off its SA in a regular way. The SA of a sentence S is distinct from its surface structure (SS), which corresponds directly with the way S is to be pronounced. Each language has a set of rules, its grammar G, defining the relationship between the SAs and the SSs of its sentences. The SA of a sentence S is often also called its logical form, because the SA exhibits not only the predicate-argument structure of S and its embedded clauses if S has any, but also the logically correct position of tense, quantifiers, negation, modalities, and other possible operators – besides all the meaningful lexical items of the corresponding SS. SAs are thus analytical as regards their structure, not as regards their lexical items. The lexical items of SSs are in place in SAs: in principle, SAs provide an analysis that goes as far as the lexical items and stops there. SAs do not specify lexical meanings. Lexical meanings are normally specified in dictionaries, but dictionaries do so from an SS point of view. However, linguistic theories assuming an SA-level of representation for sentences require that lexical

meanings be specified at SA-level. The difference is that, at SA-level, lexical items are allowed to occur only in predicate positions. A surface sentence like (1a) is represented at SA-level as (1b), written as the linear formula (1c) and read intuitively as (1d): (1a) The farmer was not working on the land. (1b)

(1c) S[V[not] S[V[past] S[V[on] S[V[be] S[V[work] NP[the y S[V[farmer] NP[y]]]]] NP[the x S[V[land] NP[x]]]]]] (1d) It is not so that in the past on the land the farmer was working.

The items not, past, on, be–ing, work, farmer, and land are all labeled ‘V’, which makes them predicates in (1b). In (1a), however, farmer is a noun, the past tense is incorporated into the finite verb form was, not is usually considered an adverb, working is a present participle in the paradigm of the verb work, on is a preposition, and land is again a noun.

Lexical Conditions 77 Jackendoff R (1997). The architecture of the language faculty. Cambridge, MA: MIT Press. Jackendoff R (2002). Foundations of language: brain, meaning, grammar, evolution. Oxford: Oxford University Press. Jun J S (2003). ‘Syntactic and semantic bases of case assignment: a study of verbal nouns, light verbs, and dative.’ Ph.D. diss., Brandeis University. Katz J J (1980). ‘Chomsky on meaning.’ Language 56(1), 1–41. Katz J J & Fodor J A (1963). ‘The structure of a semantic theory.’ Language 39(2), 170–210. Labov W (1973). ‘The boundaries of words and their meanings.’ In Bailey C -J N & Shuy R W (eds.) New ways of analyzing variation in English, vol. 1. Washington, DC: Georgetown University Press. Levin B (1993). English verb classes and alternations. Chicago: University of Chicago Press. Marr D (1982). Vision. San Francisco: W. H. Freeman.

Pin˜ango M M, Zurif E & Jackendoff R (1999). ‘Real-time processing implications of aspectual coercion at the syntax-semantics interface.’ Journal of Psycholinguistic Research 28(4), 395–414. Pustejovsky J (1995). The generative lexicon. Cambridge, MA: MIT Press. Swinney D (1979). ‘Lexical access during sentence comprehension: (re)consideration of context effects.’ Journal of Verbal Learning and Verbal Behavior 18, 645–659. Wertheimer M (1912). ‘Experimentelle Studien u¨ber das Sehen von Bewegung.’ Zeitschrift fu¨r Psychologie 61, 161–265. Zurif E & Blumstein S (1978). ‘Language and the brain.’ In Halle M, Bresnan J & Miller G A (eds.) Linguistic theory and psychological reality. Cambridge, MA: MIT Press. 229–245. Zurif E, Caramazza A & Myerson R (1972). ‘Grammatical judgments of agrammatic aphasics.’ Neuropsychologia 10, 405–417.

Lexical Conditions P A M Seuren, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

In many schools of linguistics it is assumed that each sentence S in a natural language has a so-called semantic analysis (SA), a syntactic structure representing S in such a way that the meaning of S can be read off its SA in a regular way. The SA of a sentence S is distinct from its surface structure (SS), which corresponds directly with the way S is to be pronounced. Each language has a set of rules, its grammar G, defining the relationship between the SAs and the SSs of its sentences. The SA of a sentence S is often also called its logical form, because the SA exhibits not only the predicate-argument structure of S and its embedded clauses if S has any, but also the logically correct position of tense, quantifiers, negation, modalities, and other possible operators – besides all the meaningful lexical items of the corresponding SS. SAs are thus analytical as regards their structure, not as regards their lexical items. The lexical items of SSs are in place in SAs: in principle, SAs provide an analysis that goes as far as the lexical items and stops there. SAs do not specify lexical meanings. Lexical meanings are normally specified in dictionaries, but dictionaries do so from an SS point of view. However, linguistic theories assuming an SA-level of representation for sentences require that lexical

meanings be specified at SA-level. The difference is that, at SA-level, lexical items are allowed to occur only in predicate positions. A surface sentence like (1a) is represented at SA-level as (1b), written as the linear formula (1c) and read intuitively as (1d): (1a) The farmer was not working on the land. (1b)

(1c) S[V[not] S[V[past] S[V[on] S[V[be] S[V[work] NP[the y S[V[farmer] NP[y]]]]] NP[the x S[V[land] NP[x]]]]]] (1d) It is not so that in the past on the land the farmer was working.

The items not, past, on, be–ing, work, farmer, and land are all labeled ‘V’, which makes them predicates in (1b). In (1a), however, farmer is a noun, the past tense is incorporated into the finite verb form was, not is usually considered an adverb, working is a present participle in the paradigm of the verb work, on is a preposition, and land is again a noun.

78 Lexical Conditions

Because predicates express properties, the question is what property the predicates at issue assign to what kind of objects. Not assigns the property of being false to the proposition in its scope. (Finnish and cognate languages use verbs for the negation: ‘John nots working’ for ‘John does not work.’) Past places its proposition in a given past time. On says that the farmer’s being at work is on the land. (Some American Indian languages say ‘the farmer’s working on-s the land,’ with on as a verb.) Be–ing stretches the farmer’s working out over a period of time. Farmer and land assign the property of being a farmer, or land, to the values of their variables. Thus, despite differences in surface categories, all lexical words can be regarded as predicates at SA level. Analyzing all lexical meanings as predicate meanings has the advantage of a uniform format of lexical specification for all lexical items. The format is that of a definition of satisfaction conditions or lexical conditions. The lexical conditions of an n-ary predicate Pn define the property assigned by Pn. They are the conditions that must be fulfilled by any object (or n- tuple of objects) o for o to deserve Pn, in the sense that when Pn is applied to o, a true proposition results. Thus, for example, the conditions that determine whether a sentence like This animal is a dog is true are the lexical conditions associated with the predicate dog, applied to whatever object is referred to by the definite term this animal. Only if that object fulfills the conditions that are necessary for doghood is the sentence true. Generally, the extension [[Pn]] of the predicate Pn is the set of n- tuples of world objects o that fulfill the conditions set for Pn. Or: (1) [[Pn]] ¼ { | . . . (lexical conditions) . . . }

It is important to note that the lexical conditions thus specified do not, generally, exhaust the meaning of a predicate, even though lexical conditions can be formulated with great subtlety. Meanings often have vague boundaries, which makes the formulation of lexical conditions difficult. Words are often polysemous in that they have different but related meanings, such as the word chest, which applies either to a box meant for storage or to the part of a human body that is enclosed by the ribs. Polysemy often leads to homonymy or near homonymy (again with vague boundaries), as in the case of table (piece of furniture, slab of stone with symbols on it, or well-ordered list of data) or leaf (of a tree or of a book). Moreover, there is often dynamic filtering in word meanings, as in The office is on fire versus The office has a day off. In the former, the term the office denotes a building, in the latter a group of employees. The difference is caused by the nature of the predicate: be on fire

requires a combustible object, whereas have a day off requires humans under a statute imposing duties, but how to integrate such possible referential differences into the format shown in (1) is unknown (and largely undiscussed in the literature). Then there is object dependency, as with verbs of cutting: one cuts the grass, one’s hair or nails, one’s finger, and the meat (though cutting one’s finger is very different from cutting the meat); one trims the hedge and the dog, and sometimes one’s hair also; one tailors a suit (German: schneiden); one gelds a horse (French: couper), etc. It is such phenomena that make it hard to use the format shown in (1) for the practical purposes of dictionaries. In one respect, the format of (1) can be refined. Presuppositions are naturally accounted for by making a distinction between two kinds of lexical conditions, preconditions and update conditions (see Presupposition). Presuppositions are derivable from the preconditions of SA-predicates (see Fillmore, 1971; Seuren, 1985: 266–313). Consider the predicate be divorced. For someone to be divorced, they must have been married first. Or the predicate be back: for someone to be back, they must have been away first. The conditions of having been married first or having been away first are the preconditions of these predicates. The condition that the marriage has been dissolved, or that the person in question is no longer away, is the update condition. When a precondition is not fulfilled, the sentence suffers from presupposition failure, a condition that, according to some (in particular Strawson, 1950), leads to a lack of truth value and according to others (Blau, 1978; Seuren, 1985), to a third truth value, strong or ‘radical’ falsity. If an update condition is not fulfilled, the sentence is simply, or minimally, false. In presupposition theory, the lexical conditions of a predicate Pn can thus be presented in the following general format: (2) [[Pn]] ¼ { : . . . (preconditions) . . . | . . . (update conditions). . .}

This format is exemplified in, for example, the following specification for be divorced: (3) [[be divorced1]] ¼ { o : o was married | o’s marriage has been legally dissolved }

Or: ‘the extension of the predicate be divorced is the set of entities o such that o (precondition) was married, and (update condition) o’s marriage has been legally dissolved’. See also: Discourse Domain; Discourse Semantics; Multivalued Logics; Presupposition.

Lexical Fields 79

Bibliography Blau U (1978). Die dreiwenige logik der sprache. Ihre syntax, semantik und anwendung in der sprachanalyse. Berlin: De Gruyter. Fillmore C J (1971). ‘Types of lexical information.’ In Steinberg D & Jakobovits L (eds.) Semantics. An

interdisciplinary reader in philosophy linguistics and psychology. Cambridge: Cambridge University Press. 370–392. Seuren P A M (1985). Discourse semantics. Oxford: Blackwell. Strawson P F (1950). ‘On referring.’ Mind 59, 320–344.

Lexical Fields P Lutzeier, University of Surrey, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Lexical fields have an immediate intuitive appeal. The reference to one or two examples like the field of verbs of motion walk, run, skip, . . . or the field of adjectives of emotion happy, angry, disappointed, . . . normally is enough to give the feeling that one knows what we are talking about. A widespread curiosity about words also helps, not the least in the context of being a parent and trying to collect the child’s first words. At the same time, one cannot help noticing that textbooks very rarely go beyond the mere mentioning of lexical fields in form of a few examples, plus perhaps some critical remarks about the apparent lack of rigor around the concept. In other words, the intuitive strength of the concept may go together with some theoretical vagueness. Nonetheless, what remains at this stage is the widespread appeal of the concept and undoubtedly successful application of the concept in several disciplines: . Lexicology, Semantics and Cognitive Linguistics. Lexical fields are a useful tool for holistic approaches about lexical meaning, structures of the vocabulary and mental lexicon as well as issues around categorization. . Lexicography. The codification of the vocabulary of a language can be done in several different formats, and the organization of entries around lexical fields is one of them and leads to specialized dictionaries. . Psycholinguistics. Lexical fields are employed in connection with word memory tests, explorations on language acquisition and language loss. . Anthropology. Lexical fields are a useful tool in fieldwork on the language and culture of societies. This remains a major area in the context of globalization and ‘Global’ English, and the concern about endangered languages.

. Medical Neuroscience and Clinical Linguistics. Lexical fields are used for the investigation of different forms of aphasia. In addition to the term ‘lexical field,’ there are other terms in use, such as ‘word field’ and ‘semantic field’; but we shall confine ourselves to lexical field, which provides greater flexibility, because, in contrast to word field, it implies that the relevant groupings involve lexical elements and these are not necessarily confined to words. At least in theory idioms can be contemplated as possible members of such groupings. We also prefer lexical to semantic because the relevant groupings are parts of the lexicon, and its elements will consist of a form level as well as of a content level.

Background Lexical fields contribute to structuring the lexicon and to exploration of lexical meaning. Although the lexical meaning of any member of the lexicon must be seen as a holistic entity, this does not preclude its conception as something internally structured. This structure must make provision for phenomena such as monosemy and polysemy; and, for each individual sense, phenomena such as prototypicality, stereotypes, and family resemblances need to be incorporated. In addition, the outer boundaries of the lexical meaning/senses of any member of the lexicon will be established by finding its unique position in the content plane of the lexicon. This happens in contradistinction to other similar lexical meanings along the paradigmatic dimension and in connection with other different, but compatible lexical meanings along the syntagmatic dimension. The paradigmatic dimension is mainly captured by membership in the same lexical fields and by means of sense relations, but also partly by associations. The syntagmatic dimension is mainly captured by collocations, but also partly by associations. Whichever structure one adopts for the lexical meaning, it cannot be a static one. One has to take

Lexical Fields 79

Bibliography Blau U (1978). Die dreiwenige logik der sprache. Ihre syntax, semantik und anwendung in der sprachanalyse. Berlin: De Gruyter. Fillmore C J (1971). ‘Types of lexical information.’ In Steinberg D & Jakobovits L (eds.) Semantics. An

interdisciplinary reader in philosophy linguistics and psychology. Cambridge: Cambridge University Press. 370–392. Seuren P A M (1985). Discourse semantics. Oxford: Blackwell. Strawson P F (1950). ‘On referring.’ Mind 59, 320–344.

Lexical Fields P Lutzeier, University of Surrey, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Lexical fields have an immediate intuitive appeal. The reference to one or two examples like the field of verbs of motion walk, run, skip, . . . or the field of adjectives of emotion happy, angry, disappointed, . . . normally is enough to give the feeling that one knows what we are talking about. A widespread curiosity about words also helps, not the least in the context of being a parent and trying to collect the child’s first words. At the same time, one cannot help noticing that textbooks very rarely go beyond the mere mentioning of lexical fields in form of a few examples, plus perhaps some critical remarks about the apparent lack of rigor around the concept. In other words, the intuitive strength of the concept may go together with some theoretical vagueness. Nonetheless, what remains at this stage is the widespread appeal of the concept and undoubtedly successful application of the concept in several disciplines: . Lexicology, Semantics and Cognitive Linguistics. Lexical fields are a useful tool for holistic approaches about lexical meaning, structures of the vocabulary and mental lexicon as well as issues around categorization. . Lexicography. The codification of the vocabulary of a language can be done in several different formats, and the organization of entries around lexical fields is one of them and leads to specialized dictionaries. . Psycholinguistics. Lexical fields are employed in connection with word memory tests, explorations on language acquisition and language loss. . Anthropology. Lexical fields are a useful tool in fieldwork on the language and culture of societies. This remains a major area in the context of globalization and ‘Global’ English, and the concern about endangered languages.

. Medical Neuroscience and Clinical Linguistics. Lexical fields are used for the investigation of different forms of aphasia. In addition to the term ‘lexical field,’ there are other terms in use, such as ‘word field’ and ‘semantic field’; but we shall confine ourselves to lexical field, which provides greater flexibility, because, in contrast to word field, it implies that the relevant groupings involve lexical elements and these are not necessarily confined to words. At least in theory idioms can be contemplated as possible members of such groupings. We also prefer lexical to semantic because the relevant groupings are parts of the lexicon, and its elements will consist of a form level as well as of a content level.

Background Lexical fields contribute to structuring the lexicon and to exploration of lexical meaning. Although the lexical meaning of any member of the lexicon must be seen as a holistic entity, this does not preclude its conception as something internally structured. This structure must make provision for phenomena such as monosemy and polysemy; and, for each individual sense, phenomena such as prototypicality, stereotypes, and family resemblances need to be incorporated. In addition, the outer boundaries of the lexical meaning/senses of any member of the lexicon will be established by finding its unique position in the content plane of the lexicon. This happens in contradistinction to other similar lexical meanings along the paradigmatic dimension and in connection with other different, but compatible lexical meanings along the syntagmatic dimension. The paradigmatic dimension is mainly captured by membership in the same lexical fields and by means of sense relations, but also partly by associations. The syntagmatic dimension is mainly captured by collocations, but also partly by associations. Whichever structure one adopts for the lexical meaning, it cannot be a static one. One has to take

80 Lexical Fields

into account the dynamics of the lexicon, which expresses itself at the synchronic level in form of variation and at the diachronic level in form of change. Such a fascinating complexity of the lexicon of any natural language allows for many different ways of ordering the lexicon, and any particular way of doing so can only capture certain aspects of such a complex system of systems. In a sense, lexical fields themselves have to find their own unique position in the vast field of all possible orderings of the lexicon, and will, in any case, only be able to catch one particular type of ordering. Therefore, stretching and extending the concept beyond its established conception will have to be looked at carefully and may not be the right way forward, especially when we have other concepts such as word families and frames.

The Concept of Lexical Field Any concept of lexical fields will try to capture the following basic ideas and principles: . Fields have a position somewhere between the individual lexical element and the whole lexicon, i.e., fields build relevant parts of the lexicon and make a contribution to the structuring of the lexicon. . Fields and individual words have in common that they are part of the lexicon. Fields and the lexicon have in common that they are constituted from words. . Fields are (higher level) signs and therefore comprise a form level as well as a content level. . Each element of the field receives its position in contradistinction and interconnection with other elements of the field. In other words, fields help to establish the senses of individual elements and therefore have to be seen as part of a semasiological approach. . Each lexical field deals with a particular conceptual domain and therefore can be seen as part of an onomasiological approach. With regard to the form level of a lexical field, lexical fields are particular paradigmatic groupings within the lexicon, i.e., their elements belong in each case to the same parts of speech. Although examples of lexical fields can be found for any part of speech, the most useful ones in terms of size and structures are found for nouns, verbs, and adjectives. Apart from possible problems with classification in terms of parts of speech, this does not preclude idioms as special lexical elements from being members of lexical fields. Because there are other groupings such as word families and frames that allow links between

members from different parts of speech, there is no problem in keeping lexical fields to elements of one and the same part of speech. The fact that we see lexical fields as ‘particular’ paradigmatic groupings will have to do with the way that the form level is linked to the content level. If it is true that each lexical field is meant to deal with a particular conceptual domain, then all members of a lexical field will have to relate to this particular domain and therefore will all be similar in terms of their senses. The domain will normally be labeled by a specific term such as ‘body part,’ ‘fruit,’ ‘transfer,’ ‘temperature,’ etc., and this allows us to talk of ‘the lexical field of body part nouns,’ ‘the lexical field of nouns of fruit,’ ‘the lexical field of verbs of transfer,’ ‘the lexical field of adjectives of temperature,’ etc. The term that captures the domain provides the semantic framework for the lexical field and each member of this specific field must have a sense that is compatible with the meaning of the domain term, i.e., has an identical to or a more specific sense than the meaning of the term. This way a lexical field also constitutes a special onomasiological grouping, namely all members belong to the same part of speech. This may well mean that the field does not have a member whose sense is identical with the meaning of the domain term. For instance, in the field of English adjectives of temperature, we do not have an adjective that would match the meaning of the noun ‘temperature.’ The membership of a lexical field structures the given domain and one needs some tools to describe the particular structure. The resulting description should guarantee the unique position in some kind of semantic space for each member of the lexical field. It is wise to apply a combination of dimensions and sense relations. The idea is to attach a finite number of dimensions to the field. Each dimension is meant to provide a partition of the whole set. The resulting sets are therefore disjoint with each other and the name of each set normally reflects a necessary part of the relevant sense of the element. In other words, a paraphrase of the relevant sense of the element would normally involve the partition name. It has to be stressed that the idea of such a name is similar, but not identical, to traditional features of a componential analysis and that we are talking metalinguistically of a partition of the set of elements, not of anything out there in reality. For instance, take the trivial paradigm P ¼ {Venus, morning star, evening star}. All three nouns refer to the same entity: Venus. In line with Frege’s thoughts we could establish a dimension ‘Time of occurrence in the sky.’ This dimension

Lexical Fields 81

would divide the paradigm into three sets: S1 ¼ {morning star}, S2 ¼ {evening star}, S3 ¼ {Venus} with the corresponding names N1 ¼ ‘the brightest star in the morning sky,’ N2 ¼ ‘the brightest star in the evening sky,’ N3 ¼ ‘neutral.’ Furthermore, there is no claim to the effect that the sum of all names of all sets of which a particular element is member of would constitute its sense in respect to the given domain. Also, paraphrases may vary accounts for the dynamic element of any description and therefore one cannot necessarily expect a unique set of dimensions for the description of the lexical field. This does not constitute a weakness of the notion but reflects an important phenomenon of natural language in general. Guiding principles for the choice amongst possible candidates are: (1) those dimensions are preferable that result in sets with several members, and (2) those dimensions are preferable that result in cross-classification. The familiar tool of sense relations can act as a complement to the tool of dimensions. As far as lexical fields are concerned, the hyponymy-relation and the incompatibility-relation tend to be the most useful ones. The analysis of the field is successful and complete when the net of links by means of these sense relations, always relative to the given domain, plus the relevant names constitute a unique position for each member of the field. An obvious exception is the case of synonyms with regard to the given domain. The interplay between dimensions and sense relations in forming the semantic space helps in several ways. It frees us from the often difficult task of expressing the unique position by means of an explicit link to necessary parts of the sense, whereas sense relations can often make a valuable contribution by means of more implicit links. The reliance on sense relations also allows an elegant solution to the supposedly thorny problem of completeness for lexical fields. How can you decide whether a particular element should be a member of the field or not? There is an obvious answer in terms of the sense relations: completeness is achieved once one has closure in terms of the sense relations relative to the given domain. Earlier on, we stressed that the senses of a lexical element will be defined through paradigmatic and syntagmatic links. In other words, it must be accepted that lexical fields as paradigmatic groupings can in most cases only make a partial contribution to the senses of their members. This again is an acknowledgment of the richness of the lexicon of any natural language rather than the admission of a weakness in the concept of lexical field.

Relevance of Lexical Fields In the 1930s, Jost Trier as the true founder of lexical field theory evoked for lexical fields the idea of a mosaic within the content plane. This idea was discredited in the 1960s and 1970s because of its unfortunate link with Aristotelian/structuralist ideas of categories. As soon as this link was shown not to be a necessary one and lexical fields were taken to be groupings of the kind described here, the old idea of a mosaic could gain new strength again. In this context, lexical field theory can make the legitimate claim to be a forerunner of Cognitive Linguistics. Many syntactic and semantic theories and concepts come and go. Against this pattern, lexical field theory and the concept of lexical field has proven to be remarkably stable. What is urgently needed is the will to engage in more practical descriptions of vast lexical fields across several languages, because such information would provide useful data for typological comparisons across different languages and cultures. Lexical fields are also witness to the fact that natural languages exist and change as a result of dynamic (field) forces. It goes without saying that any linguistic theory worthy of its name must take account of such changes and therefore cannot afford to ignore the concept of lexical field. See also: Antonymy and Incompatibility; Cognitive Semantics; Componential Analysis; Hyponymy and Hyperonymy; Language Education: Vocabulary; Lexical Conceptual Structure; Lexical Semantics: Overview; Lexicology; Onomasiology and Lexical Variation; Structuralism; Trier, Jost (1894–1970).

Bibliography Coseriu E (1973). Einfu¨hrung in die strukturelle Betrachtung des Wortschatzes (2 edn.). Tu¨ bingen: Gunter Narr. Geckeler H (1971). Strukturelle Semantik und Wortfeldtheorie. Munich: Fink Verlag. Geckeler H (2002). ‘Anfa¨ nge und Ausbau des Wortfeldgedankens.’ In Cruse D A, Hundsnurscher F, Job M et al. (eds.) Lexicology: an international handbook on the nature and structure of words and vocabularies, vol. 1. Berlin: Walter de Gruyter. 713–728. Gloning T (2002). ‘Auspra¨ gungen der Wortfeldtheorie.’ In Cruse D A, Hundsnurscher F, Job M et al. (eds.) Lexicology: an international handbook on the nature and structure of words and vocabularies, vol. 1. Berlin: Walter de Gruyter. 728–737. Jones W J (1990). German kinship terms (750–1500): documentation and analysis. Berlin: Walter de Gruyter. Lehrer A (1974). Semantic fields and lexical structure. Amsterdam: North Holland.

82 Lexical Fields Lehrer A (1985). ‘The influence of semantic fields on semantic change.’ In Fisiak J (ed.) Historical semantics/ historical word-formation. Berlin: Mouton. 283–296. Lutzeier P R (1981). Wort und Feld. Wortsemantische Fragestellungen mit besonderer Beru¨ cksichtigung des Wortfeldbegriffes. Tu¨ bingen: Max Niemeyer Verlag. Lutzeier P R (ed.) (1993). Studien zur Wortfeldtheorie/Studies in lexical field theory. Tu¨ bingen: Max Niemeyer Verlag. Lutzeier P R (1995). Lexikologie. Ein Arbeitsbuch. Tu¨ bingen: Stauffenburg Verlag. Lutzeier P R (2005). ‘Die Wortfeldtheorie unter dem Einfluss des Strukturalismus.’ In Auroux S, Koerner E F, Niederehe H-J et al. (eds.) History of the language sciences: an international handbook on the evolution of the study of language from the beginnings to the present, vol. 3. Berlin: Walter de Gruyter.

Lyons J (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press. Pottier B (1963). Recherches sur l’analyse se´ mantique en linguistique et en traduction me´ canique. Nancy: Universite´ de Nancy. Schmidt L (ed.) (1973). Wortfeldforschung. Zur Geschichte und Theorie des sprachlichen Feldes. Darmstadt: Wissenschaftliche Buchgesellschaft. Seiffert L (1968). Wortfeldtheorie und Strukturalismus. Studien zum Sprachgebrauch Freidanks. Stuttgart: Kohlhammer Verlag. Trier J (1973). Der deutsche Wortschatz im Sinnbezirk des Verstandes. Die Geschichte eines sprachlichen Feldes. Band 1 (Von den Anfa¨ ngen bis zum Beginn des 13. Jahrhunderts) (2 edn.). Heidelberg: Carl Winter.

Lexical Functional Grammar M Dalrymple, Oxford University, Oxford, UK ! 2006 Elsevier Ltd. All rights reserved.

LFG’s Syntactic Structures Lexical Functional Grammar (LFG) is a theory of the structure of language and how different aspects of linguistic structure are related. As the name implies, the theory is lexical; the lexicon is richly structured, with lexical relations rather than transformations or operations on phrase structure trees as a means of capturing linguistic generalizations. It is also functional; grammatical functions such as subject and object are primitives of the theory, not defined in terms of phrase structure configuration or semantic roles. LFG assumes that two syntactic levels are important in the analysis of linguistic structure. F(unctional)structure represents abstract grammatical functions such as subject and object as well as abstract features such as tense and case. Another level, c(onstituent)structure, represents the concrete phrasal expression of these relations, governed by language-particular constraints on word order and phrase structure. This duality of syntactic representation is motivated by the different natures of these two structures both within and across languages. Languages vary greatly in word order and phrasal structure, and the theory of constituent structure allows for this variation within certain universally defined parameters. In contrast, all languages share the same functional vocabulary. According to LFG’s theory of functional structure, the abstract syntactic structure of every language is organized in

terms of subject, object, and other grammatical functions, most of which are familiar from traditional grammatical work. Regularities in the relation between c-structure and f-structure are captured by functions relating parts of one structure to parts of the other. For example, the subject phrase in the c-structure tree is related to the subject f-structure by means of a function that relates nodes of the c-structure tree to parts of the f-structure for a sentence. Relations among c-structure, f-structure, and other linguistic levels have also been explored and defined in terms of functional mappings from subparts of one structure to the corresponding subparts of other structures. The overall formal structure and basic linguistic assumptions of the theory have changed very little since its development in the late 1970s by Joan Bresnan, a linguist trained at the Massachusetts Institute of Technology, and Ronald M. Kaplan, a psycholinguist and computational linguist trained at Harvard University. Bresnan (1982) is a collection of influential early papers in LFG; recent works providing an overview or introduction to LFG include Dalrymple et al. (1995), Bresnan (2001), Dalrymple (2001), Falk (2001), and Kroeger (2004).

Constituent Structure Languages vary greatly in the basic phrasal expression of even simple sentences. Basic word order can be verb-initial (Malagasy), verb-final (Japanese), or verb-medial (English). Word order correlates with grammatical function in some languages, such as

82 Lexical Fields Lehrer A (1985). ‘The influence of semantic fields on semantic change.’ In Fisiak J (ed.) Historical semantics/ historical word-formation. Berlin: Mouton. 283–296. Lutzeier P R (1981). Wort und Feld. Wortsemantische Fragestellungen mit besonderer Beru¨cksichtigung des Wortfeldbegriffes. Tu¨bingen: Max Niemeyer Verlag. Lutzeier P R (ed.) (1993). Studien zur Wortfeldtheorie/Studies in lexical field theory. Tu¨bingen: Max Niemeyer Verlag. Lutzeier P R (1995). Lexikologie. Ein Arbeitsbuch. Tu¨bingen: Stauffenburg Verlag. Lutzeier P R (2005). ‘Die Wortfeldtheorie unter dem Einfluss des Strukturalismus.’ In Auroux S, Koerner E F, Niederehe H-J et al. (eds.) History of the language sciences: an international handbook on the evolution of the study of language from the beginnings to the present, vol. 3. Berlin: Walter de Gruyter.

Lyons J (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press. Pottier B (1963). Recherches sur l’analyse se´mantique en linguistique et en traduction me´canique. Nancy: Universite´ de Nancy. Schmidt L (ed.) (1973). Wortfeldforschung. Zur Geschichte und Theorie des sprachlichen Feldes. Darmstadt: Wissenschaftliche Buchgesellschaft. Seiffert L (1968). Wortfeldtheorie und Strukturalismus. Studien zum Sprachgebrauch Freidanks. Stuttgart: Kohlhammer Verlag. Trier J (1973). Der deutsche Wortschatz im Sinnbezirk des Verstandes. Die Geschichte eines sprachlichen Feldes. Band 1 (Von den Anfa¨ngen bis zum Beginn des 13. Jahrhunderts) (2 edn.). Heidelberg: Carl Winter.

Lexical Functional Grammar M Dalrymple, Oxford University, Oxford, UK ! 2006 Elsevier Ltd. All rights reserved.

LFG’s Syntactic Structures Lexical Functional Grammar (LFG) is a theory of the structure of language and how different aspects of linguistic structure are related. As the name implies, the theory is lexical; the lexicon is richly structured, with lexical relations rather than transformations or operations on phrase structure trees as a means of capturing linguistic generalizations. It is also functional; grammatical functions such as subject and object are primitives of the theory, not defined in terms of phrase structure configuration or semantic roles. LFG assumes that two syntactic levels are important in the analysis of linguistic structure. F(unctional)structure represents abstract grammatical functions such as subject and object as well as abstract features such as tense and case. Another level, c(onstituent)structure, represents the concrete phrasal expression of these relations, governed by language-particular constraints on word order and phrase structure. This duality of syntactic representation is motivated by the different natures of these two structures both within and across languages. Languages vary greatly in word order and phrasal structure, and the theory of constituent structure allows for this variation within certain universally defined parameters. In contrast, all languages share the same functional vocabulary. According to LFG’s theory of functional structure, the abstract syntactic structure of every language is organized in

terms of subject, object, and other grammatical functions, most of which are familiar from traditional grammatical work. Regularities in the relation between c-structure and f-structure are captured by functions relating parts of one structure to parts of the other. For example, the subject phrase in the c-structure tree is related to the subject f-structure by means of a function that relates nodes of the c-structure tree to parts of the f-structure for a sentence. Relations among c-structure, f-structure, and other linguistic levels have also been explored and defined in terms of functional mappings from subparts of one structure to the corresponding subparts of other structures. The overall formal structure and basic linguistic assumptions of the theory have changed very little since its development in the late 1970s by Joan Bresnan, a linguist trained at the Massachusetts Institute of Technology, and Ronald M. Kaplan, a psycholinguist and computational linguist trained at Harvard University. Bresnan (1982) is a collection of influential early papers in LFG; recent works providing an overview or introduction to LFG include Dalrymple et al. (1995), Bresnan (2001), Dalrymple (2001), Falk (2001), and Kroeger (2004).

Constituent Structure Languages vary greatly in the basic phrasal expression of even simple sentences. Basic word order can be verb-initial (Malagasy), verb-final (Japanese), or verb-medial (English). Word order correlates with grammatical function in some languages, such as

Lexical Functional Grammar 83

English, in which the subject and other arguments appear in particular phrase structure positions. In other languages, word order is more free, and grammatical functions are identified by case marking or agreement rather than phrasal configuration; in many languages, there is no specific phrasal position where the subject or object must always appear. Requirements for phrasal groupings also differ across languages. In English, for example, a noun and any adjectives that modify it must appear together and form a phrasal unit. In many other languages, including Latin, this is not necessary and a noun can be separated from its modifying adjectives in the sentence. LFG’s constituent structure represents word order and phrasal constituenthood. Constituent Structure Representation

Like many other linguistic theories, LFG represents word order and phrasal groupings by means of phrase structure trees, also called constituent structures (see Constituent Structure) or c-structures. The c-structure for an English sentence such as David is sleeping is: (1) David is sleeping.

C-structure trees contain two sorts of categories. Categories such as N (noun) and V (verb), familiar from traditional grammatical analysis, are called lexical categories. Most LFG analyses assume at least the lexical categories N (noun), A (adjective), V (verb), Adv (adverb), and P (preposition), although more or fewer categories may be relevant for a particular language. Most languages also make use of a set of functional categories, including I (for Inflection), C (for Complementizer), and D (for Determiner). Functional categories play an organizing role in the syntax and are either associated with closed-class categories such as complementizers or are filled with subtypes of particular lexical categories. Constituent structure is organized according to X-bar theory (see X-Bar Theory), which assumes that phrases are internally headed and therefore endocentric; a phrase and its head have the same category but a different bar level. For example, the basic lexical category N is the head of the single-barlevel category N0 (‘N-bar’), which in turn is the head

of the two-bar-level category N00 (‘N-double-bar’). Similarly, the basic functional category I is the head of I0 , which heads I00 . Many LFG analyses assume that N00 and I00 are maximal phrases, meaning that there is no category N000 or I000 for the double-bar-level category to head. Thus, as maximal phrases, the categories N00 and I00 are usually written as NP (noun phrase) and IP (the category assigned to a sentence such as David is sleeping). Nonprojecting categories are also assumed (Toivonen, 2003); these are lexical categories that are not heads of phrases but appear on their own, adjoined to heads. For example, verbal particles (words corresponding to the particle up in a sentence such as I woke up the baby) in some Germanic languages are nonprojecting words, typically prepositions, adjoined to the verb. Not all phrases are endocentric. LFG assumes a single exocentric, nonheaded category, the category S, which does not obey the constraints of X-bar theory. Not all languages make use of this phrase; it plays no role in the syntax of English, for example. In languages that make use of this phrase, it behaves as a maximal phrase, but it has no c-structure head and it can dominate phrases of any category or bar level. A phrase can dominate other constituents in addition to its head. LFG does not require phrase structure trees to be binary branching, and so there can be more than two daughters of any node in a c-structure tree. The nonhead daughter of a maximal phrase is called its specifier, and the nonhead sisters of a lexical category are its complements. This is shown schematically in (2).

As shown in (1), a verbal category, often an auxiliary, appears in I. The complement of I is either VP, as in (1), or, in languages that make use of it, the exocentric category S. Not all languages make use of the functional category C and the phrases it heads, C0 and CP. When a language makes use of this category, complementizers or verbs can appear in C, and the complement of C is IP or S. The functional category D, filled by a determiner, is also often assumed; the complement of D is NP. In the following, we do not assume DP, which means that the category of a phrase such as the boy is NP. However, there is no general agreement of the status of such phrases in LFG. According to some analyses, the boy is a DP rather than an NP in at least some languages.

84 Lexical Functional Grammar

Unlike many theories, LFG assumes that daughters of all phrasal categories are optional. In particular, the head of a maximal phrase need not appear. In many languages, for example, tensed verbs appear in I (King, 1995; Sells, 2001). A Swedish sentence such as (3), with a tensed verb and no nontensed verbs, has a VP that does not contain a V. (3) Anna sa˚g boken Anna saw book.DEF ‘Anna saw the book.’

The right-hand side of an LFG phrase structure rule is a regular expression, allowing for disjunction, optionality, and arbitrary repetition of a node or sequence of nodes. The V and NP daughters in the rule in (6) are optional, and the Kleene star (*) annotation on the PP indicates that a sequence of zero or more PP constituents may appear. (6) V0 ! (V) (NP) PP*

Functional Structure Syntactic analyses in traditional grammatical descriptions are stated in terms of abstract syntactic functions such as subject, object, and complement. These functions are represented at LFG’s functional structure. F-structure represents abstract grammatical functions such as subject and object, as well as features such as tense, case, person, and number. Nonhead daughters are also only optionally present. In Japanese and other so-called ‘prodrop’ languages, a verb can appear with no overt arguments. If no overt arguments of a verb are present, the c-structure tree contains only the verb: (4) koware-ta break-PAST ‘[it/something] broke.’

Grammatical Functions and Their Representation

In a sentence such as David devoured a sandwich, David is the subject and a sandwich is the object. This information is represented by an attribute-value structure, the f-structure, in which the value of the SUBJ feature is the f-structure for the subject and the value of the OBJ feature is the f-structure for the object. (7) David devoured a sandwich.

C-structure does not contain subparts of words or unpronounced features, nor does it contain null pronominals in prodrop languages such as Japanese. Rather, it reflects the structure and grouping of the full syntactic units – the words and phrases – in the sentence. Phrase Structure Rules

LFG draws a strong distinction between the formal objects of the theory – constituent structure trees and functional structures – and the constraints or descriptions involving those objects. C-structure trees are constrained by phrase structure rules, which license local tree configurations. The phrase structure rule in (5a) licenses the c-structure in (5b):

For clarity, many of the features and values in this f-structure have been omitted, a practice often followed in LFG presentations. The full f-structure contains tense, aspect, person, number, and other functional features. Every content word in a sentence contributes a value for the feature PRED. These values are called semantic forms. In the functional structure, semantic forms are surrounded by single quotes: the semantic form contributed by the word David is ‘DAVID.’ An important property of semantic forms is that they are uniquely instantiated for each instance of their use, reflecting the unique semantic contribution of each word within the sentence. This is occasionally

Lexical Functional Grammar 85

indicated by associating a unique numerical identifier with each instance of a semantic form, as in (8): (8) David devoured a sandwich.

Table 1 Governable grammatical functions SUBJ OBJ COMP XCOMP OBJy

In (8), the particular occurrence of the semantic form for the word David as it is used in this sentence is represented as ‘DAVID42.’ Another use of David will be associated with a different unique identifier, perhaps ‘DAVID73.’ Representing semantic forms with explicit numerical identifiers clearly shows that each word makes a unique contribution to the f-structure. However, the identifiers also add unnecessary clutter to the f-structure and, therefore, are usually not displayed. A verb or other predicate generally requires a particular set of arguments: for example, the verb devoured requires a subject (SUBJ) and an object (OBJ). These arguments are said to be governed by the predicate; equivalently, the predicate is said to subcategorize for its arguments. The semantic form contributed by a verb or other predicate contains information about the arguments it governs. As shown in (8), the governed arguments appear in angled brackets: ‘DEVOURhSUBJ,OBJi.’ The LFG requirements of completeness and coherence ensure that all and only the grammatical functions governed by a predicate are found in the structure of a grammatically acceptable sentence. For example, the unacceptability of example (9) shows that the verb devoured cannot appear without an OBJ: (9) *David devoured.

This sentence violates the principle of completeness, according to which every grammatical function governed by a predicate must be filled. Here, the OBJ is not present, and the sentence is incomplete. Furthermore, devour cannot appear with other functions than the grammatical functions SUBJ and OBJ that it governs. Example (10) shows that it cannot appear with a sentential complement in addition to its object: (10) *David devoured a sandwich that it was raining.

This sentence violates the principle of coherence, according to which only the grammatical functions that are governed by a predicate can appear. Because the sentence contains a grammatical function that the verb devour does not govern, it is incoherent.

OBLy

Subject Object Sentential or closed (nonpredicative) infinitival complement An open (predicative) complement, often infinitival, whose SUBJ function is externally controlled A family of secondary OBJ functions associated with a particular, language-specific set of thematic roles; in English, only OBJTHEME is allowed, while other languages allow more than one thematically restricted secondary object A family of thematically restricted oblique functions such as OBLGOAL or OBLAGENT, often corresponding to adpositional phrases at c-structure

The grammatical functions that a predicate can govern are called governable grammatical functions. The inventory of universally available governable grammatical functions is given in Table 1. Languages differ as to which of these functions are relevant, but in many languages, including English, all of these functions are used. Not all phrases fill argument positions of a predicate. Modifying adjunct phrases are not required by a predicate and hence are not governable. In (11), the phrase yesterday bears the nongovernable grammatical function ADJ (unct): (11) David devoured a sandwich yesterday.

There are two nongovernable grammatical functions. The function ADJ is the grammatical function of modifiers such as in the park, with a hammer, and yesterday. The function XADJ is the grammatical function of open predicative adjuncts whose subject is externally controlled; as with the governable grammatical function XCOMP, the X in the name of the function indicates that it is an open function whose SUBJ is supplied externally. The phrase filling the XADJ role is in boldface in (12). (12a) Having opened the window, David took a deep breath. (12b) David ate the celery naked. (12c) David ate the celery raw.

In (12a) and (12b), the open adjunct XADJ is controlled by the subject of the main clause: It is David who opened the window, and it is David who is naked. In (12c), the XADJ is controlled by the object: It is the celery that is raw. Unlike governable grammatical functions, more than one adjunct function can appear in a sentence: (13) David devoured a sandwich at noon yesterday.

86 Lexical Functional Grammar

Because the ADJ function can be multiply filled, its value is a set of f-structures: (14) David devoured a sandwich at noon yesterday.

the CASE feature to allow for case indeterminacy. Some studies assume a PCASE feature whose value specifies the grammatical function of its phrase. In more recent work, Nordlinger (1998) provided a theory of constructive case, according to which a case marked phrase places constraints on its f-structure environment that determine its grammatical function in the sentence. This treatment supplants the traditional treatment of obliques in terms of the PCASE feature. Functional Descriptions

The same is true of XADJ; more than one XADJ phrase can appear in a single sentence: (15) Having opened the window, David ate the celery naked.

Hence, the value of the XADJ feature is also a set of f-structures. The f-structures that have been presented so far have included only a subset of their functional features. In fact, it is common in LFG literature to display only those features that are relevant to the analysis under discussion because a full representation is often too unwieldy. A full f-structure for these sentences contains at least the features and values listed in Table 2 and probably other language-specific features and values as well. The values given in this chart are the ones that are most often assumed, but some authors have argued for a different representation of the values of some features. For example, Dalrymple and Kaplan (2000) argue for a set-based representation of the PERS and GEND features to allow for an account of feature resolution in coordination and of Table 2 f-Structure features Feature

Value

Person Gender Number Case Surface form Verb form Complementizer form

PERS GEND NUM CASE FORM VFORM COMPFORM

Tense Aspect

TENSE ASPECT

1, 2, 3 MASC, FEM, . . . SG, DUAL, PL, . . . NOM, ACC, . . . Surface word form PASTPART, PRESPART, . . . Surface form of complementizer: THAT, WHETHER, . . . PRES, PAST, . . . F-structure representing complex description of sentential aspect; sometimes abbreviated, e.g., PRES.IMPERFECT REL, WH, PERS, . . .

Pronoun type

PRONTYPE

As with c-structures, we draw a sharp distinction between f-structures and their descriptions. The set of f-structure constraints associated with the analysis of some sentence is called a functional description or f-description. To refer to the value of a feature, say, TENSE, in some f-structure, we use an expression like the following: (16) (f TENSE)

This expression refers to the value of the TENSE feature in the f-structure f. If we want to specify the value of that feature, we use an expression such as: (17) (f TENSE) ¼ PAST

This defining equation specifies that the feature TENSE in the f-structure f has the value PAST. We can also specify that an feature has a particular f-structure as its value. The expression in (18) specifies that the value of the SUBJ feature in f is the f-structure g: (18) (f SUBJ) ¼ g

Some features take as their value a set of functional structures. For example, because any number of adjuncts can appear in a sentence, the value of the feature ADJ is a set. We can specify that an f-structure h is a member of the ADJ set with the following constraint, using the set-membership symbol 2: (19) h 2 (f ADJ)

The constraints discussed so far are called defining constraints because they define the required properties of a functional structure. An abbreviated f-description for a sentence such as David sneezed is: (20) (f PRED) ¼ ‘SNEEZEhSUBJi’ (f TENSE) ¼ PAST (f SUBJ) ¼ g (g PRED) ¼ ‘DAVID’

This f-description holds of the following f-structure, where the f-structures are annotated with the names used in the f-description (20):

Lexical Functional Grammar 87 (21) David sneezed

Notice, however, that the f-description also holds of the f-structure in (22), which also contains all the attributes and values that are mentioned in the f-description in (20):

However, the f-structure in (22) is not the minimal or smallest solution to the f-description in (20) because it contains additional attributes and values that do not appear in the f-description. We require the f-structure solution for a particular f-description to be the minimal solution to the f-description: no additional attributes or values that are not mentioned in the f-description are included. Thus, the correct solution to the f-description in (20) is the f-structure in (21), not the larger one in (22). Formally, the solution to an f-description is the most general f-structure that satisfies the f-description, which subsumes all other (larger) f-structures that satisfy the f-description. In addition to the defining constraints just described, LFG also allows elements of the f-description to check the properties of the minimal solution to the defining equations. The expression in (23) is a constraining equation, distinguished from a defining equation by the c subscript on the equals sign in the expression: (23) (f SUBJ NUM) ¼ c SG

When this expression appears, the f-structure f that is the minimal solution to the defining equations must contain the feature SUBJ whose value has an feature NUM with value SG. The constraining equation in (23) does not hold of the f-structure in (21) because in that f-structure the value of the NUM feature has been left unspecified and the SUBJ of f does not have a NUM feature with value SG. In contrast, the functional description in (24a) for the sentence David sneezes has a well-formed solution, the f-structure in (24b):

(24a) (f PRED) ¼ ‘SNEEZEhSUBJi’ (f TENSE) ¼ PRES (f SUBJ) ¼ g (g PRED) ¼ ‘DAVID’ (g NUM) ¼ SG (f SUBJ NUM) ¼ c SG

(24b)

Here, the value SG for the NUM feature for g is specified in the second-to-last line of the functional description. Thus, the f-structure in (24b) satisfies the defining constraints given in the first five lines of (24a). Moreover, it satisfies the constraining equation given in the last line of (24a). We can also place other requirements on the minimal solution to the defining equations in some f-description. The expression in (25a) requires f not to have the value PRESENT for the feature TENSE, which can happen if f has no TENSE feature or if f has a TENSE feature with some value other than PRESENT. When it appears in a functional description, the expression in (25b) is an existential constraint, requiring f to contain the feature TENSE, but not requiring any particular value for this feature. We can also use a negative existential constraint to require an f-structure not to contain an feature, as in (25c), which requires f not to contain the feature TENSE with any value whatsoever. (25a) Negative equation: (f TENSE) 6¼ PRESENT (25b) Existential constraint: (f TENSE) (25c) Negative existential constraint: :(f TENSE)

Functional descriptions can also be stated in terms of the Boolean operations of conjunction, disjunction, and negation. In the f-descriptions just given, we implicitly assume that the constraints in the f-description are interpreted conjunctively; if an f-description contains more than one requirement, each requirement must hold. LFG also allows disjunctions and negations of sets of requirements. For example, a verb like sneeze contributes the following f-description: (26) sneeze

Disjunction is indicated by curly brackets, with the alternatives separated by a vertical bar |. Negation for a set of requirements is represented by prefixing :, and the scope of negation is indicated by curly brackets.

88 Lexical Functional Grammar

This lexical entry allows two possibilities. The first is for the base form of the verb, in which the value of the VFORM feature is BASE. For the second possibility, the value of the feature TENSE is PRES for present tense, and a third-person singular subject is disallowed by negating the possibility for the PERS feature to have value 3 when the NUM feature has value SG.

The Constituent Structure–Functional Structure Relation There are clear crosslinguistic regularities relating constituent structure positions to grammatical functions. In particular, phrases and their heads are required to correspond to the same f-structure, and specifier and complement positions are associated with particular grammatical functions. Such generalizations constrain the relation between c-structure positions and the f-structure positions they are associated with. Structural Correspondences

To express these generalizations formally, relating nodes in the c-structure tree and the f-structures they correspond to, we can define a function called f (phi) that relates nodes of the c-structure tree to parts of the f-structure for a sentence. In (27), the f function from the NP node to the f-structure it corresponds to is represented by an arrow and labeled f.

specified by the verb supplies the SUBJ value for the sentence. In (29), because there is no overt subject, all of the information about the subject comes from specifications on the verb, and there is no c-structure node corresponding to the SUBJ f-structure. (29) koware-ta break-PAST ‘[it/something] broke’

Constituent Structure–Functional Structure Correspondences

The f function is important in stating universally valid relations between c-structure positions and the functional roles associated with them. For example, a phrase and its head always correspond to the same f-structure. Furthermore, the complement of a functional category is an f-structure cohead; the functional head and its complement correspond to the same f-structure. This is shown in (30), where the functional category IP, its heads I0 and I, and its complement VP map to the same f-structure. (30) David is yawning.

(27) David sneezed

Each node of the c-structure tree corresponds to some part of the f-structure. As shown in (28), more than one c-structure node can correspond to the same f-structure (the f function is many to one):

The specifier position of the functional categories IP and CP is filled by a phrase bearing a grammaticized discourse function: SUBJ, TOPIC, or FOCUS. Within these limits, languages can differ as to the particular grammaticized discourse function allowed in each of these positions. In English, as we have seen, the specifier position of IP is filled by the SUBJ. (31) David yawned.

Further, there can be f-structures that have no corresponding c-structure node (the f function is into). Example (29) shows the c-structure and f-structure for a sentence of Japanese, a prodrop language in which the verb optionally specifies functional information about its subject. When there is no overt subject phrase in the sentence, the information

Lexical Functional Grammar 89

In Finnish, the specifier of IP is associated with the TOPIC function, and the specifier of CP is associated with FOCUS. (32) Mikolta Anna sai kukkia. Mikko Anna got flowers. ‘From Mikko, Anna got flowers.’

. the f-structure of the immediately dominating node: " . the f-structure of the current c-structure node: #

We can use these symbols to annotate the V0 phrase structure rule with f-structure correspondence constraints.

This annotated rule licenses the configuration in (35). In the c-structure, the V0 node dominates the V node, as the phrase structure rules require. The V0 and V nodes correspond to the same f-structure, as the annotations on the V node require. When a f-structure contains a FOCUS or TOPIC function, the Extended Coherence Condition requires it to be integrated into the f-structure by either anaphorically or functionally binding another f-structure in the sentence. Here, the FOCUS also bears the OBLSOURCE function, and the TOPIC is also the SUBJ; these relations involve functional binding because the same f-structure fills both functions. In a sentence such as Bill, I like him, the f-structure for Bill anaphorically binds the f-structure him; the two phrases Bill and him are syntactically independent and each phrase has its own f-structure, but the anaphoric relation between the two satisfies the Extended Coherence Condition. The complements of a lexical category bear nondiscourse grammatical functions, that is, any grammatical function other than SUBJ, FOCUS, or TOPIC. In (33), the complements of V are associated with the grammatical functions OBJ and OBJTHEME. (33) David gave Chris a book.

In the rule shown in (36), the V and the V0 node correspond to the same f-structure, as specified by the " ¼ # annotation on the V node. The annotation on the NP node requires the f-structure # corresponding to the NP to be the value of the OBJ value in the f-structure " for the mother node.

The rule in (36) licenses the following configuration:

We can use the same formal vocabulary in the specifications of lexical entries. The lexical entry for the verb sneezed is shown in (38). It specifies that the c-structure category of sneezed is V, and also specifies constraints on the f-structure " of the preterminal V node that dominates the terminal node sneezed: (38) sneezed V

This lexical entry licenses the c-structure–f-structure configuration in (39). Constraining the Constituent Structure–Functional Structure Relation

In describing the relation between c-structure and f-structure, we use the following symbols for the f-structure corresponding to the current node in a phrase structure rule and the f-structure of its mother node:

Syntax and Semantics Several recent research strands in LFG have explored the relation of constituent and functional structure

90 Lexical Functional Grammar

to other linguistic levels. Among these are the theory of the relation between argument structure and syntax, and the ‘glue’ approach to the interface between syntax and semantics. Mapping Theory and Argument Linking

Mapping theory explores correlations between the semantic roles of the arguments of a verb and their syntactic functions. If a language assigns the syntactic function SUBJ to the agent argument of an active verb such as kick, for example, it invariably assigns SUBJ to the agent argument of semantically similar verbs such as hit. Early formulations of the rules of mapping theory proposed rules relating specific thematic roles to specific grammatical functions, for example, that the thematic role of AGENT is always realized as SUBJ. Later work proposed more general rules relating thematic roles to classes of grammatical functions rather than specific functions. It is most often assumed that grammatical functions are crossclassified with the features "R and "O. Several versions of mapping theory have been proposed (Bresnan and Kanerva, 1989; Bresnan and Zaenen, 1990; Bresnan, 2001); in the following, we describe the theory of Bresnan and Zaenen (1990). The feature "R distinguishes unrestricted (#R) grammatical functions from restricted (þR) functions. ions. The grammatical functions SUBJ and OBJ are classified as unrestricted, meaning that they can be filled by an argument bearing any thematic role. These contrast with restricted grammatical functions such as obliques or thematically restricted objects, which must be filled by arguments with particular thematic roles; for example, the OBLSOURCE function must be filled by an argument bearing the thematic role SOURCE and the thematically restricted object function OBJTHEME is filled by a THEME argument. The feature "O distinguishes objective (þO) grammatical functions from nonobjective (#O) functions. The unrestricted OBJ function and the restricted OBJy functions are objective, whereas the SUBJ and the oblique functions are nonobjective. These features crossclassify the grammatical functions as in Table 3. These features are used to state rules of intrinsic classification of particular thematic roles. Such rules constrain the relation between thematic roles and the classes of grammatical functions

that these features delineate. For example, arguments bearing the AGENT role are classified as intrinsically nonobjective (#O), either SUBJ or OBLAGENT. Arguments bearing the THEME role are disjunctively classified, either as intrinsically unrestricted (#R), bearing the SUBJ or OBJ function, or as intrinsically objective (þO), filling the OBJ or OBJTHEME function. In addition to these intrinsic classifications, default mapping rules classify the arguments of a predicate according to their relative position on the thematic hierarchy (Bresnan and Kanerva, 1989): (40) AGENT > BENEFACTIVE > RECIPIENT/EXPERIENCER > INSTRUMENT > THEME/PATIENT > LOCATIVE

One of the default mapping rules requires the argument of a predicate that is highest on the thematic hierarchy to be classified as unrestricted (#R). For example, if a verb requires an AGENT argument and a PATIENT argument, the AGENT argument thematically outranks the PATIENT argument and, thus, the AGENT argument is classified as unrestricted. For a predicate with an AGENT and a PATIENT argument, such as kick, this has the result in (41) (Bresnan and Kanerva, 1989).

For simplicity, we consider only the intrinsically unrestricted classification of the PATIENT argument, leaving aside the option of considering the PATIENT an intrinsically objective function. The AGENT argument is classified as intrinsically nonobjective. The default rules add the unrestricted classification to the thematically highest argument, the AGENT. Because the AGENT is classified as [#O, #R], it is the SUBJ. The unrestricted classification of the PATIENT argument allows it to bear either the SUBJ or the OBJ role, but because the AGENT is assigned the SUBJ role, the PATIENT must be realized as OBJ. Thus, the argument classification rules, together with wellformedness conditions such as the Subject Condition requiring each verbal predicate to have a subject, constrain the mapping between argument roles and grammatical functions.

Table 3 Classification of grammatical functions

#O þO

#R

þR

SUBJ OBJ

OBLy OBJy

Glue: The Syntax–Semantics Interface

LFG assumes that the syntactic level that is primarily involved in semantic composition is the functional structure. That is, functional relations such as SUBJ

Lexical Functional Grammar 91

and OBJ rather than c-structure tree configurations are primarily responsible for determining how the meanings of the parts of a sentence combine to produce the full meaning of the sentence. The dominant theory of the syntax–semantics interface in LFG is called the glue approach (Dalrymple, 1999, 2001), a theory of how syntax guides the process of semantic composition. The glue approach assumes that each part of the f-structure corresponds to a semantic resource associated with a meaning and that the meaning of an f-structure is obtained by assembling the meanings of its parts according to a set of instructions specifying how the semantic resources can combine. These assembly instructions are provided as a set of logical premises in the ‘glue language’ of linear logic, and the derivation of a meaning for a sentence corresponds to a logical deduction. The deduction is performed on the basis of logical premises contributed by the words in the sentence (and possibly by syntactic constructions). Linear logic, a resource-based logic, is used to state requirements on how the meanings of the parts of a sentence can be combined to form the meaning of the sentence as a whole. Linear logic is different from classical logic in that it does not admit rules that allow for premises to be discarded or used more than once in a deduction. Premises in a linear logic deduction are, then, resources that must be accounted for in the course of a deduction; this nicely models the semantic contribution of the words in a sentence, which must contribute exactly once to the meaning of the sentence and may not be ignored or used more than once. A sentence such as David knocked twice cannot mean simply David knocked; the meaning of twice cannot be ignored. It also cannot mean the same thing as David knocked twice twice; the meaning of a word in a sentence cannot be used multiple times in forming the meaning of the sentence. The syntactic structures for the sentence David yawned, together with the desired semantic result, are displayed in (42). (42) David yawned.

The semantic structure for the sentence is related to its f-structure by the correspondence function s, represented as a dotted line. This result is obtained on the basis of the following lexical information, associated with the verb yawned: (43) lX.yawn(X): (" SUBJ)s —o "s

This formula is called a meaning constructor. It pairs the meaning for yawned, the one-place predicate lX.yawn(X), with the linear logic formula (" SUBJ)s—o "s. In this formula, the connective —o is the linear implication symbol of linear logic. This symbol expresses a meaning similar to if . . . then, in this case, stating that if a semantic resource (" SUBJ)s representing the meaning of the subject is available, then a semantic resource "s representing the meaning of the sentence can be produced. Unlike the implication operator of classical logic, the linear implication operator —o carries with it a requirement for consumption and production of semantic resources; the formula (" SUBJ)s—o "s indicates that if a semantic resource (" SUBJ)s is found, it is consumed and the semantic resource "s is produced. We also assume that a name such as David contributes a semantic resource, its semantic structure. In an example like David yawned, this resource is consumed by the verb yawned, which requires a resource for its SUBJ to produce a resource for the sentence. This accords with the intuition that the verb in a sentence must obtain a meaning for its arguments in order for a meaning for the sentence to be available. The f-structure for the sentence David yawned, together with the instantiated meaning constructors contributed by David and yawned, is given in (44).

The left-hand side of the meaning constructor labeled [David] is the proper noun meaning David, and the left-hand side of the meaning constructor labeled [yawn] is the meaning of the intransitive verb yawned, the one-place predicate lX.yawn(X). We must also provide rules for how the right-hand (glue) side of each of the meaning constructors in (44) relates to the left-hand (meaning) side in a meaning deduction. For simple, nonimplicational meaning constructors such as [David] in (44), the meaning on the left-hand side is the meaning of the semantic structure on the right-hand side. For meaning constructors that contain the linear implication operator —o, such as [yawn], modus ponens on the glue side corresponds to function application on the meaning side:

92 Lexical Functional Grammar

With these correspondences between linear logic formulas and meanings, we perform the following series of reasoning steps: (46) David: ds

lX.yawn(X): ds—o ys

yawn (David): ys

The meaning David is associated with the SUBJ semantic structure ds. On the glue side, if we find a semantic resource for the SUBJ ds, we consume that resource and produce a semantic resource for the full sentence ys. On the meaning side, we apply the function lX.yawn(X) to the meaning associated with ds. We have produced a semantic structure for the full sentence ys, associated with the meaning yawn(David).

By using the function application rule and the meaning constructors for David and yawned, we deduce the meaning yawn (David) for the sentence David yawned, as desired. Glue analyses of quantification, intensional verbs, modification, coordination, and other phenomena have been explored (Dalrymple, 1999). A particular challenge for the glue approach is found in cases in which there are apparently too many or too few meaning resources to produce the correct meaning for a sentence; such cases are explored within the glue framework by Asudeh (2004).

Preferences and Parsing From its inception, work on LFG has been informed by computational and psycholinguistic concerns. Recent research has combined LFG’s syntactic assumptions with an optimality–theoretic approach in an exploration of OT-LFG (see Pragmatics: Optimality Theory; Optimality-Theoretic Lexical-Functional Grammar). Other work combines LFG with DataOriented Parsing, a new view of language processing and acquisition. There have also been significant developments in parsing and generating with LFG grammars and grammars in related formalisms.

Data-Oriented Parsing and Lexical Functional Grammar

The framework of Data-Oriented Parsing (DOP), developed primarily by Rens Bod and his colleagues, represents a new view of the productivity of language and how it can be acquired on the basis of a finite amount of data. DOP views language acquisition as the analysis of a pool of linguistic structures that are presented to the language learner. The learner breaks up these structures into all of their component pieces, from the largest pieces to the smallest units, and new utterances are assembled from these pieces. The likelihood of assigning a particular analysis to a new sentence depends on the frequency of occurrence of its component parts, both large and small, in the original pool of structures. LFG-DOP (Bod and Kaplan, 1998) specializes the general DOP theory to LFG assumptions about linguistic structures and the relations between them. LFG-DOP assumes that the body of linguistic evidence that a language learner is presented with consists of well-formed c-structure–f-structure pairs. On this view, language acquisition consists in determining the relevant component parts of these structures and then combining these parts to produce new c-structure–f-structure pairs for novel sentences. Parsing

Several breakthroughs have been made in the parsing of large computational LFG grammars. Maxwell and Kaplan (1991) examined the problem of processing disjunctive specifications of constraints, which are computationally very difficult to process. In the worst case, processing disjunctive constraints is exponentially difficult. However, this worst-case scenario assumes that every disjunctive constraint can interact significantly with every other constraint. In practice, such interactions are found only very rarely. An ambiguity in the syntactic properties of the SUBJ of a sentence rarely correlates with ambiguities in the OBJ or other arguments. This insight is the basis of Maxwell and Kaplan’s algorithm, which works by turning a set of disjunctively specified constraints into a set of contexted, conjunctively specified constraints, in which the context of a constraint indicates where the constraint is relevant. Solving these contexted constraints turns out to be very efficient for linguistically motivated sets of constraints, in which only local interactions among disjunctions tend to occur. Maxwell and Kaplan (1993, 1996) explored the issue of c-structure processing and its relation to

Lexical Functional Grammar 93

solving f-structural constraints. It has long been known that constituent structure parsing – determining the phrase structure trees for a given sentence – is very fast in comparison to solving the equations that determine the f-structure for the sentence. For this reason, an important task in designing algorithms for linguistic processing of different kinds of structures such as the c-structure and the f-structure is to optimize the interactions between these computationally very different tasks. Previous research often assumed that the most efficient approach would be to interleave the construction of the phrase structure tree with the solution of f-structure constraints. Maxwell and Kaplan (1993) explored and compared a number of different methods for combining phrase structure processing with constraint solving; they showed that in certain situations, interleaving the two processes can actually give very bad results. Subsequently, Maxwell and Kaplan (1996) showed that if phrase structure parsing and f-structural constraint solving are combined in the right way, parsing can be very fast. In fact, if the grammar that results from combining phrase structure and functional constraints happens to be context-free equivalent, the algorithm for computing the c-structure and f-structure operates in cubic time, the same as for pure phrase structure parsing. Generation

Generation is the inverse of parsing. Whereas the parsing problem is to determine the c-structure and f-structure that correspond to a particular sentence, work on generation in LFG assumes that the generation task is to determine which sentences correspond to a specified f-structure, given a particular grammar. Based on these assumptions, several interesting theoretical results have been attained. Of particular importance is the work of Kaplan and Wedekind (2000), who showed that if we are given an LFG grammar and an acyclic f-structure (that is, an f-structure that does not contain a reference to another f-structure that contains it), the set of strings that corresponds to that fstructure according to the grammar is a context-free language. Kaplan and Wedekind also provided a method for constructing the context-free grammar for that set of strings by a process of specialization of the full grammar that we are given. This result leads to a new way of thinking about generation; opens the way to new, more efficient generation algorithms; and clarifies a number of formal and mathematical issues relating to LFG parsing and generation. Wedekind and Kaplan (1996) explored issues in ambiguity-preserving generation, in which a set of

f-structures rather than a single f-structure is considered, and the sentences of interest are those that correspond to all of the f-structures under consideration. The potential practical advantages of ambiguitypreserving generation are clear. Consider, for example, a scenario involving translation from English to German. We first parse the input English sentence, producing several f-structures if the English sentence is ambiguous. For instance, the English sentence Hans saw the man with the telescope is ambiguous: It means either that the man had the telescope or that Hans used the telescope to see the man. The best translation for this sentence would be a German sentence that is ambiguous in exactly the same way as the English sentence, if such a German sentence exists. In the case at hand, we would like to produce the German sentence Hans sah den Mann mit dem Fernrohr, which has exactly the same two meanings as the English input. To do this, we map the English f-structures for the input sentence to the set of corresponding German f-structures; our goal is then to generate the German sentence Hans sah den Mann mit dem Fernrohr, which corresponds to each of these f-structures. This approach is linguistically appealing, but mathematically potentially problematic. Wedekind and Kaplan (1996) showed that determining whether there is a single sentence that corresponds to each member of a set of f-structures is in general undecidable for an arbitrary (possibly linguistically unreasonable) LFG grammar. This means that there are grammars that can be written within the formal parameters of LFG, even though these grammars may not encode the properties of any actual or potential human language, and, for these grammars, there are sets of f-structures for which it is impossible to determine whether there is any sentence that corresponds to those f-structures. This result is important in understanding the formal limits of ambiguity-preserving generation. See also: Constituent Structure; Declarative Models of Syntax; Grammatical Relations and Arc-Pair Grammar; Optimality-Theoretic Lexical-Functional Grammar; Pragmatics: Optimality Theory; Syntactic Features and Feature Structures; Unification, Classical and Default; X-Bar Theory.

Bibliography Alsina A (1993). Predicate composition: a theory of syntactic function alternations. Ph.D. diss., Stanford University. Andrews A III & Manning C D (1999). Complex predicates and information spreading in LFG. Stanford, CA: CSLI Publications.

94 Lexical Functional Grammar Asudeh A (2004). Resumption as resource management. Ph.D. diss., Stanford University. Bod R & Kaplan R M (1998). ‘A probabilistic corpusdriven model for Lexical-Functional analysis.’ In Proceedings of COLING/ACL98. Montreal. 145–151. Bresnan J (1978). ‘A realistic transformational grammar.’ In Halle M, Bresnan J & Miller G A (eds.) Linguistic theory and psychological reality. Cambridge, MA: MIT Press. 1–59. Bresnan J (ed.) (1982). The mental representation of grammatical relations. Cambridge, MA: MIT Press. Bresnan J (2001). Lexical–Functional syntax. Oxford: Blackwell Publishers. Bresnan J & Kanerva J M (1989). ‘Locative inversion in Chichew ˆ a: A case study of factorization in grammar.’ Linguistic Inquiry 20(1), 1–50. [Reprinted in Stowell et al. (1992).] Bresnan J & Zaenen A (1990). ‘Deep unaccusativity in LFG.’ In Dziwirek K, Farrell P & Mejı´as-Bikandi E (eds.) Grammatical relations: a cross-theoretical perspective. Stanford, CA: CSLI Publications. 45–57. Butt M (1996). The structure of complex predicates in Urdu. Stanford, CA: CSLI Publications. Dalrymple M (1993). CSLI lecture notes 36: The syntax of anaphoric binding. Stanford, CA: CSLI Publications. Dalrymple M (ed.) (1999). Semantics and syntax in Lexical Functional grammar: the resource logic approach. Cambridge, MA: MIT Press. Dalrymple M (2001). Syntax and semantics 34: Lexical Functional grammar. New York: Academic Press. Dalrymple M & Kaplan R M (2000). ‘Feature indeterminacy and feature resolution.’ Language 76(4), 759–798. Dalrymple M, Kaplan R M, Maxwell J T III & Zaenen A (eds.) (1995). Formal issues in Lexical–Functional grammar. Stanford, CA: CSLI Publications. Falk Y N (2001). Lexical-Functional grammar: an introduction to parallel constraint-based syntax. Stanford, CA: CSLI Publications. Kaplan R M & Wedekind J (2000). ‘LFG generation produces context-free languages.’ In Proceedings of the 18th International Conference on Computational Linguistics (COLING2000). Saarbruecken. 425–431.

King T H (1995). Configuring topic and focus in Russian. Stanford, CA: CSLI Publications. Kroeger P (1993). Phrase structure and grammatical relations in Tagalog. Stanford, CA: CSLI Publications. Kroeger P (2004). Analyzing syntax: a Lexical–Functional approach. Cambridge, UK: Cambridge University Press. Levin L S (1986). Operations on lexical forms: unaccusative rules in Germanic languages. Ph.D. diss., MIT. Levin L S, Rappaport M & Zaenen A (eds.) (1983). Papers in Lexical Functional grammar. Bloomington, IN: Indiana University Linguistics Club. Manning C D (1996). Ergativity: argument structure and grammatical relations. Stanford, CA: CSLI Publications. Maxwell J T III & Kaplan R M (1991). ‘A method for disjunctive constraint satisfaction.’ In Tomita M (ed.) Current issues in parsing technology. Dordrecht: Kluwer Academic Publishers. 173–190. [Reprinted in Dalrymple et al. (eds.). 381–401.] Maxwell J T III & Kaplan R M (1993). ‘The interface between phrasal and functional constraints.’ Computational Linguistics 19(4), 571–590. Maxwell J T III & Kaplan R M (1996). ‘An efficient parser for LFG.’ In Butt M & King T H (eds.) On-line Proceedings of the LFG96 Conference. Available at: http://cslipublications.stanford.edu. Mohanan T (1994). Arguments in Hindi. Stanford, CA: CSLI Publications. Nordlinger R (1998). Constructive case: evidence from Australian languages. Stanford, CA: CSLI Publications. Sells P (2001). Structure, alignment and optimality in Swedish. Stanford, CA: CSLI Publications. Simpson J (1991). Warlpiri morpho-syntax: a Lexicalist approach. Dordrecht: Kluwer Academic Publishers. Stowell T, Wehrli E & Anderson S R (eds.) (1992). Syntax and semantics 26: Syntax and the lexicon. San Diego: Academic Press. Toivonen I (2003). Non-projecting words: a case study of Swedish particles. Dordrecht: Kluwer Academic Publishers. Wedekind J & Kaplan R M (1996). ‘Ambiguity-preserving generation with LFG- and PATR-style grammars.’ Computational Linguistics 22(4), 555–568.

Lexical Phonology and Morphology G Booij, Vrije Universiteit Amsterdam, Amsterdam, Netherlands ! 2006 Elsevier Ltd. All rights reserved.

Lexical and Postlexical Phonology The term ‘lexical phonology’ is used for two different but related purposes. First, it refers to the range of phonological processes or constraints in a language

that pertain to the domain of the word. In this use, it is a synonym of ‘word phonology,’ and stands in opposition to the term ‘postlexical phonology’ or ‘phrasal phonology.’ With the latter term we denote the processes or constraints that apply across the board, not only within the domain of the word, but also across word boundaries in the domain of larger constituents such as phrases. The distinction between these two domains of phonology can be illustrated by means of the following example. In Dutch, obstruents

94 Lexical Phonology and Morphology Asudeh A (2004). Resumption as resource management. Ph.D. diss., Stanford University. Bod R & Kaplan R M (1998). ‘A probabilistic corpusdriven model for Lexical-Functional analysis.’ In Proceedings of COLING/ACL98. Montreal. 145–151. Bresnan J (1978). ‘A realistic transformational grammar.’ In Halle M, Bresnan J & Miller G A (eds.) Linguistic theory and psychological reality. Cambridge, MA: MIT Press. 1–59. Bresnan J (ed.) (1982). The mental representation of grammatical relations. Cambridge, MA: MIT Press. Bresnan J (2001). Lexical–Functional syntax. Oxford: Blackwell Publishers. Bresnan J & Kanerva J M (1989). ‘Locative inversion in Chichew ˆ a: A case study of factorization in grammar.’ Linguistic Inquiry 20(1), 1–50. [Reprinted in Stowell et al. (1992).] Bresnan J & Zaenen A (1990). ‘Deep unaccusativity in LFG.’ In Dziwirek K, Farrell P & Mejı´as-Bikandi E (eds.) Grammatical relations: a cross-theoretical perspective. Stanford, CA: CSLI Publications. 45–57. Butt M (1996). The structure of complex predicates in Urdu. Stanford, CA: CSLI Publications. Dalrymple M (1993). CSLI lecture notes 36: The syntax of anaphoric binding. Stanford, CA: CSLI Publications. Dalrymple M (ed.) (1999). Semantics and syntax in Lexical Functional grammar: the resource logic approach. Cambridge, MA: MIT Press. Dalrymple M (2001). Syntax and semantics 34: Lexical Functional grammar. New York: Academic Press. Dalrymple M & Kaplan R M (2000). ‘Feature indeterminacy and feature resolution.’ Language 76(4), 759–798. Dalrymple M, Kaplan R M, Maxwell J T III & Zaenen A (eds.) (1995). Formal issues in Lexical–Functional grammar. Stanford, CA: CSLI Publications. Falk Y N (2001). Lexical-Functional grammar: an introduction to parallel constraint-based syntax. Stanford, CA: CSLI Publications. Kaplan R M & Wedekind J (2000). ‘LFG generation produces context-free languages.’ In Proceedings of the 18th International Conference on Computational Linguistics (COLING2000). Saarbruecken. 425–431.

King T H (1995). Configuring topic and focus in Russian. Stanford, CA: CSLI Publications. Kroeger P (1993). Phrase structure and grammatical relations in Tagalog. Stanford, CA: CSLI Publications. Kroeger P (2004). Analyzing syntax: a Lexical–Functional approach. Cambridge, UK: Cambridge University Press. Levin L S (1986). Operations on lexical forms: unaccusative rules in Germanic languages. Ph.D. diss., MIT. Levin L S, Rappaport M & Zaenen A (eds.) (1983). Papers in Lexical Functional grammar. Bloomington, IN: Indiana University Linguistics Club. Manning C D (1996). Ergativity: argument structure and grammatical relations. Stanford, CA: CSLI Publications. Maxwell J T III & Kaplan R M (1991). ‘A method for disjunctive constraint satisfaction.’ In Tomita M (ed.) Current issues in parsing technology. Dordrecht: Kluwer Academic Publishers. 173–190. [Reprinted in Dalrymple et al. (eds.). 381–401.] Maxwell J T III & Kaplan R M (1993). ‘The interface between phrasal and functional constraints.’ Computational Linguistics 19(4), 571–590. Maxwell J T III & Kaplan R M (1996). ‘An efficient parser for LFG.’ In Butt M & King T H (eds.) On-line Proceedings of the LFG96 Conference. Available at: http://cslipublications.stanford.edu. Mohanan T (1994). Arguments in Hindi. Stanford, CA: CSLI Publications. Nordlinger R (1998). Constructive case: evidence from Australian languages. Stanford, CA: CSLI Publications. Sells P (2001). Structure, alignment and optimality in Swedish. Stanford, CA: CSLI Publications. Simpson J (1991). Warlpiri morpho-syntax: a Lexicalist approach. Dordrecht: Kluwer Academic Publishers. Stowell T, Wehrli E & Anderson S R (eds.) (1992). Syntax and semantics 26: Syntax and the lexicon. San Diego: Academic Press. Toivonen I (2003). Non-projecting words: a case study of Swedish particles. Dordrecht: Kluwer Academic Publishers. Wedekind J & Kaplan R M (1996). ‘Ambiguity-preserving generation with LFG- and PATR-style grammars.’ Computational Linguistics 22(4), 555–568.

Lexical Phonology and Morphology G Booij, Vrije Universiteit Amsterdam, Amsterdam, Netherlands ! 2006 Elsevier Ltd. All rights reserved.

Lexical and Postlexical Phonology The term ‘lexical phonology’ is used for two different but related purposes. First, it refers to the range of phonological processes or constraints in a language

that pertain to the domain of the word. In this use, it is a synonym of ‘word phonology,’ and stands in opposition to the term ‘postlexical phonology’ or ‘phrasal phonology.’ With the latter term we denote the processes or constraints that apply across the board, not only within the domain of the word, but also across word boundaries in the domain of larger constituents such as phrases. The distinction between these two domains of phonology can be illustrated by means of the following example. In Dutch, obstruents

Lexical Phonology and Morphology 95

(stops and fricatives) are always voiceless at the end of a syllable (a generalization referred to as final devoicing). This constraint is part of word phonology, as illustrated by the following minimal pair: vind-er [vin.der] ‘finder’ vind er [vin.ter] ‘find her’

The first example is a noun ending in the suffix -er derived from the verb vind ‘to find’. Final devoicing is not applicable here since the stem-final /d/ is syllabified as an onset (syllable boundaries in phonetic forms are indicated by dots). In the second example the clitic pronoun er /er/ ‘her’ forms one prosodic word with the preceding verb. However, the /d/ is realized as [t], showing that it must have been in coda position at the presyntactic level, before this word was combined with the clitic pronoun. This shows that final devoicing is a constraint that is part of the word phonology of Dutch, whereas the syllabification of the verb and the following clitic as one prosodic word belongs to the domain of phrasal (postlexical) phonology. As this example illustrates, rules of word phonology must apply before phrasal phonology. Another example of this distinction is that Dutch simplex and derived words do not have geminate consonants. The past tense form eette /e:t-te/ of the verb eet ‘to eat’, with the past tense suffix -te, for instance, is realized as [e:te]. Hence, there is an obligatory rule of degemination as part of the word phonology of Dutch. Across word boundaries, however, geminate consonants do occur, but they may be reduced to one consonant in casual or fast speech. Thus, the same process may function as an obligatory rule at the level of word phonology, and as an optional rule in phrasal phonology. The term ‘lexical phonology’ is also used to denote a particular theory about the interaction between morphology and phonology, in which the distinction between word phonology and phrasal phonology discussed above plays an important role. This theory will be discussed in the next section.

tandem. Given a word with its underlying phonological form, the relevant rules of word phonology are applied to that word. You may then apply a morphological rule to that word in its derived phonological form. This creates a new domain of application for the rules of word phonology. Thus, we derive the lexical phonetic forms of words. These words will subsequently be combined into phrases and larger constituents by the rules of syntax. Postlexical phonology is accounted for by a component of postlexical phonology that applies after syntax. Thus, the organization of the grammar is proposed as in Figure 1. Suppose we want to compute the phonetic form of the 1.sg. form of the Dutch verb vind /vind/ ‘to find’, which is vind [vint]. The stem for vind is first fed into morphology, which has no phonological effect since there is no overt morphological ending for 1.sg. verbs. The form is then syllabified, as one syllable (vind). To this form, the rule of final devoicing applies, resulting in the form [vint]. This is the form that is fed into syntax. At the postlexical level, this form may undergo further phonological rules such as assimilation. We do not have to label the rule of final devoicing explicitly as a lexical rule. Instead, we might assume the principle ‘apply a rule when possible.’ Thus, the rules of syllabification will apply to a word. Subsequently, the presence of syllable structure will trigger application of the rule of final devoicing to voiced obstruents in syllable-final position. When we derive the deverbal noun vinder with the phonetic form [vin.der] we might proceed as

Lexical Phonology and Morphology Lexical phonology is a theory about the interface between phonology and morphology developed by Paul Kiparsky (1982, 1985) and a number of other phonologists. The basic issue is to what extent and how the morphological structure of words determines their phonetic realization. Lexical phonology may be qualified as a set of related but independent hypotheses about the morphology–phonology interface (see Booij, 2000 for a detailed survey). The basic claim of lexical phonology is that morphology and the rules of word phonology apply in

Figure 1 Lexical phonology.

96 Lexical Phonology and Morphology

follows. On the assumption that stems do not trigger the application of phonological rules, the stem vind is fed into morphology, where the noun vinder is created. Subsequently, this word will be syllabified as vin. der. The /d/ is now in onset position and hence it will not be devoiced, as required. This word is then, with this computed phonological form, available for lexical insertion into syntactic structure. The theory of lexical phonology in its original form claimed that word phonological rules may apply cyclically, in an outward fashion. The concept of cyclic rule application originates from Chomsky and Halle (1968) (usually referred to as SPE), which proposed to apply the stress rules of English in a cyclic fashion. For example, the stress patterns of the word condensation in the (deverbal) interpretation ‘the act of condensing’ and of the similar word compensation are derived in two steps. On the first cycle, main stress is assigned to their verbal stems condense and compensate respectively, which results in main stress on the second and first syllable respectively. The addition of the suffix -ation creates a second domain of application of the English main stress rule, with main stress on the first vowel of -ation. There is also some degree of stress on the word-initial syllables of these words. These words differ, however, in the phonetic realization of the second vowel. The second, unstressed vowel of compensation, which is unstressed on both cycles, is realized as a schwa, whereas the second vowel of condensation is pronounced as a full vowel [e] since it still has some degree of stress, a reflex of the stress pattern of its verbal stem condense. The difference in phonetic realization of the second vowel of these words thus reflects their derivational history (Chomsky and Halle, 1968: 116). Whereas in SPE cyclic application of the English stress rules is stipulated, in lexical phonology this manner of application follows from the organization of the grammar as outlined above, since the derived phonological forms of words can be fed back into the morphological component. Therefore, it has been proposed that the rules of lexical phonology apply as soon as possible, hence cyclically. An additional advantage of this theory is that it predicts that morphological operations can be dependent on derived phonological properties of words. An example of this kind of dependence is that the English suffix -al can only be attached to verbs that bear main stress on their final syllable; hence the difference between try– trial versus organize–*organizal (the only exception to this constraint is bury–burial). If stress can be computed by rule in a cyclic fashion, the grammar will first assign stress to the verbal stems, and subsequently,

the word formation rule for -al can be applied to verbal stems provided with the required stress pattern. This proposal raises a number of issues. First, stress is a property of syllables. Hence, if stress is cyclic, syllabification must be applied in a cyclic fashion as well. Affixation may, however, affect the syllabification of the stem. In particular, stem-final consonants will be syllabified as codas, but syllabified as onsets when a vowel-initial suffix is attached to that stem, as in find–fin.der. Cyclic syllabification thus implies some form of resyllabification of the stem on the next cycle. Secondly, it has become clear that certain phonological rules should not apply cyclically. This is, for example, the case for the Dutch rule of final devoicing, which should not apply to the stem of the deverbal noun vind-er discussed above because otherwise the wrong phonetic form [vin.ter] is predicted. Therefore, Booij and Rubach (1987) proposed a refinement of the model of lexical phonology, and introduced a third level, a category of presyntactic word-level rules (also called postcyclic rules) that apply after the set of cyclic phonological rules, but still within the lexicon. Hence, we get three levels of application of phonological rules: 1. Cyclic phonological rules (interacting with word formation) 2. Word-level phonological rules 3. Postlexical phonological rules. A related hypothesis is that cyclic rules are subject to the condition that they apply in derived environments only. This means that they can only apply in a context created by the application of a previous phonological or morphological operation (see Booij, 2000 for detailed discussion). Furthermore, it has been hypothesized that lexical rules are structure preserving; that is, they only introduce phonological segments that also occur in the underlying forms of words. In some varieties of lexical phonology one also finds the hypothesis of level ordering. This hypothesis claims that the morphological processes of a language may be organized in two or more levels or strata, each with their own set of phonological rules applying to the complex words created at that level. For English, for instance, it has been proposed that there are two levels. Level 1 is the level of nonnative suffixation, which triggers the application of a specific set of phonological rules such as the main stress rule. Level 2 is the level for stress-neutral (native derivational and inflectional) suffixes. At this level, the main stress rule no longer applies, thus accounting for the stressneutrality of these suffixes. This level-ordering

Lexical Phonology and Morphology 97

hypothesis also predicts that stress-neutral suffixes are peripheral to those that influence the stress patterns of words, such as the stress-bearing nonnative (Romance) suffixes of English. For instance, in the complex noun contrastiveness the stress-shifting suffix -ive precedes the stress-neutral suffix -ness. The level-ordering hypothesis for English is discussed and criticized in detail by Fabb (1988) and Plag (1999), and defended in Giegerich (1999).

Constraint-Based Phonology To what extent do the insights of lexical phonology carry over to constraint-based phonological theories such as optimality theory (OT)? An important insight of lexical phonology is that morphological structure may determine the domain of application of phonological constraints. This insight has been carried over to OT in the subtheory of alignment (McCarthy and Prince, 1994). In this approach, the boundaries of prosodic constituents such as syllables and prosodic words must be aligned as much as possible with morphological boundaries. Thus, morphological boundaries codetermine the boundaries of prosodic constituents, and hence indirectly the domain of application of phonological rules. The effects of cyclic application of phonological rules within complex words in rule-based frameworks might be obtained by making use of output–output identity constraints that refer to the phonetic form of morphologically related words or word forms (Benua, 2000). The distinction between word phonology and phrase phonology can be carried over to OT in the form of derivational optimality theory (DOT), as advocated in Booij (1997), Rubach (2000), and Ito and Mester (2001). This means that the evaluation of the candidate phonetic forms of words takes place in two steps, first at the word level, and subsequently at the postlexical level. Thus a restricted form of derivation is maintained. This approach can account for the difference in phonetic form between vinder and vind er mentioned above.

See also: Generative Phonology; Morphotactics; Rule Ordering and Derivation in Phonology.

Bibliography Benua L (2000). Phonological relations between words. New York/London: Garland. Booij G (2000). ‘The phonology–morphology interface.’ In Cheng L & Sybesma R (eds.) The first Glot International state-of-the-article book. Berlin/New York: Mouton de Gruyter. 287–306. Booij G (1997). ‘Non-derivational phonology meets lexical phonology.’ In Roca I (ed.) Derivations and constraints in phonology. Oxford: Oxford University Press. 261–288. Booij G & Rubach J (1987). ‘Postcyclic versus postlexical rules in lexical phonology.’ Linguistic Inquiry 18, 1– 44. Chomsky N & Halle M (1968). The sound pattern of English. New York: Harper and Row. Fabb N (1988). ‘English suffixation is constrained only by selectional restrictions.’ Natural Language and Linguistic Theory 6, 527–539. Giegerich H (1999). Lexical strata in English: morphological causes, phonological effects. Cambridge: Cambridge University Press. Ito J & Mester A (2001). ‘Structure preservation and stratal opacity in German.’ In Lombardi L (ed.) Segmental phonology in optimality theory. Cambridge: Cambridge University Press. 261–295. Kiparsky P (1982). ‘From cyclic phonology to lexical phonology.’ In van der Hulst H & Smith N (eds.) The structure of phonological representations, vol. 1. Dordrecht: Foris. 131–175. Kiparsky P (1985). ‘Some consequences of lexical phonology.’ Phonology Yearbook 2, 85–138. McCarthy J J & Prince A (1994). ‘General alignment.’ In Booij G & Marle J van (eds.) Yearbook of Morphology 1993. Dordrecht: Kluwer. 79–154. Plag I (1999). Morphological productivity: structural constraints in English derivation. Berlin/New York: Mouton de Gruyter. Rubach J (2000). ‘Glide and glottal stop insertion in Slavic languages: a DOT analysis.’ Linguistic Inquiry 31, 271–317.

98 Lexical Semantics: Overview

Lexical Semantics: Overview J Pustejovsky, Brandeis University, Waltham, MA, USA ! 2006 Elsevier Ltd. All rights reserved.

Word Knowledge Semantic interpretation requires access to knowledge about words. The lexicon of a grammar must provide a systematic and efficient way of encoding the information associated with words in a language. Lexical semantics is the study of what words mean and how they structure these meanings. This article examines word meaning from two different perspectives: the information required for composition in the syntax and the knowledge needed for semantic interpretation. The lexicon is not merely a collection of words with their associated phonetic, orthographic, and semantic forms. Rather, lexical entries are structured objects that participate in larger operations and compositions, both enabling syntactic environments and acting as signatures to semantic entailments and implicatures in the context of larger discourse. There are four basic questions in modeling the semantic content and structure of the lexicon: (1) What semantic information goes into a lexical entry? (2) How do lexical entries relate semantically to one another? (3) How is this information exploited compositionally by the grammar? and (4) How is this information available to semantic interpretation generally? This article focuses on the first two. The lexicon and lexical semantics have traditionally been viewed as the most passive modules of language, acting in the service of the more dynamic components of the grammar. This view has its origins in the generative tradition (Chomsky, 1955) and has been an integral part of the notion of the lexicon ever since. While the aspects model of selectional features (Chomsky, 1965) restricted the relation of selection to that between lexical items, work by McCawley (1968) and Jackendoff (1972) showed that selectional restrictions must be available to computations at the level of derived semantic representation rather than at deep structure. Subsequent work by Bresnan (1982), Gazdar et al. (1985), and Pollard and Sag (1994) extended the range of phenomena that can be handled by the projection and exploitation of lexically derived information in the grammar. With the convergence of several areas in linguistics (lexical semantics, computational lexicons, and type theories) several models for the determination of selection have emerged that put even more compostional power in the lexicon, making explicit reference to the paradigmatic systems that allow for grammatical

constructions to be partially determined by selection. Examples of this approach are generative lexicon theory (Bouillon and Busa, 2001; Pustejovsky, 1995) and construction grammar (Goldberg, 1995; Jackendoff, 1997, 2002). These developments have helped to characterize the approaches to lexical design in terms of a hierarchy of semantic expressiveness. There are at least three such classes of lexical description: sense enumerative lexicons, where lexical items have a single type and meaning, and ambiguity is treated by multiple listings of words; polymorphic lexicons, where lexical items are active objects, contributing to the determination of meaning in context, under welldefined constraints; and unrestricted sense lexicons, where the meanings of lexical items are determined mostly by context and conventional use. Clearly, the most promising direction seems to be a careful and formal elucidation of the polymorphic lexicons, and this will form the basis of the subsequent discussion of both the structure and the content of lexical entries.

Historical Overview The study of word meaning has occupied philosophers for centuries, beginning at least with Aristotle’s theory of meaning. Locke, Hume, and Reid all paid particular attention to the meanings of words, but not until the 19th century did the rise of philological and psychological investigations of word meaning occur, with Bre´al (1897), Erdmann (1900), Trier (1931), Stern (1931/1968), and others focused on word connotation, semantic drift, and word associations in the mental lexicon as well as in social contexts. Interestingly, Russell, Frege, and other early analytic philosophers were not interested in language as a linguistic phenomenon but simply as the medium through which judgments can be formed and expressed. Hence, there is little regard for the relations between senses of words, when not affecting the nature of judgment, for example, within intensional contexts. Nineteenth-century semanticists and semasiologists, on the other hand, viewed polysemy as the life force of human language. Bre´al, for example, considered it to be a necessary creative component of language and argued that this phenomenon better than most in semantics illustrates the cognitive and conceptualizing force of the human species. Even with their obvious enthusiasm, semasiology produced no lasting legacy to the study of lexical semantics. In fact, there was no systematic research into lexical meaning until structural linguists extended the relational techniques of Saussure (1916/1983) and elaborated the framework of componential analysis for language meaning (Jakobson, 1970).

Lexical Semantics: Overview 99

The idea behind componential analysis is the reduction of a word’s meaning into its ultimate contrastive elements. These contrastive elements are structured in a matrix, allowing for dimensional analysis and generalizations to be made about lexical sets occupying the cells in the matrix. This technique developed into a general framework for linguistic description called distinctive feature analysis (Jakobson and Halle, 1956). This is essentially the inspiration for Katz and Fodor’s 1963 theory of lexical semantics within transformational grammar. In this theory, usually referred to as ‘markerese,’ a lexical entry in the language consists of grammatical and semantic markers and a special feature called a ‘semantic distinguisher.’ In Weinreich (1972) and much subsequent discussion, it was demonstrated that this model is far too impoverished to characterize the compositional mechanisms inherent in language. In the late 1960s and early 1970s, alternative models of word meaning emerged (Fillmore, 1965; Gruber, 1965; Jackendoff, 1972; Lakoff, 1965) that respected the relational structure of sentence meaning while encoding the named semantic functions in lexical entries. In Dowty (1979), a model theoretic interpretation of the decompositional techniques of Lakoff, McCawley, and Ross was developed. Recently, the role of lexical–syntactic mapping has become more evident, particularly with the growing concern over projection from lexical semantic form, the problem of verbal alternations and polyvalency, and the phenomenon of polysemy.

Ambiguity and Polysemy Given the compactness of a lexicon relative to the number of objects and relations in the world, and the concepts we have for them, lexical ambiguity is inevitable. Add to this the cultural, historical, and linguistic blending that contributes to the meanings of our lexical items, and ambiguity can appear arbitrary as well. Hence, ‘homonymy’ – where one lexical form has many meanings – is to be expected in a language. Examples of homonyms are illustrated in the following sentences: (1a) Mary walked along the bank of the river. (1b) Bank of America is the largest bank in the city. (2a) Drop me a line when you are in Boston. (2b) We built a fence along the property line. (3a) First we leave the gate, then we taxi down the runway. (3b) John saw the taxi on the street. (4a) The judge asked the defendant to approach the bar. (4b) The defendant was in the pub at the bar.

Weinreich (1964) calls such lexical distinctions ‘contrastive ambiguity,’ where it is clear that the senses associated with the lexical item are unrelated. For this reason, it is generally assumed that homonyms are represented as separate lexical entries within the organization of the lexicon. This accords with a view of lexical organization that has been termed a ‘sense enumeration lexicon’ (cf. Pustejovsky, 1995). That is, a lexicon is sense enumerative when every word o that has multiple senses stores these senses as separate lexical entries. This model becomes difficult to maintain, however, when we consider the phenomenon known as ‘polysemy.’ Polysemy is the relationship that exists between different senses of a word that are related in some logical manner rather than arbitrarily, as in the previous examples. We can distinguish three broad types of polysemy, each presenting a novel set of challenges to lexical semantics and linguistic theory. a. Deep semantic typing: single argument polymorphism b. Syntactic alternations: multiple argument polymorphism c. Dot objects: lexical reference to objects that have multiple facets The first class refers mainly to functors allowing a range of syntactic variation in a single argument. For example, aspectual verbs (begin and finish), perception verbs (see, hear), and most propositional attitude verbs (know, believe) subcategorize for multiple syntactic forms in complement position, as illustrated in (6): (5a) Mary began to read the novel. (5b) Mary began reading the novel. (5c) Mary began the novel. (6a) Bill saw John leave. (6b) Bill saw John leaving. (6c) Bill saw John. (7a) Mary believes that John told the truth. (7b) Mary believes what John said. (7c) Mary believes John’s story.

What these and many other cases of multiple selection share is that the underlying relation between the verb and each of its complements is essentially identical. For example, in (7), the complement to the verb believe in all three sentences is a proposition; in (5), what is begun in each sentence is an event of some sort; and in (6), the object of the perception is (arguably) an event in each case. This has led some linguists to argue for semantic selection (cf. Chomsky, 1986; Grimshaw, 1979) and others to argue for structured

100 Lexical Semantics: Overview

selectional inheritance (Godard and Jayez, 1993). In fact, these perspectives are not that distant from one another (cf. Pustejovsky, 1995): in either view, there is an explicit lexical association between syntactic forms that is formally modeled by the grammar. The second type of polysemy (syntactic alternations) involves verbal forms taking arguments in alternating constructions, the so-called ‘verbal alternations’ (cf. Levin, 1993). These are true instances of polysemy because there is a logical (typically causal) relation between the two senses of the verb. As a result, the lexicon must either relate the senses through lexical rules (such as in head-driven phrase structure grammar (HPSG) treatments; cf. Pollard and Sag, 1994) or assume that there is one lexical form that has multiple syntactic realizations (cf. Pustejovsky and Busa, 1995). (8a) The window opened suddenly. (8b) Mary opened the window suddenly. (9a) Bill began his lecture on time. (9b) The lecture began on time. (10a) The milk spilled onto the table. (10b) Mary spilled the milk onto the table.

The final form of polysemy reviewed here is encountered mostly in nominals and has been termed ‘regular polysemy’ (cf. Apresjan, 1973) and ‘logical polysemy’ (cf. Pustejovsky, 1991) in the literature; it is illustrated in the following sentences: (11a) Mary carried the book home. (11b) Mary doesn’t agree with the book. (12a) Mary has her lunch in her backpack. (12b) Lunch was longer today than it was yesterday. (13a) The flight lasted 3 hours. (13b) The flight landed on time in Los Angeles.

Notice that in each of the pairs, the same nominal form is assuming different semantic interpretations relative to its selective context. For example, in (11a) the noun book refers to a physical object, whereas in (11b) it refers to the informational content. In (12a), lunch refers to the physical manifestation of the food, whereas in (12b) it refers to the eating event. Finally, in (13a) flight refers to the flying event, whereas in (13b) it refers to the plane. This phenomenon of polysemy is one of the most challenging in the area and has stimulated much research Bouillon, 1997; Bouillon and Busa, 2001. In order to understand how each of these cases of polysemy can be handled, we must first familiarize ourselves with the structure of individual lexical entries.

Lexical Relations Another important aspect of lexical semantics is the study of how words are semantically related to one another. Four classes of lexical relations, in particular, are important to recognize: synonymy, antonymy, hyponymy, and meronymy. Synonymy is generally taken to be a relation between words rather than concepts. One fairly standard definition states that two expressions are synonymous if substituting one for the other in all contexts does not change the truth value of the sentence where the substitution is made (cf. Cruse, 1986, 2004; Lyons, 1977). A somewhat weaker definition makes reference to the substitution relative to a specific context. For example, in the context of carpentry, plank and board might be considered synonyms, but not necessarily in other domains (cf. Miller et al., 1990). The relation of antonymy is characterized in terms of semantic opposition and, like synonymy, is properly defined over pairs of lexical items rather than concepts. Examples of antonymy are rise/fall, heavy/light, fast/slow, and long/short (cf. Cruse, 1986; Miller, 1991). It is interesting to observe that co-occurrence data illustrate that synonyms do not necessarily share the same antonyms. For example, rise and ascend as well as fall and descend are similar in meaning, yet neither fall/ascend nor rise/descend are antonym pairs. For further details see Miller et al. (1990). The most studied relation in the lexical semantic community is hyponymy, the taxonomic relationship between words, as defined in WordNet (Fellbaum, 1998) and other semantic networks. For example, specifying car as a hyponym of vehicle is equivalent to saying that vehicle is a superconcept of the concept car or that the set car is a subset of those individuals denoted by the set vehicle. One of the most difficult lexical relations to define and treat formally is that of meronymy, the relation of parts to the whole. The relation is familiar from knowledge representation languages with predicates or slot-names such as ‘part-of’ and ‘made-of.’ For treatments of this relation in lexical semantics, see Miller et al. (1990) and Cruse (1986).

The Semantics of a Lexical Entry It is generally assumed that there are four components to a lexical item: phonological, orthographic, syntactic, and semantic information. Here, we focus first on syntactic features and then on what semantic information must be encoded in an individual lexical entry.

Lexical Semantics: Overview 101

There are two types of syntactic knowledge associated with a lexical item: its category and its subcategory. The former includes traditional classifications of both the major categories, such as noun, verb, adjective, adverb, and preposition, as well as the minor categories, such as adverbs, conjunctions, quantifier elements, and determiners. Knowledge of the subcategory of a lexical item is typically information that differentiates categories into distinct, distributional classes. This sort of information may be usefully separated into two types, contextual features and inherent features. The former are features that may be defined in terms of the contexts in which a given lexical entry may occur. Subcategorization information marks the local syntactic context for a word. It is this information that ensures that the verb devour, for example, is always transitive in English, requiring a direct object; the lexical entry encodes this requirement with a subcategorization feature specifying that a noun phrase (NP) appear to its right. Another type of context encoding is collocational information, where patterns that are not fully productive in the grammar can be tagged. For example, the adjective heavy as applied to drinker and smoker is collocational and not freely productive in the language (Mel’cˇuk, 1988). ‘Inherent features’ on the other hand, are properties of lexical entries that are not easily reduced to a contextual definition but, rather, refer to the ontological typing of an entity. These include such features as count/mass (e.g., pebble vs. water), abstract, animate, human, physical, and so on. Lexical items can be systematically grouped according to their syntactic and semantic behavior in the language. For this reason, there have been two major traditions of word clustering, corresponding to this distinction. Broadly speaking, for those concerned mainly with grammatical behavior, the most salient aspect of a lexical item is its argument structure; for those focusing on a word’s entailment properties, the most important aspect is its semantic class. In this section, these two approaches are examined and it is shown how their concerns can be integrated into a common lexical representation. Lexical Semantic Classifications

Conventional approaches to lexicon design and lexicography have been relatively informal with regard to forming taxonomic structures for the word senses in the language. For example, the top concepts in WordNet (Miller et al., 1990) illustrate how words are characterized by local clusterings of semantic properties. As with many ontologies, however, it is difficult

Figure 1 Type structures.

to discern a coherent global structure for the resulting classification beyond a weak descriptive labeling of words into extensionally defined sets. One of the most common ways to organize lexical knowledge is by means of type or feature inheritance mechanisms (Carpenter, 1992; Copestake and Briscoe, 1992; Evans and Gazdar, 1990; Pollard and Sag, 1994). Furthermore, Briscoe et al. (1993) described a rich system of types for allowing default mechanisms into lexical type descriptions. Similarly, type structures, such as that shown in Figure 1, can express the inheritance of syntactic and semantic features, as well as the relationship between syntactic classes and alternations (cf. Alsina, 1992; Davis, 1996; Koenig and Davis, 1999; Sanfilippo, 1993) and other relations (cf. Pustejovsky, 2001; Pustejovsky and Boguraev, 1993). In the remainder of this section, we first examine the approach to characterizing the weak constraints imposed on a lexical item associated with its arguments. Then, we examine attempts to model lexical behavior by means of internal constraints imposed on the predicate. Finally, it is shown how, in some respects, these are very similar enterprises and both sets of constraints may be necessary to model lexical behavior. Argument Structure

Once the base syntactic and semantic typing for a lexical item has been specified, its subcategorization and selectional information must be encoded in some form. There are two major techniques for representing this type of knowledge: 1. Associate ‘named roles’ with the arguments to the lexical item (Fillmore, 1985; Gruber, 1965; Jackendoff, 1972). 2. Associate a logical decomposition with the lexical item; meanings of arguments are determined by how the structural properties of the representation are interpreted (cf. Hale and Keyser, 1993; Jackendoff, 1983; Levin and Rappaport, 1995).

102 Lexical Semantics: Overview

One influential way of encoding selectional behavior has been the theory of thematic relations (cf. Gruber, 1976; Jackendoff, 1972). Thematic relations are now generally defined as partial semantic functions of the event being denoted by the verb or noun, and they behave according to a predefined calculus of roles relations (e.g., Dowty, 1989). For example, semantic roles such as agent, theme, and goal can be used to partially determine the meaning of a predicate when they are associated with the grammatical arguments to a verb. (14a) put (14b) borrow

Thematic roles can be ordered relative to each other in terms of an implicational hierarchy. For example, there is considerable use of a universal subject hierarchy such as is shown in the following (cf. Comrie, 1981; Fillmore, 1968): (15) AGENT > RECIPIENT/BENEFACTIVE > THEME/PATIENT > INSTRUMENT > LOCATION>

Many linguists have questioned the general explanatory coverage of thematic roles, however, and have have chosen alternative methods for capturing the generalizations they promised. Dowty (1991) suggested that theta-role generalizations are best captured by entailments associated with the predicate. A theta-role can then be seen as the set of predicate entailments that are properties of a particular argument to the verb. Characteristic entailments might be thought of as prototype roles, or proto-roles; this allows for degrees or shades of meaning associated with the arguments to a predicate. Others have opted for a more semantically neutral set of labels to assign to the parameters of a relation, whether it is realized as a verb, noun, or adjective. For example, the theory of argument structure as developed by Williams (1981), Grimshaw (1990), and others can be seen as a move toward a more minimalist description of semantic differentiation in the verb’s list of parameters. The argument structure for a word can be seen as the simplest specification of its semantics, indicating the number and type of parameters associated with the lexical item as a predicate. For example, the verb die can be represented as a predicate taking one argument, kill as taking two arguments, where as the verb give takes three arguments: (16a) die (x) (16b) kill (x,y) (16c) give (x,y,z)

What originally began as the simple listing of the parameters or arguments associated with a predicate

has developed into a sophisticated view of the way arguments are mapped onto syntactic expressions. Williams’s (1981) distinction between external (the underlined arguments above) and internal arguments and Grimshaw’s proposal for a hierarchically structured representation (cf. Grimshaw, 1990) provide us with the basic syntax for one aspect of a word’s meaning. Similar remarks hold for the argument list structure in HPSG (Pollard and Sag, 1994) and Lexical Functional Grammar (Bresnan, 1994). The interaction of a structured argument list and a rich system of types, such as that presented previously, provides a mechanism for semantic selection through inheritance. Consider, for instance, the sentence pairs in (17): (17a) The man/the rock fell. (17b) The man/*the rock died.

Now consider how the selectional distinction for a feature such as animacy is modeled so as to explain the selectional constraints of predicates. For the purpose of illustration, the arguments of a verb will be identified as being typed from the system shown previously. (18a) lx: physical[fall(x)] (18b) lx: animate[die(x)]

In the sentences in (17), it is clear how rocks cannot die and men can, but it is still not obvious how this judgment is computed, given what we would assume are the types associated with the nouns rock and man, respectively. What accomplishes this computation is a rule of subtyping, Y, that allows the type associated with the noun man (i.e., ‘human’) to also be accepted as the type ‘animate,’ which is what the predicate die requires of its argument as stated in (18b) (cf. Carpenter, 1992). (19) Y [human v animate]: human ! animate

The rule Y, applies since the concept ‘human’ is subtyped under ‘animate’ in the type hierarchy. Parallel considerations rule out the noun rock as a legitimate argument to die since it is not subtyped under ‘animate.’ Hence, one of the concerns given previously for how syntactic processes can systematically keep track of which ‘selectional features’ are entailed and which are not is partially addressed by such lattice traversal rules as the one presented here. Event Structure and Lexical Decomposition

The second approach to lexical specification mentioned previously is to define constraints internally to the predicate. Traditionally, this has been known as ‘lexical decomposition.’ In this section, we review

Lexical Semantics: Overview 103

the motivations for decomposition in linguistic theory and the proposals for encoding lexical knowledge as structured objects. We then relate this to the way in which verbs can be decomposed in terms of eventualities (Tenny and Pustejovsky, 2000). Since the 1960s, lexical semanticists have attempted to formally model the semantic relations between lexical items such as between the adjective dead and the verbs die and kill (cf. Lakoff, 1965; McCawley, 1968) in the following sentences: (20a) John killed Bill. (20b) Bill died. (20c) Bill is dead.

Assuming the underlying form for a verb such as kill directly encodes the stative predicate in (20c) and the relation of causation, generative semanticists posited representations such as (21): (21) (CAUSE (x, (BECOME (NOT (ALIVE y))))

Here the predicate CAUSE is represented as a relation between an individual causer x and an expression involving a change of state in the argument y. Carter (1976) proposes a representation quite similar, shown here for the causative verb darken: (22) (x CAUSE ((y BE DARK) CHANGE))

Although there is an intuition that the cause relation involves a causer and an event, neither Lakoff nor Carter make this commitment explicitly. In fact, it has taken several decades for Davidson’s (1967) observations regarding the role of events in the determination of verb meaning to find their way convincingly into the major linguistic frameworks. A new synthesis has emerged that attempts to model verb meanings as complex predicative structures with rich event structures (cf. Hale and Keyser, 1993; Parsons, 1990; Pustejovsky, 1991). This research has developed the idea that the meaning of a verb can be analyzed into a structured representation of the event that the verb designates, and it has furthermore contributed to the realization that verbs may have complex, internal event structures. Recent work has converged on the view that complex events are structured into an inner and an outer event, where the outer event is associated with causation and agency, and the inner event is associated with telicity (completion) and change of state (cf. Tenny and Pustejovsky, 2000). Jackendoff (1990) developed an extensive system of what he calls ‘conceptual representations,’ which parallel the syntactic representations of sentences of natural language. These employ a set of canonical predicates, including CAUSE, GO, TO, and ON, and canonical elements, including Thing, Path, and

Event. These approaches represent verb meaning by decomposing the predicate into more basic predicates. This work owes obvious debt to the innovative work within generative semantics, as illustrated by McCawley’s (1968) analysis of the verb kill. Recent versions of lexical representations inspired by generative semantics can be seen in the lexical relational structures of Hale and Keyser (1993), where syntactic tree structures are employed to capture the same elements of causation and change of state as in the representations of Carter, Levin and Rapoport, Jackendoff, and Dowty. The work of Levin and Rappaport, building on Jackendoff’s lexical conceptual structures, has been influential in further articulating the internal structure of verb meanings (Levin and Rappaport, 1995). Pustejovsky (1991) extended the decompositional approach presented in Dowty (1979) by explicitly reifying the events and subevents in the predicative expressions. Unlike Dowty’s treatment of lexical semantics, where the decompositional calculus builds on propositional or predicative units (as discussed previously) a ‘syntax of event structure’ makes explicit reference to quantified events as part of the word meaning. Pustejovsky further introduced a tree structure to represent the temporal ordering and dominance constraints on an event and its subevents. For example, a predicate such as build is associated with a complex event such as the following (cf. also Moens and Steedman, 1988): (23) [transition[e1:PROCESS] [e2:STATE]]

The process consists of the building activity itself, whereas the state represents the result of there being the object built. Grimshaw (1990) adopted this theory in her work on argument structure, where complex events such as break are given a similar representation. In such structures, the process consists of what x does to cause the breaking, and the state is the resultant state of the broken item. The process corresponds to the outer causing event as discussed previously, and the state corresponds in part to the inner change of state event. Both Pustejovsky and Grimshaw differ from the previous authors in assuming a specific level of representation for event structure, distinct from the representation of other lexical properties. Furthermore, they follow Higginbotham (1985) in adopting an explicit reference to the event place in the verbal semantics. Rappaport and Levin (2001) adopted a large component of the event structure model for their analysis of the resultative construction in English. Event decomposition has also been employed for properties of adjectival selection, the interpretation of compounds, and stage and individual-level predication.

104 Lexical Semantics: Overview Qualia Structure

Thus far, we have focused on the lexical semantics of verb entries. All of the major categories, however, are encoded with syntactic and semantic feature structures that determine their constructional behavior and subsequent meaning at logical form. In generative lexicon theory (Pustejovsky, 1995), it is assumed that word meaning is structured on the basis of four generative factors, or ‘qualia roles’, that capture how humans understand objects and relations in the world and provide the minimal explanation for the linguistic behavior of lexical items (these are inspired in large part by Moravcsik’s (1975, 1990) interpretation of Aristotelian aitia). These are the formal role, the basic category that distinguishes the object within a larger domain; the constitutive role, the relation between an object and its constituent parts; the telic role, its purpose and function; and the agentive role, factors involved in the object’s origin or ‘coming into being.’ Qualia structure is at the core of the generative properties of the lexicon since it provides a general strategy for creating new types. For example, consider the properties of nouns such as rock and chair. These nouns can be distinguished on the basis of semantic criteria that classify them in terms of general categories such as natural_kind, artifact_ object. Although very useful, this is not sufficient to discriminate semantic types in a way that also accounts for their grammatical behavior. A crucial distinction between rock and chair concerns the properties that differentiate natural_kinds from artifacts: Functionality plays a crucial role in the process of individuation of artifacts but not of natural kinds. This is reflected in grammatical behavior, whereby ‘a good chair’ or ‘enjoy the chair’ are well-formed expressions reflecting the specific purpose for which an artifact is designed, but ‘good rock’ or ‘enjoy a rock’ are semantically ill formed since for rock the functionality (i.e., telic) is undefined. Exceptions exist when new concepts are referred to, such as when the object is construed relative to a specific activity, as in ‘The climber enjoyed that rock’; rock takes on a new meaning by virtue of having telicity associated with it, and this is accomplished by integration with the semantics of the subject NP. Although chair and rock are both physical_object, they differ in their mode of coming into being (i.e., agentive): artifacts are man-made, rocks develop in nature. Similarly, a concept such as food or cookie has a physical manifestation or denotation, but also a functional grounding, pertaining to the relation of ‘eating.’ These apparently contradictory aspects of a category are orthogonally represented by the qualia structure for that concept, which provides a

coherent structuring for different dimensions of meaning. See also: Compositionality: Semantic Aspects; Lexicon, Generative; Semantics of Prosody; Syntax-Semantics Interface.

Bibliography Alsina A (1992). ‘On the argument structure of causatives.’ Linguistic Inquiry 23(4), 517–555. Apresjan J D (1973). ‘Regular polysemy.’ Linguistics 142, 5–32. Bouillon P (1998). Polymorphie et semantique lexicale: le cas des adjectifs. Lille: Presse du Septentrion. Bouillon P & Busa F (2001). The language of word meaning. Cambridge [England], New York: Cambridge University Press. Bre´ al M (1897). Essai de se´mantique (science des significations). Paris: Hachette. Bresnan J (ed.) (1982). The mental representation of grammatical relations. Cambridge, MA: MIT Press. Bresnan J (1994). ‘Locative inversion and the architecture of universal grammar.’ Language 70, 72–131. Briscoe T, de Paiva V & Copestake A (eds.) (1993). Inheritance, defaults, and the lexicon. Cambridge, UK: Cambridge University Press. Carpenter R (1992). ‘Typed feature structures.’ Computational Linguistics 18, 2. Chomsky N (1955). The logical structure of linguistic theory. Chicago: University of Chicago Press (Original work published 1975). Chomsky N (1965). Aspects of the theory of syntax. Cambridge: MIT Press. Comrie B (1981). Language universals and linguistic typology. Chicago, IL: The University of Chicago Press. Copestake A & Briscoe T (1992). ‘Lexical operations in a unification-based framework.’ In Pustejovsky J & Bergler S (eds.) Lexical semantics and knowledge representation, Berlin: Springer Verlag. Cruse D A (1986). Lexical semantics. Cambridge, UK: Cambridge University Press. Cruse D A (2004). Meaning in language: an introduction to semantics and pragmatics (2nd edn.). Oxford: Oxford University Press. Davidson D (1967). ‘The logical form of action sentences.’ In Rescher N (ed.) The logic of decision and action. Pittsburgh: Pittsburgh University Press. Davis A (1996). Lexical semantics and linking and the hierarchical lexicon. Ph.D. diss., Stanford University. Davis A & Koenig J-P (2000). ‘Linking as constraints on word classes in a hierarchical lexicon.’ Language 2000. Dowty D R (1979). Word meaning and Montague grammar. Dordrecht, The Netherlands: D. Reidel. Dowty D R (1989). ‘On the semantic content of the notion ‘‘thematic role’’.’ In Chierchia G, Partee B & Turner R

Lexical Semantics: Overview 105 (eds.) Properties, types, and meaning, vol. 2. Semantic issues. Dordrecht: D. Reidel. Dowty D (1991). ‘Thematic proto-roles and argument selection.’ Language 67, 547–619. Erdmann K (1900). Die Bedeutung des Wortes: Aufsa¨ tze aus dem Grenzgebiet der Sprachpsychologie und Logik. Avenarius: Leipzig. Evans R & Gazdar G (1990). ‘The DATR papers: February 1990.’ Cognitive Science Research Paper CSRP 139, School of Cognitive and Computing Science, University of Sussex, Brighton, England. Fillmore C (1965). Entailment rules in a semantic theory. POLA Report 10. Columbus, OH: Ohio State University. Fillmore C (1968). ‘The case for case.’ In Bach E W & Harms R T (eds.) Universals in linguistic theory. New York: Holt, Rinehart and Winston. Gazdar G, Klein E, Pullum G & Sag I (1985). Generalized phrase structure grammar. Cambridge, MA: Harvard University Press. Goldberg A (1995). Constructions: a construction grammar approach to argument structure. Chicago: University of Chicago Press. Grimshaw J (1979). ‘Complement selection and the lexicon.’ Linguistic Inquiry 10, 279–326. Grimshaw J (1990). Argument structure. Cambridge: MIT Press. Gruber J S (1965/1976). Lexical structures in syntax and semantics. Amsterdam: North-Holland. Gruber J S (1976). Lexical structures in syntax and semantics. Amsterdam: North-Holland. Hale K & Keyser J (1993). On argument structure and the lexical expression of syntactic relations: the view from building 20. Cambridge, MA: MIT Press. Halle M, Bresnan J & Miller G (eds.) (1978). Linguistic theory and psychological reality. Cambridge: MIT Press. Higginbotham J (1985). ‘On Semantics.’ Linguistic Inquiry 16, 547–593. Hjelmslev L (1961). Prolegomena to a theory of language. Whitfield F (ed.). Madison: University of Wisconsin Press (Original work published 1943). Jackendoff R (1972). Semantic interpretation in generative grammar. Cambridge: MIT Press. Jackendoff R (1983). Semantics and cognition. Cambridge, MA: MIT Press. Jackendoff R (1990). Semantic structures. Cambridge: MIT Press. Jackendoff R (1992). ‘Babe Ruth homered his way into the hearts of America.’ In Stowell T & Wehrli E (eds.) Syntax and the lexicon. San Diego: Academic Press. 155–178. Jackendoff R (2002). Foundations of language: brain, meaning, grammar. Oxford: Oxford University Press. Jakobson R (1970). Recent developments in linguistic science. Perenial Press. Jakobson R (1974). Main trends in the science of language. New York: Harper & Row. Jakobson R & Halle M (1956). Fundamentals of language. The Hague, The Netherlands: Mouton. Katz J (1972). Semantic theory. New York: Harper & Row.

Katz J & Fodor J (1963). ‘The structure of a semantic theory.’ Language 39, 170–210. Lakoff G (1965/1970). Irregularity in syntax. New York: Holt, Rinehart, and Winston. Levin B & Rappaport Hovav M (1995). Unaccusativity: at the syntax–semantics interface. Cambridge: MIT Press. Lyons J (1977). Semantics (2 volumes). Cambridge: Cambridge University Press. McCawley J (1968). ‘Lexical insertion in a transformational grammar without deep structure.’ Proceedings of the Chicago Linguistic Society 4. Mel’cuk I A (1988b). Dependency syntax. Albany, NY: SUNY Press. Miller G (1991). The science of words. New York: Scientific American Library. Miller G, Beckwith R, Fellbaum C, Gross D & Miller K J (1990). ‘Introduction to WordNet: an on-line lexical database.’ International Journal of Lexicography 3, 235– 244. Moens M & Steedman M (1988). ‘Temporal ontology and temporal reference.’ Computational Linguistics 14, 15–28. Moravcsik J M (1975). ‘Aitia as generative factor in Aristotle’s philosophy.’ Dialogue 14, 622–636. Moravcsik J M (1990). Thought and language. London: Routledge. Parsons T (1990). Events in the semantics of English. Cambridge, MA: MIT Press. Pinker S (1989). Learnability and cognition: the acquisition of argument structure. Cambridge: MIT Press. Pollard C & Sag I (1994). Head-driven phrase structure grammar. Chicago University of Chicago Press, Stanford CSLI. Pustejovsky J (1991). ‘The syntax of event structure.’ Cognition 41, 47–81. Pustejovsky J (1995). The generative lexicon. Cambridge: MIT Press. Pustejovsky J (2001). ‘Type construction and the logic of concepts.’ In Bouillon P & Busa F (eds.) The syntax of word meaning. Cambridge: Cambridge University Press. Pustejovsky J & Boguraev P (1993). Lexical knowledge representation and natural language processing. Artificial Intelligence 63, 193–223. Pustejovsky J & Busa F (1995). ‘Unaccusativity and event composition.’ In Bertinetto P M, Binachi V, Higginbotham J & Squartini M (eds.) Temporal reference: aspect and actionality. Turin: Rosenberg and Sellier. Rappaport Hovav M & Levin B (2001). ‘An event structure account of English resultatives.’ Language 77, 766–797. Sanfilippo A (1993). ‘LKB encoding of lexical knowledge.’ In Briscoe T, de Paiva V & Copestake A (eds.) Inheritance, defaults, and the Lexicon. Cambridge: Cambridge University Press. Saussure F de (1983). Course in general linguistics. Harris R (trans.). (Original work published 1916). Stern G (1968). Meaning and change of meaning. With special reference to the English langage. Bloomington: Indiana University Press (Original work published 1931).

106 Lexical Semantics: Overview Tenny C & Pustejovsky J (2000). Events as grammatical objects. Chicago: University of Chicago Press. Trier J (1931). Der deutsche Wortschatz im Sinnbezirk des Verstandes: Die Geschichte eines sprachlichen Feldes. Band I. Heidelberg: Heidelberg.

Weinreich U (1972). Explorations in semantic theory. The Hague, The Netherlands: Mouton. Williams E (1981). ‘Argument structure and morphology.’ Linguistic Review 1, 81–114.

Lexicalization K Bakken, University of Oslo, Oslo, Norway ! 2006 Elsevier Ltd. All rights reserved.

Lexicalization – A Variable Concept The term lexicalization is used with a broad range of related but distinct meanings within linguistics. One very basic meaning of the word pertains specifically to conceptualization. Concepts, which are expressed by a lexical form, may be described as ‘lexicalized.’ A lexicalized concept could be juxtaposed to a concept that is not lexicalized, for example, in a comparison of languages (cf. Lyons, 1977: 235; Talmy, 1985). The concept ‘the succession of one day and one night’ is thus lexicalized as døgn in Norwegian (and Danish), but has no lexical equivalent in English. New concepts that are given lexical form become part of the lexicon of a language and the process of establishing a new unit in any specific lexicon is commonly referred to as lexicalization. All new words and meanings may thus potentially be lexicalized and the term is often activated in discussions of the integration of loan words, the development of new meanings, and the establishing of idioms as lexical units. Most important in this respect is the concept’s relevance to regular word-formation processes, which is discussed below. Another use of lexicalization indicates a contrast to grammaticalization. Lehmann (1989) compares the two concepts by defining them as processes that move signs in different directions across the border between lexicon and grammar. Moreno Cabrera (1998), on the other hand, defines the relationship between the phenomena as complementary; grammaticalization involves metaphorical abstraction processes, whereas lexicalization has to do with metonymical concretion. Grammaticalization viewed in this light represents a well-developed field of linguistic research, whereas lexicalization thus conceived does not. Hopper and Traugott (1993: 49) briefly refer to lexicalization as the rare ‘‘process whereby a nonlexical form such as up becomes a fully referential lexical item,’’ thus evoking the difference between grammatical and lexical words.

Finally, a fairly common use of lexicalization refers to the process when morphosyntactic, syntactic, or pragmatic means of conveying information are taken over by lexical forms. One example is Moore (1988), who demonstrates how productive morphosyntactic categories are lexicalized word by word in a situation of language obsolescence; another is Bybee (1985: 18), who refers to examples where once productive causatives or present-tense morphemes are reanalyzed as lexical forms.

Lexicalization and the Lexicon In broad terms, lexicalization has to do with including new items in the lexicon. Therefore, one’s conceptualization of the lexicon is a prerequisite for the use one makes of the term lexicalization. Pawley (1986) makes a point of contrasting the then-dominant structuralist or generative notion of the lexicon with that implied by lexicographers and ‘lay lexicographers.’ Since 1986, linguistics has undergone a major shift in the relative amount of interest paid to the lexicon. The ‘lexicographers’ lexicon’ that Pawley pointed to can much more easily be harmonized with the new host of usage-based theories, with cognitive linguistics being the most elaborated framework (cf. Langacker, 1987, 1991, 2000). Here, the mental lexicon is seen as the primary locus of the human language capacity, the network is a metaphor much used to describe its organization, and the distinction between productive rules and stored units is to a large degree erased. Lexicalization is thus dependent on theories of the lexicon. Lexicalization used within generative theories such as the Lexicalist approach during the 1970s (cf. Bauer, 1988) or within Lexical Morphology during the 1980s evokes a host of theoretical assumptions that cannot easily be compared to those evoked by linguists working within usage-based theories (cf., for example, Svanlund, 2002).

Lexicalization and Word Formation It is only in relation to regular word formation that lexicalization has been the focus of major research on

106 Lexical Semantics: Overview Tenny C & Pustejovsky J (2000). Events as grammatical objects. Chicago: University of Chicago Press. Trier J (1931). Der deutsche Wortschatz im Sinnbezirk des Verstandes: Die Geschichte eines sprachlichen Feldes. Band I. Heidelberg: Heidelberg.

Weinreich U (1972). Explorations in semantic theory. The Hague, The Netherlands: Mouton. Williams E (1981). ‘Argument structure and morphology.’ Linguistic Review 1, 81–114.

Lexicalization K Bakken, University of Oslo, Oslo, Norway ! 2006 Elsevier Ltd. All rights reserved.

Lexicalization – A Variable Concept The term lexicalization is used with a broad range of related but distinct meanings within linguistics. One very basic meaning of the word pertains specifically to conceptualization. Concepts, which are expressed by a lexical form, may be described as ‘lexicalized.’ A lexicalized concept could be juxtaposed to a concept that is not lexicalized, for example, in a comparison of languages (cf. Lyons, 1977: 235; Talmy, 1985). The concept ‘the succession of one day and one night’ is thus lexicalized as døgn in Norwegian (and Danish), but has no lexical equivalent in English. New concepts that are given lexical form become part of the lexicon of a language and the process of establishing a new unit in any specific lexicon is commonly referred to as lexicalization. All new words and meanings may thus potentially be lexicalized and the term is often activated in discussions of the integration of loan words, the development of new meanings, and the establishing of idioms as lexical units. Most important in this respect is the concept’s relevance to regular word-formation processes, which is discussed below. Another use of lexicalization indicates a contrast to grammaticalization. Lehmann (1989) compares the two concepts by defining them as processes that move signs in different directions across the border between lexicon and grammar. Moreno Cabrera (1998), on the other hand, defines the relationship between the phenomena as complementary; grammaticalization involves metaphorical abstraction processes, whereas lexicalization has to do with metonymical concretion. Grammaticalization viewed in this light represents a well-developed field of linguistic research, whereas lexicalization thus conceived does not. Hopper and Traugott (1993: 49) briefly refer to lexicalization as the rare ‘‘process whereby a nonlexical form such as up becomes a fully referential lexical item,’’ thus evoking the difference between grammatical and lexical words.

Finally, a fairly common use of lexicalization refers to the process when morphosyntactic, syntactic, or pragmatic means of conveying information are taken over by lexical forms. One example is Moore (1988), who demonstrates how productive morphosyntactic categories are lexicalized word by word in a situation of language obsolescence; another is Bybee (1985: 18), who refers to examples where once productive causatives or present-tense morphemes are reanalyzed as lexical forms.

Lexicalization and the Lexicon In broad terms, lexicalization has to do with including new items in the lexicon. Therefore, one’s conceptualization of the lexicon is a prerequisite for the use one makes of the term lexicalization. Pawley (1986) makes a point of contrasting the then-dominant structuralist or generative notion of the lexicon with that implied by lexicographers and ‘lay lexicographers.’ Since 1986, linguistics has undergone a major shift in the relative amount of interest paid to the lexicon. The ‘lexicographers’ lexicon’ that Pawley pointed to can much more easily be harmonized with the new host of usage-based theories, with cognitive linguistics being the most elaborated framework (cf. Langacker, 1987, 1991, 2000). Here, the mental lexicon is seen as the primary locus of the human language capacity, the network is a metaphor much used to describe its organization, and the distinction between productive rules and stored units is to a large degree erased. Lexicalization is thus dependent on theories of the lexicon. Lexicalization used within generative theories such as the Lexicalist approach during the 1970s (cf. Bauer, 1988) or within Lexical Morphology during the 1980s evokes a host of theoretical assumptions that cannot easily be compared to those evoked by linguists working within usage-based theories (cf., for example, Svanlund, 2002).

Lexicalization and Word Formation It is only in relation to regular word formation that lexicalization has been the focus of major research on

Lexicalization 107

a broader basis. Here, Lipka (1977), Bauer (1983), Bakken (1998), and Svanlund (2002) will be used as the major points of orientation. A long-recognized problem when specifying word-formation rules is that they are not easily formulated on the basis of a corpus of established words. Productive word-formation rules produce new complex words, but having come into existence, these words ‘step out’ of the word-formation grammar. They develop according to other principles and it is these ‘other principles’ that lexicalization research has focused on. Lexicalization as Conventionalization

The starting point for the lexicalization of complex words is the nonce formation (Bauer, 1983: 45; Bakken, 1998; Morita, 1995). For a new word form to be established in a socially or mentally defined lexicon, a conventional connection needs to be established between the concept and the new word form; the word must be established as a conventional sign. Lipka (1977: 155) refers to this conventional connection of form and meaning by the term Hypostasierung and sees it as the immediate cause of the lexicalization process proper as he defines it. Bauer (1983) refers to the phenomenon as ‘institutionalization’ and, like Lipka, keeps institutionalized and lexicalized words apart. In Bakken (1998), conventionalization is the preferred term for the process of establishing a new unit in the lexicon and, more importantly, conventionalization is seen as the first phase of a gradual lexicalization process, thus making lexicalization a gradient quality of a complex word. Bakken (1998) is inspired by Cognitive Linguistics, where entrenchment is the term translating into conventionalization. Usage is seen as the sanctioning factor that enhances the degree of conventionalization and thus lexicalization. Svanlund (2002) also works within the framework of Cognitive Linguistics, but wants to reserve the term lexicalization for the establishing process. Where other accounts seek to relate the semantics of the complex word to the conventionalization process, Svanlund argues that no logical connection can be established between the degree of conventionalization and the semantics of the complex word. (See Lexicalization as an Aspect of Meaning below.) Lexicalization as Irregularity

If the lexicon is defined negatively in relation to grammar as Bloomfield (1933) does, it is simply a place where linguistic units are stored rather than generated by rule. A natural consequence of this view of the lexicon is to define lexicalization as a concomitant of irregularity. Bauer (1983) is the most explicit representative of this approach. Since one

must accept that nonce formations are produced by word-formation rules, the irregularity of a complex word must occur after the word formation. It has often been noted that complex words tend to develop in unpredictable directions once they have come into existence. These developments are of different kinds, and Bauer (1983) refers to them as ‘‘types of lexicalization,’’ distinguishing between syntactic, semantic, morphological, and phonological lexicalization. Semantic lexicalization (referred to as idiomatization by Lipka, 1977) has occurred when, for example, a compound has developed a meaning that is unpredictable given the meanings of its parts. Phonological lexicalization has to do with phonotactic developments blurring the original make-up of the word and morphological lexicalization characterizes complex words with unpredictable inflection. Bauer gives a nuanced discussion of the reasons for the irregularities and is careful to note that whereas some of the discrepancies between the thus lexicalized words and the productive rules are caused by word-internal developments, some are indeed caused by changes in the grammar at large or even changes in the material word (pertains to word meanings). In Bakken (1998: 61–124), it is argued that the lexicalization process essentially is a semantic process, whereby the originally complex word loses its compositional meaning and gradually becomes an arbitrary sign. The occurrence of formal idiosyncracies is seen as secondary, since they allegedly have non-compositionality as a prerequisite. More importantly, Bakken (1998: 28–35) questions the sharp distinction between regular and irregular word formation, in line with Cognitive Linguistics, and argues that motivation rather than predictability must be the evaluating measure for the description of complex words. It is essentially the theory of the lexicon that is at stake, but the consequences for lexicalization models are obvious. This line of reasoning is followed up by Svanlund (2002) and it is demonstrated how particularly problematic the notion of semantic lexicalization is (see below). Lexicalization as an Aspect of Meaning

When a complex word is labeled as lexicalized on semantic grounds, the most common measure has been unpredictability. The Norwegian word barsel was originally a compound consisting of the elements barn (‘child’) and øl (‘beer’). It used to refer to the party given on the occasion of a childbirth. It now refers to the confinement itself. This straightforward example would logically be referred to as semantic lexicalization by Bauer, on the grounds that the meaning of the whole is not (any longer) predicted by the

108 Lexicalization

meaning of its parts. However, the whole notion of compositionality is a problematic concept and Svanlund (2002) raises the question of whether any complex word at any time is truly compositional and predictable in meaning. He shows how polysemy, metaphor, and the number of possible relations between elements undermine any naive idea of predictability. On this basis, he presents an alternative model of analyzing the meanings of complex words. The key word here is motivation and one of his main claims is that the model is as applicable to nonce words as to well-established ones, thus demonstrating how the semantics of a complex word is independent of its conventionalization. Bakken (in press), on the other hand, seeks to justify the connection between conventionalization and word meaning. She concedes that the meaning of complex words can be more or less motivated like Svanlund claims and that a linguist could apply Svanlund’s model to both nonce words and wellestablished words. However, she insists that the importance does not lie not in the conscious analysis of meaning elements, but in the salience of these elements for the language user. Conventionalization enhances automation, and it is in this respect that the non-compositionality of words, i.e., their meaning, could be interpreted as being relevant to the lexicalization process. See also: Cognitive Linguistics; Grammaticalization; Lexical Phonology and Morphology; Lexical Semantics: Overview; Lexicon: Structure; Metaphors, Grammatical; Polysemy and Homonymy; Word Formation.

Bibliography Bakken K (1998). Leksikalisering av sammensetninger. En studie av leksikaliseringsprosessen belyst ved et gammelnorsk diplommateriale fra 1300-tallet. Acta Humaniora 38. Oslo: Universitetsforlaget. Bakken K (in press). ‘Leksikaliseringsmodellar og leksikografisk praksis.’ In Lie S, Nedrelid G & Omdal H (eds) MONS 10, Utvalde artikler fra˚ det tiande møtet om norsk spra˚k i Kristiansand (2003). Kristiansand: Høyskole-furlaget. 107–116. Bauer L (1983). English word-formation. Cambridge, UK: Cambridge University Press. Bauer L (1988). Introducing linguistic morphology. Edinburgh: Edinburgh University Press. Bloomfield L (1933). Language. Chicago–London: The University of Chicago Press.

Bybee J L (1985). Typological studies in language. 9: Morphology: A study of the relation between meaning and form. Amsterdam: John Benjamins. Grimm U (1991). Lexikalisierung im heutigen Englisch am Beispiel der -er Ableitungen. Tu¨bingen: Gunter Narr Verlag. Hopper P J & Traugott E C (1993). Grammaticalization. Cambridge, UK: Cambridge University Press. Langacker R W (1987). Cognitive linguistics 1: Theoretical prerequisites. Stanford: Stanford University Press. Langacker R (1991). Cognitive linguistics 2: Descriptive application. Stanford: Stanford University Press. Langacker R W (2000). ‘A dynamic usage-based model.’ In Barlowe M & Kemmer S (eds.) Usage based models of language. CLSI Publications: Stanford. 1–64. Lehmann C (1989). ‘Grammatikalisierung und Lexikalisierung.’ Zeitschrift fu¨ r Phonetik, Sprachwissenschaft und Kommunikationsforschung (ZPSK) 42(1), 11–19. Lipka L (1977). ‘Lexikalisierung, Idiomatisierung und Hypostasierung als Probleme einer synchronischen Wortbildungslehre.’ In Brekle H E & Kastovsky D (eds.) Perspektiven der Wortbildungsforschung. Beitra¨ ge zum Wuppertaler Wortbildungskolloquium vom 9–10 Juli 1976. Anla¨ sslich des 70. Geburtstags von Hans Marchand am 1. Oktober 1977. Bonn: Bouvier Verlag Herbert Grundmann. 155–163. Lyons J (1977). Semantics 1. Cambridge, UK: Cambridge University Press. Moore R E (1988). ‘Lexicalization versus lexical loss in Wasco-Wishram language obsolescence.’ International Journal of American Linguistics 54(1), 453–468. Moreno Cabrera J C (1998). ‘On the relationships between grammaticalization and lexicalization.’ In Ramat A G & Hopper P (eds.) Typological studies in language 37: The limits of grammaticalization. Amsterdam/Philadelphia: John Benjamins. 211–229. Morita J (1995). ‘Lexicalization by way of contextdependent nonce word formation.’ English Studies 5, 468–473. Pawley A (1986). ‘Lexicalization.’ In Tannen D & Alatis J E (eds.) Georgetown Universal Round Table on Languages and Linguistics 1985. Languages and linguistics: The interdependence of theory, data and application. Washington, DC: Georgetown University Press. 98–120. Ryder M E (2000). ‘Complex -er nominals: Where grammaticalization and lexicalization meet?’ In Contini-Morava E & Tobin Y (eds.) Between grammar and lexicon: Current issues in linguistic theory. Amsterdam/Philadelphia: John Benjamins. Talmy L (1985). ‘Lexicalization patterns: Semantic structure in lexical forms.’ In Shopen T (ed.) Language typology and syntactic description 3: Grammatical categories and the lexicon. Cambridge, UK: Cambridge University Press. 57–149. Svanlund J (2002). ‘Lexicalization.’ Spra˚k & Stil. Tidskrift fo¨r svensk spra˚kforskning 12, 7–45.

Lexicase 109

Lexicase R Kikusawa, National Museum of Ethnology, Osaka, Japan ! 2006 Elsevier Ltd. All rights reserved.

Introduction Lexicase is a type of European-style dependency/ valency grammar, which evolved out of Americanstyle generative transformational grammar. It is a monostratal syntactic theory, which claims that the information for well-structured sentences is implied in each lexical item (or ‘word’) as its features. Thus, the grammar, such as morphological and syntactic mechanisms and semantic relations, are all described as features carried by lexical items. The assignment of three different types of cases, namely case relations, case forms, and the macrorole actor, is considered to be a prominent part of the grammar (thus the name lexi-case). The theory was developed by Stanley Starosta at the University of Hawai‘i, where it has been applied in the study of some 40 languages, mainly Pacific and Asian languages, and English, with some limited work on Greek, Swahili, and Native American languages. In Lexicase, the sentences are analyzed as spoken/ written, that is, without going through hypothetical levels, such as ‘underlying structures’ and ‘movements.’ This enables the direct comparison of syntactic structures of different languages. Lexicase has also been applied in some morphosyntactic (historical) comparisons, mainly of Austronesian languages, as well as computational language processing.

Regent-Dependent Relationship and Grammatical Description Being a version of Dependency Grammar, elements constituting sentences are analyzed pair-wise as a sum of regent (or ‘head’) and dependent relations. A regent carries features that specify what kind of dependents it must take, it may take, and it cannot take. These include syntactic properties and word order, as well as morphological and semantic characteristics of the dependent. When all the regents in a sentence have dependents to meet all such features, then this sentence is well-formed and ‘grammatical.’ How this operates with actual language data is illustrated here with a few English to prepositional phrases. The locative preposition to carries features such as the following: (1) it has to take a noun as its dependent; (2) its dependent noun has to be accusative; (3) its dependent noun has to be locational, and

(4) it must be followed by a dependent noun, thus specifying the word category of the dependent, the form of the dependent, the semantic nature of the dependent, and the relative word order between the dependent and itself. In Lexicase formalism, the features that a word carries are listed under each form, as shown in (1). The symbol ‘?’ indicates that whatever follows this symbol is required as a dependent by this form; the feature ‘@ndex’ indicates the respective position of this particular form in a sentence. Both of these symbols are variable and are replaced with appropriate position numbers once an actual sentence or phrase is formed. (1) to @ndex P ? [N] " N Acc ! " ? N loc

?

!

@< ?[N]

. . . position index of this word . . . lexical category of this word . . . required feature of the dependent (a Noun) (i) . . . required feature of the dependent (an Accusative noun) (ii) . . . required feature of the dependent (a noun indicating location) (iii) . . . required feature of the dependent (follows the form to) (iv)

The phrases to her and to Amsterdam are both wellstructured, since both her and Amsterdam in these phrases satisfy all the features required by its regent. They are shown in (2), where the variables (‘?’ and ‘@’) are replaced with appropriate position numbers. (2a) to 1ndex P 2 ![ N ] " 2 N Acc ! " 2 N loc 1 < 2[N]

her 2ndex N Acc loc

(2b) to 1ndex P 2 ![ N ] " 2 N Acc ! " 2 N loc 1 < 2[N]

Amsterdam 2ndex N loc

On the other hand, in the phrases *to she and *Amsterdam to, in each case one of the required features is not satisfied, as indicated with the symbol ‘*’ in (3); the former violating the case requirement and the latter violating the word order requirement. Both are ungrammatical. (3a) *to 1ndex P 2 ![ N ] " *2 N Acc 1 < 2[N]

she 2ndex N Nom

(3b) *Amsterdam to 1ndex 2ndex N P 1 ![ N ] " 1 N Acc *2 < 1[N]

110 Lexicase

It should be noted that only the relevant features of each word are listed here in (1) through (3), and words are considered to actually carry a far greater number of grammatical features. In Lexicase, the sum of such features of words constituting a sentence determines whether the whole sentence is wellformed, thus the expression ‘the grammar comes from the lexicon.’

Word Categories One of the basic features that words carry is their category, or ‘part of speech.’ Word categories are determined by their syntactic distribution. For example, if a form occurs as a dependent of and only of a noun, always occurring at the outer edge of an NP (on either side), and if it never occurs as an immediate dependent of a pronoun, then it is analyzed as a

(4)

The 1ndex Det

man 2ndex N 1[Det] ?([Adj])

saw 3ndex V 6[N] 2[N]

a 4ndex Det

assumed in a Lexicase grammar. For example, the word ‘look’ in a sentence Let’s have a look. with a Determiner preceding it would be analyzed as a noun, while the same form in the phrase Look at me. would be analyzed as a verb. The two forms look1 and look2 are analyzed as different lexical items that have a derivational relationship with each other. An example of a well-formed English sentence ‘The man saw a big dog.’ is shown in (4) with word category analysis. Elements indicated with parentheses are optional, so that in this example ?([Adj]) means that an Adjective optionally occurs with the form man. Therefore, even when the variable ‘?’ is not fulfilled it does not affect the well-formedness of the sentence. Thus, optional features, including word order flexibility, are also considered to be a part of the grammatical information carried by each word.

big 5ndex Adj

Determiner. The eight recognized word categories, namely, Noun (N), Verb (V), Determiner (Det), Adjective (Adj), Adverb (Adv), Pre/Postposition (P), Conjunction (Conj), and Sentence Particle (Sprt), and the regent-dependent relationship among these categories, summarized in Table 1, are considered to be cross-linguistic features. It should be noted that various ‘functions’ that each word may carry are considered to be separate features from the word category, and are language-specific. For example, a Determiner, defined as above, may carry functions such as indicating a case form, marking definiteness or gender, etc., depending on the language. Because word categories are determined based on their syntactic distribution, multiple homophones are

dog. 6ndex N 4[Det] 5([Adj])

. . . position index of each word . . . word category of each word . . . dependent required by each word . . . dependent required by each word

Three Cases – Case Relations, Case Forms, and the Macrorole

In a Lexicase grammar, nouns are considered to carry three different ‘cases,’ namely, case relations, case forms, and the macrorole. The assignment of these different types of cases enables various languageinternal and cross-linguistic grammatical generalizations to be captured, especially generalizations shared by both ergative and accusative types of languages. In a sentence, every noun carries a case relation (or, the feature þprdc [predicate], which is considered to grammatically commute with case relations) and a case form, while only one noun in a clause carries the macrorole.

Table 1 Word categories and dependency relations

Verb Noun Pre/post position Adverb Adjective Determiner Conjunction Sentence particle

Verb

Noun

Pre/post-position

Adverb

Adjective

Determiner

Conjunction

Sentence particle

þ þ þ " " " þ "

þ þ þ " " " þ "

þ þ þ " " " þ "

þ " " " þ " þ "

" þ " þ " " þ "

" þ " " " " " "

þ þ þ " " " þ "

þ þ þ " " " " "

Each column indicates whether the word category indicated in the top row (shown in italics) can be a dependent of the word category indicated in the left column (shown in bold). The symbol ‘þ’ indicates that it can, and the symbol ‘"’ indicates that it cannot.

Lexicase 111

There are five case relations: 1. 2. 3. 4. 5.

Patient (PAT) Agent (AGT) Locus (LOC) Means (MNS) Correspondent (COR)

Among these, PAT is required by every verb (including all intransitive verbs) as its complement (Andrew in (5), John in (6) and (7)), while AGT is required by every transitive verb as its additional complement (Mary in (6)). Case forms refer to any morphological and/or syntactic configurations that characterize a set of nouns in a sentence. The number of case forms that occur in a language may differ depending on each specific language. The major case forms are labeled as follows: . Nominative (Nom): The case form that indicates both the [PAT] of intransitive sentences and either the [PAT] or the [AGT] of transitive sentences. This case form is shared by all languages. Note that by definition this covers also what is often referred to as ‘absolutive.’ . Accusative (Acc): The case form that marks the [PAT] of transitive sentences in an accusative language. . Ergative (Erg): The case form that only marks the [AGT] of a transitive clause. However, if the form also marks instruments, it is labeled as Instrumental (Ins), and if it marks adnominal possessors, it is labeled as Genitive (Gen). Unlike case relations and case forms, which are assigned to every occurring Noun, there is only one macrorole in a sentence, ‘actor’ ([actr]), which is assigned to the same noun as the [AGT] of a transitive clause or the [PAT] of an intransitive clause. Roughly speaking, the notion [PAT] corresponds to the thematic roles patient and theme in Case Grammar, and the theta role patient in Government and Binding theory. The Nominative case form roughly corresponds to what is often referred to as the ‘(grammatical) subject,’ and the notion [actr] to what is often referred to as the ‘logical subject.’ Some sentence examples with three cases indicated are shown in (5) through (7). The case form of the word Mary in (6) cannot be determined from the form of the word itself, but is determined by its immediate postverbal word order and by the potential alternation with an accusative pronoun she. (5) He N Nom PAT actr

ran. V -trns

. . . word category . . . case form (N), transitivity (V) . . . case relation . . . macrorole

(6) Mary N Nom AGT actr

hit V þtrns

him. N Acc PAT

. . .word category . . . case form (N), transitivity (V) . . . case relation . . . macrorole

(7) He was hit by her. N V V P N Nom -trns þadjc Acc PAT actr

. . . word category . . . case form (N), transitivity (V) . . . case relation . . . macrorole

The description of certain grammatical characteristics, including the actancy system (whether a language is ergative or accusative), agreement patterns, and patterns of control are described using these three cases. For example, an ergative system is defined in Lexicase as one where the [PAT] of both transitive and intransitive sentences is marked by the same case form, while an accusative system is one where the [actr] of both transitive and intransitive sentences is marked by the same case form.

The Application of Rules in Lexicase Grammar Word categories and case types are the basic features that are employed to analyze and describe sentences. However, various rule types are employed to capture generalizations within the language and to enable cross-linguistic generalizations to be made. For example, Redundancy Rules (RRs) and Subcategorization Rules (SRs) are implicational statements about relations between features. The former simply states that if a word carries one or more particular features, it will also contain certain other features. For example, if a word carries the feature [þprnn] (pronominal), then it must carry the feature [N] (Noun). The latter is the reverse of the former. For example, if a word carries the feature [N], then it may be subcategorized into either [þprnn] (pronominal) or [-prnn] (nonpronominal). Subcategorization Rules also contain what can be referred to as ‘Inflectional Subcategorization Rules,’ which break up stems into paradigms of inflected forms. For example, if a word is [-prnn] (nonpronominal, which implies that it is a subcategory of a Noun), it is inflected into [#plrl] (plural or nonplural). Linking Rules (LRs) copy the index of the dependent word onto the regent matrix as the value of one of its contextual features (the replacement of the symbols ‘?’ in the regent with the numbers that fill the symbols ‘@’ in (2) and (3)). Chaining Rules are also used to copy indices from items in higher or lower domains to satisfy contextual features in cases

112 Lexicase

Figure 1 Relationship among lexicase rules.

where the dependency relationship is between words in different dependency domains. Chaining Rules (CRs) include the index mapping related to infinitival complementation (‘control’) and displacement (‘wh-movement’). In addition, there are Derivational Rules (DRs), which state analogical relationships among subsets of lexical entries and are used to account for the active-passive relationship, and ‘raising’ to subject and object, etc.; Morphological Rules (MRs), which specify morphological marking of inflectional features; and Inflectional Redundancy Rules (IRRs), which specify agreement relationship and word order patterns. The relationship among these rules is shown in Figure 1. See also: Case; Dependency Grammar; Role and Refer-

ence Grammar; Thematic Structure.

Bibliography Acson V (1979). ‘A diachronic view of case-marking systems in Greek: a localistic-lexicase analysis.’ Ph.D. diss., University of Hawai‘i.

Alves M J (2000). ‘A Pacoh analytic grammar.’ Ph.D. diss., University of Hawai‘i. Brown E K & Miller J (eds.) (1996). Concise encyclopedia of syntactic theories. Oxford: Elsevier Science. de Guzman V P & Bender B W (eds.) (2000). Grammatical analysis: morphology, syntax and semantics: studies in honor of Stanley Starosta. Oceanic Linguistics Special Publication (29). Honolulu: University of Hawai‘i Press. Fraser N (1989). ‘Review of ‘‘The case for lexicase: an outline of lexicase grammatical theory’’ by Stanley Starosta. Pinter 1988.’ Linguistics 15–2, 114–115. Indrambarya K (1994). ‘Subcategorization of verbs in Thai: a lexicase dependency approach.’ Ph.D. diss., University of Hawai‘i. Jeong Hy-sook (1992). ‘A valency subcategorization of verbs in Korean and Russian: a lexicase dependency approach.’ Ph.D. diss., University of Hawai‘i. Kikusawa R (2002). Proto Central Pacific ergativity: its reconstruction and development in the Fijian, Rotuman and Polynesian languages. Pacific Linguistics (520). Canberra: Pacific Linguistics. Lee N I (1993). ‘Complementation in Japanese: a lexicase analysis.’ Ph.D. diss., University of Hawai‘i. Nagao M (ed.) (1986). Proceedings of the eleventh international conference on computational linguistics (COLING 86). Bonn: University of Bonn. Pagotto L (1987). ‘Verb subcategorization and verb derivation in Marshallese: a lexicase analysis.’ Ph.D. diss., University of Hawai‘i. Reid L A (2000). ‘Sources of Proto-Oceanic initial prenasalization: the view from outside Oceanic.’ In de Guzman V P & Bender B W (eds.). 30–45. Sak-Humphry C (1996). Khmer nouns and noun phrases: a dependency grammar analysis. Ph.D. diss., University of Hawai‘i. Savetamalya S (1989). ‘Thai nouns and noun phrases: a lexicase analysis.’ Ph.D. diss., University of Hawai‘i. Springer H K (1993). ‘Perspective-shifting constructions in Japanese: a lexicase dependency analysis.’ Ph.D. diss., University of Hawai‘i. Starosta S (1988). The case for lexicase: an outline of lexicase grammatical theory. London: Pinter Publishers Limited. Starosta S (1996). ‘Lexicase.’ In Brown E K & Miller J (eds.). 231–241. Starosta S & Nomura H (1986). ‘Lexicase parsing: a lexicon-driven approach to syntactic analysis.’ In Nagao M (eds.). 127–132. Wilawan S (1993). ‘A reanalysis of so-called ‘serial verb constructions’ in Thai, Khmer, Mandarin, and Yoruba.’ Ph.D. diss., University of Hawai‘i.

Lexicography: Overview 113

Lexicography: Overview P Hanks, Brandeis University, Waltham, MA; USA and Berlin-Brandenburg Academy of Sciences, Berlin, Germany ! 2006 Elsevier Ltd. All rights reserved.

Introduction Lexicography is the art or craft of writing dictionaries. In contrast to lexicology (see Lexicology), which is the theoretical study of words, their use, and their meaning, lexicography is a practical activity that involves compiling an inventory of the words in a language (or some particular subset of them) and saying something about each of them. Lexicographers were described by Trench (1858) as ‘‘the inventory clerks of language.’’ Their primary duty is to collect an inventory of all the words in the language (or a selected subset), and to say something about each of them. There are many different kinds of dictionaries, giving many different kinds of information about words. The main ones are: (1) scholarly dictionaries of record; (2) practical dictionaries for everyday use; (3) pedagogical dictionaries; (4) dictionaries of linguistic phenomena such as slang or idioms; and (5) special-subject dictionaries. All the foregoing kinds are monolingual. To these we must add: (6) bilingual dictionaries; (7) onomasiological dictionaries (thesauruses, synonym dictionaries); and (8) term banks. There are also hybrid dictionaries, for example, monolingual dictionaries for secondlanguage learners with marginal glosses in a relevant foreign language. Four issues of general principle that must be considered for all serious types of dictionaries in any language are: (1) breadth, not depth; (2) consistency; (3) descriptive versus prescriptive approach to the language; and (4) historical versus synchronic approach. The third of these, in particular, is controversial: there are many circumstances in which a dictionary, rightly or wrongly, is expected to play a prescriptive role in a language. General procedures for many kinds of lexicography include collecting, ‘tokenizing,’ and ‘lemmatizing’ words. Having established a draft word list, the lexicographer must decide what to say about each word. This includes decisions about (1) orthography of both the headword and inflected forms; (2) guidance on pronunciations; (3) word-class classification (part of speech); (4) definition writing; (5) examples; (6) phraseology; (7) disputed points of usage; and (8) etymology and word histories.

Brief History of Lexicography Lexicography had its beginnings in the ancient civilizations of the Middle East (see Bilingual Lexicography) and ancient Chinese civilization (see Chinese Lexicography). In ancient Greece there were already dictionaries in the fifth century B.C. These were compiled to explain words used by Homer and other writers from the Archaic period, which had become rare or obsolete by the Classical period. Classical Greek lexicography was exclusively monolingual until quite a late period (see Greek Lexicography, Classical). It was left to the Romans to introduce bilingual lexicography (Latin ! Greek, Greek ! Latin), from the first century B.C. onward. In many of the languages of Europe, the origins of lexicography can be traced back to interlinear glosses in medieval manuscripts (see, for example, English Lexicography, German Lexicography). Monks noted the vernacular equivalents of unfamiliar Latin words on the manuscript as an aid to understanding the text. In some monasteries, the glosses were subsequently collected and arranged roughly in alphabetical order in a separate manuscript. The development of European dictionaries as we understand them today, however, had to await the invention of printing technology in Germany by Johannes Gutenberg in 1450. Printed dictionaries became practical tools that could be mass-produced by the hundreds and thousands, enabling identical information about words to be disseminated extremely rapidly throughout a community. A major milestone in the development of European lexicography was the publication in Paris in 1531 of the Dictionarium, seu Latinae linguae thesaurus by the great Renaissance scholar and printer Robert Estienne (Robertus Stephanus). In 1538 he followed this with a Latin– French dictionary and in 1539 a French–Latin dictionary. These magnificent works remained unchallenged as authorities for over two hundred years and were the source of innumerable derivatives and revisions. Bilingual dictionaries played a major part in the revival of learning (Latin and Greek) during the Renaissance and an equally important part, first in the spread of Italian and French culture into northern Europe, and subsequently in the interplay of languages that was characteristic of Europe from the 15th century to the late 20th century, at which time, once again, an international lingua franca (English this time, not Latin) was to emerge. European dictionaries of the 15th and 16th centuries were of two kinds: those that played a role in the learning of Latin and those that were

114 Lexicography: Overview

designed to explain the words of one vernacular in terms of another. Many of these early printed dictionaries displayed great energy, learning, and sophistication, although an examination of them also provides a reminder of the important part played by the linguistic philosophy of the Enlightenment (17th and 18th centuries) in developing lexicographical standards of accuracy and consistency. Sixteenthcentury dictionaries were full of information, but much of it was anecdotal and discursive. Balance and consistency were not among the lexicographical virtues of Renaissance dictionaries. Multilingual dictionaries were also popular during the Renaissance; valiant attempts were made to represent the meaning of words in one language by equivalents in several others. Only gradually did people come to realize how difficult it is to represent one language in terms of another. Languages do not map easily onto one another, and trying to map three or more languages onto each other is an impossibility, except in the case of superficial equivalents such as terms for artifacts in a technology that is shared by many linguistic cultures. During the 18th century, large dictionaries were compiled in all the languages of Europe that were not the tongue of an oppressed minority. Academies were established with the aim of preserving the purity of vernacular languages (see Academies: Dictionaries and Standards). Great advances were made in data collection. (The aim was to collect and define all the words of a language, not just the ‘hard words.’) Scientific standards of definition writing were pursued. Some 18th-century lexicographers made strenuous efforts to implement the ‘substitution principle,’ a notion that can be traced back to Leibniz’s dictum that two things are the same if the one can be substituted for the other without affecting the truth value (salva veritate). The idea was – and is – that a definition should be substitutable in any context for the word being defined (the definiendum). The substitution principle, coupled with unthinking reductionism (according to which every sense of every word is defined as an independent linguistic entity, rather than as part of a phraseological whole) is responsible for some of the convolutions still to be found in present-day dictionaries, for example in the definitions of English reciprocal verbs: meet (sense 2): to come into or be in conjunction or contact with (something or someone else or each other).

This definition may be convoluted, but at least it is an attempt to capture the reciprocal syntactic potential of the verb, as in examples 1–5 below. Some other English dictionaries present these as separate senses,

while others ignore the syntax completely or indicate only ‘transitive’ and ‘intransitive.’ 1. 2. 3. 4. 5.

John met Sally. Sally met John. John and Sally met. John and Sally met each other. John and Sally met with each other.

Eighteenth-century dictionaries saw their role not only as recorders of the words of a language, but also as reinforcers of ‘correct’ usage, to warn against barbarisms. Thus, Johnson (1755) included citations from ‘‘the best authorities’’ to show how English words are used, and included comments on words that he disapproved of (e.g., ‘‘clever: a low word’’). During the 19th century, a shift from such prescriptive attitudes to scientific descriptive principles occurred in many important European lexicography projects. The proposal by Sir William Jones in 1786 that Sanskrit, Greek, Latin, and other languages were ‘‘sprung from some common source which, perhaps, no longer exists’’ had a profound effect on scholarly lexicography. Although the American lexicographer Noah Webster (see American Lexicography) rejected the Indo-European hypothesis, preferring to believe that English words were derived from ‘Chaldee,’ scholarly lexicography in Europe throughout the 19th century devoted much energy to discovering the Indo-European roots of everyday words and establishing cognate relationships among words in different languages. Etymology, which in the 18th century had been a matter of wild guesswork, became a painstaking scientific discipline and came to dominate lexicography (see Etymology). The task of the scholarly lexicographer in all the leading languages of Europe was seen partially, if not primarily, as a historical one: not only to define the meaning of words but also to trace their origins. In 1864 Webster’s successor Noah Porter and his etymologist C. A. F. Mahn quietly dropped ‘Chaldee’ and brought the leading American English dictionary into line with the rest of Western scholarship. The establishment of lexicography as a scholarly discipline on historical principles led to the establishment of scholarly dictionaries of record for many languages. Also in the 19th and 20th centuries, large scholarly dictionaries of ancient and medieval languages began to be compiled, presenting a detailed account of the recorded vocabulary of major languages at earlier historical periods. Some of these great scholarly projects continued for many decades. The culmination of historical and in particular

Lexicography: Overview 115

Indo-European lexicographical scholarship is still being worked out. In 1949 Carl Darling Buck published A dictionary of selected synonyms in the principal Indo-European languages. At the University of Leiden, a major project is currently underway with the goal of creating an Indo-European etymological database on the Internet and, eventually, compiling a new Indo-European etymological dictionary, designed to supersede Julius Pokorny’s Indogermanisches etymologisches Wo¨ rterbuch (1959). In the 20th century another school of lexicography grew up, the purpose of which was to explain the meaning and use of words in the contemporary language, relegating etymology and obsolete senses to a subsidiary role. Historical principles were superseded in these dictionaries by synchronic principles, more suitable for practical everyday works. The first work to make this distinction explicitly was Funk and Wagnall’s Standard dictionary of the English language (1893). Some but not all American dictionaries followed Funk and Wagnall’s lead, for example the American College dictionary (1949), its successor the Random House dictionary (1964), and the American Heritage dictionary (1969). Others, including the Merriam-Webster dictionaries, clung to the more traditional historical principles. Among British and Australian one-volume dictionaries, it is now standard to follow synchronic principles rather than historical principles. One-volume dictionaries of the English language (both historical and synchronic) characteristically focus on explaining meaning, with comparatively few examples. In other languages, for example German and Greek, the focus of one-volume dictionaries is often very much more on phraseology and usage, with many more examples of usage than are found in one-volume English dictionaries and comparatively less space devoted to definitions. In Russia, dictionary compilation was associated with important developments in linguistic theory (see Russian Lexicography), for example the meaning ¼ text theory of Mel’cˇuk (see Mel’cˇuk, Igor Aleksandrovic (b.1932)). Beginning in the late 1980s, lexicography began to respond to the opportunities offered by computer technology, in particular the use of computers to collect and process evidence of word use on a very large scale (see Corpus Lexicography).

Typology of Dictionaries There are innumerable works calling themselves ‘dictionaries’ in almost every language in the world that has a literature. Indeed, modern lexicography

also plays a major part in recording the vocabulary of and establishing written norms for languages that do not have a literature. These works are extremely heterogeneous: some of them have nothing in common other than the word ‘dictionary’ in the title. Others are clearly members of the same family of reference works. A typology of dictionaries may be attempted, identifying the following types: scholarly dictionaries of record, practical dictionaries for everyday use, pedagogical dictionaries, dictionaries of slang and/or idioms, specialsubject dictionaries – all of which are monolingual and explanatory in nature – along with bilingual dictionaries, onomasiological dictionaries, and term banks. In addition, there are various less frequent classes, for example hybrid works such as monolingual dictionaries for second-language learners with marginal glosses in a relevant foreign language. Scholarly Dictionaries of Record

The 19th and 20th centuries saw an explosion of scholarly lexicographical activity. Every language in the world that has an established literary tradition now has (or will soon have) a major dictionary of record, recording the vocabulary of the language and usually but not always organized on historical principles, with plentiful citations illustrating usage and supporting the definitions. Classic examples of great historical dictionaries are the Oxford English dictionary (OED) (discussed in English Lexicography), the Tre´ sor de la langue franc¸ aise (see French Lexicography), and the Deutsches Wo¨ rterbuch of the brothers Grimm (see German Lexicography). The present encyclopedia includes articles on the history and current state of the art of lexicography in all the world’s major languages. There are also articles on major areas of historical lexicographical scholarship (see Old English Dictionaries, Middle English Dictionaries, and Old French Dictionaries). In other cases, historical dictionaries are surveyed in the relevant national articles (see, for example, Dutch and Flemish Lexicography, German Lexicography). As a subgroup of the category of scholarly dictionaries, dictionaries of etymology and word history should also be mentioned. These range from popular works such as Chantrell’s very readable Oxford dictionary of word histories (2002) and Picoche’s Dictionnaire e´ tymologique du franc¸ ais (1992) to larger works of scholarship such as Kluge’s Etymologisches Wo¨ rterbuch der deutschen Sprache (1st edn., 1883; 24th edn., 2002), the Dutch Etymologisch

116 Lexicography: Overview

woordenboek by P. van Veen and N. van der Sijs (1997), V. Machek’s Etymologicky´ slovnı´k jazyka Ceske´ho (1997), and Pokorny’s Indogermanisches etymologisches Wo¨rterbuch (1959).

be a more effective explanatory technique than an abstract metalanguage.

Practical Dictionaries for Everyday Use

There are two main kinds of pedagogical dictionaries: those compiled for foreign learners of a language and those compiled for schoolchildren who are native speakers. Dictionaries for foreign learners are often formidable works of scholarship, with a much greater emphasis on syntax and usage than is usual in dictionaries for native speakers (see Learners’ Dictionaries). Here, the pedagogical lexicographer has to decide whether the emphasis should be on helping the user to ‘encode’ or ‘decode’ the language. Many foreign learners use dictionaries to help with language encoding – i.e., to write naturally and idiomatically in the foreign language. A dictionary designed for encoding use typically has a smaller word list and many more examples of usage than one for decoding use. An encoding dictionary rigorously eschews eccentric or fanciful usage and is sensitive to selectional preferences. Dictionaries for schoolchildren are often nothing more than simplified versions of small dictionaries for adults. However, there are also some fine examples of dictionaries that take their pedagogical function seriously, not only explaining the meanings of words in suitable terms and focusing on ‘hard words’ but also presenting facts about grammar selected for the appropriate age group (see, for example, English Lexicography).

Despite the high cost of compiling a dictionary, most languages offer a range of choices of one-volume dictionaries, varying greatly in size and scope. In size, they range from small pocket books giving simple definitions by synonym and spelling dictionaries listing only word forms (together with occasional comments on points of orthographical difficulty), to large and ambitious works that give detailed information about many aspects of word meaning and word use, as well as facts about the world, such as famous people and places or the distribution and habitat of a plant or its New Latin name. The emphasis in practical dictionaries nowadays is often on presenting a wealth of complex facts about words in a way that is as accessible as possible for users. Lexicographers of an earlier generation appear to have been less concerned with user-friendliness. For example, the first edition of the Concise Oxford dictionary (1911) seems very dense and hard to use when compared with the 10th edition (1999), where a great deal of careful planning went into the page design and layout, as well as the selection of information. The compilers of such dictionaries are faced with choices at each step of the work, to which the answer is not always obvious. For example, should a practical dictionary record rare and obsolete words? Some dictionary publishers take the view that this is the raison d’eˆtre of a practical dictionary. No user, it is argued, will look up a very ordinary, familiar word, so as little space as possible should be wasted on such words. The likelihood that a word will be unfamiliar and therefore looked up in a dictionary (if encountered by a reader in some text) increases with its rarity, until the point is reached where the word is so rare that no one will ever encounter it. Another decision concerns the presentation of grammatical information. Facts about such matters as complementation patterns, countability, and selectional preferences are the linguist’s stock in trade, but the practical lexicographer has to bear in mind that many ordinary users are not trained linguists and (especially in the English-speaking world) may actually be hostile to grammar. A judgment must be made whether anything at all should be said about each grammatical point concerning each word, and, if the answer is yes, how to present it. Often, exemplification, on the basis of which a user can make analogies, is considered to

Pedagogical Dictionaries

Dictionaries of Slang, Idioms, Etc.

Good slang dictionaries (see Slang Dictionaries, English) in many ways resemble the great scholarly dictionaries. Great efforts are made to collect citations from a vast variety of sources, many of them ephemeral, and a selection of the citations is presented in the dictionary itself, often with bibliographical information. The definitions and examples in slang dictionaries are of the greatest importance, because (unlike general dictionaries) many of the headwords will be quite unfamiliar to users. Dictionaries of idioms (see Idiom Dictionaries) are more often aimed at foreign learners than at native speakers. They are classified here with slang dictionaries, because their aim is to collect and explain a specific subset of the vocabulary, rather than the whole language. The same is true of dictionaries of phrasal verbs. There are several other kinds of dictionary that focus on subsets of the vocabulary of a language,

Lexicography: Overview 117

for example dictionaries of loanwords such as the Dictionary of European Anglicisms (2001).

Dictionaries of Rare and Endangered Languages

Special-Subject Dictionaries

The need for a lexical inventory is also felt by languages that have no large literary tradition. A great deal of lexicographical work is currently being done on rare and endangered languages. Lexicographical tools for field linguists to use in compiling dictionaries and corpora of rare languages are available, for example, SIL’s Toolbox. The motivation for such dictionaries is various: sometimes, the lexicographer is trying to establish a written standard for a language that previously was not written down. Sometimes, in the case of severely endangered languages, the motivation is simply to record the lexicon and phraseology of the language before it is lost forever. Much more often, however, the motivation for such a dictionary allies scholarship to practical needs: providing access for speakers of a rare language to the wider world and to modern facilities, markets, and technology, plus the more traditional motivation of providing missionaries with a tool to help them to do good works, translate the Bible, and preach the word of God to speakers of a minority language. Such dictionaries are inevitably bilingual, with varying degrees of emphasis on the source language and the target language, depending on the local circumstances. Simpson (1993) discussed the objectives and methods for dictionaries of Australian Aboriginal languages. The target language of such dictionaries is typically English, although in certain geographical locations it may be Russian, Spanish, or Portuguese (see Bilingual Lexicography; Endangered Languages for further discussion).

The number of subjects of which dictionaries have been made is vast, ranging from huge dictionaries of medicine and law to quite small dictionaries of particular sports or games. Typically, special-subject dictionaries differ from general dictionaries in that they give more detailed and discursive encyclopedic information about each term, rather than merely a definition, while at the same time including little or no linguistic information. The focus is on objects and concepts, rather than on usage and linguistic behavior. Bilingual Dictionaries

Bilingual dictionaries are typically practical tools for interlingual communication and learning, rather than scholarly studies (see Bilingual Lexicography and Dictionaries).

Onomasiological Lexicography A distinction can be made between semasiological and onomasiological dictionaries. Semasiological dictionaries consist of an alphabetical list leading the user from the word to its meaning(s), which is the principle characteristic of the dictionary types discussed up to now in this article. By contrast, the idea behind an onomasiological dictionary is to help the user to find the appropriate word for a particular meaning or concept. Various techniques are used in the attempt to achieve this aim. The classic example of an onomasiological dictionary is Roget’s Thesaurus, in which words are arranged in hierarchies under more general terms (hypernyms, also called superordinates). Thus, the terms sofa and settee may be found listed under furniture (see Thesauruses). Another techinique is to provide lists of synonyms and antonyms for a headword, in the hope that one of them may be the term that the user is looking for. An example of an onomasiological dictionary that helps users to discriminate between near synonyms is McArthur’s Longman lexicon (1981). Another kind of onomasiological lexicography is the compilation of terminological databases (see Terminology and Terminological Databases). In a world of increasingly complex engineering and technology in a multilingual environment, vast on-line terminological databases serve vital needs for standardization of teminology for many specialist communities.

Principles of Lexicography The needs, problems, and opportunities presented by the lexicon of each language differ, and where appropriate these are discussed in the articles devoted to the lexicography of each language. Four general issues of principle, applicable to most or all types of dictionary in most or all languages, may be discerned. These are: (1) breadth, not depth; (2) consistency; (3) descriptive versus prescriptive approach to the language; and (4) historical versus synchronic approach. Breadth, Not Depth

Unlike other kinds of scholarship, lexicography generally aims at breadth rather than depth. A dictionary does not say everything that could possibly be said about a particular word or linguistic phenomenon.

118 Lexicography: Overview

Instead, it tries to present a reasonably comprehensive inventory of the vocabulary and to state just those facts that are most salient or most relevant about each word. This inventory may be a list of all the words of the language that are in common use, all the words that a learner needs to know, all the words relevant to a particular subject, or all the words ever used in the literary record of a particular language. But as far as the entries themselves are concerned, it is necessary for dictionaries to idealize – and often simplify – word meaning and word use. To attempt to account in detail for all possible uses of words would be to attempt the impossible, for usage is open-ended and shades of meaning are determined by context. Furthermore, if a dictionary presents too much information about a particular word, there is a danger that the user may not be able to see the woods for the trees. There are principled as well as practical reasons for dictionaries to be economical with space. Consistency

Having collected the inventory, the lexicographer must say something about everything in it and must aim at consistency of coverage and presentation or wording. The user of a dictionary is entitled to expect that all differences in the presentation represent some differences in the language, i.e., stylistic variation for its own sake is avoided. For this reason, unlike other kinds of writers, lexicographers aim to repeat the same forms of words (in definitions and etymologies) wherever possible, so that only real differences in meaning or use are reflected in the wording. It is no good saying ‘‘of or pertaining to X’’ at one entry and ‘‘relating to Y’’ at another entry, if ‘‘of or pertaining to’’ and ‘‘relating to’’ are intended to convey the same message. More seriously, consistency of coverage is always the goal in modern dictionaries. Allied to this is the principle of ‘consistency of sets’: the selection of entries and the definitions of a set of related items should be worded, as far as possible, consistently. A modern dictionary will include entries for all chemical elements, even rare ones, on the principle of consistency of sets. On the other hand, it would be quite impossible to include entries for all possible chemical compounds, so some additional principle – e.g., common usage – must be invoked. Computer technology is a great help here: it enables the chief editor of a new dictionary to divide up the work by topic rather than in alphabetical order and to assign the work to relevant specialists, so that, for example, the chemistry editor can make a reasonable selection of terms denoting compounds; the medical editor can

define all the bones of the body, all the organs, all the physiological processes, all the diseases; and so on in accordance with a systematic policy, without being hampered by the need to work in alphabetical order (or to use cumbersome card indexes), as was the case before computer text processing became commonplace. Consistency of treatment is all very well, but there is also a danger that consistency may become a false god, forcing the members of a lexicographical team to distort subtle linguistic differences in the name of consistency. The lexicographical apparatus available to the members of a lexicographic team must be powerful enough to reflect subtle differences in linguistic facts, and team members must know how to use it. For example, in an English dictionary, for the sake of economy of space, the word lexicographical may be listed as a rare variant of lexicographic. No separate entry is necessary. But the same treatment certainly cannot be accorded to historical and historic or economical and economic. So much is obvious. However, other cases may be more difficult to decide. What, for example, is to be done about evangelistical? Should it be treated simply as a rare variant of evangelistic, or is there a subtle difference of meaning, requiring separate entry? Careful scrutiny of the evidence is required, rather than application of a rule by rote in the name of ‘consistency.’ Descriptive or Prescriptive?

Most modern lexicographers agree that their job is to describe the language, not to legislate about right and wrong. Those few who have, from time to time, set themselves up as pundits and pontificators have had remarkably little success in halting the endless process of language change. Some changes in language may be a matter for regret, while others may be welcome. However, the lexicographer can only hope to describe what is going on in the language, not to change it. This is a very different principle from the principles that motivated 17th- and 18thcentury European lexicographers, for example, the Acade´ mie Franc¸ aise (see Academies: Dictionaries and Standards). The lexicographers of the Enlightenment had explicit goals to ‘fix’ the language, to set standards of correct usage, and to combat any decline in those standards. It fell to Samuel Johnson to abandon his original commission to ‘fix’ the English language and to draw attention to the impossibility of such a goal: When we see men grow old and die at a certain time one after another, from century to century, we laugh at the elixir that promises to prolong life to a thousand years;

Lexicography: Overview 119 and with equal justice may the lexicographer be derided, who being able to produce no example of a nation that has preserved their words and phrases from mutability, shall imagine that his dictionary can embalm his language, and secure it from corruption or decay. . . . With this hope, however, academies have been instituted, to guard the avenues of their language, to retain fugitives, and repulse intruders; but their vigilance and activity have hitherto been vain; sounds are too volatile and subtle for legal restraint; to enchain syllables, and to lash the wind, are equally the undertakings of pride, unwilling to measure its desires by its strength. (English dictionary, 1755, Preface)

Nevertheless, no matter how vehemently the lexicographer may agree with Johnson and protest, with Clarence Barnhart in the General Introduction to the American College dictionary (1949), that ‘‘it is not the function of the dictionary-maker to tell you how to speak, any more than it is the function of the mapmaker to move rivers or rearrange mountains or fill in lakes,’’ it must be acknowledged that many users will inevitably treat a dictionary as if it were a prescriptive tool, looking to it for guidance on good usage. As Jess Stein wrote in the preface to the first edition of the Random House dictionary (1966): Since language is a social institution, the lexicographer must give the user an adequate indication of the attitudes of society toward particular words or expression, whether he regards those attitudes as linguistically sound or not.

Various expedients have been devised to deal with this problem, of which the most successful is to comment explicitly on disputed points of usage, discussing the evidence in a usage note (see later section on ‘The Internal Structure of Dictionary Entries’). Historical Principles or Synchronic Description?

As a general rule, scholarly dictionaries of record are compiled on historical principles – that is to say, they first record the etymology of each word in the language (see Etymology), then define the earliest sense of the word, and then go on to outline the word’s semantic and grammatical development from that point on. Large dictionaries of record also give archaic and obsolete spellings. Because languages are unstable, i.e., word forms, conventions of usage, and meanings change over time, many entries in the great scholarly dictionaries of literary languages start with a sense that is now obsolete. Current senses are sometimes buried in the middle of an article or near the end. To take a simple example, the word camera in English is now used almost exclusively to denote an apparatus for taking photographs, but

formerly it meant ‘a small room’ (the sense of the Latin word that was taken into English in the 17th century) or ‘a legislative chamber’ and, more specifically, ‘the treasury department of the papal curia.’ Placing these three senses first in the entry for the English word camera raises interesting theoretical questions about the nature of literal word meaning. The methodology of historical principles is elaborated in Burchfield (1989). ‘Synchronic principles’ – placement of the modern meaning of each word first – may seem to be no more than a matter of common sense. But in many cases it turns out not to be easy to decide what is ‘the modern meaning.’ For example, it is not clear whether the ‘modern meaning’ of the English word oasis is (1) a haven of calm in a big city or other hectic place or situation, or (2) a fertile spot in a desert. (1) seems to be at least as common as (2). For practical purposes, it might be thought desirable to place (1) first, since few English speakers live in a desert. Appeals to etymology are often made, but if etymology were a good guide to the literal meaning of modern words, then the first meaning of camera would be ‘a small room’ and the first meaning of dope would be ‘a thick varnish.’ To decide which meaning to place first, the synchronic lexicographer needs to take a principled view about what counts as literal meaning and to balance the (sometimes conflicting) claims of frequency, etymology, concrete versus abstract, cognitive plausibility, and other factors. Synchronic principles are sometimes expressed in terms of ‘arranging senses in order of frequency,’ but in fact this is never done. The best that a synchronic lexicographer can do is to start with a sense that has a reasonable claim to be literal and currently common and then to arrange the other senses in some sort of coherent order thereafter. Dictionary entries have a discourse structure of their own. Different senses of a word are not merely mutually exclusive alternatives. Rather, the entry for a polysemous word is to be read as a whole, in which senses (2) and (3) are influenced by and draw on what has been stated at sense (1). Thus, it is easier to explain and understand reference to an oasis of calm in a big city if it can be read as a secondary sense, a conventionalized metaphor exploiting a primary definition that describes an oasis as a tranquil green location in a desert.

Lexicographical Procedures Collecting the Evidence

As already mentioned, it is now widely accepted that the central part of the lexicographer’s task is to

120 Lexicography: Overview

compile an inventory, collecting evidence for words and showing how they are actually used. The traditional method of collecting evidence for a dictionary is to set up a reading program, i.e., for readers to read books or journals and to transcribe citations for interesting words onto small slips of paper, which are then stored in a filing cabinet before being examined as evidence for a possible definition, in some cases many years after the slip was first collected. This methodology is extraordinarily slow, expensive, and labor-intensive; nevertheless, it has been the methodology that has supported most of the great historical dictionary projects of European languages. All this is now changing with the increasing emphasis on collecting electronic text archives. These can be processed computationally, providing a concordance for all uses of a word in all the texts that are accessible in electronic form. Before long, it will be possible for historical lexicographers to see how the usage of great writers of the past compared with the everyday conventional usage of their day, insofar as ordinary nonliterary texts from the relevant period survive and have been put into machinereadable form. This may be expected to lead to new developments in historical lexicography. Large practical dictionaries often place considerable store on explaining technical terms and technical senses of words in science and technology, medicine, law, economics, business, sports, and so on. For such dictionaries, specialized reading programs with a focus on textbooks and manuals in various disciplines are sometimes set up. The growth of electronic corpora since the mid– 1980s has had a profound effect on the collection of lexicographical evidence (see Corpus Linguistics, Corpus Lexicography, Corpus Approaches to Idiom). A modern corpus containing hundreds of millions of words of text, with a good concordance program, can provide instant access to more evidence than any lexicographer could possibly use, showing how each ordinary word is ordinarily used in the language. Statistical tools such as the Sketch Engine (see Computers in Lexicography) help the lexicographer to decide which word associations or collocations are important and which can be safely ignored. In addition to this, the Internet itself can now be used as a corpus, providing evidence for unusual as well as normal uses of words, on the basis of billions of pages of electronic text (see Grefenstette and Kilgarriff, 2003; Church and Mercer, 1993). As mentioned above, these techniques are equally applicable to historical dictionaries, provided that the relevant texts are available and accessible in machine-readable form.

Identifying Words: Tokenization, Lemmatization

Once words have been identified, their usage and meaning can be studied. But what is a word? To readers of English, Spanish, or French, it seems obvious that a word is a string of letters bounded by a space or a punctuation mark. It is easy to overlook how reliant on an artificial cultural convention this notion of word boundaries is. Establishing word boundaries is a problem for computational language processing in German and many other languages written alphabetically, not to mention languages such as Chinese and Japanese, which have other writing systems. Two examples of the problem may be given from German. On the one hand, German prints as a single ‘word’ some items that in English require at least four ‘words’ – for example, Binnenschifffahrtsstraßenverkehrsordnung ‘inland waterways traffic regulation.’ On the other hand, certain verbs in German are ‘separable,’ i.e., they contain a particle that is separated from the root verb in finite contexts, e.g., abfahren ‘drive away’ is found in finite contexts such as Er fuhr langsam ab ‘he drove slowly away’; thus a single ‘word’ is printed as two ‘words’ at opposite ends of the sentence. Such conventions present interesting challenges for tagging, sorting, and grouping lexicographical evidence. Whereas a concordancing program for English can rely on spaces and punctuations marks as indicators of word boundaries and simple string matches for finding different inflected forms of a word, for German it is absolutely necessary to pre-process a corpus by ‘lemmatizing’ it, i.e., by identifying all the inflected and separated forms of each word and grouping them together, also identifying boundaries between bound morphemes, i.e., word boundaries that are not represented in written texts by a space. There are at least four different senses of the word word. To avoid confusion, the term is often avoided altogether in technical writing about language, being replaced with more specific terms. The four main competing senses of word are: 1. (synonym: type) A unique spelling form considered as an abstract category; a particular string of letters, without any reference to the meaning of the string. A type is always just a single type, no matter how often it is used or how many senses it may have. 2. (synonym: token) Any one occurrence of a type in a text. The text the cat sat on the mat contains six tokens, but only five types, because there are two tokens of the type the. 3. (synonym: lemma) A further abstract category, consisting of all the different inflected forms of a

Lexicography: Overview 121

word type grouped together under a single heading. Thus, the English verb lemma take consists of five types: take, takes, taking, took, taken. In highly inflected languages such as Czech, 30 or 40 different types may be grouped together as inflected forms of a single lemma. 4. (synonym: lexical item) A minimal meaning-bearing unit of language. In this sense, word overlaps with ‘bound morphemes’ such as un- and pre- on the one hand and with ‘multiword expressions’ such as gas burner and air cushion vehicle on the other hand. Dictionaries typically record (3) and (4), grouping words into lemmas and also recording bound morphemes, together with a selection of multiword expressions (see Word). Multiword Expressions

The number of multiword expressions in a language is unimaginably vast. They could not possibly all go into a dictionary, even one that was unconstrained by the physical limitations of printed books. Nevertheless, some multiword expressions receive entries in dictionaries. These are selected on semantic grounds, not on the basis of frequency, i.e., a dictionary entry is made when the meaning is not recoverable from analysis of the parts. There is no point in putting a frequent collocation into a dictionary if it is perfectly obvious what it means, i.e., if the normal relationship of modifier þ noun applies, no matter how frequent the expression may be. The purpose of including a multiword expression in a dictionary is to explain something that needs to be explained. Having said that, it must be acknowledged that there are many multiword expressions with special meanings that are ignored by dictionaries. For example, in English, forest and wood are synonyms, so it ought to follow that forest fire and wood fire are synonyms. But they are not. A forest fire occurs only in the wild – it is dangerous and out of control; a wood fire implies deliberately burning wood in a hearth, under human control, to generate heat. Innumerable similar examples could be cited. Multiword expressions remain a challenge for lexicography. Can a lexicographical project correct the reductionist expectations that would equate wood fire with forest fire, and if so, how can it be done? Names in Dictionaries

The traditional criterion for distinguishing names from words is not entirely satisfactory. Common sense suggests that the name ‘Samuel Courtauld’ denotes a single individual. But in fact there are (or

have been) several Samuel Courtaulds in the history of the world, only one of whom founded the Courtauld Institute of Art in London. Is Courtauld a word of English? And then what about all the other Samuel Courtaulds? The term Courtauld has no particular semantic function; instead, it has a series of denotations, separately picking out all the individuals who bear and bore this surname. Is Courtauld a word of English? It is undeniably a type in a corpus. And then, what about the names of all the billions of other people, past and present, in the rest of the world, whether English-speaking or not? There is no difficulty in regarding them as types, because a type is just a processing notion, without reference to meaning. But it would be absurd to call them ‘words of English’ or put them in a dictionary. However, lexicons for computer use are a different matter. Some of these lexicons contain very large lists of conventional names as an aid to ‘named entity recognition.’ Some writers consider that the lexicon is a ‘‘cultural system of reference,’’ a criterion that would justify including in a dictionary names that have collective cultural reference, such as London, Washington, Shakespeare, and Courtauld. Many larger dictionaries such as the American Heritage dictionary and the New Oxford dictionary of English (NODE) include names that are part of the cultural reference system. Other dictionaries do not, with occasionally bizarre results, for example when Jesus gets an entry as an oath (a general word of the language), but not as the name of the founder of the Christian religion. In dictionaries that make this distinction, English gets in (presumably, it is a word), but England does not (it is a name). Names of products raise other problems. The term Mars Bar is generally considered a name rather than a word, but why? It clearly denotes a class of entities, has a plural inflection, and behaves in other respects like an ordinary English count noun. Yet it is neglected by dictionaries. This may be regarded as a triumph for common sense but not for systematicity. As a type (or rather, a collocation of two types), Mars Bar qualifies for entry in large electronic lexicons for natural language processing, since one cannot process English effectively by computer without being able to process Mars Bar. And for many applications, knowing that a Mars Bar is candy and not a steel girder is important knowledge, of just the kind that needs to be encapsulated in a computational lexicon. Bilingual dictionaries, too, include names, but typically only place names for which different language equivalents exist, for example German Mu¨ nchen/ English Munich. There is, however, an increasing

122 Lexicography: Overview

fashion in the modern world to use the local form of place names, so that, for example, Leghorn (the former English name for Livorno) or Ratisbon (Regensburg) must be labeled as archaic if they are included in a dictionary at all. Foreign Borrowings

Languages and cultures borrow from one another. For example, speakers and writers freely use French, German, Italian, Spanish, and other expressions in an English-speaking context. The converse is even more true: English is the source of very many foreign borrowings in all the languages of the world. This raises problems for the inventorist. Is tete-a-tete an English word? There are at least three uses of it as a perfectly normal word of English in the British National Corpus. The French form (teˆ te-a`-teˆ te) is, of course, spelled with accents, but in English it can equally well be written without accents. Many similar examples could be given. Expletives

Are the English expletives er, um, oh, unh-huh, phwoah, etc., words? Should they be in a dictionary? (The same question applies to expletives in any language.) They occur as types in careful transcriptions of spoken English. To that extent, they may be regarded as words, and indeed, nowadays they often make an appearance in dictionaries. Technical Terminology

Are the coinages of chemists words of English, e.g., 2,3,7,8-tetrachlorodibenzoparadioxin? Most dictionaries take the view that highly artificial domainspecific terminology of this kind should be classified quite separately from the general vocabulary. Such terms are not words at all, it is said – no more than are the symbols used by logicians and mathematicians. The problem of terminology is not restricted to the sciences. Sports, too, for example, have an increasingly rich and rarefied range of domain-specific terms. To take an example from cricket: should silly mid-off be classified as a ‘word’? It makes perfectly good sense in a cricketing context, but is not part of general English. The term has a space in it, which in English normally counts as a boundary between types, but as we have already seen, this is not an infallible guide to identification of word types for a dictionary. Some tokenizers treat the hyphen, too, as a boundary between types, which would make this expression three types. There is no sense (except etymologically) in

which it counts as a head noun (mid-off) modified by the adjective silly. Such an analysis would be a further example of misplaced syntactic reductionism. Many other examples could be given of the difficulties of processing the evidence of words in text to produce the abstraction that is a dictionary headword list. The fact that there are traditional solutions to such problems, many of them unchallenged, should not blind us to their existence. In this age of computational text processing, lexicographers may need to revisit their most basic assumptions about what a word is.

The Internal Structure of Dictionary Entries Once the evidence has been collected and organized into alphabetical order or some other order, the lexicographer must start writing the entry. In practice, especially in dictionaries involving large teams of lexicographers, the two components of the work go hand in hand. There is considerable overlap. Collection of evidence – especially evidence for new and ‘trendy’ words beloved of publishers’ marketing departments – normally continues until the last possible moment before publication. Dictionaries are among the most highly structured texts known to humans. They consist of hundreds of thousands of little bits of information, carefully pieced together in an ordered pattern. An entry in a large dictionary typically contains at least the following information types: orthography, pronunciation, grammar, definitions, examples of its use, phraseology, prescriptive comments and usage notes, and a word history. Orthography

The single most common use of practical dictionaries in English and other languages is to check spelling. English spelling, though eccentric, is settled (with very few exceptions) and inflections are few, so it would be easy for an English lexicographer to overlook the amount of work that goes into deciding on orthography in other languages, for example, Irish Gaelic, for which a complete new orthography was developed for the language in the 20th century, or German, which undergoes periodic bouts of spelling reform. In non-European languages it is sometimes necessary for a lexicographer to make a principled decision about which dialect should be represented as the lexicographical standard or how many variants should be shown. Great historical dictionaries, such

Lexicography: Overview 123

as the OED, typically list the variant spellings that were in use at different periods in history. The orthography of inflected forms, too, must be represented. Again, English is at an advantage in that it has comparatively few inflections and those that exist are mostly regular. It is no great burden to list the inflected forms of the few strong verbs that exist and other irregular forms. In German, plurals of all countable nouns and inflected forms of strong verbs must be indicated. In highly inflected languages such as Czech, inflections can generally be represented by reference to tables of declensions and conjugations. However, there are often variants and irregularities, all of which must be represented somehow. The lexicographer must also decide systematically which inflected forms must be represented and which can safely be ignored (either because they are regular or because they are rare or obsolete). Pronunciation

Many dictionaries offer a guide to pronunciation of the headwords (and in some languages also of inflected forms), especially dictionaries of languages such as English, where the relationship between orthography and phonetics is not entirely regular. If it is decided to represent the pronunciation of words independently of their orthography, there are two major questions that the lexicographer must address: which accent to represent, and which transcription system to use. A question that is of great importance but that does not arise in relation to dictionaries is whether the representation should be of the word as it is normally pronounced in coherent speech or whether it should be a ‘reading-list pronunciation.’ Dictionaries deal with words in isolation, so they can only offer a reading-list pronunciation, which is a kind of abstract idealization. Electronic dictionaries can, in principle, offer a choice between a written transcription and an audio representation. As long ago as 1969, Webster’s New World dictionary pioneered the practice of accompanying the printed text of a dictionary with an audio representation of the pronunciation of words. Other dictionary publishers have followed suit from time to time, and now several have taken advantage of the potential of electronic technology in this respect. The audio representation may be taken from a list of words read by a human being, a text-to-speech synthesis, or both. In either case, a decision has to be made about which accent should be represented. As electronic dictionaries in hypertext form become better established, it may be expected that more

detailed lexicographical attention can be given to the pronunciation of words in different accents. Which accent to represent? This is a tricky question. The dictionary must either choose one accent as being standard and representative, or it must attempt a representation of several different standards, which can sometimes be done by using symbols that have an ‘archetypal’ value, i.e., each symbol can have a different value for each different accent. Of course, this only works insofar as the relationship between accents is regular; otherwise, alternative transcriptions must be offered. Even in languages where one accent is generally recognized as standard, problems can arise. For British English, the representation of pronunciation is usually in a standard accent, which Daniel Jones called ‘received pronunciation’ (RP), defining it in the 1926 edition of his English pronouncing dictionary as the accent of English ‘‘which is the everyday speech of families of Southern English persons whose menfolk have been educated at the great public boarding schools.’’ Thus, it is not only a regional accent but also a class accent. At that time, and for many years thereafter, English learners and many non-RP speakers aimed at RP in formal contexts, with varying degrees of success. In the intervening 80 years, however, since Jones wrote those words, RP itself has changed, although guides to English pronunciation, by and large, have not. In the Oxford dictionary of English pronunciation (ODEP, 2001), Clive Upton commented that the accent represented by Jones’s model had become ‘‘the possession of a small minority restricted in terms of age, class, and region.’’ The ODEP offered ‘‘a younger, unmarked RP.’’ The differences are mainly in terms of vowel quality. The ODEP also offered guidance on American pronunciation. Upton’s co-author, William Kretzschmar, commented that ‘‘English in the United States has no obvious standard spoken model’’ but that there was a ‘‘trend among educated speakers, especially those of the younger generation, toward limitation of the use of marked regional features while speaking in formal settings.’’ As far as Australian English is concerned, the Macquarie dictionary offers a transcription of English words in a standard Australian accent, and several other dictionaries of national varieties of English likewise offer a representation of a local standard accent. Languages that have a rich variety of dialect pronunciations, no one of which can be regarded as standard, tend to avoid giving guidance in dictionaries on pronunciation of words, since to do so risks raising more problems than it solves.

124 Lexicography: Overview

Which transcription system to use? The obvious scholarly choice for many dictionaries is the International Phonetic Alphabet (IPA) (see International Phonetic Association and Phonetic Transcription: Analysis). This, however, has its disadvantages. The symbols of IPA have an absolute value, so that even a broad transcription is accent-specific. Moreover, since accents change over time, an IPA transcription becomes out of date more rapidly than the alternative (a spelling-rewrite system). Perhaps these are among the reasons why almost all American English dictionaries prefer a spelling-rewrite system (borrowing only the schwa symbol, /e/, from IPA). Grammar

The presentation of grammar in dictionaries (as distinct from part-of-speech labels) is, for English at least, a controversial subject. There are many things that must be included – e.g., that the verb put requires not only a direct object, but also an adverbial of place – but the public are said to be resistant to too much grammar. Inclusion of detailed grammatical information about word usage is not a popular selling point. There are two ways in which dictionaries deal with this problem. One, adopted by NODE in 1998, is to make a stringently considered selection of necessary grammatical statements and to present them in unabbreviated English words and phrases such as ‘‘with adverbial of place.’’ The second is to devise a notational system that refers to an appendix on syntactic patterns of word use. The first edition of Longmans’ dictionary of contemporary English (LDOCE, 1978) devised such a system. It is very detailed and has the merit of taking up very little space, but it is not selfexplanatory and only the most hardened seekers of grammatical truth seek out the appendixes and unpack all the grammatical information relevant to each word.

of plants and animals or the formulae of chemical compounds. Examples of Words in Use

For many European dictionaries, the selection of examples of usage is seen as a central activity. By contrast, exemplifying usage is a neglected aspect of some older one-volume English dictionaries. In scholarly historical dictionaries such as OED, examples of words in use are cited from literature, with date, author, and other bibliographical details. It is this, and not only the size of the word list itself or the richness of definition, that contributes principally to the multivolume bulk of such dictionaries. One-volume dictionaries make a selection of examples, showing how different senses of words are ordinarily used. Nowadays, with the easy availability of machine-readable corpus evidence, such examples are usually chosen from authentic texts rather than invented by the lexicographer, but still careful selection is needed to avoid the risk of giving examples that are taken from eccentric or high-flown literary usage. Despite the efforts of linguists in recent decades, selectional preferences are still not well understood (let alone well described), so prudence dictates that the lexicographer should use authentic examples rather than invented ones. Many learners’ dictionaries and textbooks of the 1970s contain examples of usage concocted by the writers which (presumably unconsciously) violate selectional preferences. On the other hand, authenticity alone is not enough: authentic texts are full of colorful and eccentric language. The lexicographer who wishes to help a user encode the language is under an obligation to choose examples that show words in their most central and typical usage. One might almost say that such examples should be chosen for their dullness. Phraseology

Definitions

In the English-speaking world, definition writing is often seen as the central part of the lexicographical task. The project leader of a large lexicographic project keeps a set of compilers’ instructions and tries to ensure that all members of the team are writing in roughly the same style and in a consistent way. Some of the principal issues in definition writing are discussed elsewhere in this encyclopedia (see Definition). A dictionary that sets out to record the whole vocabulary of a language must devise separate styles for explaining function words and for discourse organizers, sentence adverbs, and other ‘pragmaticky’ expressions, as well as many kinds of nonlexical information, such as the New Latin terms for species

Brief mention must also be made of phraseology in dictionaries. Entire dictionaries are devoted to idioms and idiomatic expressions (see Idiom Dictionaries). Common idioms are also explained in general dictionaries, typically at the end of the entry for one of the key words in the idiom. Bilingual dictionaries give a great deal of information about phraseological equivalents in different languages, seeking to help the user overcome the problem that many expressions cannot be translated word for word. Large monolingual dictionaries of some European languages, such as German and modern Greek, devote more space to listing common phraseology, showing how words are used, than to explanations of meaning.

Lexicography: Overview 125 Prescriptive Comments and Usage Notes

As stated above, users often expect guidance from a dictionary on matters of ‘correct’ and ‘incorrect’ usage, paying no heed to the lexicographers’ protestations that their task is merely descriptive. Some prescriptive works do exist, for example, H. W. Fowler’s Modern English usage (1926), revised and updated by R. W. Burchfield in 1983, which is much admired for its robust and sometimes humorous advice on various aspects of English style. Perennially popular, too, are more basic practical works, such as the Oxford guide to style (2002) (a successor to Hart’s rules for compositors and readers at the Oxford University Press) and Judith Butcher’s Copy-editing: the Cambridge handbook for editors, authors, and publishers, which concern themselves with such matters as hyphenation, punctuation, italicization of foreign words and phrases, and consistency of spelling. Large practical dictionaries deal with the problem of the demands of ordinary users for prescriptive instructions about word usage either by ignoring them or by including a commentary on controversial issues, such as the use (in English) of split infinitives, competing pairs such as disinterested versus uninterested, and the use of taboo words. These comments may be no more substantial than register labels such as ‘slang’ or ‘taboo’ (see Register: Dictionaries), or they may be fairly substantial essays. NODE made a concerted attempt in its longer usage notes to bring corpus evidence to bear, comparing what writers actually write with what the pundits say they ought to write. The American Heritage dictionary contains usage notes, edited by Geoffrey Nunberg, in which the opinions of a panel of pundits (173 of them in the 1992 edition) are summarized on different issues, e.g., whether it is correct to use impact as a verb: ‘‘. . . 84 percent of the Usage Panel disapproves of the construction to impact on, as in the phrase social pathologies, common in the inner city, that impact on such a community; and fully 95 percent disapproves of the use of impact as a transitive verb in the sentence Companies have used disposable techniques that have a potential for impacting our health. . .’’

illuminated by knowledge of its etymology, for example, that the English verb concatenate is ultimately derived from the Latin noun catena ‘chain’ or that the modern German verb leisten (a word for which there is no English equivalent, the translation varying according to context: Gehorsam leisten is ‘to obey,’ Hilfe leisten is ‘to help,’ Arbeit leisten is ‘to do work,’ ein Eid leisten is ‘to swear an oath’) is derived from an ancient Germanic verb meaning ‘to follow in the footsteps (of a feudal superior).’ Etymologies tell a culturally unifying story about the interactions between languages and cultures. In a hybrid language such as English, even something as simple as an indication that a word is of Germanic or Latin origin, as the case may be, is illuminating information included in many small dictionaries, including dictionaries for schoolchildren. A dictionary on historical principles places the etymologies first in the explanations, immediately following the headword and pronunciation. Etymologies in great scholarly dictionaries such as the third edition of OED, now in progress, can be quite extensive and amount almost to scholarly articles in their own right. Practical dictionaries on synchronic principles take a slightly different view of word histories. First, there is more emphasis on semantic development; second, there is less emphasis on morphological development; and third, information about important obsolete senses (which in a dictionary on historical principles usually appears first among the definitions) may be placed in the etymology or ‘word history.’ A primary concern of dictionaries on historical principles is with dating. Many practical synchronic dictionaries of English give information about the earliest date at which a word was recorded in the language. Since this can rarely be done with absolute precision, the date is generally given in terms of a spread of dates; Collins gives the century in which the word was first recorded, relying largely on the historical evidence of OED; the Random House dictionary of the English language (2nd edn., 1987) generally indicates the decade (or some other spread of years) in which the word was first recorded and seems to be similarly indebted to OED.

Word Histories

Bilingual dictionaries and dictionaries for foreign learners do not normally say anything about etymologies, word histories, or obsolete senses. However, most larger dictionaries for native speakers see it as an essential part of the lexicographical task to explain not only the meaning but also the history and semantic development of each word, or at least of the main root words. The meaning of a word can often be

The Future of Lexicography The Challenge of Corpus Evidence

The inventory of the words of well-known languages like English, Spanish, French, Italian, Russian, and German is well established (within certain conventional limits). Increasingly sophisticated efforts have been devoted to recording the words of these

126 Lexicography: Overview

languages for over three centuries. They are widely spoken and have a vast literature, and the main task of the lexicographer has traditionally been to record new words, new phrases, and new senses of words as they develop. In other words, the task of the lexicographer is (or was) to maintain an existing dictionary, adding to the existing text new words and new senses as they arise. This is still the case in the United States. However, this comfortable conservative notion has been shaken somewhat since the mid–1980s with the development of large machine-readable corpora of languages (see Corpus Linguistics). Fashionable linguistic theories of the 1970s and 1980s required a language to consist of a ‘lexicon’ (a fixed inventory of lexical items) and a ‘grammar’ (a fixed set of rules), but corpus analysis has shown that the lexicon consists not only of a large number of regular words, but also of an unbounded number of words that have occurred only once, even in very large corpora, and that may never occur again – words such as giraffishness, which occurred once in an Associated Press news item for 1988 and has never been seen since – while even quite common words are sometimes used in a way that is not satisfactorily accounted for by received grammatical rules and ‘selectional restrictions’ (a term that must be contrasted with ‘selectional preferences’). In other words, the lexicon itself is dynamic. New words are coined all the time, both by recombination of existing morphemes and by productive use of the phonological system of the language. Moreover, the meaning of a word (in particular, the everyday words of a language, as opposed to technical terms such as mammal, which have scientifically stipulated definitions) is to a large extent dependent on the context in which it is used. For example, the verb treat denotes attitude or behavior if it is used with an adverbial of manner, but is more likely to denote a medical procedure if there is no adverbial. The sentence ‘‘I hazarded various Stuartesque destinations like Florida, Bali, Crete, and Western Turkey’’ (British National Corpus) can only be explained as a dynamic exploitation of a normal use of the verb hazard, involving the use of ellipsis (‘‘I hazarded a guess at various Stuartesque destinations . . .’’). In addition, the word Stuartesque is not regular in English; it must be explained in terms of the personal name Stuart and the bound morpheme –esque. Corpora offer many such challenges to lexicographers; on the other hand, they can equally well be used as a pool in which to fish for examples that support existing hypotheses. Monolingual lexicography is only just beginning to come to terms with the challenge of corpus evidence. English dictionaries that have devised corpus-driven

procedures for the description of the lexicon from the outset are Cobuild (1st edn., 1987), the Cambridge international dictionary of English (1995), NODE (1998), and the Macmillan English dictionary (2002). Older established dictionaries, including OALD (the Oxford Advanced Learner’s dictionary) and LDOCE, have revised their entries systematically in the light of corpus evidence but have not radically revised their theoretical foundations. All except NODE are dictionaries for foreign learners of English. There are other dictionaries whose publishers claim or imply a corpus connection, even though the text remains virtually unchanged from earlier editions, compiled before electronic corpora were available. In other languages, the situation is similar. For example, the Duden dictionaries contain some recent updates of entries that appear to be beholden to the evidence of the corpus of the Institut fu¨ r deutsche Sprache, but a systematic new corpus-based analysis of the German lexicon has still to appear. Ironically, it is dictionaries of languages with smaller populations, such as Irish Gaelic and Korean, that seem to be taking the lead in exploiting the new opportunities afforded by corpora. Online Lexical Resources

The Internet is tailor-made for lexicographical projects. The constraints of the printed page (two-dimensional, limited space, printed in expensive multivolume format, etc.) do not apply, although this opens the door to possible faults, such as prolixity and excessive ingenuity. At the present time, however, these faults are not a major preoccupation. Online dictionaries have not yet taken full advantage in any language of the potential of computer technology. This may in part be due to the great expense of setting up something such as a hypertext dictionary for a whole language and the lack of a clear business model. The economics of dictionary publishing in book form, though sometimes nerve-wracking, have the merit of being well understood. The economics of online publishing are as yet somewhat unstable: they have not yet had a chance to stand the test of time. A well-known and universally admired traditional project such as the OED succeeded in attracting online subscriptions from major institutional libraries to help finance the preparation of its third edition, but such funding is less likely to be forthcoming for a more radical, innovative project. The situation is further complicated by the fact that the funding agencies that can normally be expected to fund radical and innovative research projects are shy of infrastructure projects such as

Lexicography: Overview 127

dictionaries and even more reluctant to fund new editions of existing resources. Major academic research projects on the lexicon in recent years, such as WordNet and FrameNet, point the way to what is possible, but they are only a beginning. A full multimedia account of the lexicon of any language in hypertext form is at present no more than a pipe dream.

Information Sources, Conferences, and Associations The leading journals for lexicography are The International Journal of Lexicography, published quarterly, and Dictionaries: The Journal of the Dictionary Society of North America, published annually. The former contains articles not only in English but also in French and German. It takes a global view of lexicography, with articles on dictionaries of all languages, although the chief focus is, not surprisingly, on dictionaries of the major European languages – monolingual and bilingual, synchronic and diachronic, pedagogical and encyclopedic. Conferences and workshops on lexicography are regularly organized by lexicography associations in every continent in the world. Chief among these are Euralex (biennial conferences since 1983 plus other workshops), Australex (biennial conferences since 1990), Afrilex (annual conferences since 1996 plus other workshops; see African Lexicography), Asialex (biennial conferences since 1997), and the Dictionary Society of North America (annual conferences since 1975). A regular conference on computational lexicography, Complex, has been hosted each year since 1992 by the Linguistics Institute of the Hungarian Academy of Sciences in Budapest. A seminal conference was held as long ago as June 1972 under the auspices of the New York Academy of Sciences, the proceedings of which appeared in 1973, edited by McDavid and Duckert and including papers by Bolinger, Zgusta, McIntosh, Quirk, Burchfield, Lakoff, McCawley, Malkiel, Quine, Urdang, and others. Although the title of the collection is Lexicography in English, some of the papers touch on issues that are of general relevance to lexicography in all languages. Reliable and informative discussions of lexicography as a discipline include those by Landau (2001), Jackson (2002), and Svensen (1993), and the papers collected by Hartmann (2003). An important reference work is a massive three-volume collection of essays on all major aspects of lexicography, edited by Hausmann et al. (1989–1991). A survey of the terminology of the field is Hartmann and James

(1998). Other important contributions were made by Zgusta (1971, 1988).

See also: Academies: Dictionaries and Standards; African Lexicography; American Lexicography; Bilingual Lexicography; Chinese Lexicography; Computational Lexicons and Dictionaries; Computational Lexicons and Dictionaries; Computers in Lexicography; Corpus Linguistics; Corpus Lexicography; Definition; Dictionaries and Encyclopedias: Relationship; Dictionaries; Dutch and Flemish Lexicography; English Lexicography; Etymology; French Lexicography; German Lexicography; Greek Lexicography, Classical; Idiom Dictionaries; International Phonetic Association; Jones, William, Sir (1746– 1794); Learners’ Dictionaries; Lexicology; Middle English Dictionaries; Old English Dictionaries; Old French Dictionaries; Phonetic Transcription: Analysis; Register: Dictionaries; Slang Dictionaries, English; Terminology and Terminological Databases; Thesauruses; also articles on lexicography in particular languages, under the name of the relevant language.

Bibliography Burchfield R (1989). Unlocking the English language. London: Faber. Church K W, Mercer & Robert L (eds.) (1993). Computational Linguistics 19(1). Special Issue on Using Large Corpora. Dictionaries: Journal of the Dictionary Society of North America (1975–). Grefenstette G & Kilgarriff A (eds.) (2003). Computational Linguistics 29(3). Special Issue on ‘Web as Corpus’. Hanks P. ‘Lexicography.’ In Mitkov R (ed.) The Oxford handbook of computational linguistics. Oxford, New York: Oxford University Press. 48–69. Hartmann R R K (ed) (2003). Lexicography: critical concepts (3 vols). London: Routledge. Hartmann R R K & James G (1998). Dictionary of lexicography. London and New York: Routledge. Hausmann F J, Reichmann O, Wiegand H E & Zgusta L (eds.) (1989–1991). Wo¨ rterbu¨ cher, dictionaries, dictionnaires: an international handbook of lexicography (3 vols.). Berlin: de Gruyter. International Journal of Lexicography (1988–). Jackson H (2002). Lexicography: an introduction. London and New York: Routledge. Johnson S (1755). Dictionary of the English language. Landau S I (2001). Dictionaries: the art and craft of lexicography (2nd edn.). New York: Scribners. McDavid R I & Duckert A (eds.) (1973). Lexicography in English. New York: Annals of the New York Academy of Sciences. Simpson J (1993). ‘Making dictionaries.’ In Walsh M & Yallop C (eds.) Language and culture in Aboriginal Australia. Canberra: Aboriginal Studies Press. 123–144.

128 Lexicography: Overview Sinclair J (1987). Looking up: an account of the Cobuild project in lexical computing. London and Glasgow: Harper Collins. Svense´ n B (1993). Practical lexicography: principles and methods of dictionary-making. Sykes J & Schofield K (trans.). Oxford, New York: Oxford University Press. Trench R C (1858). ‘On some deficiencies in our English dictionaries.’ In Proceedings of the Philological Society. Wilks Y, Slator B M & Guthrie L (1996). Electric words: dictionaries, computers, and meaning. Cambridge, MA: Bradford Books (MIT Press). Zgusta L (1971). Manual of lexicography. The Hague: Mouton.

Zgusta L (1988). Lexicography today: an annotated bibliography of the theory of lexicography. Tu¨ bingen: Niemeyer.

Relevant Websites http://www.indo-european.nl/ – Website of the IndoEuropean Etymological Dictionary, a research project at the University of Leiden. http://www.sil.org/computing/toolbox/ – The Field Linguist’s Toolbox, produced by SIL International.

Lexicology A Cowie, University of Leeds, Leeds, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction: The Scope of Lexicology Lexicology is ‘the study of the lexicon or lexis (specified as the vocabulary or total stock of words of a language)’ (Lipka, 1992: 1). For English-speaking linguists familiar with teaching and research in the subject in France and Germany, there is a ‘classical’ as well as an ‘evolved’ view of what is meant by the term. The classical view is brought home by examining the contents of standard textbooks written in Germany for university students of English. The title of one of the best known is Englische Lexikologie, with the subtitle Einfu¨ hrung in Wordbildung und lexikalische Semantik (‘Introduction to word formation and lexical semantics’), which highlights its central elements (Hansen et al., 1992). And the handbook does indeed deal in successive major sections with word formation (broken down into ‘compounding’ ‘derivation’ [including ‘zero-derivation’], ‘reduplication’ and ‘blends’) and lexical semantics, in which one major subsection deals with ‘relations within words’ (including ‘homonymy’ and ‘polysemy,’ and ‘metonymy’ and ‘metaphor’) and another with ‘paradigmatic semantic relations between words’ (including ‘antonymy’ and ‘hyponymy’), which are in Britain often referred to as sense relations (Lyons, 1977). Matching for the most part this view of what a lexicology textbook should contain is Tournier’s broad, masterly work (Tournier, 1985), with major chapters devoted to derivation, compounding, and conversion (and thus to word formation), and also

to metase´ mie (or meaning change), which features several of the themes that, in the German work, are subsumed under lexical semantics. There are of course differences of detail between the two works. Interestingly, there are no parallels in Britain for textbooks dealing with the lexicon in quite the way I have just described. Further, the term ‘lexicology’ has not until comparatively recently been used in linguistics textbooks published in Britain or defined in many ‘desk-sized’ dictionaries (not, for example, in the Concise Oxford dictionary until the eighth edition of 1990). Moreover, in British practice the two main topics that are given prominence by both Tournier and Hansen et al. are hived off into separate volumes: word formation to Bauer (1983), lexical semantics to Cruse (1986). However, in many areas where British lexicologists work in close collaboration with colleagues from various European countries, the term of choice is almost universally ‘lexicology’ (rather than ‘lexical semantics’). In the proceedings of recent congresses of the European Association for Lexicography, for instance, we find section headings such as ‘Computational Lexicography and Lexicology’ and ‘Reports on Lexicographical and Lexicological Projects’ where the coupling of related areas of professional concern is underpinned by the formal relatedness of the terms (Braasch and Povlsen, 2002a). There are of course other linkages of form and meaning that help to explain a growing preference. In addition to the connection to lexicography, lexicology is related in form to a number of terms (phonology, morphology, phraseology) denoting other levels of linguistic analysis. Then again, the term has been broader in its application than lexical semantics, having, with

128 Lexicography: Overview Sinclair J (1987). Looking up: an account of the Cobuild project in lexical computing. London and Glasgow: Harper Collins. Svense´n B (1993). Practical lexicography: principles and methods of dictionary-making. Sykes J & Schofield K (trans.). Oxford, New York: Oxford University Press. Trench R C (1858). ‘On some deficiencies in our English dictionaries.’ In Proceedings of the Philological Society. Wilks Y, Slator B M & Guthrie L (1996). Electric words: dictionaries, computers, and meaning. Cambridge, MA: Bradford Books (MIT Press). Zgusta L (1971). Manual of lexicography. The Hague: Mouton.

Zgusta L (1988). Lexicography today: an annotated bibliography of the theory of lexicography. Tu¨bingen: Niemeyer.

Relevant Websites http://www.indo-european.nl/ – Website of the IndoEuropean Etymological Dictionary, a research project at the University of Leiden. http://www.sil.org/computing/toolbox/ – The Field Linguist’s Toolbox, produced by SIL International.

Lexicology A Cowie, University of Leeds, Leeds, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction: The Scope of Lexicology Lexicology is ‘the study of the lexicon or lexis (specified as the vocabulary or total stock of words of a language)’ (Lipka, 1992: 1). For English-speaking linguists familiar with teaching and research in the subject in France and Germany, there is a ‘classical’ as well as an ‘evolved’ view of what is meant by the term. The classical view is brought home by examining the contents of standard textbooks written in Germany for university students of English. The title of one of the best known is Englische Lexikologie, with the subtitle Einfu¨hrung in Wordbildung und lexikalische Semantik (‘Introduction to word formation and lexical semantics’), which highlights its central elements (Hansen et al., 1992). And the handbook does indeed deal in successive major sections with word formation (broken down into ‘compounding’ ‘derivation’ [including ‘zero-derivation’], ‘reduplication’ and ‘blends’) and lexical semantics, in which one major subsection deals with ‘relations within words’ (including ‘homonymy’ and ‘polysemy,’ and ‘metonymy’ and ‘metaphor’) and another with ‘paradigmatic semantic relations between words’ (including ‘antonymy’ and ‘hyponymy’), which are in Britain often referred to as sense relations (Lyons, 1977). Matching for the most part this view of what a lexicology textbook should contain is Tournier’s broad, masterly work (Tournier, 1985), with major chapters devoted to derivation, compounding, and conversion (and thus to word formation), and also

to metase´mie (or meaning change), which features several of the themes that, in the German work, are subsumed under lexical semantics. There are of course differences of detail between the two works. Interestingly, there are no parallels in Britain for textbooks dealing with the lexicon in quite the way I have just described. Further, the term ‘lexicology’ has not until comparatively recently been used in linguistics textbooks published in Britain or defined in many ‘desk-sized’ dictionaries (not, for example, in the Concise Oxford dictionary until the eighth edition of 1990). Moreover, in British practice the two main topics that are given prominence by both Tournier and Hansen et al. are hived off into separate volumes: word formation to Bauer (1983), lexical semantics to Cruse (1986). However, in many areas where British lexicologists work in close collaboration with colleagues from various European countries, the term of choice is almost universally ‘lexicology’ (rather than ‘lexical semantics’). In the proceedings of recent congresses of the European Association for Lexicography, for instance, we find section headings such as ‘Computational Lexicography and Lexicology’ and ‘Reports on Lexicographical and Lexicological Projects’ where the coupling of related areas of professional concern is underpinned by the formal relatedness of the terms (Braasch and Povlsen, 2002a). There are of course other linkages of form and meaning that help to explain a growing preference. In addition to the connection to lexicography, lexicology is related in form to a number of terms (phonology, morphology, phraseology) denoting other levels of linguistic analysis. Then again, the term has been broader in its application than lexical semantics, having, with

Lexicology 129

suitable modification, referred to diachronic lexical studies (hence ‘historical lexicology’; cf. Ullmann, 1957: 39) and contrastive lexical studies (hence ‘contrastive lexicology’; cf. Van Roey, 1990). A number of major topics from outside the traditional core are currently subsumed under the heading ‘lexicology.’ This brings me to my reference earlier to an ‘evolved’ view. One cannot avoid noticing a greater breadth of coverage in the latest edition of what is probably the best-known German textbook (Lipka, 2002). Not only do we find the topics treated that we have come to take for granted in such a text – homonymy vs. polysemy, lexical fields and hierarchies, and lexical rules and semantic processes – but there are other themes that fall outside its traditional limits, including corpus linguistics and cognitive linguistics – both to be considered below – while there is extensive discussion in Lipka’s Introduction of ‘the expanding field.’ This article attempts to reflect this greater diversity of themes. Polysemy and homonymy in the first section belong to the traditional core of the subject (Cowie, 1994). The second section is devoted to cognitive approaches to the analysis of metaphor based on the work of Lakoff and to a project in the ‘syntagmatics of metaphor’ that draws extensively on very large text corpora. The final section is taken up with an approach to lexical description – ‘frame semantics’ – that brings together syntactic and semantic levels of description in a rigorous and systematic way.

Polysemy and Homonymy Though polysemy and homonymy are long-established concepts in lexicology, they continue to excite discussion among linguists (e.g., Cowie, 2001; Ravin and Leacock, 2000). The concepts themselves and the relationship between them are of crucial importance to lexicographers, since they require sound criteria to determine whether different occurrences of a given word represent different senses and, if different, how different. As we shall see, approaches to meeting this need which lay emphasis on the contrastive lexical environments of such occurrences (their differing collocability) have much to offer, especially as evidence is now available from immensely large text corpora (Hanks, 2004a). Polysemy and homonymy can be defined in general terms as follows. When a given word form (i.e., in the written language, any sequence of letters bounded on either side by a space) realizes two or more related though separate meanings of the same lexical item, we have polysemy. (Compare: thicka ‘having density’; thickb ‘with a large number of units close together’.)

When, however, the word form functions as the realization in speech or writing of more than one lexical item, we have homonymy. (Compare: meal1 ‘flour’; meal2 ‘repast’.) Though the phenomena are often linked, polysemy is more widespread than homonymy, and of much greater significance. Homonymy may come about through the chance convergence of two distinct forms (meal1 coming from Old English melo and meal2 from Old English mæl). Polysemy, however, is typically the result of lexical creativity and is crucial for the functioning of a language as an efficient semiotic system (Lyons, 1977: 567; Lipka, 1992: 136). It is also true that the distinction between homonymy and polysemy is not the simple dichotomy that is sometimes believed, but is in the nature of a scale or cline. This becomes unarguable once it is accepted that every case of polysemy involves relatedness as well as difference of meaning (Cowie, 2001). As recent studies have shown, however, there is still a tendency to emphasize the latter at the expense of the former (Ravin and Leacock, 2000b). Three major issues need to be addressed in treating this opposition. First, we need to take account of the phenomenon of ‘modulation,’ or ‘contextual variation,’ of a single sense (Cruse, 1986). It is sometimes claimed, rather loosely, that the sense of a (polysemous) item is determined by its context. Such claims, however, fail to distinguish between the unrestricted, and often ephemeral, ‘shaping’ of a word by its context and the more familiar variation attributable to context, which is the activation by different contexts of existing and possibly well-established senses of polysemous words. An example of a meaning difference attributable to modulation – but where there is only one sense – appears below (cf. Moon, 1987): He took the cigarette out of his mouth [the opening]. She had a wide and smiling mouth [the lips].

The second issue that needs to be considered is whether, when attempting to identify polysemy or homonymy, we need to take account of more than one criterion (for checklists, see Lipka, 1986; Cowie, 2001). A further, related, issue is whether we should give prominence to semantic rather than formal criteria. To begin with the question of multiple criteria, it seems vital, given the nature of polysemy as a scalar phenomenon, to refer in each case to two or more criteria. If one takes two or more occurrences of the same word form, and finds that they have similar distributions at a number of descriptive levels – syntactic, morphological, collocational – then the assumption is that they will be close in meaning and

130 Lexicology

at the most remote point from homonymy. Consider, with this in mind, the following senses of the noun tour, noting especially the possible conversion of the noun – in one or more senses – to a verb, and any other characteristic derivatives (Cowie, 2001: 46). a. b. c. d. e.

tour (holidays) tour (visit, inspection) tour (military) tour (artistic) tour (sporting)

tour (v), tourist, tourism tour (v) tour (v) tour (v) tour (v), tourist

We can see that in terms of conversion (‘zero derivation’) the various senses are alike, while as regards derivation proper, only the first stands out from the others (note also the compounds package tour and tour operator). By contrast, near-homonymy will be indicated when, for a given set of criteria, one finds wide distributional differences. Such is the case with three senses of the adjective sheer, where the synonyms are sharply distinct and the pattern with sheerness is only fully acceptable in the first sense: sheera (‘very steep’)

sheerb (‘absolute’)

sheerc (‘so fine as to be transparent’)

¼ perpendicular (cf. the sheerness of the rock face) ¼ unmitigated (cf. *the sheerness of his folly) ¼ diaphanous (cf. ?the sheerness of her tights)

As for giving priority to semantic criteria, we should take account of the view of Cruse (1986, 2000) on how polysemy should be defined, while noting its limitations. Cruse favors semantic (or ‘direct’ tests), which he describes as more successful and reliable for diagnostic purposes. One such test depends on the fact that separate senses of an item cannot be brought into play in the same sentence without oddness. Consider for example this sentence, where activating two senses of expired at the same time gives rise to zeugma: ?John and his driving license expired last Tuesday. Nevertheless, the meanings of different lexical items differ quite widely as to the degree of oddness that is revealed when they are brought into play. In this sentence, for instance, where senses of tour are (again) involved, no oddness is involved: The England cricketers and the Royal Ballet were both touring South Africa. Yet this is only to be expected if the senses of a polysemous word are – in differing degrees – related as well as separate. More positively, one could conclude that, while no better in this respect than grammatical tests, semantic tests do provide evidence of the scalar nature of polysemy.

Metaphor and the Differentiation of Meanings As many entries in ordinary dictionaries confirm, metaphor is one of the commonest means by which new meanings develop from existing senses. The prevalence of metaphor more generally, as an integral part of the structure of the lexicon, has been highlighted in the field of cognitive semantics and in particular in the work of Lakoff and Johnson (1980) and Lakoff (1987). Lakoff and Johnson argued that we can point to a number of very general concepts whose metaphorical structuring is reflected in the phrasal lexicon of the language (1980: 52), and they gave weight to their hypothesis by showing that particular metaphors, such as ‘ideas are cutting instruments,’ are manifested by a cluster of specific phrases: That’s an incisive idea. That cuts right to the heart of the matter . . . He has a razor wit. He had a keen mind. (Lakoff and Johnson, 1980: 48).

It is worth noting that elsewhere the authors pointed out that a particular set of phrases ‘structured by a single metaphorical concept’ (‘life is a gambling game’) are ‘phrasal lexical items’ (or, to use a common equivalent, multiword units). Such examples reinforce the point made earlier, that metaphorical concepts are characteristically realized by expressions from the phrasal lexicon (Lakoff and Johnson, 1980: 51). The link between Lakoff and Johnson’s hypothesis and phraseology has been seized on by a number of linguists. Referring to a collocational database enriched by having had ‘lexical functions’ assigned to each of the 70 000 pairs of collocations it contains, Fontenelle noticed that other metaphors are commonly used in the language (Fontenelle, 1994: 275; and for lexical functions cf. Mel’chuk and Zholkovsky, 1984; Fontenelle, 1997). He considered the following examples from the database (introduced by the function ‘Mult,’ expressing ‘a group or set of something’): Mult (arrow) Mult (bullet) Mult (missile) Mult (stone)

¼ ¼ ¼ ¼

cloud, rain, shower, storm rain storm shower

Arrow, bullet and so on are projectiles aimed at a target. But as Fontenelle explained, these terms are used in collocation with words pertaining to meteorological phenomena, enabling us to posit the existence of the metaphor ‘a projectile is a meteorological phenomenon.’

Lexicology 131

The approach to the definition of metaphor followed by Patrick Hanks must be viewed as part of his much broader theory of norms and exploitations, developed as a response to the pervasiveness in language use of what he has called ‘‘stereotypical syntagmatic patterns’’ (Hanks, 2004a). What the term points to is repetitive use (abundantly illustrated by Hanks), relative fixity, and preference for the form in question over plausible alternatives (well illustrated by native speakers preferring a storm of protest to a torrent of protest). Such examples as a storm of protest and a torrent of abuse represent the ‘norm.’ As Hanks explains, ‘some uses are stereotypical; others exploit stereotypes, typically for rhetorical effect’ (2004b: 246), and he goes on to develop the point by suggesting that whereas stereotyped phrases and idioms are ‘norms,’ innovative adaptations – often based on existing phrases, as in the vultures are circling vs. the lawyers are circling – are ‘exploitations.’ Clearly, dead metaphors will figure in the first category and freshly minted metaphors in the second. The treatment of metaphor provided by Hanks is detailed and often illuminating. It is clear, first, that we must recognize ‘degrees’ of metaphoricity, in the sense that neither, one, or both of the constituents of a phrase may be metaphorical: a. The lowest point on the scale is represented by storms blow, storms abate, where there are arguably two literal elements and ‘a reductionist interpretation is appropriate’ (Hanks, 2004b: 256). b. We come next to cases where storm is literal but the associated verb is figurative: storms brew and rage; storms batter, lash, and ravage. Such combinations are so common that ‘it is easy to overlook the metaphorical status of the verb’ (Hanks, 2004b: 256). It is worth noting, too, that in treatments influenced by ‘classical’ Russian theory, storm would be said to ‘shape’ the figurative senses of batter, lash, etc. (cf. Cowie, 1999). c. In the third type, storm is metaphorical, as in a political storm, a storm of protest. d. At the final stage we find idioms, such as a storm in a teacup. There is a particularly interesting comment on storm (i.e., as a conventional metaphor) as the direct object of a causative verb. The causative may be literal, as witness cause, create and possibly raise. But ‘storm in this sense is found as the direct object of both literal and metaphorical causative verbs’ (Hanks, 2004b: 259). In the latter case, we have mixed metaphors: spark a storm (of protest, etc.), whip up a storm. The mixed metaphor effect is perhaps caused by spark (say), which acts originally as

a collocate of explosion, later carrying over the appropriate features to storm (for ‘oblique’ metaphors, see Cowie, 2004). Finally, and still on the subject of conventional metaphors, Hanks made two observations that only access to extensive data could make possible. In causative brew up a storm, whip up a storm, etc., the noun is almost certainly metaphorical. In inchoative a storm is brewing up, by contrast, the noun could be either. Other contrastive comments are prompted by a storm of something, where the final noun is almost always a conventional metaphor and negative: a storm of protest, a storm of criticism, etc. Less common are positive reactions, such as a storm of applause.

Frame Semantics Considerable interest is currently being shown by lexicologists and specialists in lexicography and natural language processing in frame semantics, an approach to the study of lexical meaning based on work by Charles Fillmore and his associates. A particular strength of the theory is that it brings together syntactic and semantic levels of analysis in a rigorous and systematic way. But of particular significance is the contribution that the theory makes to the analysis of words, and thus to lexicography. As Atkins, Fillmore and Johnson pointed out, experienced lexicographers will use various clues to identify differences in the meaning of a word when examining citations in which it appears. In the case of the verb argue, for example, the indicators of difference will include different synonyms (quarrel in one case, reason in another) and contrastive sets of prepositions: ‘in the quarrelling sense, you argue with someone, about something, while in the reasoning sense you argue for or against something, or that something should be done’ (Atkins et al., 2003: 253). By making implicit use, as here, of various kinds of linguistic information, the practical lexicographer can make considerable progress towards identifying or separating senses. However, if recognized explicitly, the information can lead to deeper understanding. One kind of information has to do with the syntactic contexts in which argue occurs; the other concerns the meanings of the prepositions (with, for, and so on) which function as its complements. Beyond that awareness is the recognition that we need a theory that links the meanings of words very explicitly to the syntactic contexts in which those words are used, and to the semantic properties of those contexts. In frame semantics, this requires a further step, which is to recognize that ‘word meanings must be described in

132 Lexicology

relation to semantic frames – schematic representations of the conceptual structures and patterns of beliefs, practices, institutions, images, etc. that provide a foundation for meaningful interaction in a speech community’ (Fillmore et al., 2003: 235). Those working on semantic frames have an associated computational lexicography project, FrameNet, which extracts information about the related semantic and syntactic properties of English words from large text corpora, and analyzes the meanings of words ‘‘by directly appealing to the frames that underlie their meanings and studying the syntactic properties of words by asking how their semantic properties are given syntactic form’’ (Fillmore et al., 2003: 235). In this way are the grammatical and semantic features of argue, presented informally earlier on, given more rigorous and systematic shape. The notion of valence – at the grammatical level, the requirement that a word combines with particular phrases in a sentence – plays an important part in the theory. However, in FrameNet, information about valence must be stated in semantic as well as syntactic terms. That is, the semantic roles that syntactic elements such as subject or object play with reference to the meaning of the word must be fully accounted for. In FrameNet, ‘the semantic valence properties of a word are expressed in terms of the kinds of entities that can participate in frames of the kind evoked by the word’ (Fillmore et al., 2003: 237). These roles are called frame elements (FEs). FEs must be taken to include not only the human participants – aggressors, victims, etc. – but conditions and objects relevant to the central concepts of the frame. In the semantics of the key term risk – the subject of a detailed study by Fillmore and Atkins (1992) – such elements include uncertainty about the future (the frame element ‘chance’) and a potential unwelcome development (the frame element ‘harm’). According to the study, these two FEs alone make up the core of our understanding of several other items within the field, including peril, danger, venture, and hazard. Semantic frame theory offers a number of opportunities to lexicologists and lexicographers in treating the vocabulary of risk and other semantic domains (Cowie, 2002). First is the fact that the model brings together frame elements and syntactic functions, so that the one-to-two (or one-to-many) relations that often hold between FEs and lexicosyntactic structures are clearly demonstrated. To illustrate this point, we can see, just below, that the frame element ‘aggressor’ is common to both examples. This perception leads us to bring together – as semantically related – two contrasting structures incorporating the noun threat, in which ‘aggressor’ is

realized first as a grammatical subject and second as a prepositional phrase introduced by from (Cowie, 2002: 327): 1.

AGGRESSOR

{the dolphins} were

a threat VALUED OBJECT {to the local fishing industry} 2. an imagined threat AGGRESSOR {from the few remaining ex-revolutionaries}

A further way in which frame semantics can benefit lexical description is that is enables us to take a set of supposed synonyms and to use the theory to elucidate crucial differences of meaning and distribution – and incidentally show up the inadequacies of definitions in several current dictionaries. This has been demonstrated in Atkins’s analysis of the verbs of seeing (1994). Comparing approaches to defining the verbs behold, descry, notice, spot, and spy in two mother tongue dictionaries, she notes how the verbs are defined in terms of each other. She is also aware that recourse to corpus data alone, rich though it may be, does not enable one to identify the tiny shifts of sense which divide one verb from another (Cowie, 1999). Atkins’s approach, in the case of the verbs see, behold, catch sight of, spot, spy is to recognize a ‘perception frame,’ whose chief elements include a ‘passive’ rather than an ‘active experiencer’ (since seeing, unlike looking, does not include an element of volition), a ‘percept’ (or object perceived) and a ‘judgment,’ which refers to the opinion which the experiencer forms of the percept as a result of seeing, etc. In the first example below, the judgment being made is comparative, while in the second example it is in the nature of an inference: He looked to me like a yellow budgerigar. (judgment–simile) Peter looks relaxed. (judgment–inference)

As we can see from these examples, the relationship between semantic frame elements and grammatical functions is often far from straightforward. In the first example above, the experiencer is realized as the prepositional phrase to me; in a sentence such as Mary saw the duck, the experiencer is Mary, the subject of the perception verb see (Atkins, 1994: 5). The work described briefly here forms part of a continuing descriptive project, of considerable scope and ambition, whose results are already proving invaluable to lexicographers and computational linguists. See also: Collocations; Corpus Approaches to Idiom;

Fillmore, Charles J. (b. 1929); Frame Semantics; Lexicon

Lexicology 133 Grammars; Mel’cˇuk, Igor Aleksandrovic (b.1932); Phraseology; Russian Lexicography.

Bibliography Allen R E (ed.) (1990). The concise Oxford dictionary of current English. (8th edn.). Oxford: Clarendon Press. Atkins B T S (1994). ‘Analyzing the verbs of seeing: a frame semantics approach to corpus lexicography.’ In Gahl S, Johnson C & Dolbey A (eds.) Proceedings of the twentieth annual meeting of the Berkeley Linguistics Society. Berkeley: University of California at Berkeley. 1–17. Atkins B T S, Fillmore C J & Johnson C R (2003). ‘Lexicographic relevance: selecting information from corpus evidence.’ International Journal of Lexicography 16(3), 251–280. Bauer L (1983). English word-formation. Cambridge: Cambridge University Press. Braasch A & Povlsen C (2002a). ‘Introduction.’ In Braasch & Povlsen (eds.). 1. Braasch A & Povlsen C (eds.) (2002b). Proceedings of the Tenth Euralex International Conference, Euralex. Copenhagen: Center for Sprogteknologi. Cowie A P (1994). ‘Applied linguistics: lexicology.’ In Asher R E (ed.) The encyclopedia of language and linguistics 1. Oxford and New York: Pergamon Press. 177–180. Cowie A P (1999). English dictionaries for foreign learners: a history. Oxford: Clarendon Press. Cowie A P (2001). ‘Homonymy, polysemy and the monolingual English dictionary.’ Lexicographica 17, 40–60. Cowie A P (2002). ‘Harmonising the vocabulary of risk.’ In Braasch & Povlsen (eds.). 325–330. Cowie A P (2004). ‘Oblique metaphors and restricted collocations.’ In Palm-Meister C (ed.) Europhras 2000: internationale Tagung zur Phraseologie vom 15.–18. Juni 2000 in Aske/Schweden. Berlin: Stauffenburg. 45–50. Cruse D A (1986). Lexical semantics. Cambridge: Cambridge University Press. Cruse D A (2000). ‘Aspects of the micro-structure of word meanings.’ In Ravin & Leacock (eds.). 30–51. Fillmore C J & Atkins B T S (1992). ‘Towards a framebased lexicon: the semantics of RISK and its neighbors.’ In Lehrer A & Kittay E (eds.) Frames, fields, and contrasts. Hillsdale, NJ: Lawrence Erlbaum Associates. 75–102. Fillmore C J, Johnson C R & Petruck M R L (2003). ‘Background to FrameNet.’ International Journal of Lexicography 16(3), 235–250. Fontenelle T (1994). ‘Using lexical functions to discover metaphors.’ In Martin W, Meijs W, Moerland M, ten Pas

E, van Sterkenburg P & Vossen P (eds.) Euralex 1994 proceedings. Amsterdam: Vrije Universiteit Amsterdam. 271–278. Fontenelle T (1997). Turning a bilingual dictionary into a lexical-semantic database. Tu¨ bingen: Max Niemeyer. Hanks P (2004a). ‘Corpus pattern analysis.’ In Williams G & Vessier S (eds.) Euralex International Congress: proceedings (3 vols). Lorient, France: Universite´ de Bretagne-Sud. 87–97. Hanks P (2004b). ‘The syntagmatics of metaphor and idiom.’ International Journal of Lexicography 17(3), 245–274. Hansen B, Hansen K & Neubert A (1992). Englische Lexikologie (2nd edn.). Berlin: Langenscheidt. Lakoff G (1987). Women, fire, and dangerous things: what categories reveal about the mind. Chicago/London: University of Chicago Press. Lakoff G & Johnson M (1980). Metaphors we live by. Chicago/London: University of Chicago Press. Lipka L (1986). ‘Homonymie, Polysemie oder Ableitung in heutigen Englisch.’ Zeitschrift fu¨ r Anglistik und Amerkanistik 34, 128–138. Lipka L (1992). An outline of English lexicology (2nd edn.). Tu¨ bingen: Max Niemeyer. Lipka L (2002). English lexicology: lexical structure, word semantics and word-formation (3rd edn.). Tu¨ bingen: Gunter Narr. Lyons J (1977). Semantics (2 vols). Cambridge: Cambridge University Press. Mel’chuk I & Zholkovsky A (1984). Tolkovo-kombinatornyi slovar’ russkogo iazyka / Explanatory and combinatorial dictionary of modern Russian. Vienna: Wiener Slawistischer Almanach. Moon R (1987). ‘Monosemous words and the dictionary.’ In Cowie A P (ed.) The dictionary and the language learner. Tu¨ bingen: Max Niemeyer. 173–182. Ravin Y & Leacock C (2000a). ‘Polysemy: an overview.’ In Ravin & Leacock (eds.). 1–29. Ravin Y & Leacock C (eds.) (2000b). Polysemy: theoretical and computational approaches. Oxford: Oxford University Press. Tournier J (1985). Introduction descriptive a` la lexicoge´ ne´ tique de l’anglais contemporain. Paris: Champion/ Geneva: Slatkine. Ullmann S (1957). Principles of semantics (2nd edn.). Oxford: Blackwell. Van Roey J (1990). French-English lexicology: an introduction. Louvain-la-Neuve, Belgium: Peeters.

134 Lexicon Grammars

Lexicon Grammars A Geyken, Berlin-Brandenburg Academy of Sciences, Berlin, Germany ! 2006 Elsevier Ltd. All rights reserved.

History The term ‘lexique-grammar’ (lexicon-grammar) was introduced by Maurice Gross in 1984, but its basic principles were established already in the 1950s by Zellig Harris. However, despite the linguistic insights revealed by Harris’s transformational methods, Harris has never tried his method on more than a handful of examples. The large-scale implementation of Harris’s ideas was the merit of Gross, his student at the University of Pennsylvania. After returning to Paris in the late 1960s, Gross founded a research laboratory, LADL (Laboratoire d’Automatique Documentaire et Linguistique), where he and his team began to build a very large empirically based inventory of a lexicon-grammar of French. In 1975, Gross published Me´ thodes en syntaxe, which lays the foundations for the methodology of lexicon-grammars. Me´ thodes also contains a complete and systematic description of French verbs that select for sentential complements. Two additional publications (Boons et al., 1976; Guillet and Lecle`re, 1992) describe simple verbs without sentential complements. A large class of verb–complement combinations could not be characterized in terms of selectional restrictions. They are described by M. Gross (1981). Several lexicon-grammars about support verb constructions followed: Vive`s (1983), Giry-Schneider (1987), and G. Gross (1988). Another focus of intense research was fixed expressions (M. Gross, 1982). Finally, lexicon-grammars have also been applied to other lexical forms: adjectives (Meunier, 1981), adverbs (M. Gross, 1990), and compound nouns (G. Gross, 1990). Lexicon-grammars have been applied to the study of languages as diverse as Arabic, Chinese, English, French, German, Greek, Italian, Japanese, and Korean. Since 1981, RELEX, the international network of research on lexicon-grammars, organizes annual conferences devoted to lexis and grammar. At the very basis of lexicon-grammars are the morphological DELA lexicons (DELA is an acronym for Electronic Dictionary of the LADL), which serve as index keys to the lexicon-grammars. DELA consists of several sublexicons: a lexicon of simple forms, containing 90 000 lemmas with complete morphological information, a lexicon containing phonemic information, and a lexicon of compound nouns with about 100 000 entries. The DELA lexicon format has

been applied to English, Italian, German, Greek, Norwegian, Portuguese, and Spanish. Maurice Gross’s death slowed the momentum of lexicon-grammar development. A special issue of Linguisticae Investigationes in his honor, with more than 50 contributions, shows, however, that research in lexicon-grammar is still active (Lecle`re et al., 2004).

Goals and Assumptions of Lexicon-Grammar The goal of lexicon-grammar is to give a precise and complete description of the senses and usages of lexical units in elementary sentences on the basis of distributional and transformational methods (M. Gross, 1990: 39). The lexicon-grammar approach is built upon the following important assumptions: . The minimal context of linguistic description is the elementary sentence, i.e., a verb with only its essential complements (or strictly subcategorized complements in the terminology of generative grammar), and not the word. As a consequence, lexicon-grammars make no strict separation between lexicon and grammar. . The transformational description in lexicon-grammar is based on Z. H. Harris’s theoretical framework. Since Harris’s transformations preserve the meaning of elementary sentences, lexicon-grammar describes meanings and not merely constructions. . The approach is empirical since the syntactic behavior of lexical units is believed to be too idiosyncratic to be described with general rules. Rather than proceeding from a theory, Maurice Gross insists on a systematic description of the syntactic behavior of all or at least a large number of lexical elements of a language before deriving general rules (M. Gross, 1975: 46).

The Elementary Sentence An elementary sentence consists of a predicative lexical unit and its essential complements. In accordance with valence theory, essential complements cannot be added freely to predicative lexical forms. In the case of verbs, the elementary sentences are of the form P:¼ N0 V W, where N0 is the subject, V the verb, and W an arbitrary sequence of verb complements. Lexicon-grammar claims that the form of an elementary sentence cannot be determined a priori but has to be discovered empirically for each lexical unit ‘‘since tests for characterizing essential complements

Lexicon Grammars 135

are highly lexical and tend to apply more to individual verbs or small groups rather than to broad classes’’ (M. Gross, 1996). The main principle of lexicon-grammar is that the elementary sentence is taken as the minimal unit for the study of meaning and syntax, for several reasons. First, sentential idioms such as Max took the bull by the horns constitute elementary units of meaning. Likewise, noncompositional expressions such as compound nouns show that words are not always the minimal unit of meaning. Further, predicative lexical elements can only be fully described in their elementary sentence context, which is more than a description of verb–argument structure. Indeed, a fine-grained analysis reveals that more syntactic properties than is commonly assumed depend on the verb. For example, determiners or adjectives are mostly described as being constrained by their noun. However, in (1) and (2) they are shown to be verb-specific (M. Gross, 1993): (1) Bob wants some beer * Bob loves some beer (quantifier reading) (2) Bob is building a future mansion ¼ Bob is building a mansion Bob is eating a future cake 6¼ Bob is eating a cake

Current linguistic models share the view that grammar and lexicon interact. However, this assumption was remarkable in a period when the overwhelming majority of linguists strictly separated lexical and grammatical levels. In contrast to this traditional point of view, lexicon-grammars assume that lexical units always carry grammatical information and, vice versa, what is traditionally described by grammars depends very much on lexical variability.

adjective–noun sequence is determined by the set {legal, small, common, immovable . . .}, which is different from the entire class of all adjectives (*sallow property, *yellow property). Two or more constructions are transformations of one another if they: a. share the same set of elementary classes (affixes, auxiliaries, or promorphemes such as pronouns are not considered), b. have the same k-occurrence, and c. have the same broad distribution, i.e., occur in the same sentence environment. Harris’s transformations are equivalence classes if the homonyms of lexical units are distinguished. If homonyms are not distinguished, transformations are neither symmetric nor transitive. For example, the passive transformation in The wreck was seen by the seashore cannot be reversed because of the homonymy of the by construction (Harris, 1957: 395). Transformations thus preserve meaning, as summarized by Harris (1957: 395): ‘‘That many sentences which are transforms of each other have more or less the same meaning except for different external grammatical status . . . is an immediate impression. This is not surprising, since meaning correlates closely with the range of occurrence, and transformations maintain the same occurrence range.’’ One consequence of (c) is that while Harris’s transformations operate on sentences, they generally do not allow transformations between sentences and noun phrases. For example, for the sentence King John attacks the city, Harris (Harris, 1964) formulates the relation of nominalization in (3) rather than (4), as suggested by early versions of generative grammar: (3) King John launches an attack against the city

Transformations Transformational rules in lexicon-grammar are based on Harris’s theoretical framework (e.g., Harris, 1957, 1964). Unlike the transformational rules of early generative grammar, rejected by Gross because of the problematic status of their constituents (M. Gross, 1978), Harris’s transformations operate on what generative grammar would call surface structure, i.e., on sequences of word classes (constructions) and their distributions (k-occurrences). Constructions are sequences of elementary classes (noun, verb, adjective, article, preposition, conjunction, adverb) and morphemes (i.e., affixes). K-occurrences define the paradigmatic set of acceptable lexical units of sequences of elementary classes. For example, the k-occurrence of all adjectives occurring with the noun property in an

(4) the attack of the city by King John

This relation of nominalization led to the notion of support verb, a type of construction where the selectional constraints are not between the verb and the complements, but between the predicative noun and its complement. Again, this is justified by a transformational approach. Consider for example (5), a sentence that has the same surface structure as (3). However, the two constructions differ with respect to extraposition. Here, the double analysis holds for support verb constructions (7a), whereas it is not acceptable for the distributional verbs in (7b). (5) King John criticizes an attack against the city (6a) It is an attack against the city that King John launches (6b) It is an attack against the city that King John criticizes

Table 1 Sample of the class 36DT, the class of transfer verbs. N0, N1 and N2 denote subject, direct and indirect object. Transformations are listed under P.A.

136 Lexicon Grammars

Lexicon Grammars 137 (7a) It is an attack that King John launches against the city (7b) *It is an attack that King John criticizes against the city

Constructing Lexicon-Grammars The main motivation for constructing a lexicongrammar is to determine systematically the rules of a language and their range of application. This process is illustrated here for verbs, but it works similarly for other predicative elements. The description of verbs and the selection of distributional and transformational properties were carried out by the linguists in Gross’s team in an interdependent process. For two sentences with a given verb, it had to be determined whether the sentences had the same meaning or whether they could be related by transformations. Or, if the meaning was different, the goal was to find a transformation that was acceptable for one but not the other. For example, the different meanings of the morphological verb arroser (‘to water’) in the two following sentences can be demonstrated by the following reciprocal transformation: (8) Luc arrose la plante avec un arrosoir ¼ *Luc et l’arrosoir arrosent la plante ‘Luc waters the plant with a can’ ¼ ‘*Luc and the can water the plant’ (9) Luc arrose la nouvelle anne´ e avec ses amis ¼ Luc et ses amis arrosent la nouvelle anne´ e ‘Luc toasts (lit.: waters) the New Year with his friend’ ¼ ‘Luc and his friends toast the New Year’

Another goal is to mark all the syntactic properties for each verbal entry, i.e., the elementary sentence of the form subject-verb-essential complements. Judgments of acceptability are binary: an entry does or does not have a certain property. In order to correct the shortcomings of introspection that underestimate the actual production of forms (M. Gross, 1975), as a rule of thumb, dubious syntactic properties were marked as acceptable (M. Gross, 1996). Even though lexicon-grammars were constructed long before large electronic corpora were available for collecting attested data, this procedure has been criticized as leading to biased judgments and the acceptance of far too many properties (Busse, 1980).

Results and Discussion More than 30 years of systematic description by a team of linguists have resulted in a huge lexicon grammar of French elementary sentences based on predicative lexical units. It consists of:

. 12 000 verbs organized according to 500 properties (stored as 15 000 verbal entries) . 44 000 entries of fixed sentences . 50 000 support verb constructions . 7000 frozen adverbs (e.g., back and forth). These units are stored in matrices where each line is an entry and each column corresponds to either a distributional or a transformational property. The cells are populated by a ‘þ’ if the property holds for the entry or a ‘#’ if it does not. Since there is considerable redundancy in these matrices, entries are grouped together in submatrices, called classes, if they have similar syntactic properties. A sample of the matrix-representation is given in Table 1. A detailed description of the organization of French lexicon-grammar as well as the statistical distribution of all constructions can be found in Lecle`re (2002). A comparison of all the lines of the matrix revealed that only a few verbs share the same set of constructions. This finding can be seen as an a posteriori justification of the lexicon grammar’s methodology: ‘‘every verb has to be described on an individual basis’’ (M. Gross, 1996). Lexicon-grammars separate meanings by applying the theoretical framework of Harris, but they have nothing to say on how the different entries are related. For example, the literal meaning and metaphorical uses of entries such as (8) and (9) can be separated by differences in form but they are unrelated. Laporte comments this as ‘‘the price to be paid for an essential epistemological feature: the model is designed so that it can be confronted with reality through a set of sufficiently reproducible experimental processes, which includes acceptability judgment, differential acceptability, paraphrase judgment, ambiguity judgment, and a few others, all applied to sentences’’ (Laporte, 2004). See also: Computers in Lexicography; Constituent Structure; Corpus Lexicography; French Lexicography; French; Gross, Maurice (1934–2001); Harris, Zellig S. (1909–1992); Lexicology; Lexicon, Generative; Phraseology; Valency Grammar.

Bibliography Boons J-P, Guillet A & Lecle`re C (1976). La structure des phrases simples en franc¸ ais. Constructions intransitives. Ge´ ne`ve: Droz. Busse W (1980). ‘Distributionelle Beschreibung der Franzo¨ sischen Verbsyntax als Grundlage zur Kritik der transformationellen Grammatik.’ Zeitschrift fu¨ r franzo¨ sische Sprache und Literatur. XC(1), 25–45. Giry-Schneider J (1987). E´ tudes de pre´ dicats nominaux en franc¸ ais. Les constructions faire N. Ge´ ne`ve: Droz.

138 Lexicon Grammars Gross G (1989). Les constructions converses du franc¸ ais. Ge´ ne`ve: Droz. Gross G (1990). ‘De´ finitions des noms compose´ s dans un lexique-grammaire.’ Langue franc¸ aise 87, 84–91. Gross M (1975). Me´ thodes en syntaxe. Re´ gimes des constructions comple´ tives. Paris: Hermann. Gross M (1979). ‘On the failure of generative grammar.’ Language 55(4), 3–17. Gross M (1981). ‘Les bases empiriques de la notion de pre´ dicat se´ mantique.’ Langages 63, 7–53. Gross M (1990). ‘Sur la notion harrissienne de transformation et son application au franc¸ ais.’ Langages 98, 48–58. Gross M (1993). ‘Constructing lexicon-grammars.’ In Atkins B T S & Zampolli A (eds.) Computational approaches to the lexicon. Oxford: Oxford University Press. 259–316. Gross M (1996). ‘Lexicon-Grammars.’ In Brown K & Miller J (eds.) Concise encyclopedia of syntactic theory. Oxford: Pergamon. 244–258. Guillet A & Lecle`re C (1992). La structure des phrases simples en franc¸ ais. Constructions transitives locatives. Ge´ ne`ve: Droz.

Harris Z (1957). ‘Co-occurrence and transformation in linguistic structure.’ Language 33, 238–340. Reprinted in Papers in structural and transformational linguistics (1970). Dordrecht: Reidel. 390–457. Harris Z (1964). Elementary transformations. Philadelphia: University of Pennsylvenia. T. D. A. P. no. 54. Reprinted in Papers in structural and transformational linguistics (1970). Dordrecht: Reidel. 472–482. Laporte E (2004). ‘Pre´ face de syntax, lexis and lexicongrammar.’ In Lecle`re C, Laporte E, Piot M & Silberztein M (eds.). xi–xxi. Lecle`re C (2002). Organization of the Lexicon-grammar of French Verbs. Lingvisticae Investigationes 25:1, Amsterdam: John Benjamins. Lecle`re C, Laporte E, Piot M & Silberztein M (eds.) (2004). Syntax, lexis and lexicon-grammar. Papers in honour of Maurice Gross. Lingvisticae Investigationes Supplementa 24, Amsterdam, Philadelphia: Benjamins. Meunier A (1981). Nominalisations d’adjectifs par verbes supports. Ph.D. diss. Universite´ Paris 7. Vive`s R (1983). Avoir, prendre, perdre: constructions a` verbe support et extensions aspectuelles. Ph.D. diss. Universite´ Paris 8.

Lexicon, Generative J Pustejovsky, Brandeis University, Waltham, MA, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction Generative Lexicon (GL) introduces a knowledge representation framework that offers a rich and expressive vocabulary for lexical information. The motivations for this are twofold. Overall, GL is concerned with explaining the creative use of language; we consider the lexicon to be the key repository holding much of the information underlying this phenomenon. More specifically, however, it is the notion of a constantly evolving lexicon that GL attempts to emulate; this is in contrast to currently prevalent views of static lexicon design, where the set of contexts licensing the use of words is determined in advance, and there are no formal mechanisms offered for expanding this set. One of the most difficult problems facing theoretical and computational semantics is defining the representational interface between linguistic and nonlinguistic knowledge. GL was initially developed as a theoretical framework for encoding selectional knowledge in natural language. This in turn required making some changes in the formal rules of

representation and composition. Perhaps the most controversial aspect of GL has been the manner in which lexically encoded knowledge is exploited in the construction of interpretations for linguistic utterances. Following standard assumptions in GL, the computational resources available to a lexical item consist of the following four levels: (1a) Lexical typing structure: giving an explicit type for a word positioned within a type system for the language; (1b) Argument structure: specifying the number and nature of the arguments to a predicate; (1c) Event structure: defining the event type of the expression and any subeventual structure it may have with subevents; (1d) Qualia structure: a structural differentiation of the predicative force for a lexical item.

The qualia structure, inspired by Moravcsik’s (1975) interpretation of the aitia of Aristotle, are defined as the modes of explanation associated with a word or phrase in the language, and are defined as follows (Pustejovsky, 1991): (2a) Formal: the basic category of which distinguishes the meaning of a word within a larger domain; (2b) Constitutive: the relation between an object and its constituent parts;

138 Lexicon Grammars Gross G (1989). Les constructions converses du franc¸ais. Ge´ne`ve: Droz. Gross G (1990). ‘De´finitions des noms compose´s dans un lexique-grammaire.’ Langue franc¸aise 87, 84–91. Gross M (1975). Me´thodes en syntaxe. Re´gimes des constructions comple´tives. Paris: Hermann. Gross M (1979). ‘On the failure of generative grammar.’ Language 55(4), 3–17. Gross M (1981). ‘Les bases empiriques de la notion de pre´dicat se´mantique.’ Langages 63, 7–53. Gross M (1990). ‘Sur la notion harrissienne de transformation et son application au franc¸ais.’ Langages 98, 48–58. Gross M (1993). ‘Constructing lexicon-grammars.’ In Atkins B T S & Zampolli A (eds.) Computational approaches to the lexicon. Oxford: Oxford University Press. 259–316. Gross M (1996). ‘Lexicon-Grammars.’ In Brown K & Miller J (eds.) Concise encyclopedia of syntactic theory. Oxford: Pergamon. 244–258. Guillet A & Lecle`re C (1992). La structure des phrases simples en franc¸ais. Constructions transitives locatives. Ge´ne`ve: Droz.

Harris Z (1957). ‘Co-occurrence and transformation in linguistic structure.’ Language 33, 238–340. Reprinted in Papers in structural and transformational linguistics (1970). Dordrecht: Reidel. 390–457. Harris Z (1964). Elementary transformations. Philadelphia: University of Pennsylvenia. T. D. A. P. no. 54. Reprinted in Papers in structural and transformational linguistics (1970). Dordrecht: Reidel. 472–482. Laporte E (2004). ‘Pre´face de syntax, lexis and lexicongrammar.’ In Lecle`re C, Laporte E, Piot M & Silberztein M (eds.). xi–xxi. Lecle`re C (2002). Organization of the Lexicon-grammar of French Verbs. Lingvisticae Investigationes 25:1, Amsterdam: John Benjamins. Lecle`re C, Laporte E, Piot M & Silberztein M (eds.) (2004). Syntax, lexis and lexicon-grammar. Papers in honour of Maurice Gross. Lingvisticae Investigationes Supplementa 24, Amsterdam, Philadelphia: Benjamins. Meunier A (1981). Nominalisations d’adjectifs par verbes supports. Ph.D. diss. Universite´ Paris 7. Vive`s R (1983). Avoir, prendre, perdre: constructions a` verbe support et extensions aspectuelles. Ph.D. diss. Universite´ Paris 8.

Lexicon, Generative J Pustejovsky, Brandeis University, Waltham, MA, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction Generative Lexicon (GL) introduces a knowledge representation framework that offers a rich and expressive vocabulary for lexical information. The motivations for this are twofold. Overall, GL is concerned with explaining the creative use of language; we consider the lexicon to be the key repository holding much of the information underlying this phenomenon. More specifically, however, it is the notion of a constantly evolving lexicon that GL attempts to emulate; this is in contrast to currently prevalent views of static lexicon design, where the set of contexts licensing the use of words is determined in advance, and there are no formal mechanisms offered for expanding this set. One of the most difficult problems facing theoretical and computational semantics is defining the representational interface between linguistic and nonlinguistic knowledge. GL was initially developed as a theoretical framework for encoding selectional knowledge in natural language. This in turn required making some changes in the formal rules of

representation and composition. Perhaps the most controversial aspect of GL has been the manner in which lexically encoded knowledge is exploited in the construction of interpretations for linguistic utterances. Following standard assumptions in GL, the computational resources available to a lexical item consist of the following four levels: (1a) Lexical typing structure: giving an explicit type for a word positioned within a type system for the language; (1b) Argument structure: specifying the number and nature of the arguments to a predicate; (1c) Event structure: defining the event type of the expression and any subeventual structure it may have with subevents; (1d) Qualia structure: a structural differentiation of the predicative force for a lexical item.

The qualia structure, inspired by Moravcsik’s (1975) interpretation of the aitia of Aristotle, are defined as the modes of explanation associated with a word or phrase in the language, and are defined as follows (Pustejovsky, 1991): (2a) Formal: the basic category of which distinguishes the meaning of a word within a larger domain; (2b) Constitutive: the relation between an object and its constituent parts;

Lexicon, Generative 139 (2c) Telic: the purpose or function of the object, if there is one; (2d) Agentive: the factors involved in the object’s origins or ‘coming into being.’

Conventional interpretations of the GL semantic representation have been as feature structures (cf. Bouillon, 1993; Pustejovsky, 1995). The feature representation shown below gives the basic template of argument and event variables, and the specification of the qualia structure.

Traditional Lexical Representations The traditional organization of lexicons in both theoretical linguistics and natural language processing systems assumes that word meaning can be exhaustively defined by an enumerable set of senses per word. Lexicons, to date, generally tend to follow this organization. As a result, whenever natural language interpretation tasks face the problem of lexical ambiguity, a particular approach to disambiguation is warranted. The system attempts to select the most appropriate ‘definition’ available under the lexical entry for any given word; the selection process is driven by matching sense characterizations against contextual factors. One disadvantage of such a design follows from the need to specify, ahead of time, the contexts in which a word might appear; failure to do so results in incomplete coverage. Furthermore, dictionaries and lexicons currently are of a distinctly static nature: the division into separate word senses not only precludes permeability; it also fails to account for the creative use of words in novel contexts. GL attempts to overcome these problems, both in terms of the expressiveness of notation and the kinds of interpretive operations the theory is capable of supporting. Rather than taking a ‘snapshot’ of language at any moment of time and freezing it into lists of word sense specifications, the model of the lexicon proposed here does not preclude extensibility: it is open-ended in nature and accounts for the novel, creative uses of words in a variety of contexts by positing procedures for generating semantic expressions for words on the basis of particular contexts.

Adopting such a model presents a number of benefits. From the point of view of a language user, a rich and expressive lexicon can explain aspects of learnability. From the point of view of linguistic theory, it can offer improvements in robustness of coverage. Such benefits stem from the fact that the model offers a scheme for explicitly encoding lexical knowledge at several levels of generalization. In particular, by making lexical ambiguity resolution an integral part of a uniform semantic analysis procedure, the problem is rephrased in terms of dynamic interpretation of a word in context; this is in contrast to current frameworks which select among a static, predetermined set of word senses, and do so separately from constructing semantic representations for larger text units. There are several methodological motivations for importing tools developed for the computational representation and manipulation of knowledge into the study of word meaning, or lexical semantics. Generic knowledge representation (KR) mechanisms, such as inheritance structures or rule bases, can – and have been – used for encoding linguistic information. However, not much attention has been paid to the notion of what exactly constitutes such linguistic information. Traditionally, the application area of knowledge representation formalisms has been the domain of general world knowledge. By shifting the focus to a level below that of words (or lexical concepts) one is able to abstract the notion of lexical meaning away from world knowledge, as well as from other semantic influences such as discourse and pragmatic factors. Such a process of abstraction is an essential prerequisite for the principled creation of lexical entries. Although GL makes judicious use of knowledge representation (KR) tools to enrich the semantics of lexical expressions, it preserves a felicitous partitioning of the information space. Keeping lexical meaning separate from other linguistic factors, as well as from general world knowledge, is a methodologically sound principle; nonetheless, GL maintains that all of these should be referenced by a lexical entry. In essence, such capabilities are the base components of a generative language whose domain is that of lexical knowledge. The interpretive aspect of this language embodies a set of principles for richer composition of components of word meaning. As illustrated later in this entry, semantic expressions for word meaning in context are constructed by a fixed number of generative devices (cf. Pustejovsky, 1991). Such devices operate on a core set of senses (with greater internal structure than hitherto assumed); through composition, an extended set of word senses is obtained when individual lexical items are considered jointly with others in larger phrases. The language

140 Lexicon, Generative

presented below thus becomes an expressive tool for capturing lexical knowledge, without presupposing finite sense enumeration.

The Nature of Polysemy One of the most pervasive phenomena in natural language is that of systematic ambiguity, or polysemy. This problem confronts language learners and natural language processing systems alike. The notion of context enforcing a certain reading of a word, traditionally viewed as selecting for a particular word sense, is central both to global lexical design (the issue of breaking a word into word senses) and local composition of individual sense definitions. However, current lexicons reflect a particularly ‘static’ approach to dealing with this problem: the numbers of and distinctions between senses within an entry are ‘frozen’ into a fixed grammar’s lexicon. Furthermore, definitions hardly make any provisions for the notion that boundaries between word senses may shift with context – not to mention that no lexicon really accounts for any of a range of lexical transfer phenomena. There are serious problems with positing a fixed number of ‘bounded’ word senses for lexical items. In a framework that assumes a partitioning of the space of possible uses of a word into word senses, the problem becomes that of selecting, on the basis of various contextual factors (typically subsumed by, but not necessarily limited to, the notion of selectional restrictions), the word sense closest to the use of the word in the given text. As far as a language user is concerned, the question is that of ‘fuzzy matching’ of contexts; as far as a text analysis system is concerned, this reduces to a search within a finite space of possibilities. This approach fails on several accounts, both in terms of what information is made available in a lexicon for driving the disambiguation process, and how a sense selection procedure makes use of this information. Typically, external contextual factors alone are not sufficient for precise selection of a word sense; additionally, often the lexical entry does not provide enough reliable pointers to critically discriminate between word senses. In the case of automated sense selection, the search process becomes computationally undesirable, particularly when it has to account for longer phrases made up of individually ambiguous words. Finally, and most importantly, the assumption that an exhaustive listing can be assigned to the different uses of a word lacks the explanatory power necessary for making generalizations and/or predictions about how words used in a novel way can be reconciled with their currently existing lexical definitions.

To illustrate this last point, consider the ambiguity and context-dependence of adjectives such as fast and slow, where the meaning of the predicate varies depending on the noun being modified. Sentences (3a)–(3e) show the range of meanings associated with the adjective fast. Typically, a lexicon requires an enumeration of different senses for such words, to account for this ambiguity: (3a) The island authorities sent out a fast little government boat to welcome us: Ambiguous between a boat driven quickly/one that is inherently fast. (3b) a fast typist:

a person who performs the act of typing quickly. (3c) Rackets is a fast game: the motions involved in the game are rapid and swift. (3d) a fast book: one that can be read in a short time. (3e) My friend is a fast driver and a constant worry to her cautious husband: one who drives quickly.

These examples involve at least four distinct word senses for the word fast, (WordNet 2.0 has ten senses for the adjectival reading of fast.): fast (1): moving quickly; fast (2): performing some act quickly; fast (3): doing something requiring a short space of time; fast (4): involving rapid motion.

In an operational lexicon, word senses would be further annotated with selectional restrictions: for instance, fast (1) may be predicated by the object belonging to a class of movable entities, and fast (3) may relate the action ‘that takes a little time’ – e.g., reading, in the case of (4) below – to the object being modified. Upon closer analysis, each occurrence of fast above predicates in a slightly different way. In fact, any finite enumeration of word senses will not account for creative applications of this adjective in the language. For example, consider the two phrases fast freeway and fast garage. The adjective fast in the phrase a fast freeway refers to the ability of vehicles on the freeway to sustain high speed, while in fast garage it refers to the length of time needed for a repair. As novel uses of fast, we are clearly looking at new senses that are not covered by the enumeration given above. Part of GL’s argument for a different organization of the lexicon is based on a claim that the boundaries between the word senses in the analysis of fast above are too rigid. Still, even if we assume that enumeration is adequate as a descriptive mechanism, it is not

Lexicon, Generative 141

always obvious how to select the correct word sense in any given context: consider the systematic ambiguity of verbs like bake (discussed by Atkins et al., 1988), which require discrimination with respect to change-of-state versus create readings, depending on the context (see sentences [4a] and [4b], respectively). (4a) John baked the potatoes. (4b) Mary baked a cake.

The problem here is that there is too much overlap in the ‘core’ semantic components of the different readings (Jackendoff (1985) correctly points out, however, that deriving one core meaning for all homographs of a word form may not be possible, a view not inconsistent with that proposed here.); hence, it is not possible to guarantee correct word sense selection on the basis of selectional restrictions alone. Another problem with this approach is that it lacks any appropriate or natural level of abstraction. As these examples clearly demonstrate, partial overlaps of core and peripheral components of different word meanings make the traditional notion of word sense, as implemented in current dictionaries, inadequate. Within this approach, the only feasible solution would be to employ a richer set of semantic distinctions for the selection of complements than is conventionally provided by the mechanism of selectional restrictions. It is equally arbitrary to create separate word senses for a lexical item just because it can participate in several subcategorization forms; yet this has been the only approach open to computational lexicons that are based on a fixed number of features and senses. A striking example of this is provided by verbs such as believe and forget. The sentences in (5a)–(5b) show that the syntactic realization of the verb’s object complement determines how the phrase is interpreted semantically. The that-complement, for example, in (5a) exhibits a property called factivity (Kiparsky and Kiparsky, 1971), where the object proposition is assumed to be a fact regardless of what modality the whole sentence carries. Sentence (5d) contains a ‘concealed question’ complement (Grimshaw, 1979), so called because the phrase can be paraphrased as a question. These different interpretations are usually encoded as separate senses of the verb, with distinct lexical entries. (5a) Mary forgot that she left the light on at home. (a factive reading) (5b) Mary forgot to leave the light on for the delivery man. (a nonfactive reading) (5c) I almost forgot where we’re going. (an embedded question)

(5d) She always forgets the password to her account. (a concealed question) (5e) He leaves, forgets his umbrella, comes back to get it . . . (ellipsed nonfactive)

These distinctions could be easily accounted for by simply positing separate word senses for each syntactic type, but this misses the obvious relatedness between the different syntactic contexts of forget. Moreover, the general ‘core’ sense of the verb forget, which deontically relates a mental attitude with a proposition or event, is lost between the separate senses of the verb. GL, on the other hand, posits one definition for forget which can, by suitable composition with the different complement types, generate all the allowable readings (cf. Pustejovsky, 1995).

Levels of Lexical Meaning The richer structure for the lexical entry proposed in GL takes to an extreme the established notions of predicate-argument structure, primitive decomposition and conceptual organization; these can be seen as determining the space of possible interpretations that a word may have. That is, rather than committing to an enumeration of a predetermined number of different word senses, a lexical entry for a word now encodes a range of representative aspects of lexical meaning. For an isolated word, these meaning components simply define the semantic boundaries appropriate to its use. When embedded in the context of other words, however, mutually compatible roles in the lexical decompositions of each word become more prominent, thus forcing a specific interpretation of individual words within a specific phrase. It is important to realize that this is a generative process, which goes well beyond the simple matching of features. In fact, this approach requires, in addition to a flexible notation for expressing semantic generalizations at the lexical level, a mechanism for composing these individual entries on the phrasal level. The emphasis of our analysis of the distinctions in lexical meaning is on studying and defining the role that all lexical types play in contributing to the overall meaning of a phrase. Crucial to the processes of semantic interpretation that the lexicon is targeted for is the notion of compositionality, necessarily different from the more conventional pairing of verbs as functions and nouns as arguments. If the semantic load in the lexicon is entirely spread among the verb entries, as many existing lexicons assume, differences like those exemplified above can only be accounted for by treating bake, forget, and so forth as polysemous verbs. If, on the other hand, elaborate lexical meanings of verbs and adjectives could be made sensitive to

142 Lexicon, Generative

components of equally elaborate decompositions of nouns, the notion of spreading the semantic load evenly across the lexicon becomes the key organizing principle in expressing the knowledge necessary for disambiguation. To be able to express the lexical distinctions required for analyzing the examples in the last section, it is necessary to go beyond viewing lexical decomposition as based only on a predetermined set of primitives; rather, what is needed is to be able to specify, by means of sets of predicates, different levels or perspectives of lexical representation, and to be able to compose these predicates via a fixed number of generative devices. The ‘static’ definition of a word provides its literal meaning; it is only through the suitable composition of appropriately highlighted projections of words that we generate new meanings in context. In order to address these phenomena and inadequacies mentioned above, Generative Lexicon argues that a theory of computational lexical semantics must make reference to the four levels of representations mentioned above: 1. Lexical Typing Structure. This determines the ways in which a word is related to other words in a structured type system (i.e., inheritance. In addition to providing information about the organization of a lexical knowledge base, this level of word meaning provides an explicit link to general world (commonsense) knowledge. 2. Argument Structure. This encodes the conventional mapping from a word to a function, and relates the syntactic realization of a word to the number and type of arguments that are identified at the level of syntax and made use of at the level of semantics (Grimshaw, 1991); 3. Event Structure. This identifies the particular event type for a verb or a phrase. There are essentially three components to this structure: the primitive event type – state (S), process (P), or transition (T); the focus of the event; and the rules for event composition (cf. Moens and Steedman, 1988; Pustejovsky, 1991b). 4. Qualia Structure. This defines the essential attributes of objects, events, and relations, associated with a lexical item. By positing separate components (see below) in what is, in essence, an argument structure for nominals, nouns are elevated from the status of being passive arguments to active functions (cf. Moravcsik, 1975; Pustejovsky, 1991a). We can view the fillers in qualia structure as prototypical predicates and relations associated with this word. A set of generative devices connects the four levels, providing for the compositional interpretation of

words in context. These devices include subselection, type coercion, and cocomposition. In this article, we will focus on the qualia structure and type coercion, an operation that captures the semantic relatedness between syntactically distinct expressions. As an operation on types within a l-calculus, type coercion can be seen as transforming a fixed semantic language into one with changeable (polymorphic) types. Argument, event, and qualia types must conform to the well-formedness conditions defined by the type system and the lexical inheritance structure when undergoing operations of semantic composition. Lexical items are strongly typed yet are provided with mechanisms for fitting to novel typed environments by means of type coercion over a richer notion of types. Qualia Structure

Qualia structure is a system of relations that characterizes the semantics of a lexical item, very much like the argument structure of a verb (Pustejovsky). To illustrate the de scriptive power of qualia structure, the semantics of nominals will be the focus here. In effect, the qualia structure of a noun determines its meaning in much the same way as the typing of arguments to a verb determines its meaning. The elements that make up a qualia structure include familiar notions such as container, space, surface, figure, or artifact. One way to model the qualia structure is as a set of constraints on types (cf. Copestake and Briscoe, 1992; Pustejovsky and Boguraev, 1993). The operations in the compositional semantics make reference to the types within this system. The qualia structure along with the other representational devices (event structure and argument structure) can be seen as providing the building blocks for possible object types. Figure 1 illustrates a type hierarchy fragment for knowledge about objects, encoding qualia structure information. In Figure 1, the term nomrqs refers to a ‘relativized qualia structure’ a type of generic information structure for entities (cf. Calzolari, 1992 for discussion). Further, ind.obj represents ‘individuated object.’

Figure 1 Type hierarchy fragment.

Lexicon, Generative 143

The tangled type hierarchy above shows how qualia can be unified to create more complex concepts out of simple ones. Following Pustejovsky (2001, 2005), we can distinguish the domain of individuals into three ranks or levels of type: (6a) Natural Types: Natural kind concepts consisting of reference only to Formal and Const qualia roles; (6b) Functional Types: Concepts integrating reference to purpose or function. (6c) Complex Types: Concepts integrating reference to a relation between types.

For example, a simple natural physical object (7), can be given a function (i.e., a Telic role), and transformed into a functional type, as in (8). (7)

we obviate the enumeration of multiple entries for different senses of a word. We define coercion as follows (Pustejovsky, 1995): (10) Type Coercion: a semantic operation that converts an argument to the type that is expected by a function, where it would otherwise result in a type error.

The notion that a predicate can specify a particular target type for its argument is a very useful one, and intuitively explains the different syntactic argument forms for the verbs below. In sentences (11) and (12), noun phrases and verb phrases appear in the same argument position, somehow satisfying the type required by the verbs enjoy and begin. In sentences (13) and (14), noun phrases of very different semantic classes appear as subject of the verbs kill and wake. (11a) Mary enjoyed the movie. (11b) Mary enjoyed watching the movie.

(8)

Functional types in language behave differently from naturals, as they carry more information with them regarding their use and purpose. For example, the noun sandwich contains information of the ‘eating activity’ as a constraint on its Telic value, due to its position in the type structure; that is, eat(P,w,x) denotes a process, P, between an individual w and the physical object x. (9)

From qualia structures such as these, it now becomes clear how a sentence such as Mary finished her sandwich receives the default interpretation it does; namely, that of Mary eating the sandwich. This is an example of type coercion, and the semantic compositional rules in the grammar must make reference to values such as qualia structure, if such interpretations are to be constructed on-line and dynamically.

Coercion and Compositionality Type coercion is an operation in the grammar ensuring that the selectional requirements on an argument to a predicate are in fact satisfied by the argument in the compositional process. The rules of coercion presuppose a typed ontology such as that outlined above. By allowing lexical items to coerce their arguments,

(12a) Mary began a book. (12b) Mary began reading a book. (12c) Mary began to read a book (13a) John killed Mary. (13b) The gun killed Mary. (13c) The bullet killed Mary. (14a) The cup of coffee woke John up. (14b) Mary woke John up. (14c) John’s drinking the cup of coffee woke him up.

If we analyze the different syntactic occurrences of the above verbs as separate lexical entries, following the sense enumeration theory outlined in previous sections, we are unable to capture the underlying relatedness between these entries; namely, that no matter what the syntactic form of their arguments, the verbs seem to be interpreting all the phrases as events of some sort. It is exactly this type of complement selection that type coercion allows in the compositional process.

Complex Types in Language One of the more unique aspects of the representational mechanisms of GL is the data structure known as a complex type (or dot object), introduced to explain several phenomena involving the selection of conflicting types in syntax. There are well-known cases of container-containee and figure-ground ambiguities, where a single word may refer to two aspects of an object’s meaning (cf. Apresjan, 1973; Wilks, 1975; Lakoff, 1987; Pustejovsky and Anick, 1988). The words window, door, fireplace, and room can be used to refer to the physical object itself or the space associated with it:

144 Lexicon, Generative (15a) They walked through the door. (15b) She will paint the door red.

Recent Developments in Generative Lexicon

(16a) Black smoke filled the fireplace. (16b) The fireplace is covered with soot.

As the theory has matured, many of the analytic devices and the linguistic methodology of Generative Lexicon have been extended and applied to languages and phenomena well beyond the original scope of the theory. Cocomposition has been applied to a number of phenomena, particularly light verb constructions, with a fair amount of success, in Korean and Japanese (Lee et al., 2003). Qualia structure has proved to be an expressive representational device and has been adopted by adherents of many other grammatical frameworks. For example, Jensen and Vikner (1994) and Borschev and Partee (2001) both appeal to qualia structure in the interpretation of the genitive relation in NPs, while many working on the interpretation of noun compounds have developed qualia-based strategies for interpretation of noun–noun relations (Johnston and Busa, 1996, 1997; Lehner, 2003; Jackendoff, 2003). Van Valin (2005) has adopted qualia roles within several aspects of RRG analyses, where nominal semantics have required finer grained representations. Perhaps one of the biggest developments within the theory in recent years has been the integration of type coercion into a general theory of the mechanisms of selection in grammar (Pustejovsky, 2002, 2005). On this view, there are three mechanisms that account for all local syntagmatic and paradigmatic behavior in the grammar: pure selection, type exploitation, and type coercion. The challenges posed by Generative Lexicon to linguistic theory are quite direct and simple: semantic interpretation is as creative and generative as syntax if not more so. But the process operates under serious constraints and inherently restrictive mechanisms. It is GL’s goal to uncover these mechanisms in order to model the expressive semantic power of language.

In addition to figure-ground and container-containee alternations, there are many other cases in natural language where two or more aspects of a concept are denoted by a single lexicalization. As with nouns such as door, the nouns book and exam denote two contradictory types; books are both physical form and informational in nature; exams are both events and informational. (17a) Mary doesn’t believe the book. (17b) John bought his book from Mary. (17c) The police burnt a controversial book. (18a) John thought the exam was confusing. (18b) The exam lasted more than two hours this morning.

What is interesting about the above pairs is that the two senses of these nouns are related to one another in a specific way. The apparently contradictory nature of the two senses for each pair actually reveals a deeper structure relating these senses, something that is called a dot object. For each pair, there is a relation that connects the senses, represented as a Cartesian product of the two semantic types. There must exist a relation R that relates the elements of the pairing, and this relation must be part of the definition of the semantics for the dot object to be well-formed. For nouns such as book, disk, and record, the relation R is a species of ‘containment,’ and shares grammatical behavior with other container-like concepts. For example, we speak of information in a book, articles in the newspaper, as well as songs on a disc. This containment relation is encoded directly into the semantics of a concept such as book – i.e., hold(x, y) – as the formal quale value. For other dot object nominals such as prize, sonata, and lunch, different relations will structure the types in the Cartesian product, as we see below. The lexical structure for book as a dot object can be represented as in (19). (19)

See also: Argument Structure; Compositionality: Semantic

Aspects; Computational Lexicons and Dictionaries; Lexical Conceptual Structure; Lexical Semantics: Overview; Lexicon: Structure; Syntax-Semantics Interface; Thematic Structure.

Bibliography

Nouns such as sonata, lunch, and appointment, on the other hand, are structured by entirely different relations.

Alsina A (1992). ‘On the Argument Structure of Causatives.’ Linguistic Inquiry 23(4), 517–555. Apresjan J D (1973). ‘Regular Polysemy.’ Linguistics 143, 5–32. Apresjan J D (1973). ‘Synonymy and synonyms.’ In Kiefer F (ed.). 173–199.

Lexicon, Generative 145 Asher N & Morreau M (1991). ‘Common sense entailment: a modal theory of nonmonotonic reasoning.’ In Proceedings to the 12th International Joint Conference on Artificial Intelligence, Sydney, Australia. Asher N & Pustejovsky J (in press). ‘The metaphysics of words,’ ms. Brandeis University and University of Texas. Atkins B T, Kegl J & Levin B (1988). ‘Anatomy of a verb entry: from linguistic theory to lexicographic practice.’ International Journal of Lexicography 1, 84–126. Baker M (1988). Incorporation: a theory of grammatical function changing. Chicago: University of Chicago Press. Bierwisch M (1983). ‘Semantische und konzeptuelle Repra¨ sentationen lexikalischer Einheiten.’ In Ruzicka R & Motsch W (eds.) Untersuchungen zur Semantik. Berlin: Akademische-Verlag. Boguraev B & Briscoe E (1989). Computational lexicography for natural language processing. Harlow/London: Longman. Boguraev B & Pustejovsky J (1996). Corpus processing for lexical acquisition. Cambridge: Bradford Books/MIT Press. Borschev V & Partee B H (in press). ‘Genitives, types, and sorts.’ In Kim J-Y Lander Y & Parlee B H (eds.) Possessives and beyond: semantics and syntax. Amherst: GLSA. Borschev V & Partee B H (2001). ‘Genitive modifiers, sorts, and metonymy.’ Nordic Journal of Linguistics. Borschev V & Partee B H (2001a). ‘Genitive modifiers, sorts, and metonymy.’ Nordic Journal of Linguistics 24, 140–160. Borschev V & Partee B H (2001b). ‘Ontology and metonymy.’ In Jensen P A & Skadhauge P (eds.) Ontology-based interpretation of noun phrases. Proceedings of the First International OntoQuery Workshop. Kolding: Department of Business Communication and information Science, University of Southern Denmark. 121–138. Bouillon P (1997). ‘Polymorphie et se´ mantique lexicale: le case des adjectifs’ Ph.D. diss., Paris VII. Paris. Bresnan J (1994). ‘Locative Inversion and the architecture of universal grammar.’ Language 70(1), 2–31. Briscoe T, de Paiva V & Copestake A (eds.) (1993). Inheritance, defaults, and the lexicon. Cambridge: Cambridge University Press. Busa F (1996). Compositionality and the semantics of nominals. Ph.D. diss., Brandeis University. Calzolari N (1992). ‘Acquiring and representing semantic information in a lexical knowledge base.’ In Pustejovsky J & Bergler S (eds.) Lexical semantics and knowledge representation. New York: Springer Verlag. Carpenter B (1992). ‘Typed feature structures.’ Computational Linguistics 18, 2. Chomsky N (1955). The logical structure of linguistic theory. Chicago: University of Chicago Press. Chomsky N (1965). Aspects of the theory of syntax. Cambridge: MIT Press. Choueka Y (1988). ‘Looking for needles in a haystack, or locating interesting collocational expressions in large textual databases.’ Proceedings of the RAIO. 609–623. Copestake A & Briscoe E (1992). ‘Lexical operations in a unification-based framework.’ In Pustejovsky J & Bergler

S (eds.) Lexical semantics and knowledge representation. New York: Springer Verlag. Copestake A (1992). The Representation of lexical semantic information. CSRP 280, University of Sussex. Copestake A (1993). ‘Defaults in the LKB.’ In Briscoe T & Copestake A (eds.) Default inheritance in the lexicon. Cambridge: Cambridge University Press. Copestake A & Briscoe E (1992). ‘Lexical operations in a unification-based framework.’ In Pustejovsky J & Bergler S (eds.) Lexical semantics and knowledge representation. New York: Springer Verlag. Davis A & Koenig J-P (2000). ‘Linking as constraints on word classes in a hierarchical lexicon.’ Language 76(1). Davis A (1996). Lexical semantics and linking and the hierarchical lexicon. Ph.D. diss., Stanford University. de Miguel E & Fernandez Lagunilla M (2001). ‘El operador aspectual se.’ Revista Espaola de Lingstica 39(1), 13–43. de Miguel E (2000). ‘Relazioni tra il lessico e la sintassi: classi aspettuali di Verbi ed il passivo in spagnolo.’ In Simone R (ed.) Classi di parole e conoscenza lessicale. Studi Italiani di Linguistica Teorica e Applicata (SILTA), 2. Do¨ lling J (1992). ‘Flexible Interpretationen durch Sortenverschiebung.’ In Zimmermann I & Strigen A (eds.) Fu¨ gungspotenzen, Berlin: Akademie Verlag. Dowty D R (1979). Word meaning and Montague Grammar. Dordrecht: D. Reidel. Dowty D R (1985). ‘On some recent analyses of control.’ Linguistics and Philosophy 8, 1–41. Dowty D (1991). ‘Thematic proto-roles and argument selection.’ Language 67, 547–619. Egg M & Lebeth K (1995). ‘Semantic underspecification and modifier attachment ambiguities.’ In Kilbury J & Wiese R (eds.) Integrative Ansa¨ tze in der Computerlinguistik. Du¨ sseldorf: Seminar fu¨ r Allgemeine Sprachwissenschaft. Fauconnier G (1985). Mental spaces. Cambridge: MIT Press. Goldberg A E (1995). Constructions: a Construction Grammar approach to Argument Structure. Chicago: University of Chicago Press. Grimshaw J (1979). ‘Complement selection and the lexicon’ Linguistic Inquiry 10, 279–326. Grimshaw J (1990). Argument structure. Cambridge: MIT Press. Grimshaw J & Mester A (1988). ‘Light verbs and y-marking.’ Linguistic Inquiry 10, 205–232. Gruber J S (1976). Lexical structures in syntax and semantics. Amsterdam: North-Holland. Gunter C (1992). Semantics of programming languages. Cambridge: MIT Press. Guthrie L, Pustejovsky J, Wilks Y & Slator B (1996). ‘The role of lexicons in natural language processing.’ Communications of the ACM 39(1). Hale K & Keyser J (1993). ‘On argument structure and the lexical expression of syntactic relations.’ In Hale K & Keyser J (eds.) The view from building 20. Cambridge: MIT Press.

146 Lexicon, Generative Halle M, Bresnan J & Miller G (eds.) (1978). Linguistic theory and psychological reality. Cambridge: MIT Press. Higginbotham J (1985). ‘On semantics.’ Linguistic Inquiry 16, 547–593. Higginbotham J (1989). ‘Elucidations of meaning.’ Linguistics and Philosophy 12, 465–517. Hirst G (1987). Semantic interpretation and the resolution of ambiguity. Cambridge: Cambridge University Press. Hjelmslev L (1961). Prolegomena to a theory of language. Whitfield F (trans.). Madison: University of Wisconsin Press, first published in 1943. Ingria R (1986). ‘Lexical information for parsing systems: points of convergence and divergence.’ Automating the Lexicon. Italy: Marina di Grosseto. Ingria R, Boguraev B & Pustejovsky J (1992). ‘Dictionary/ lexicon.’ In Shapiro S (ed.) Encyclopedia of artificial intelligence. 2nd ed. New York: Wiley. Jackendoff R (1972). Semantic interpretation in Generative Grammar. Cambridge: MIT Press. Jackendoff R (1985). ‘Multiple subcategorization and the theta-criterion: the case of climb.’ Natural Language and Linguistic Theory 3, 271–295. Jackendoff R (1990). Semantic structures. Cambridge: MIT Press. Jensen P A & Vikner C (1996). ‘The double nature of the verb have.’ In LAMBDA 21, OMNIS Workshop 23–24 Nov. 1995. Handelshjskolen i Kbenhavn: Institut for Datalingvistik. 25–37. Jensen P A & Vikner C (1994). ‘Lexical knowledge and the semantic analysis of Danish genitive constructions.’ In Hansen S L & Wegener H (eds.) Topics in knowledgebased NLP systems. Copenhagen: Samfundslitteratur. 37–55. Johnston M (1995). ‘Semantic underspecification and lexical types: capturing polysemy without lexical rules.’ In Proceedings of ACQUILEX Workshop on Lexical Rules, August 9–11, 1995, Cambridgeshire. Kayser D (1988). ‘What kind of thing is a concept?’ Computational Intelligence 4, 158–165. Kiparsky P & Kiparsky C (1971). ‘Fact.’ In Steinberg D & Jakobovitz L (eds.) Semantics. Cambridge: Cambridge University Press. 345–369. Lee C & Kim Y (2003). ‘The lexico-semantic structure of Korean inchoative verbs: With reference to -e-ci-ta class.’ In Bouillon P (ed.) Proceedings of International Workshop on Generative Lexicon. Geneva: University of Geneva. Lee C (2000). ‘Numeral classifiers, (In-)Definites and Incremental Theme in Korean.’ In Lee C & Whitman J (eds.) Korean syntax and semantics: LSA Institute Workshop, Santa Cruz, ’91. Seoul: Thaehaksa. Lee C (2003). ‘Change of location and change of state.’ In Boullion P (ed.) Proceedings of the 2nd International Workshop on Generative Lexicon. Geneva: University of Geneva. Lee C (2004). ‘Motion and state: verbs of tul-/na-(K) and hairu/ deru (J) enter/exit.’ In Hudson E et al. (eds.) Japanese/Korean Linguistics 13. Standford: CSLI.

Lee C & Im S (2003). ‘How to combine the verb ha-‘do’ with an entity type noun in Korean – Its cross-linguistic implications.’ In Bouillon P (ed.) Proceedings of International Workshop on Generative Lexicon. Geneva: University of Geneva. Levin B & Rappaport Hovav M (1995). Unaccusativity: at the syntax–semantics interface. Cambridge: MIT Press. Levin B (1993). Towards a lexical organization of English verbs. Chicago: University of Chicago Press. Lyons J (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press. Maier D (1983). The theory of relational databases. Computer Science Press. McCawley J D (1968). ‘The role of semantics in a grammar.’ In Bach E & Harms R T (eds.) Universals in linguistic theory. New York, NY: Holt, Rinehart, and Winston. McCawley J (1968). Lexical insertion in a Transformational Grammar without Deep Structure. Proceedings of the Chicago Linguistic Society 4. Mel’cˇuk I A (1988). ‘Semantic description of lexical units in an explanatory combinatorial dictionary: basic principles and heuristic criteria.’ International Journal of Lexicography 1, 165–188. Miller G (1990). ‘WordNet: an on-line lexical database.’ International Journal of Lexicography 3, 235–312. Miller G (1991). The science of words. Scientific American Library. Moravcsik J M (1975). ‘Aitia as generative factor in Aristotle’s philosophy.’ Dialogue 14, 622–636. Nunberg G (1979). ‘The non-uniqueness of semantic solutions: polysemy.’ Linguistics and Philosophy 3, 143–184. Ostler N & Atkins B T (1992). ‘Predictable meaning shift: some linguistic properties of lexical implication rules.’ In Pustejovsky J & Bergler S (eds.) Lexical semantics and knowledge representation. Berlin: Springer Verlag. Partee B H & Borschev V B (2000). ‘Possessives, favorite, and coercion.’ In Riehl A & Daly R (eds.) Proceedings of ESCOL 99. Ithaca: CLC Publications, Cornell University. 173–190. Partee B H & Borschev V (2003). ‘Genitives, relational nouns, and argument-modifier ambiguity.’ In Lang E, Maienborn C & Fabricius-Hansen C (eds.) Modifying adjuncts. Berlin: Mouton de Gruyter. 67–112. Partee B & Rooth M (1983). ‘Generalized conjunction and type ambiguity.’ In Ba¨uerle, Schwarze & von Stechow (eds.) Meaning, use, and interpretation of language. Walter de Gruyter. Pinkal M (1995). ‘Radical underspecification.’ In Proceedings of the Tenth Amsterdam Colloquium. Pinker S (1989). Learnability and cognition: the acquisition of Argument Structure. Cambridge: MIT Press. Poesio M (1994). ‘Ambiguity, underspecification, and discourse interpretation.’ In Bunt H, Muskens R & Rentier G (eds.) International Workshop on Computational Semantics. University of Tilburg.

Lexicon, Generative 147 Pollard C & Sag I (1994). Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press/Stanford CSLI. Pustejovsky J & Boguraev P (1993). ‘Lexical knowledge representation and natural language processing.’ Artificial Intelligence 63, 193–223. Pustejovsky J (1991). ‘The generative lexicon.’ Computational Linguistics 17(4). Pustejovsky J (1991). ‘The syntax of event structure.’ Cognition 41, 47–81. Pustejovsky J (1992). ‘Lexical semantics.’ In Shapiro S (ed.) Encyclopedia of artificial intelligence, 2nd ed. New York: Wiley. Pustejovsky J (1994). ‘Semantic typing and degrees of polymorphism.’ In Martin-Vide (ed.) Current Issues in Mathematical Linguistics. Holland: Elsevier. Pustejovsky J (1995). The Generative Lexicon. Cambridge: MIT Press. Pustejovsky J (1995a). ‘Linguistic constraints on type coercion.’ In Saint-Dizier P & Viegas E (eds.) Computational lexical semantics. Cambridge: Cambridge University Press. Pustejovsky J (1995b). The generative lexicon. Cambridge: MIT Press. Pustejovsky J (1998). ‘Generativity and explanation in semantics: a reply to Fodor and Lepore.’ Linguistic Inquiry 29(2). Pustejovsky J (1998). ‘The Semantics of lexical underspecification.’ Folia Linguistica. Pustejovsky J (2001). Pustejovsky J (2005). Pustejovsky J & Boguraev B (1993). ‘Lexical knowledge representation and natural language processing.’ In Artificial Intelligence 63, 193–223. Reyle U (1993). ‘Dealing with ambiguities by underspecification: construction, representation, and deduction.’ Journal of Semantics 10, 123–179. Rosen S (1989). Argument structure and complex predicates. Ph.D. diss., Brandeis University. Sanfilippo A (1990). Grammatical relations, thematic roles, and verb semantics. Ph.D. diss., University of Edinburgh. Sanfilippo A (1993). ‘LKB encoding of lexical knowledge.’ In Briscoe T, de Paiva V & Copestake A (eds.) Inheritance, defaults, and the lexicon. Cambridge: Cambridge University Press.

Schabes Y Abeille A & Joshi A. ‘Parsing strategies with leixcalized grammars.’ In Proceedings of the 12th International Conference on Computational linguistics, Budapest. Sowa J (1992). ‘Logical structures in the lexicon.’ In Pustejovsky J & Bergler S (eds.) Lexical semantics and knowledge representation. New York: Springer Verlag. Steedman M (1997). Surface structure interpretation. Cambridge: MIT Press. Talmy L (1975). ‘Semantics and syntax of motion.’ In Kimball J P (ed.) Syntax and semantics 4. New York: Academic Press. Talmy L (1985). ‘Lexicalization patterns: semantic structure in lexical forms.’ In Shopen T (ed.) Language typology and syntactic description 3: grammatical categories and the lexicon. Cambridge: Cambridge University Press. 57–149. Talmy L (1985). ‘Lexicalization patterns.’ In Shopen T (ed.) Language typology and syntactic description. Cambridge. Tenny C & Pustejovsky J (2000). Events as grammatical objects. Stanford: CSLI Publications University of Chicago Press. van Deemter K & Peters S (eds.) (1996). Ambiguity and underspecification. Stanford: CSLI, Chicago University Press. Vendler Z (1967). Linguistics and philosophy. Ithaca: Cornell University Press. Vikner C & Jensen P (2002). ‘A semantic analysis of the English genitive. Interaction of lexical and formal semantics.’ Studia Linguistica 56, 191–226. Weinreich U (1959). ‘Travels through semantic space.’ Word 14, 346–366. Weinreich U (1963). ‘On the semantic structure of language.’ In Greenberg J (ed.) Universal of language. Cambridge: MIT Press. Weinreich U (1964). ‘Webster’s third: a critique of its semantics.’ International Journal of American Linguistics 30, 405–409. Weinreich U (1972). Explorations in semantic theory. The Hague: Mouton. Wilks Y (1975). ‘A preferential pattern seeking semantics for natural language inference.’ Artificial Intelligence 6, 53–74. Williams E (1981). ‘Argument structure and morphology.’ Linguistic Review 1, 81–114.

148 Lexicon: Structure

Lexicon: Structure K Allan, Monash University, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.

A lexicon is a bin for storing listemes, language expressions whose meaning is not determinable from the meanings (if any) of their constituent forms and that, therefore, a language user must memorize as a combination of form and meaning. The way that a lexicon is organized depends on what it is designed to do. 1. The traditional Indo-European desktop dictionary is organized into alphabetically ordered entries in order to maximize look-up efficiency for the literate user. 2. Kitab al-Ayn, the Arabic dictionary of Al Khalil Ben Ahmad, has a phonetically based listing, beginning with velars and ending with bilabials (Kniffka, 1994). 3. A grammar that requires random lexical entry according to morphological class needs to access a lexicon through the morphosyntactic category of the lexicon entry. 4. The hearer accesses the mental lexicon from the phonological or graphological form of the entry. 5. The speaker accesses the mental lexicon through the meaning. However, slips of the tongue suggest that access presents a choice among several activated forms, from among which the inappropriate ones need to be inhibited (Aitchison, 2003: 220). 6. Someone producing an alliteration (e.g., around the rugged rocks the ragged rascal ran) needs to access the onsets to phonological forms. 7. Someone looking for a rhyme needs to be able to access entries via the endings of the phonological form. 8. The meaning of one word can remind us of another with a similar or contrary meaning; this suggests that meanings in the lexicon must be organized into a network of relations similar to those in a thesaurus. Summing up, a lexicon must be accessible from three directions – form, morphosyntax, and meaning – none of which is intrinsically prior. Formal specifications include graphological specification such as alternative spellings, hyphenation location, and use of uppercase. The phonological specification includes information on alternative pronunciations, syllable weight and structure, and the relative pitch and/or tone of the various syllables. Morphosyntactic specifications include morphological and syntactic properties such as the inherent

morphosyntactic (lexical) category of the item. Traditionally, the nouns helicopter and move and the semantically related verbs helicopter and move are all different listemes; yet, despite their different syntactic properties and concomitant meaning differences, the transitive verb move and the intransitive verb move traditionally share the same polysemous dictionary entry under different subheadings. The COBUILD English dictionary lists all forms of a listeme; for example, for clear it has clearer, clearest, clears, clearing, cleared (not divided between Adj and V listemes). Derived words are either generated from the listeme by morphological rules (e.g., Adv clearly from Adj clear) or else listed as distinct listemes. Should dictionaries list just morphemes or every word form, including inflected forms, or something in between? For various views see Butterworth (1983), Flores d’Arcais and Jarvella (1983), Altman (1990), Altman and Shillcock (1993), Allan (2001), and Aitchison (2003). Regularities need to be signaled, for example, the declension and gender of Latin nouns and the conjugation of verbs. So, too, must irregularities be signaled, for example, the past tense of English strong verbs, and constraints on range such as an object NP can interrupt the V þ vPrt of some phrasal verbs (run NP down; ‘denigrate NP’) but not others (put *NP up with; ‘endure NP’). Semantic specifications identify the senses of a listeme based on the salient characteristics of its typical denotatum (see Stereotype Semantics). The form of the semantic specification depends on the preferred metalanguage (see Metalanguage versus Object Language; Cognitive Semantics; Dynamic Semantics; Frame Semantics; Lexicon, Generative; Lexical Conceptual Structure; Natural Semantic Metalanguage). Boguraev and Briscoe (1989: 5) add encyclopedic knowledge to the list of semantic specifications, and many lexicographers would agree that a lexicon should also be a cultural index; thus, the Collins English dictionary (Hanks, 1998), New Oxford dictionary of English (Pearsall, 1998), and Collins English dictionary (Butterfield, 2003) list encyclopedic information about bearers of certain proper names (on the networked relationship between lexicon and encyclopedia, see Dictionaries and Encyclopedias: Relationship). Bauer (1983: 196) proposed a category of ‘stylistic specifications’ to distinguish among piss, piddle, and micturate to reflect the kind of metalinguistic information found in traditional desktop dictionary tags such as ‘colloquial,’ ‘slang,’ ‘derogatory,’ ‘medicine,’ and ‘zoology.’ Such metalinguistic information is more appropriate to the encyclopedia entry directly

Lexicon: Structure 149

connected with every lexicon entry. The encyclopedia will also supply etymological information, which is not essential to the proper use of the listemes, and an account of the connotations (see Connotation) of listemes, which is. Bauer also suggested a signaling of ‘related lexemes’ such as that edible is related to eat. This should be identified through the semantic specification, which will differentiate the sense of edible from eatable (cf. Lehrer, 1990: 210). Contraries and antonyms are linked to a listeme via its semantic field (see Lexical Fields). Metalanguage terms used in semantic specifications are often semantically identical with object language listemes; for example, Pustejovsky (1995: 101) specified book as a ‘‘physical object’’ that ‘‘holds’’ ‘‘information’’ created by someone who ‘‘write[s]’’ it and whose function is to be ‘‘read.’’ Certainly, there is a relation among book, write, and read that needs to be accounted for either in the semantic specification or the associated encyclopedia entry. Used in the morphosyntactic specifications, such category terms as noun, verb, adjective, and feminine are part of the metalanguage, not the object language. But they also appear in the lexicon as expressions in the object language. Suppose that the sense ‘typically names a person, place, thing or abstract entity’ has an address in the lexicon (this is explained later). Next, suppose that in the lexicon of English, one form at this address is noun; in the lexicon of French, one form at this address is nom; and in the lexicon of Greek, it is o´noma. Suppose that in the lexicon of the metalanguage there is a form N at this same address. This move simply reflects the well-known and widely accepted fact that there are translation (near) equivalences between the object language and metalanguage, just as there are between different natural languages. The lexicon is used in string matching among formal specifications, such as retrieving all listemes beginning with /d/ or those rhyming with cord. Similarly, a search of the morphosyntactic specification of every entry is the best way to compile a list of all nouns in the lexicon. And the semantic specification of a listeme will locate it within its semantic field and relate it to superordinate items, contraries, antonyms, and the like. Many of these relations are not primarily lexical but arise from the relationships characteristic of the denotatum of a listeme; for example, because a cat is perceived to be a kind of animal, the listeme animal is semantically superordinate to the listeme cat. Any two listemes may be connected through their formal, morphosyntactic, and/or semantic specifications in the lexicon and through the stylistic or etymological properties identified in the encyclopedia.

Figure 1 Jackendoff-style lexicon entry for dog.

Lyons (1977: 516) suggested that each lexicon entry be assigned a random number that serves as its address for access through any of the three modes. This proposal is very similar to one in Jackendoff (1995; 1997: 89), which ‘‘licence[s] the correspondence of certain (near-) terminal symbols of syntactic structure with phonological and conceptual structures’’ (see Lexical Conceptual Structure). Jackendoff proposed that a lexicon entry has the three parts, as shown in Figure 1, where they are tagged with our specification types. In Jackendoff’s theory, the morphosyntactic specification is the linchpin that is co-indexed j with the formal specification on the left-hand side and the subscript n of the semantic specifications on the right. This seems correct. If we ask someone the meaning of the word fly (which is many ways ambiguous), two possible answers are ‘travel though the air’ (V) and ‘cloth cover for an opening’ (N). Similarly, the word dog means ‘canine animal’ (N) and ‘follow dog-like’ (V). The word canine means ‘a member of the dog family’ (N) and ‘pertaining to the dog family’ (Adj). The difference between such pairs is indicated by the combination of formal and morphosyntactic specification – which corresponds to Jackendoff’s subscript j in Figure 1. This distinguishing factor is exactly the reason that a traditional desktop dictionary identifies the morphosyntactic category immediately after identifying the form of a listeme. Example (1) shows how much morphosyntax contributes to meaning. (1) the toves gimbled in the wabe

In this example, toves is clearly a plural noun and therefore denotes more than one entity, gimbled is the past tense of a verb and so denotes some act or state of the toves, and wabe is a noun – and, because it falls within the scope of the preposition in, wabe identifies a place or time. Thus, the morphosyntactic specification of these words gives us some meaning to work on, as suggested by Jackendoff’s subscript n in Figure 1. In addition, the form distinguishes between the nouns tove and wabe. For the speaker, the main function of the lexicon is to find a form to express the intended meaning – but

150 Lexicon: Structure Table 1 Partial entries for dog and caninea F f8899dog

M

S

f8899Ns0017

dog0 s0017 follow_like_a_s00170 s3214 conical_pointed_tooth0 s7439 pertaining_to_a_s00170 s1227

f8899Vs3214 f7656canine

Figure 2 Networked components of an entry in the hypothetical lexicon. F, formal specs; M, morphosyntactic specs; S, semantic specs.

not just a form; it should also find its morphosyntactic category, which is often at least partly determined by the content. For instance, a reference to a thing will usually require using a noun; a reference to an event will usually require a verb. Except when parroting a phrase, it is impossible when speaking a language to use a form without assigning it a morphosyntactic specification. It is impossible to assign a morphosyntactic specification to a meaning without also assigning it a form (the appropriate form for a zero morph is, of course, null). Someone who asks what’s the word that means ‘get back at’? may not be able to retrieve the form retaliate, but the intended meaning is conveyed using the forms of other lexemes. In terms of Figure 1, the speaker goes from the subscript n on the semantic specification to the morphosyntactic specification jNn that links to the formal specification through the j subscript. For the hearer, the main function of the lexicon is to attach meaning to forms that normally occur within syntactic structures, so it is a matter of finding a semantic specification for something that is both formally and morphosyntactically specified. In terms of Figure 1, the hearer goes from j to n. Jackendoff’s subscripts correspond to connection points for bidirectional lines in a network as in Figure 2, which shows that their sequence with respect to the formal specifications (F), morphosyntactic specifications (M), and semantic specifications (S) is a convention without substance; that is, fF ¼ Ff and so forth. Suppose there are the grossly oversimplified components in Table 1. The formal and semantic addresses are randomly selected numbers, tagged with f and s, respectively. Table 1 ignores (1) the fact that the semantic specifications must somehow reflect the facts that a dog is an animal, a living thing, and a physical object and (2) the need to identify the kind of countable and uncountable environments in which a noun can conventionally occur; the thematic structure, valency, or frame of a verb; and the gradability of an adjective. In Figure 3, which represents a part of this network of relations, the question of how the graphological and phonological forms should be correlated is not accounted for.

f7656Ns7439 f7656Adjs1227

a

F, formal specs; M, morphosyntactic specs; S, semantic specs.

Figure 3 Part of the network for the lexicon entries for dog (N and V) and canine (N and Adj).

The form of a lexicon entry is a co-indexed tripartite network in which the indices correspond to bidirectional connectors between the components of a lexicon entry, as shown in Figure 2. Each of the formal, morphosyntactic and semantic specifications is also linked with the encyclopedia (see Dictionaries and Encyclopedias: Relationship). Such connectors model cognitive pathways that presumably have a neurological basis and could be empirically disconfirmed (e.g., by studies of aphasics). There is evidence for a spreading activation along pathways within the mental lexicon that leads to selection among partial matches between meaning and form for the appropriate listeme (Altman, 1990; Levelt, 1993; Aitchison, 2003). There is, of course, nothing to be gained by rewriting desktop dictionaries in terms of networks if they are going to be consulted in the same way human beings have consulted them in the past. See also: Cognitive Semantics; Connotation; Dictionaries and Encyclopedias: Relationship; Dynamic Semantics; Formal Semantics; Frame Semantics; Lexical Conceptual Structure; Lexical Fields; Lexicon, Generative; Metalanguage versus Object Language; Natural Semantic Metalanguage; Polysemy and Homonymy; Semantics in Psychology; Speech Errors: Psycholinguistic Approach; Stereotype Semantics; Word Recognition, Written.

Lhuyd, Edward (ca. 1660–1709) 151

Bibliography Aitchison J (2003). Words in the mind: an introduction to the mental lexicon (3rd edn.). Oxford: Blackwell. Allan K (2001). Natural language semantics. Oxford and Malden, MA: Blackwell. Altman G T M (ed.) (1990). Cognitive models of speech processing: psycholinguistic and computational perspectives. Cambridge MA: MIT Press. Altman G T M & Shillcock R (eds.) (1993). Cognitive models of speech processing. Hove, NJ: Lawrence Erlbaum. Bauer L (1983). English word-formation. Cambridge, UK: Cambridge University Press. Boguraev B & Briscoe T (1989). ‘Introduction.’ In Boguraev B & Briscoe T (eds.) Computational lexicography for natural language processing. London: Longman. 1–40. Butterfield J et al. (eds.) (2003). Collins English dictionary: complete and unabridged (6th edn.). Bishopriggs (Glasgow), UK: HarperCollins. Butterworth B (1983). ‘Lexical representation.’ In Butterworth B (ed.) Development, writing and other language processes, vol. 2. London: Academic Press.

Flores d’Arcais G B & Jarvella R J (eds.) (1983). The process of language understanding. New York: John Wiley. Hanks P et al. (eds.) (1998). Collins English dictionary (4th edn.). Glasgow: HarperCollins. Jackendoff R S (1995). ‘The boundaries of the lexicon.’ In Everaert M, van der Linden E-J, Schenk A & Schreuder R (eds.) Idioms: structural and psychological perspectives. Hillsdale, NJ: Erlbaum. 133–165. Jackendoff R S (1997). Architecture of the language faculty. Cambridge, MA: MIT Press. Kniffka H (1994). ‘Hearsay vs autoptic evidence in linguistics.’ In Nagel T et al. (eds.) Zeitschrift der deutschen morgenla¨ ndischen Gesellschaft. Stuttgart: Franz Steiner. 345–376. Lehrer A (1990). ‘Polysemy, conventionality, and the structure of the lexicon.’ Cognitive Linguistics 1, 207–246. Levelt W J M (ed.) (1993). Lexical access in speech production. Oxford: Blackwell. Lyons J (1977). Semantics (2 vols). Cambridge, UK: Cambridge University Press. Pearsall J et al. (eds.) (1998). New Oxford dictionary of English. Oxford: Oxford University Press. Pustejovsky J (1995). The generative lexicon. Cambridge, MA: MIT Press.

Lhuyd, Edward (ca. 1660–1709) B F Roberts, Ceredigion, UK ! 2006 Elsevier Ltd. All rights reserved.

The first Celtic scholar in the modern sense, Edward Lhuyd was familiar with all the insular Celtic vernaculars, and it was his systematic analyses of the phonological relationships between them and continental Celtic that justified the use of the term ‘Celtic’ as a linguistic category. More generally, the theory and methodology that underpin his study of language relationships are significant contributions to European comparative philology. Edward Lhuyd (or Lhwyd, the Welsh form of his family name Lloyd, which he adopted about 1690), the illegitimate son of Edward Lloyd, last squire of the bankrupt estate of Llanforda, Oswestry, Shropshire, England, and of Bridget Pryse, daughter of a cadet branch of the influential Pryse family of Gogerddan, Cardiganshire, Wales, was born about 1660 and raised in Shropshire where he lived with his father at Llanforda. The Lloyds were an old Oswestry family who had been patrons of poets and harpists and who had played their part in local administration, but none of these activities survived the English Civil Wars (1642–1646, 1650–1651). Edward Lloyd, in spite of his reputation as being

both reckless and dissolute, is an interesting example of 17th-century scientific culture. He was an enterprising horticulturalist, carried out his own chemical experiments, and was patron to an experienced professional gardener and field botanist, Edward Morgan. Relationships between father and son were often strained, but Lhuyd regarded the aging Morgan as one of his tutors, and there is no doubt that he took an interest in his father’s experiments. Following his father’s death, Lhuyd went to Jesus College, Oxford, in 1682 to begin his training in law, but practical botany and related field work, which he had already undertaken in north Wales and around Oswestry, and chemical and other experiments at Llanforda led him to the recently established Ashmolean Museum and to the meetings of the Oxford Philosophical Society. Robert Plot, professor of chemistry and Keeper of the Museum, was sufficiently impressed by the young man’s skills and ideas to appoint him as an assistant at the Ashmolean, where, in turn, he became Keeper in 1691. Lhuyd threw himself into the work of the museum – cataloguing library accessions, seeking out new samples to enrich the fossil collections (his catalogue of British ‘formed stones’ appeared in 1699), and developing his knowledge of flora habitats, especially the Alpine plants of Snowdonia, north Wales, where he discovered Lloydia serotina, the

Affixation 83

Ælfric (fl. 987–1010) M Cummings, York University, Toronto, Canada ! 2006 Elsevier Ltd. All rights reserved.

Recognized as the greatest Old English prose stylist, Ælfric is also the author of the first grammar in English. In 987 he went as master of novices to the monastery of Cernel (modern Cerne Abbas, Dorset), already in orders, and thus he was at least 30 years of age. He had been a pupil of the reforming Bishop Æthelwold in the school at Winchester. With his contemporary Wulfstan, Archbishop of York (obit. 1023), he was one of the two most prominent literary figures of the Benedictine Reform. His production of the Catholic Homilies belongs to the period 990–995, that of the Lives of Saints was no later than 998, and he became the first Abbot of Eynsham, near Oxford, in 1005. His work in language education includes the grammar, a glossary, and the Colloquy on the Occupations. The first of these works is the Excerptiones de arte grammatica anglice, designed to introduce junior students to the Latin grammar. It is based on the Excerptiones de Prisciano, a Latin treatise of the 9th century, compiled from the 6th century Institutiones grammaticae of Priscian, and like it concentrates on Latin syllabification, graphology, phonology, parts of speech, and morphology. The Glossary is a compilation of Latin vocables with Old English glosses. The Colloquy is an imaginary Latin dialogue between a master and his pupils, who represent themselves as having various occupational roles, the ploughman, the huntsman, the fisherman, and so forth. Each describes the habits and rigors of the occupation, sometimes sympathetically. Ælfric’s other extant works include the Catholic Homilies; the Lives of Saints; a De temporibus anni; and a collection of letters, most notably one to the bishop of Sherborne, Wulfsige, and two to Archbishop Wulfstan, designed as expert responses

to theological questions. His prose is remarkably lucid in comparison with that of some authors of the period. Parts of his work are rendered in a heightened style that has been called rhythmic prose, which in its metrical regularity shares some of the characteristics of Old English verse but is distinct from it. His work has long been taken as the exemplar of the Late West Saxon historical and regional dialect. The Catholic Homilies were produced in two series that together comprise alternate English sermons for Sundays and saints’ days. Like the grammar, however, they are based on a Latin exemplar, the late 8th century Homiliary of Paul the Deacon, which was created for reading in the monastic offices. The Lives of Saints are English versions of lives and martyrdoms for saints celebrated in the monastic liturgy, mainly taken from a 9th century Latin collection. The De temporibus anni inculcates the chronology necessary for the calculation of the Easter date. See also: Christianity and Language in the Middle Ages;

Christianity, Catholic; English, Old English; Grammar, Early Medieval; Language Education: Grammar; Language Education: Vocabulary; Language Teaching: History; Latin Lexicography; Latin; Learners’ Dictionaries; Priscianus Caesariensis (d. ca. 530); Roman Ars Grammatica; Traditional Grammar.

Bibliography Clemoes P A M (1959). ‘The chronology of Ælfric’s works.’ In Clemoes P A M (ed.) The Anglo-Saxons: Studies in some aspects of their history and culture, presented to Bruce Dickens. London: Bowes and Bowes. 212–247. Godden M (2000). Ælfric’s Catholic homilies: Introduction, commentary and glossary (Early English Text Society, s. s. 18). Oxford: Oxford University Press. Porter D W (ed.) (2002). Excerptiones de Prisciano: The source for Ælfric’s Latin-Old English grammar. Cambridge: D. S. Brewer.

Affixation A Carstairs-McCarthy, University of Canterbury, Christchurch, New Zealand ! 2006 Elsevier Ltd. All rights reserved.

Definition An affix is a bound morph that (1) is not a root and (2) is a constituent of a word rather than of a phrase

or sentence. Some examples that follow illustrate the implications of (1) and (2). The next section surveys the kinds of affixation that occur, and the last sections discuss theoretical issues. In most complex words, identifying a root (or roots, if the word is a compound) presents little difficulty. For example, in the words misfortunes and premeditated, the roots are clearly fortune and

Lhuyd, Edward (ca. 1660–1709) 151

Bibliography Aitchison J (2003). Words in the mind: an introduction to the mental lexicon (3rd edn.). Oxford: Blackwell. Allan K (2001). Natural language semantics. Oxford and Malden, MA: Blackwell. Altman G T M (ed.) (1990). Cognitive models of speech processing: psycholinguistic and computational perspectives. Cambridge MA: MIT Press. Altman G T M & Shillcock R (eds.) (1993). Cognitive models of speech processing. Hove, NJ: Lawrence Erlbaum. Bauer L (1983). English word-formation. Cambridge, UK: Cambridge University Press. Boguraev B & Briscoe T (1989). ‘Introduction.’ In Boguraev B & Briscoe T (eds.) Computational lexicography for natural language processing. London: Longman. 1–40. Butterfield J et al. (eds.) (2003). Collins English dictionary: complete and unabridged (6th edn.). Bishopriggs (Glasgow), UK: HarperCollins. Butterworth B (1983). ‘Lexical representation.’ In Butterworth B (ed.) Development, writing and other language processes, vol. 2. London: Academic Press.

Flores d’Arcais G B & Jarvella R J (eds.) (1983). The process of language understanding. New York: John Wiley. Hanks P et al. (eds.) (1998). Collins English dictionary (4th edn.). Glasgow: HarperCollins. Jackendoff R S (1995). ‘The boundaries of the lexicon.’ In Everaert M, van der Linden E-J, Schenk A & Schreuder R (eds.) Idioms: structural and psychological perspectives. Hillsdale, NJ: Erlbaum. 133–165. Jackendoff R S (1997). Architecture of the language faculty. Cambridge, MA: MIT Press. Kniffka H (1994). ‘Hearsay vs autoptic evidence in linguistics.’ In Nagel T et al. (eds.) Zeitschrift der deutschen morgenla¨ndischen Gesellschaft. Stuttgart: Franz Steiner. 345–376. Lehrer A (1990). ‘Polysemy, conventionality, and the structure of the lexicon.’ Cognitive Linguistics 1, 207–246. Levelt W J M (ed.) (1993). Lexical access in speech production. Oxford: Blackwell. Lyons J (1977). Semantics (2 vols). Cambridge, UK: Cambridge University Press. Pearsall J et al. (eds.) (1998). New Oxford dictionary of English. Oxford: Oxford University Press. Pustejovsky J (1995). The generative lexicon. Cambridge, MA: MIT Press.

Lhuyd, Edward (ca. 1660–1709) B F Roberts, Ceredigion, UK ! 2006 Elsevier Ltd. All rights reserved.

The first Celtic scholar in the modern sense, Edward Lhuyd was familiar with all the insular Celtic vernaculars, and it was his systematic analyses of the phonological relationships between them and continental Celtic that justified the use of the term ‘Celtic’ as a linguistic category. More generally, the theory and methodology that underpin his study of language relationships are significant contributions to European comparative philology. Edward Lhuyd (or Lhwyd, the Welsh form of his family name Lloyd, which he adopted about 1690), the illegitimate son of Edward Lloyd, last squire of the bankrupt estate of Llanforda, Oswestry, Shropshire, England, and of Bridget Pryse, daughter of a cadet branch of the influential Pryse family of Gogerddan, Cardiganshire, Wales, was born about 1660 and raised in Shropshire where he lived with his father at Llanforda. The Lloyds were an old Oswestry family who had been patrons of poets and harpists and who had played their part in local administration, but none of these activities survived the English Civil Wars (1642–1646, 1650–1651). Edward Lloyd, in spite of his reputation as being

both reckless and dissolute, is an interesting example of 17th-century scientific culture. He was an enterprising horticulturalist, carried out his own chemical experiments, and was patron to an experienced professional gardener and field botanist, Edward Morgan. Relationships between father and son were often strained, but Lhuyd regarded the aging Morgan as one of his tutors, and there is no doubt that he took an interest in his father’s experiments. Following his father’s death, Lhuyd went to Jesus College, Oxford, in 1682 to begin his training in law, but practical botany and related field work, which he had already undertaken in north Wales and around Oswestry, and chemical and other experiments at Llanforda led him to the recently established Ashmolean Museum and to the meetings of the Oxford Philosophical Society. Robert Plot, professor of chemistry and Keeper of the Museum, was sufficiently impressed by the young man’s skills and ideas to appoint him as an assistant at the Ashmolean, where, in turn, he became Keeper in 1691. Lhuyd threw himself into the work of the museum – cataloguing library accessions, seeking out new samples to enrich the fossil collections (his catalogue of British ‘formed stones’ appeared in 1699), and developing his knowledge of flora habitats, especially the Alpine plants of Snowdonia, north Wales, where he discovered Lloydia serotina, the

152 Lhuyd, Edward (ca. 1660–1709)

‘Snowdon lily,’ sometime before 1696. Typical of his time, Lhuyd did not draw lines of demarcation between disciplines; he was as interested in antiquities, onomastics, and languages as in paleontology and botany, seeing them all as aspects of history, natural and human. These interests were brought together in 1693 when he was invited to revise the entries on the Welsh counties in Edmund Gibson’s new English edition of Camden’s Britannia (1695). The field work Lhuyd undertook whetted his appetite, and he welcomed an invitation by ‘some gentlemen in Glamorganshire’ to write a ‘British Dictionary, Historical and Geographical . . . and a Natural History of Wales.’ He seized the opportunity to undertake an extended tour of ‘Celtic Britain’ (Wales, the Scottish Highlands, Ireland, Cornwall, and Brittany) from 1697 to 1701 to gather material for an ambitious series of books bearing the general title Archaeologia Britannica. The invitation was not fortuitous, as Lhuyd’s broadening interests in antiquities and language as a tool for historians were well known. He had already collaborated with John Ray in the latter’s Collection of English words not generally used (1691) in a discussion on the similarities between some northcountry English dialect words and Welsh, a discussion that reveals Lhuyd’s familiarity with current linguistic thinking and questions of language contact. He began learning Irish about this time, and linguistic queries were increasingly being put to him. Lhuyd’s tour with his four assistants produced a remarkable collection of raw data – samples of flora and fauna, descriptions of parishes, antiquities and customs, collections of literary manuscripts and transcripts, dictionaries and wordlists, and transcriptions of spoken vernacular material. Lhuyd analyzed much of the linguistic data on his return to Oxford in 1701 and in 1707 saw the publication of Glossography, volume I of Archaeologia Britannica, the only part to appear. He died two years later, in 1709. The underlying theme of Lhuyd’s project was historical, one of his aims being to present a ‘Clearer Notion of the first Planters of the Three Kingdoms [Wales was subsumed within the crown of England] and a better understanding of our Ancient Names of Persons and Places.’ The language of place-name elements and naming compounds would indicate the peoples responsible for their creation, and knowledge of the ‘original’ languages of Britain, their inscriptions, and early literatures was a necessary skill for the historian. Glossography, therefore, contained grammars of Irish, Breton, and Cornish (Welsh was already catered for) together with dictionaries of Breton, Welsh, and Irish (Lhuyd’s Cornish dictionary was set aside in the expectation that another was being prepared, but it remains in manuscript). But

even more important than language skills was the recognition that the key to the history lay in the vocabulary of these languages; an analysis of the phonology of words in the cognate ‘original’ languages would reveal their affinities while borrowings were an indicator of contacts between their speakers. Etymology was already an accepted tool of historical linguistics, but without a rational framework to categorize sound changes, it was prone to deteriorate into meaningless comparisons of superficially similar words in any two or three languages. Lhuyd’s major contribution was to reduce the arbitrary element in etymology by formulating criteria for examining and recognizing regular phonetic correspondence and determining what languages might bear valid comparison. These criteria he set out in the opening chapter of Glossography, ‘Comparative Etymology, or Remarks on the Alteration of Letters.’ His guidelines for etymology are reasoned, e.g., identity of signification, declensions, and conjugations are proper to individual languages, old or obsolete words may be found in modern dialects, the words to be compared should be the oldest and most simple, the obvious and necessary. Lhuyd argued that languages develop from a parent by changes in the sense of words, by an accidental transposition of sounds or syllables, by the addition or loss of syllables, by the use of different prefixes or suffixes, by mispronunciation, and by borrowings; and he then worked through a host of examples according to a number of ‘observations’ summarized under 10 headings, including seven classes of phonetic change. Lhuyd’s reasoned analysis of permutatio litterarum allowed him to define better than ever before the relationship of Irish, Welsh, and ‘Gaulish.’ Most of Lhuyd’s criteria for language change were phonological; he realized that different orthographical conventions in the various languages could mislead the etymologist. He devised his own phonetic alphabet, a ‘general alphabet’ to facilitate genuine comparison between languages. In this, as in his methodology of comparative linguistics, Edward Lhuyd was a pioneering forerunner of the Celtic studies and linguistic scholarship of the 19th century. See also: Breton; Cornish.

Bibliography Cram D (1999). ‘Edward Lhuyd and the doctrine of the permutation of letters.’ In Hassler G & Schmitter P (eds.) Sprachdiskussion und Beschreibung von Sprachen im 17 und 18. Mu¨ nster: Jahrhundert. 317–335. Ellis R (1907). ‘Some incidents in the life of Edward Lhuyd.’ Transactions of the Honourable Society of Cymmrodorion. 1–51.

Li Fang-Kuei (1902–1989) 153 Emery F V (1969). ‘‘‘The best naturalist now in Europe’’, Edward Lhuyd, FRS, (1660–1709).’ Transactions of the Honourable Society of Cymmrodorion 54–69. Emery F V (1971). Edward Lhuyd, FRS, 1660–1709. Cardiff: University of Wales Press. Gunther R T (1945). Life and letters of Edward Lhwyd. Early Science in Oxford, XIV. Lhuyd E (1971). Archaeologia Britannica, vol. 1, Glossography. Oxford. Facsimile reprints, Menston: Scolar Press, 1969; Shannon: Irish University Press, 1971; vol. 2, pt. 1 of Celtic Linguistics, 1700–1800, London/New York: Routledge, 2000. Luidius E (1699). Lithophylacii Britannici Ichnographia sive Lapidum aliorumque Fossilium Britannicorum. . . Distributio Classica. London.

Roberts B F (1980). Edward Lhuyd: the making of a scientist. Cardiff: University of Wales Press. Roberts B F (1986). ‘Edward Lhuyd and Celtic linguistics.’ In Evans D E, Griffith J G & Jope E M (eds.) Proceedings of the Seventh International Congress of Celtic Studies (Oxford, 10–15 July, 1983). Oxford: D. Ellis Evans. 1–9. Roberts B F (1998). ‘Cyhoeddiadau Edward Lhwyd’ (Edward Lhuyd’s Publications). Welsh Book Studies I 21–58. Roberts B F (1999). ‘The discovery of Old Welsh.’ Historiographia Linguistica XXVI 1–21. Sommerfelt A (1952). ‘Edward Lhuyd and the comparative method in linguistics.’ Nord Tidsskrift for Spogvidenskap 16, 370–374. Williams G J (1973–1974). ‘The history of Welsh scholarship.’ Studia Celtica 8/9, 195–219.

Li Fang-Kuei (1902–1989) R J LaPolla, La Trobe University, Bundoora, Australia ! 2006 Elsevier Ltd. All rights reserved.

Fang-Kuei Li was one of the foremost scholars of Thai and Sino-Tibetan studies and a major contributor to Amerind studies. Born in China, he was one of the early scholars sent to the United States to study. He had developed an interest in language while learning English, Latin, and German as part of his studies in China, and so he decided to study linguistics in the United States. In 1924, he went to the University of Michigan at Ann Arbor, receiving his B.A. 2 years later, then moved to the University of Chicago, where he received his M.A. and Ph.D., studying with Edward Sapir, Leonard Bloomfield, and Carl Darling Buck. Sapir had the greatest influence on Li. From Sapir he learned phonetics, field methods, and about the languages of Native Americans. Sapir also encouraged Li to read articles on East and Southeast Asian languages. It was Sapir’s influence that led Li to take up Thai, Sino-Tibetan and Amerind linguistics. In 1927, Sapir took Li to California to do field work. At first they worked together, but then Sapir sent Li off on his own to work on the Mattole language. There were only two Mattole Indians left at that time, and Li’s notes, which he later used as the basis of his Ph.D. dissertation, are the only record of that language. With Bloomfield, Li studied Germanic linguistics and the methods of text analysis. With Carl Darling Buck, Li studied Indo-European linguistics, especially Greek and Latin. Buck also got Li a fellowship to Harvard in 1928, where Li studied Sanskrit and

Tibetan for 6 months. Then, with letters of recommendation from Franz Boas, Li went to Europe to visit linguists there, such as Walter Simon. After returning to North America in 1929, Li spent 3 months living on an island in the middle of the McKenzie River north of the Arctic Circle doing fieldwork on the Hare language. On his return to China, he was invited to become a member of the Academia Sinica, where he worked on the Chinese dialects on Hainan Island, and also continued to work on Tibetan. Li married Ying Hsu in 1932 and later had three children, Lindy, Peter, and Annie. In 1933, Li went to Thailand to learn Classical Thai, then went to Guangxi to study the Tai languages there. This led to some of his major contributions. Li returned to the United States in 1937 to be a visiting professor at Yale for 2 years, then spent the war years in southwest China, the years 1943–1946 as a visiting professor at the relocated Yanjing University in Chengdu. In 1946, he returned again to the United States, accepting visiting professorships at Harvard, for 2 years, and Yale, for 1 year. In 1949, Li accepted a job at the University of Washington in Seattle and stayed there until he retired in 1969, then accepted an appointment at the University of Hawaii at Manoa, where he stayed until 1974, retiring a second time. Though he retired from teaching, Li continued doing research and publishing. In all he authored nine books and over 100 articles. In addition to his academic accomplishments, Li was a talented painter in Chinese and Western watercolors. He played the Chinese flute well, and enjoyed singing and teaching Kunqu, the musical drama of the Ming dynasty.

Li Fang-Kuei (1902–1989) 153 Emery F V (1969). ‘‘‘The best naturalist now in Europe’’, Edward Lhuyd, FRS, (1660–1709).’ Transactions of the Honourable Society of Cymmrodorion 54–69. Emery F V (1971). Edward Lhuyd, FRS, 1660–1709. Cardiff: University of Wales Press. Gunther R T (1945). Life and letters of Edward Lhwyd. Early Science in Oxford, XIV. Lhuyd E (1971). Archaeologia Britannica, vol. 1, Glossography. Oxford. Facsimile reprints, Menston: Scolar Press, 1969; Shannon: Irish University Press, 1971; vol. 2, pt. 1 of Celtic Linguistics, 1700–1800, London/New York: Routledge, 2000. Luidius E (1699). Lithophylacii Britannici Ichnographia sive Lapidum aliorumque Fossilium Britannicorum. . . Distributio Classica. London.

Roberts B F (1980). Edward Lhuyd: the making of a scientist. Cardiff: University of Wales Press. Roberts B F (1986). ‘Edward Lhuyd and Celtic linguistics.’ In Evans D E, Griffith J G & Jope E M (eds.) Proceedings of the Seventh International Congress of Celtic Studies (Oxford, 10–15 July, 1983). Oxford: D. Ellis Evans. 1–9. Roberts B F (1998). ‘Cyhoeddiadau Edward Lhwyd’ (Edward Lhuyd’s Publications). Welsh Book Studies I 21–58. Roberts B F (1999). ‘The discovery of Old Welsh.’ Historiographia Linguistica XXVI 1–21. Sommerfelt A (1952). ‘Edward Lhuyd and the comparative method in linguistics.’ Nord Tidsskrift for Spogvidenskap 16, 370–374. Williams G J (1973–1974). ‘The history of Welsh scholarship.’ Studia Celtica 8/9, 195–219.

Li Fang-Kuei (1902–1989) R J LaPolla, La Trobe University, Bundoora, Australia ! 2006 Elsevier Ltd. All rights reserved.

Fang-Kuei Li was one of the foremost scholars of Thai and Sino-Tibetan studies and a major contributor to Amerind studies. Born in China, he was one of the early scholars sent to the United States to study. He had developed an interest in language while learning English, Latin, and German as part of his studies in China, and so he decided to study linguistics in the United States. In 1924, he went to the University of Michigan at Ann Arbor, receiving his B.A. 2 years later, then moved to the University of Chicago, where he received his M.A. and Ph.D., studying with Edward Sapir, Leonard Bloomfield, and Carl Darling Buck. Sapir had the greatest influence on Li. From Sapir he learned phonetics, field methods, and about the languages of Native Americans. Sapir also encouraged Li to read articles on East and Southeast Asian languages. It was Sapir’s influence that led Li to take up Thai, Sino-Tibetan and Amerind linguistics. In 1927, Sapir took Li to California to do field work. At first they worked together, but then Sapir sent Li off on his own to work on the Mattole language. There were only two Mattole Indians left at that time, and Li’s notes, which he later used as the basis of his Ph.D. dissertation, are the only record of that language. With Bloomfield, Li studied Germanic linguistics and the methods of text analysis. With Carl Darling Buck, Li studied Indo-European linguistics, especially Greek and Latin. Buck also got Li a fellowship to Harvard in 1928, where Li studied Sanskrit and

Tibetan for 6 months. Then, with letters of recommendation from Franz Boas, Li went to Europe to visit linguists there, such as Walter Simon. After returning to North America in 1929, Li spent 3 months living on an island in the middle of the McKenzie River north of the Arctic Circle doing fieldwork on the Hare language. On his return to China, he was invited to become a member of the Academia Sinica, where he worked on the Chinese dialects on Hainan Island, and also continued to work on Tibetan. Li married Ying Hsu in 1932 and later had three children, Lindy, Peter, and Annie. In 1933, Li went to Thailand to learn Classical Thai, then went to Guangxi to study the Tai languages there. This led to some of his major contributions. Li returned to the United States in 1937 to be a visiting professor at Yale for 2 years, then spent the war years in southwest China, the years 1943–1946 as a visiting professor at the relocated Yanjing University in Chengdu. In 1946, he returned again to the United States, accepting visiting professorships at Harvard, for 2 years, and Yale, for 1 year. In 1949, Li accepted a job at the University of Washington in Seattle and stayed there until he retired in 1969, then accepted an appointment at the University of Hawaii at Manoa, where he stayed until 1974, retiring a second time. Though he retired from teaching, Li continued doing research and publishing. In all he authored nine books and over 100 articles. In addition to his academic accomplishments, Li was a talented painter in Chinese and Western watercolors. He played the Chinese flute well, and enjoyed singing and teaching Kunqu, the musical drama of the Ming dynasty.

154 Li Fang-Kuei (1902–1989) See also: Bloomfield, Leonard (1887–1949); Sapir, Edward

(1884–1939); Sino-Tibetan Languages; Tai Languages.

Bibliography Li F-K (1930). Mattole, an Athabascan language. Chicago: University of Chicago Press. Li F-K (1936–1937). ‘Languages and dialects.’ In The Chinese year book. 121–128 (First and very influential classification of the languages of China. Reprinted in Journal of Chinese Linguistics 1.1:1–13, 1973). Li F-K (1940). The Tai dialects of Lungchow. Monographs of the Institute of History and Philology, Academia Sinica. Shanghai: Commercial Press. (Reprinted as The Zhuang language of Wu-Ming. Beijing: Chinese Academy of Sciences, 1953.) Li F-K (1943). Notes on the Mak language. Monographs of the Institute of History and Philology, Academia Sinica. Shanghai: Commercial Press. (Also in Bulletin of the Institute of History and Philology 19:1–80, 1948.)

Li F-K (1956). ‘The Tibetan inscription of the Sino-Tibetan Treaty of 821–822.’ T’oung Pao 44, 1–99. Li F-K (1977). A handbook of comparative Tai. Manoa: The University of Hawaii Press. Li F-K (1977). A study of the Sui language. Monographs of the Institute of History and Philology, Academia Sinica. Taipei: Commercial Press. Li F-K (1980). Shanggu yin yanjiu (Studies on Old Chinese). Beijing: Shangwu Yinshu Guan. (Very influential reconstruction of Old and Middle Chinese.) Li F-K (1989). Linguistics East and West: Sino-Tibetan, Tai, and American Indian , recorded and edited by Ning-ping Chan & Randy J LaPolla. Transcript of interviews with Prof. Fang-Kuei Li to record his oral history. Published by the Regional Oral History Office, a department of the Bancroft Library, UC Berkeley. Li F-K & SouthCoblin W (1987). A study of the Old Tibetan inscriptions. Institute of History and Philology, Academia Sinica Special Publications #91. Taipei: Commercial Press.

Liberia: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

Liberia, one of Africa’s oldest republics, has a population of 3 390 635 (July 2004 estimate) over an area of 111 370 sq km. It is located in western Africa, bordering the North Atlantic Ocean, between Ivory Coast and Sierra Leone in a terrain that consists of Table 1 Number of speakers according to language Group

Language

Number of speakers (1991 est.)

Mel

Gola Kisi Bandi Dan Kpelle Loma Mano Manya Mende Vai Bassa Dewoin Gbii Glaro-Twabo Grebo dialects Klao Krahn dialects Kuwaa Sapo Tajuasohn

99 300 115 000 70 000 200 000 487 400 141 800 185 000 45 400 1970 89 500 347 600 8100 5600 3900 193 200 184 000 47 800 12 800 31 600 9600

Mande

Kru

mostly flat to rolling coastal plains, rising to rolling plateau and low mountains in the northeast. It shares land borders with Guinea (563 km), Ivory Coast (716 km), and Sierra Leone (306 km). Although Liberia was founded by freed American slaves, 95% of the population is now composed of indigenous Africans of the ethnic groups Kpelle, Bassa, Gio, Kru, Grebo, Mano, Krahn, Gola, Gbandi, Loma, Kisi, Vai, Dei, Bella, Mandingo, and Mende. The descendants of slaves only make up the remaining 5% of the population, equally divided between the Americo-Liberians (descendants of American slaves) and the Congo People (descendants of Caribbean slaves). The official language is English, although only 20% of the population speak it. Most Liberians speak one or more of the 29 African languages spoken in the country, a few of which are written and are used in correspondence. All languages of Liberia are Niger-Congo languages, and they belong to one of three groups within that family: Mande, Kru, or Mel (Table 1). Literacy rates have shown great improvement in the last decade, with a total adult literacy of 57.5% in 2003 compared to 25% in 1989. Of this, 73.3% are men and 41.6% women. See also: Kru Languages; Mande Languages; NigerCongo Languages. Language Maps (Appendix 1): Map 18.

154 Li Fang-Kuei (1902–1989) See also: Bloomfield, Leonard (1887–1949); Sapir, Edward

(1884–1939); Sino-Tibetan Languages; Tai Languages.

Bibliography Li F-K (1930). Mattole, an Athabascan language. Chicago: University of Chicago Press. Li F-K (1936–1937). ‘Languages and dialects.’ In The Chinese year book. 121–128 (First and very influential classification of the languages of China. Reprinted in Journal of Chinese Linguistics 1.1:1–13, 1973). Li F-K (1940). The Tai dialects of Lungchow. Monographs of the Institute of History and Philology, Academia Sinica. Shanghai: Commercial Press. (Reprinted as The Zhuang language of Wu-Ming. Beijing: Chinese Academy of Sciences, 1953.) Li F-K (1943). Notes on the Mak language. Monographs of the Institute of History and Philology, Academia Sinica. Shanghai: Commercial Press. (Also in Bulletin of the Institute of History and Philology 19:1–80, 1948.)

Li F-K (1956). ‘The Tibetan inscription of the Sino-Tibetan Treaty of 821–822.’ T’oung Pao 44, 1–99. Li F-K (1977). A handbook of comparative Tai. Manoa: The University of Hawaii Press. Li F-K (1977). A study of the Sui language. Monographs of the Institute of History and Philology, Academia Sinica. Taipei: Commercial Press. Li F-K (1980). Shanggu yin yanjiu (Studies on Old Chinese). Beijing: Shangwu Yinshu Guan. (Very influential reconstruction of Old and Middle Chinese.) Li F-K (1989). Linguistics East and West: Sino-Tibetan, Tai, and American Indian , recorded and edited by Ning-ping Chan & Randy J LaPolla. Transcript of interviews with Prof. Fang-Kuei Li to record his oral history. Published by the Regional Oral History Office, a department of the Bancroft Library, UC Berkeley. Li F-K & SouthCoblin W (1987). A study of the Old Tibetan inscriptions. Institute of History and Philology, Academia Sinica Special Publications #91. Taipei: Commercial Press.

Liberia: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

Liberia, one of Africa’s oldest republics, has a population of 3 390 635 (July 2004 estimate) over an area of 111 370 sq km. It is located in western Africa, bordering the North Atlantic Ocean, between Ivory Coast and Sierra Leone in a terrain that consists of Table 1 Number of speakers according to language Group

Language

Number of speakers (1991 est.)

Mel

Gola Kisi Bandi Dan Kpelle Loma Mano Manya Mende Vai Bassa Dewoin Gbii Glaro-Twabo Grebo dialects Klao Krahn dialects Kuwaa Sapo Tajuasohn

99 300 115 000 70 000 200 000 487 400 141 800 185 000 45 400 1970 89 500 347 600 8100 5600 3900 193 200 184 000 47 800 12 800 31 600 9600

Mande

Kru

mostly flat to rolling coastal plains, rising to rolling plateau and low mountains in the northeast. It shares land borders with Guinea (563 km), Ivory Coast (716 km), and Sierra Leone (306 km). Although Liberia was founded by freed American slaves, 95% of the population is now composed of indigenous Africans of the ethnic groups Kpelle, Bassa, Gio, Kru, Grebo, Mano, Krahn, Gola, Gbandi, Loma, Kisi, Vai, Dei, Bella, Mandingo, and Mende. The descendants of slaves only make up the remaining 5% of the population, equally divided between the Americo-Liberians (descendants of American slaves) and the Congo People (descendants of Caribbean slaves). The official language is English, although only 20% of the population speak it. Most Liberians speak one or more of the 29 African languages spoken in the country, a few of which are written and are used in correspondence. All languages of Liberia are Niger-Congo languages, and they belong to one of three groups within that family: Mande, Kru, or Mel (Table 1). Literacy rates have shown great improvement in the last decade, with a total adult literacy of 57.5% in 2003 compared to 25% in 1989. Of this, 73.3% are men and 41.6% women. See also: Kru Languages; Mande Languages; NigerCongo Languages. Language Maps (Appendix 1): Map 18.

Liechtenstein: Language Situation 155

Libya: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

The Great Socialist People’s Libyan Arab Jamahiriya (i.e., ‘state ruled by the masses’) is a country of 1.77 million square kilometers on the southern Mediterranean coast, in North Africa. More than 90% of the country is desert or semi-desert, and the majority of the population live along the 1750-kilometer-long coastal strip. Libya borders on Tunisia and Algeria in the west, Chad and Niger in the south, and Egypt and Sudan in the west. The sole official language of Libya is Arabic, and this is also the first language of the large majority of the 5.5 million Libyans. Like many countries in the Arabspeaking world, Libya is in a diglossic situation, in which Modern Standard Arabic (Arabic, Standard), which is used only as a second language, is used in education, in the media, for writing, and in some formal situations, while different Libyan dialects of Arabic are used for daily communication. In the south and in the west of the country, Berber-speaking communities are found, including 140 000 speakers of Nafusi in western Libya and 25 000 Tuaregs, speaking Hoggar and Ghat dialects of Tamahaq (Tamahaq, Tahaggart) (or Tamasheq). Ghadame`s, Awjilah, and

Sawknah are smaller Berber languages, and the last two may be extinct. The language of education is Arabic, and literacy in Arabic has significantly increased during the last decade, from 68.1% in 1990 to 81.7% in 2002. However, there is an imbalance between literacy of females (70.7%) and males (91.8%). English is the most widely taught foreign language and is used as a language of instruction in some parts of tertiary education. Italian, the language of the former colonial power, while still used in business, plays no part in the wider public life, government, or the education system, partly as a result of a strong Arabization program following the revolution in 1969, during which legal decrees ensured that street signs, shop windows, signboards, traffic signs, etc., were written in Arabic. The media in Libya use Arabic in print and broadcasting. An international edition of Al-Fajir al-Jadid is published fortnightly in English, and the external radio service Voice of Africa broadcasts in Arabic, English, and French. See also: Arabic; Berber. Language Maps (Appendix 1): Map 7.

Liechtenstein: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

The tiny Principality of Liechtenstein is sandwiched between Austria to the east and Switzerland to the west. The official language of Liechtenstein is German, and it is in fact the only country in the world where German is the only official language. Liechtenstein is in a diglossic situation: while the official language is standard German – including the 1998 spelling reform to which Liechtenstein is party – the spoken language is an Alemannic dialect of German, similar to the German spoken in Switzerland. There is some dialectal variation, and in particular the dialect of Triesenberg stands apart as a distinct conservative variety of Walser Alemannic,

whose speakers came to Liechtenstein around 1300 from the western Swiss canton of Wallis. In addition to the local variety of German, a number of other languages are spoken by foreign residents who comprise more than one-third of the total population (34% of 34 294 in 2003). The majority of foreign residents are from other Germanspeaking countries, from Switzerland (10.8% of the total population), Austria (5.9%), and Germany (3.4%), and also from Italy (3.3%), countries of the former Republic of Yugoslavia (3.3%), and Turkey (2.6%). The language of education is German, while Latin, French, English, and Spanish are offered as foreign languages in schools. There are two German-language newspapers, the Liechtensteiner Vaterland and the Liechtensteiner Volksblatt, and a

Liechtenstein: Language Situation 155

Libya: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

The Great Socialist People’s Libyan Arab Jamahiriya (i.e., ‘state ruled by the masses’) is a country of 1.77 million square kilometers on the southern Mediterranean coast, in North Africa. More than 90% of the country is desert or semi-desert, and the majority of the population live along the 1750-kilometer-long coastal strip. Libya borders on Tunisia and Algeria in the west, Chad and Niger in the south, and Egypt and Sudan in the west. The sole official language of Libya is Arabic, and this is also the first language of the large majority of the 5.5 million Libyans. Like many countries in the Arabspeaking world, Libya is in a diglossic situation, in which Modern Standard Arabic (Arabic, Standard), which is used only as a second language, is used in education, in the media, for writing, and in some formal situations, while different Libyan dialects of Arabic are used for daily communication. In the south and in the west of the country, Berber-speaking communities are found, including 140 000 speakers of Nafusi in western Libya and 25 000 Tuaregs, speaking Hoggar and Ghat dialects of Tamahaq (Tamahaq, Tahaggart) (or Tamasheq). Ghadame`s, Awjilah, and

Sawknah are smaller Berber languages, and the last two may be extinct. The language of education is Arabic, and literacy in Arabic has significantly increased during the last decade, from 68.1% in 1990 to 81.7% in 2002. However, there is an imbalance between literacy of females (70.7%) and males (91.8%). English is the most widely taught foreign language and is used as a language of instruction in some parts of tertiary education. Italian, the language of the former colonial power, while still used in business, plays no part in the wider public life, government, or the education system, partly as a result of a strong Arabization program following the revolution in 1969, during which legal decrees ensured that street signs, shop windows, signboards, traffic signs, etc., were written in Arabic. The media in Libya use Arabic in print and broadcasting. An international edition of Al-Fajir al-Jadid is published fortnightly in English, and the external radio service Voice of Africa broadcasts in Arabic, English, and French. See also: Arabic; Berber. Language Maps (Appendix 1): Map 7.

Liechtenstein: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

The tiny Principality of Liechtenstein is sandwiched between Austria to the east and Switzerland to the west. The official language of Liechtenstein is German, and it is in fact the only country in the world where German is the only official language. Liechtenstein is in a diglossic situation: while the official language is standard German – including the 1998 spelling reform to which Liechtenstein is party – the spoken language is an Alemannic dialect of German, similar to the German spoken in Switzerland. There is some dialectal variation, and in particular the dialect of Triesenberg stands apart as a distinct conservative variety of Walser Alemannic,

whose speakers came to Liechtenstein around 1300 from the western Swiss canton of Wallis. In addition to the local variety of German, a number of other languages are spoken by foreign residents who comprise more than one-third of the total population (34% of 34 294 in 2003). The majority of foreign residents are from other Germanspeaking countries, from Switzerland (10.8% of the total population), Austria (5.9%), and Germany (3.4%), and also from Italy (3.3%), countries of the former Republic of Yugoslavia (3.3%), and Turkey (2.6%). The language of education is German, while Latin, French, English, and Spanish are offered as foreign languages in schools. There are two German-language newspapers, the Liechtensteiner Vaterland and the Liechtensteiner Volksblatt, and a

156 Liechtenstein: Language Situation

radio provider, but most media, and all TV, come from outside Liechtenstein. See also: German; Switzerland: Language Situation. Language Maps (Appendix 1): Map 140.

Lillovet

Bibliography Stricker H, Banzer T & Hilbe H (1999). Liechtensteiner Namenbuch (6 vols). Vaduz: Historischer Verein fu¨ r das Fu¨ rstentum Liechtenstein.

See: Salishan Languages.

Limits of Language G Priest, University of Melbourne, Melbourne, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.

One thing that language (to the sense of which notion we will return in a moment) obviously does well is express things. In particular, it can be used to express information, often of a very complex kind. It might therefore be wondered whether there are limits to this: is there any kind of information that cannot be expressed in language (Alston, 1956)? It is not uncommon to hear it said that there is of course such information: one cannot express the taste of a peach, or the color red. But obviously one can express such things. The color red is – what else? – red. What is usually meant by inexpressibility claims, say, about redness, is that there is nothing that can be said in words that will conjure up the mental image of red for someone who has never experienced this before. Maybe so. But to identify meaning with such images, though a natural enough view, is hardly tenable. As the 20th-century Austrian philosopher Ludwig Wittgenstein pointed out in the Philosophical investigations (Wittgenstein, 1953), there are many words and phrases that conjure up no such images; and if a person is able to use a word correctly in the company of others, what mental images they are experiencing, if, indeed, any at all, is quite immaterial. Their words are meaningful, and so convey their information, in the usual way. (Private images are irrelevant to public meaning.) It is sometimes claimed that representations such as pictures, maps, and diagrams can convey information that cannot be captured verbally. No doubt they can often express information more effectively, but it is hard to find examples of information expressible only

in this way (especially once one has jettisoned the view that mental imagery has any intrinsic connection with meaning). And, in any case, it is natural enough to think of representations of this kind as languages. It seems profitless (to me, anyway) to dispute over whether or not such things really are a language. They share with verbal language at least this: they are structured forms of representation that can be used to convey information of a kind that may never have been expressed before. If, therefore, we are looking for information that cannot be represented, we will have to look elsewhere. It is presumably uncontentious that relative to most – maybe all – systems of representations there will be information that cannot be expressed. Thus, a medieval monk did not have the conceptual resources to speak about microchips and quantum fields. It can still, of course, be represented by some other system – maybe the old one augmented by the appropriate concepts. Similarly, relative to any abstract system of representations, there are likely to be things that can be represented, but that, because of their computational complexity, outstrip the resources available to a human brain, and so are inaccessible. These things could become accessible with the help of a different form of representation, however. (Thus, the multiplication of numbers using Roman numerals is computationally much harder than multiplication using Arabic numerals.) The interesting question is whether there is information that is not just unrepresentable because of contingent constraints of this kind but whether there is information that is essentially so. If there is, it is unlikely that this will be demonstrable without appeal to some substantial metaphysical views. One such view concerns the nature of God. In certain kinds of Christian theology, God is taken to

156 Liechtenstein: Language Situation

radio provider, but most media, and all TV, come from outside Liechtenstein. See also: German; Switzerland: Language Situation. Language Maps (Appendix 1): Map 140.

Lillovet

Bibliography Stricker H, Banzer T & Hilbe H (1999). Liechtensteiner Namenbuch (6 vols). Vaduz: Historischer Verein fu¨r das Fu¨rstentum Liechtenstein.

See: Salishan Languages.

Limits of Language G Priest, University of Melbourne, Melbourne, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.

One thing that language (to the sense of which notion we will return in a moment) obviously does well is express things. In particular, it can be used to express information, often of a very complex kind. It might therefore be wondered whether there are limits to this: is there any kind of information that cannot be expressed in language (Alston, 1956)? It is not uncommon to hear it said that there is of course such information: one cannot express the taste of a peach, or the color red. But obviously one can express such things. The color red is – what else? – red. What is usually meant by inexpressibility claims, say, about redness, is that there is nothing that can be said in words that will conjure up the mental image of red for someone who has never experienced this before. Maybe so. But to identify meaning with such images, though a natural enough view, is hardly tenable. As the 20th-century Austrian philosopher Ludwig Wittgenstein pointed out in the Philosophical investigations (Wittgenstein, 1953), there are many words and phrases that conjure up no such images; and if a person is able to use a word correctly in the company of others, what mental images they are experiencing, if, indeed, any at all, is quite immaterial. Their words are meaningful, and so convey their information, in the usual way. (Private images are irrelevant to public meaning.) It is sometimes claimed that representations such as pictures, maps, and diagrams can convey information that cannot be captured verbally. No doubt they can often express information more effectively, but it is hard to find examples of information expressible only

in this way (especially once one has jettisoned the view that mental imagery has any intrinsic connection with meaning). And, in any case, it is natural enough to think of representations of this kind as languages. It seems profitless (to me, anyway) to dispute over whether or not such things really are a language. They share with verbal language at least this: they are structured forms of representation that can be used to convey information of a kind that may never have been expressed before. If, therefore, we are looking for information that cannot be represented, we will have to look elsewhere. It is presumably uncontentious that relative to most – maybe all – systems of representations there will be information that cannot be expressed. Thus, a medieval monk did not have the conceptual resources to speak about microchips and quantum fields. It can still, of course, be represented by some other system – maybe the old one augmented by the appropriate concepts. Similarly, relative to any abstract system of representations, there are likely to be things that can be represented, but that, because of their computational complexity, outstrip the resources available to a human brain, and so are inaccessible. These things could become accessible with the help of a different form of representation, however. (Thus, the multiplication of numbers using Roman numerals is computationally much harder than multiplication using Arabic numerals.) The interesting question is whether there is information that is not just unrepresentable because of contingent constraints of this kind but whether there is information that is essentially so. If there is, it is unlikely that this will be demonstrable without appeal to some substantial metaphysical views. One such view concerns the nature of God. In certain kinds of Christian theology, God is taken to

Limits of Language 157

be so different in kind from anything that humans can conceive of that no human concepts can be correctly applied to Him. Because all language deploys only such concepts, the true nature of God cannot be expressed in language (Alston, 1998). The claim that the nature of God is ineffable is sometimes buttressed by other considerations, especially in the Neo-Platonist tradition. God, it is claimed, is the ground of all beings. That is, He is that which creates and sustains all beings. As such, He is not Himself a being: not a this rather than a that. His nature cannot, therefore, be communicated in words: to say anything about Him would be to say that He is a this, rather than a that, and so treat Him simply as another being. The thought that beings have a ground of this kind is not restricted to Christianity but seems to be a perennial one. It is found in Neo-Platonism (Christian and non-Christian), in which the One plays this role (O’Meara, 1993); it is found in Hinduism, in which Brahman plays this role; it is found in Taoism, in which the Tao plays this role (Lau, 1982); it is found in the writings of the 20th-century German philosopher Martin Heidegger, in which Being (Sein) plays this role (Krell, 1977). A closely related, but different, view is that there is a fundamental or ultimate reality such that the reality that we perceive or conceive is obtained by the imposition of a conceptual grid thereupon. To say what it is like, in itself, is therefore impossible, as anything said about it will deploy our conceptual grid, which is simply a superposition. Again, the existence of such a reality seems a perennial thought. It is the role played by chora (wora) in Plato’s Timaeus; it is the role played by ultimate reality (emptiness, s´u¯nyata¯) in various branches of Mahayana Buddhism, especially Yogacara (Williams, 1998). Indeed, when Taoism and Indian Mahayana fused to give Chan (Zen), this theme merged with the previous one. Fundamental reality (Buddha nature) can be appreciated only via a direct experience. It is a simple ‘thusness’ or ‘suchness,’ beyond all words (Kasulis, 1989). In some ways, the views of the 18th-century German philosopher Immanuel Kant, as expressed in his Critique of pure reason (Kemp Smith, 1923) are similar. For Kant, the empirical world is not independent of us but is partly constituted by our mental concepts. These include the forms of space and time. More particularly for present concerns, these include the logical categories that we apply when we make judgments (such as all/some, is/is not). These categories, moreover, depend for their applicability on temporal criteria. Reason forces us, however, to think (about) reality as it is in itself, independent of our mental constructions ‘things in themselves’ (dinge an sich). And because such

things are outside time, there is no way that we can apply our categories to them, and so make judgments about them. Thus, although we are forced to recognize the existence of such a reality, there is nothing that can be said about it. About a century and a half later, but for quite different reasons, Wittgenstein ended up in a similar situation when he wrote the Tractatus logicophilosophicus (Pears and McGuinness, 1961). For Wittgenstein, reality is constituted by certain states of affairs; these are composed of objects configured in certain ways. Language, on the other side of the fence, is constituted by propositions; these are composed of names configured in certain ways. A proposition represents a state of affairs if the names in it correspond to the objects in the state, and the configuration of names in the proposition is isomorphic to (has the same form as) the configuration of objects in the state. (This is the so-called picture theory. Wittgenstein is reputed to have been provoked into it by noting how the icons in a scale-representation work.) It is a consequence of this view that any situation that is not a configuration of objects cannot be expressed in a proposition. Indeed, attempts to do so will produce semantic nonsense. Such situations cannot, therefore, be described. An irony of this is that Wittgenstein’s theory itself requires him to talk, not just of objects but also of propositions, configurations, form; and for various reasons these cannot be objects. (For example, propositions can be asserted or denied; objects cannot. And the form of a state of affairs is not one of the objects in it: it is the way that those objects are structured together.) His own theory is therefore an attempt to do the impossible. This triggers the spectacular de´ nouement to the Tractatus, in which Wittgenstein pronounces his own theory to be nonsense. The final considerations that I will mention that drive toward things inexpressible concern the infinite, as it is understood in modern logic and mathematics (Moore, 1985). According to this, there are different sizes of infinity. The smallest of these, countable infinity, is the size of the natural numbers (0, 1, 2, . . .). Because the totality of objects (or even of numbers) is larger than this, so will be the totality of facts about them. (For each object, for example, it is either finite or infinite.) But any language (at least of the kind that is humanly usable) can be shown to have only countably many sentences. There will therefore be many facts that cannot be expressed. Of course, for all that this shows, each of these facts could be expressed by some richer language; for example, one obtained by adding another name. But there is more to it than this. To say something

158 Limits of Language

about an object, one has to refer to it; to do this, one has to be able to single it out in some way; and the totality of all objects is so rich that it will contain objects that are entirely indiscriminable from each other by our finite cognitive resources, and so that cannot be singled out. (Points in a continuum, for example, may be so close as to be indistinguishable by any cognitive mechanism.) There is much, therefore, that will be inexpressible. As we have now seen, there is a wide variety of metaphysical views that deliver the conclusion that there are things that are ineffable. Evaluating these views goes well beyond anything possible here. I will conclude with a brief discussion of a structural feature of (discussions of) the ineffable. As is probably clear, theories that claim that certain things are inexpressible have a tendency to say just such things. Thus, Christians say much about God, Buddhists say much about emptiness, Heidegger says much about Being, Kant says much about dinge an sich, and Wittgenstein says much about the relation between language and reality. What is one to say about this? The only thing one can say is that these claims are either literally false or meaningless. The second move is made by Wittgenstein in the Tractatus. The first move is more common: all one can do is deny any claim made about the object in question. One finds this move in Christian negative theology (Braine, 1998) and some versions of Hinduism, for example, the Advaita Veda¯ nta of S´ ankara (Sengaku Mayeda, 1992). Kant struggles with a version of this view when he distinguishes between a legitimate negative notion of ding an sich and an illegitimate positive one. This cannot be the whole story, however, as each position does appear to endorse various claims about the ineffable. How are these to be understood? The most common move is to suggest that one has to understand such assertions as metaphorical, analogical, or in some other nonliteral way. So understood, they can ‘point to’ the ineffable, although not express it. In Christian theology, this move is made by the 11th-century theologian St. Anselm; similar claims also can be found in the Zen tradition; and Heidegger uses the notion of writing under erasure (‘ ’) in an attempt to indicate that his words are not to be taken literally. There is something very unsatisfactory about this, though. One thing that each tradition gives is a set of reasons as to why the thing in question cannot be described: God is beyond categorization; the ground of being is not itself a being; ultimate reality has no features; categories cannot be applied to things outside time; propositions and form are not objects. If

one does not understand these claims as literally true, then the very ground for supposing the things in question to be inexpressible falls away. (If ‘Juliet is the sun’ is not to be taken literally, there is no reason to suppose that she is made of hydrogen and helium.) Indeed, at the very heart of the view that language has limits is a fundamental paradox (Priest, 1995). To claim that language has limits is to claim that there are things that cannot be talked about; but to say this is exactly to talk about them. The paradox manifests itself in a precise form in some of the paradoxes of self-reference in modern logic. There are many ordinal numbers that cannot be referred to. So there is a least such. But ‘the least number that cannot be referred to’ refers to that number – Ko¨ ning’s paradox. For skeptics about sizes of infinity, there is even a finite version of this. There is only a finite number of names (i.e., proper names or definite descriptions) with less than (say) 100 letters (in English); there is therefore a finite number of (natural) numbers that can be referred to by names of this kind. So there will be numbers that cannot be referred to in this way. ‘The least number that cannot be referred to with less than 100 letters’ (which has less than 100 letters) refers to one of these – Berry’s paradox. Various responses to these paradoxes have been proposed in modern logic, but none that is either generally accepted or unproblematic. One reaction to the fundamental paradox is to reject the notion of the limits of language altogether: there is nothing that it is beyond the ability of language to express. But any theory according to which there are limits to language – including, it would seem, contemporary logic – would appear to be stuck with this contradiction. See also: Language of Thought; Thought and Language: Philosophical Aspects.

Bibliography Alston W P (1956). ‘Ineffability.’ Philosophical Review 65, 506–522. Alston W P (1998). ‘Religious language.’ In Craig E (ed.) Routledge encyclopedia of philosophy, vol. 8. London: Routledge. 255–260. Braine D (1998). ‘Negative theology.’ In Craig E (ed.) Routledge encyclopedia of philosophy, vol. 6. London: Routledge. 759–763. Kasulis T P (1989). Zen action, zen person. Honolulu: University of Hawaii Press. Kemp-Smith N (trans.) (1923). Immanuel Kant’s Critique of pure reason (2 edn.). London: Macmillan. Krell D F (ed.) (1977). Martin Heidegger: basic writings. New York: Harper & Row. Lau D C (trans.) (1982). Tao te ching. Hong Kong: Chinese University Press.

Linearity/Syntagmatics 159 Moore A (1985). The infinite. London: Routledge. O’Meara D J (1993). Plotinus: an introduction to the Enneads. Oxford: Clarendon Press. Pears D F & McGuinness B F (trans.) (1961). Tractatus logico-philosophicus. London: Routledge and Kegan Paul. Priest G (1995). Beyond the limits of thought. Cambridge: Cambridge University Press. 2nd edn., Oxford: Oxford University Press, 2002.

Sengaku Mayeda (trans.) (1992). A thousand teachings: the Upadesasahasri of Sankara. Albany: State University of New York Press. Williams P (1998). ‘Buddhist Concept of Emptiness.’ In Craig E (ed.) Routledge encyclopedia of philosophy, vol. 2. London: Routledge. 76–80. Wittgenstein L (1953). Philosophical investigations. Oxford: Basil Blackwell.

Linearity/Syntagmatics G Lepschy, University College London, London, UK ! 2006 Elsevier Ltd. All rights reserved.

There is a group of concepts usually referred to in modern linguistics using the terms linearity and syntagmatics. These concepts are rooted in the beginning of grammatical reflection in classical antiquity, but they emerged explicitly as topics central to linguistic theory with the spread of Saussure’s ideas at the beginning of the 20th century. This article discusses linearity and syntagmatics under three headings: (1) in Saussure’s Course in general linguistics, (2) in syntactic theory, and (3) in the history of grammar.

Linearity and Syntagmatics in Saussure In Saussure’s Cours de linguistique ge´ ne´ rale (Saussure, 1916/1972), there are three areas that refer explicitly to our topic: (1) the linear nature of the signifier, (2) syntagmatic relations, and (3) syntax. These are briefly examined here, giving quotations from Saussure’s Course in general linguistics, using the English translation by Wade Baskin (Saussure, 1959), which is more literal than in the freer and more sophisticated, but also more eccentric, translation by Roy Harris (Saussure, 1983). Concerning the first question, the linear character of the signifier, the Course (Saussure, 1959: 70) states (Part 1: General principles; Chap. I: Nature of the linguistic sign; Sec. 3, Principle II: The linear nature of the signifier):

are incalculable.’’ The link between linearity and the time dimension was familiar to Saussure’s contemporaries and was repeatedly taken up later in the 20th century. One of the best-known discussions is offered by the valued philosopher of science Hans Reichenbach (1957, 109–110), who stressed the basic difference between time, which is one dimensional, and space, which is, at least at the level of our senses, three dimensional. It would, of course, be desirable to clarify the relation between the single temporal dimension of speech sounds succeeding one another and the single spatial dimension of symbols written one after the other along a line. On the other hand, if the spoken message is inevitably produced and perceived through time, the written message may be perceived as a simultaneous whole, a Gestalt in which different parts are linked by spatial and not temporal relationships. Concerning syntagmatics, the Course states (Part 2: Synchronic linguistics; Chap. 5: Syntagmatic and associative relations; Sec. 2: Syntagmatic relations) that the notion syntagm applies ‘‘not only to words but also to groups of words, to complex units of all lengths and types (compounds, derivatives, phrases, whole sentences).’’ (Saussure, 1959: 124). And later: An objection might be raised at this point. The sentence is the ideal type of syntagm. But it belongs to speaking [elle appartient a` la parole], not to language [non a` la langue] [. . .] Does it not follow that the syntagm belongs to speaking? I do not think so. Speaking is characterized by freedom of combinations; one must therefore ask whether or not all syntagms are equally free. (Saussure, 1959: 124).

The signifier, being auditory, is unfolded solely in time from which it gets the following characteristics: (a) it represents a span, and (b) the span is measurable in a single dimension; it is a line.

This is followed by a discussion of expressions belonging to la langue rather than la parole, ending, however, with the caveat (Saussure, 1959: 124–125):

The Course adds that, although this principle is obvious, ‘‘apparently linguists have always neglected to state it, doubtless because they found it too simple; nevertheless it is fundamental, and its consequences

we must realize that in the syntagm there is no clear-cut boundary between the language fact [le fait de langue], which is a sign of collective usage, and the fact that belongs to speaking [le fait de parole] and depends on individual freedom.

Linearity/Syntagmatics 159 Moore A (1985). The infinite. London: Routledge. O’Meara D J (1993). Plotinus: an introduction to the Enneads. Oxford: Clarendon Press. Pears D F & McGuinness B F (trans.) (1961). Tractatus logico-philosophicus. London: Routledge and Kegan Paul. Priest G (1995). Beyond the limits of thought. Cambridge: Cambridge University Press. 2nd edn., Oxford: Oxford University Press, 2002.

Sengaku Mayeda (trans.) (1992). A thousand teachings: the Upadesasahasri of Sankara. Albany: State University of New York Press. Williams P (1998). ‘Buddhist Concept of Emptiness.’ In Craig E (ed.) Routledge encyclopedia of philosophy, vol. 2. London: Routledge. 76–80. Wittgenstein L (1953). Philosophical investigations. Oxford: Basil Blackwell.

Linearity/Syntagmatics G Lepschy, University College London, London, UK ! 2006 Elsevier Ltd. All rights reserved.

There is a group of concepts usually referred to in modern linguistics using the terms linearity and syntagmatics. These concepts are rooted in the beginning of grammatical reflection in classical antiquity, but they emerged explicitly as topics central to linguistic theory with the spread of Saussure’s ideas at the beginning of the 20th century. This article discusses linearity and syntagmatics under three headings: (1) in Saussure’s Course in general linguistics, (2) in syntactic theory, and (3) in the history of grammar.

Linearity and Syntagmatics in Saussure In Saussure’s Cours de linguistique ge´ne´rale (Saussure, 1916/1972), there are three areas that refer explicitly to our topic: (1) the linear nature of the signifier, (2) syntagmatic relations, and (3) syntax. These are briefly examined here, giving quotations from Saussure’s Course in general linguistics, using the English translation by Wade Baskin (Saussure, 1959), which is more literal than in the freer and more sophisticated, but also more eccentric, translation by Roy Harris (Saussure, 1983). Concerning the first question, the linear character of the signifier, the Course (Saussure, 1959: 70) states (Part 1: General principles; Chap. I: Nature of the linguistic sign; Sec. 3, Principle II: The linear nature of the signifier):

are incalculable.’’ The link between linearity and the time dimension was familiar to Saussure’s contemporaries and was repeatedly taken up later in the 20th century. One of the best-known discussions is offered by the valued philosopher of science Hans Reichenbach (1957, 109–110), who stressed the basic difference between time, which is one dimensional, and space, which is, at least at the level of our senses, three dimensional. It would, of course, be desirable to clarify the relation between the single temporal dimension of speech sounds succeeding one another and the single spatial dimension of symbols written one after the other along a line. On the other hand, if the spoken message is inevitably produced and perceived through time, the written message may be perceived as a simultaneous whole, a Gestalt in which different parts are linked by spatial and not temporal relationships. Concerning syntagmatics, the Course states (Part 2: Synchronic linguistics; Chap. 5: Syntagmatic and associative relations; Sec. 2: Syntagmatic relations) that the notion syntagm applies ‘‘not only to words but also to groups of words, to complex units of all lengths and types (compounds, derivatives, phrases, whole sentences).’’ (Saussure, 1959: 124). And later: An objection might be raised at this point. The sentence is the ideal type of syntagm. But it belongs to speaking [elle appartient a` la parole], not to language [non a` la langue] [. . .] Does it not follow that the syntagm belongs to speaking? I do not think so. Speaking is characterized by freedom of combinations; one must therefore ask whether or not all syntagms are equally free. (Saussure, 1959: 124).

The signifier, being auditory, is unfolded solely in time from which it gets the following characteristics: (a) it represents a span, and (b) the span is measurable in a single dimension; it is a line.

This is followed by a discussion of expressions belonging to la langue rather than la parole, ending, however, with the caveat (Saussure, 1959: 124–125):

The Course adds that, although this principle is obvious, ‘‘apparently linguists have always neglected to state it, doubtless because they found it too simple; nevertheless it is fundamental, and its consequences

we must realize that in the syntagm there is no clear-cut boundary between the language fact [le fait de langue], which is a sign of collective usage, and the fact that belongs to speaking [le fait de parole] and depends on individual freedom.

160 Linearity/Syntagmatics

Concerning syntax, the Course mentions (Part 2: Synchronic linguistics; Chap. 7: Grammar and its subdivisions; Sec. 1: Definitions: traditional divisions) ‘‘the interchange of simple words and phrases within the same language’’ (Saussure, 1959: 136), for example, conside´ rer and prendre en conside´ ration or se venger de and tirer vengeance de, which can replace one another. Functionally, therefore, the lexical and the syntactical may blend. There is basically no distinction between any word that is not a simple, irreducible unit and a phrase, which is a syntactical fact. The arrangement of the subunits of the word obeys the same fundamental principles as the arrangement of groups of words in phrases. (Saussure, 1959: 136–137)

Later, the Course states (Sec. 2: Rational divisions; Saussure, 1959: 136–137) that syntax (i.e., theory of word groupings, according to the most common definition) goes back to the theory of syntagms [rentre dans la syntagmatique], for the groupings always suppose at least two units distributed in space. Not every syntagmatic fact is classed as syntactical, but every syntactical fact belongs to the syntagmatic class [tous les faits de syntaxe appartiennent a` la syntagmatique].

Earlier on (at the beginning of Chap. 5; Saussure, 1959: 123), the editors of the Course had stressed in a footnote: ‘‘It is scarcely necessary to point out that the study of syntagms is not to be confused with syntax. Syntax is only one part of the study of syntagms.’’ The paradoxical and contradictory way that these three aspects of Saussurean linguistics (linearity, syntagmatics, and syntax) are treated in the Course cannot fail to strike the observer. On the one hand, their importance is explicitly stressed; on the other, they are largely disregarded in the book. Even the relations among these three concepts are presented in a confusing and unclear way that has caused many interpreters of Saussure’s ideas to disagree about how they should be properly clarified or to reject some of these assumptions and proposals once their meaning has been established. (See in particular Jakobson’s 1939 comments in a lecture given in Copenhagen on the simultaneity of distinctive features, published in Jakobson, 1962: 304–305; Godel, 1957; Engler, 1974; Wunderli, 1981; Amacker, 1995; Lo Cascio, 1995.) The dichotomy between syntagmatic and associative relations has often caused misunderstandings. It is presented as an opposition between in praesentia vs. in absentia relations, that is, relations between items that are all present in the same syntagm (they are linked by syntagmatic relations) versus relations between items of which only one is present

and that is linked to other, absent items by associative (or paradigmatic, as they are generally called, following Hjelmslev, 1961) relations. This constitutes a paradigm, to use Saussure’s (1959: 126–127) term: Whereas a syntagm immediately suggests an order of succession and a fixed number of elements, terms in an associative family occur neither in fixed numbers nor in a definite order [. . .]. A particular word is like the center of a constellation; it is the point of convergence of an indefinite number of coordinated terms [. . .]. But of the two characteristics of the associative series – indeterminate order and indefinite number – only the first can always be verified; the second may fail to meet the test. This happens in the case of inflectional paradigms, which are typical of associative groupings. Latin dominus, domini, domino, etc., is obviously an associative group formed around a common element, the noun theme domin-, but the series is not indefinite as in the case of enseignement, changement, etc.; the number of cases is definite. Against this, the words have no fixed order of succession, and it is by a purely arbitrary act that the grammarian groups them in one way rather than another.

As a matter of fact, depending on the school tradition, the Latin paradigm is given as dominus, domini, and so on or as dominus, dominum, and so on.

Linearity and Syntax One of the most striking features of linguistic theory during the last 50 years is the change from a Saussurean to a Chomskyan perspective. Syntax had a vital part in bringing about this change and, in turn, acquired a different complexion in this process. Chomskyan syntax clearly belongs to langue rather than parole and resides in the individual’s competence rather than being a social fact, to use Saussure’s notions. For Chomsky, it was part of i-language and not of e-language. The traditional view of the spoken chain, or of sentences consisting of strings of words following one another, is obviously misleading. A sentence, spoken or written, consists of constituents, variously called phrases, syntagms, or word groups. Since at least the 18th century, word groups (Graffi, 2001: 136), not individual words, were traditionally considered the building blocks of syntax, that is, to be the basic units entering into grammatical relations with other units or having certain functions, such as subject or object. A central insight in Chomskyan syntax (Chomsky, 1957) is that grammatical functions (or relations), such as subject and object, are defined, not in terms of meaning (thematic roles or theta roles, such as Agent, Patient etc.), or of the communicative

Linearity/Syntagmatics 161

dynamism of the message (the theme, or topic of which a person is speaking, and the rheme, or comment about it), but in configurational terms, on the basis of the position that a category symbol, such as Noun Phrase (NP), occupies within the tree representing the syntactic structure of the sentence: The subject is the NP that is directly dominated by the node Sentence (S), and the object is the NP that is directly dominated by the node Verb Phrase (VP). Trees are configurations studied by the branch of topology called graph theory, based on the work of the great Swiss mathematician Leonhard Euler (1707–1783). Traditionally, a sentence was seen as a string of symbols following one another in time or along a single line (hence the expressions linearity and linear word order). But trees, which were introduced into many different disciplines from the beginning of the 19th century, suggested a different image, with branches departing from nodes, and allowed a different graphic representation of syntactic structure, such as that of the phrase markers adopted by generative linguistics. A tree can be equivalently represented by a string in which category symbols are enclosed in brackets. For the sentence John hit the boy, the phrase marker of the tree diagram illustrating the structure of the sentence may appear as the following string: [s[NP[N John ]N]NP [VP[v hit]v [NP[Det the ]Det[Nboy ]N]NP]VP]s

But this representation is only apparently linear. In fact, the labeled brackets (like the tree) convey the crucial information, not available from the linear word order, that John, because it is directly dominated by S, is the subject and that the boy, because it is directly dominated by VP, is the object. The sentence is more than a string of words. It has a syntactic structure. More information is conveyed by the tree (or the equivalent series of self-embedded brackets) than by the linear order of words following one another along one single dimension, temporal in spoken discourse and spatial in written transcription. (For the formal properties of grammars see Luce et al., 1963; for discussions of linearity in generative grammar, see Vincent, 1979; Corver and van Riemsdijk, 1994; Karimi, 2003.)

Linearity in the History of Grammar The contrast between the central importance of syntax in Chomskyan linguistics and its modest and peripheral position in a Saussurean perspective should not suggest that syntax is a young discipline with only a few decades of life. In fact, the opposite is true – and the attempts to identify the abstract, invisible

structure of grammatical relations or functions behind the superficial, visible layer of word order offer important insights into the history of grammatical notions. It has been suggested that syntax in the modern sense is a medieval creation (Robins, 1980). In Greek and Roman classical culture, subject, predicate, and object (hupokeimenon/subiectum, kategoroumenon/ praedicatum, antikeimenon/obiectum) were notions belonging to logic. It was only in medieval times that the grammatical notions of suppositum and appositum for subject and predicate (i.e., verb, or verb þ object) were introduced with Petrus Helias and the French schools of the 12th century (Black, 2001: 70– 71, 92–93) As for the terms ‘subject’ and ‘predicate,’ and their equivalents in modern European languages, they have a complex and varied history that needs to be studied in detail. One of the major experts in medieval logic, L. M. de Rijk (1980: 230), noted that ‘‘word order was viewed as the rendez-vous of grammar and ontology’’ and pointed out (Rijk, 1967) the curious attempts made by medieval logicians to read referential distinctions into different word orders, in particular the position of a noun before (a parte ante) or after the verb (a parte post), so that crudas carnes comedi ‘raw meat I ate’ is true even if the meat was well cooked when I ate it because the construction a parte ante suggests that it was raw earlier, when I bought it. This does not apply to comedi crudas carnes ‘I ate raw meat’ because the construction a parte post requires, for the statement to be true, that the meat was still raw when I ate it (Buridanus, 1977: 59, 67–68; 2001, 877, 886–887). This is not, of course, the way in which Latin grammar works. But even the humanist who contemptuosly criticized the barbaric Latin of medieval logicians like Buridan, acknowledged that different word orders could correspond to different meanings, for instance in bibas priusquam edas ‘drink before you eat’ versus edas priusquam bibas ‘eat before you drink.’ It is worth remembering that, unexpectedly for modern readers, classical and Renaissance grammars of Greek and Latin found it difficult to explain accusative and infinitive structures, in which both the subject and the object of an infinitive surface as accusatives. The most important ancient syntactician, the Alexandrian grammarian Apollonius Dyscolus (1981: 185–186), stressed the ambiguity in the Iliad (Book 5: 118) of dos de t’em’ andra helein, which could mean ‘grant that I may kill the man’ or ‘grant that the man may kill me.’ The famous Renaissance grammarian Thomas Linacre (1524: Fo.51r) deplored the difficulty of constructions such as Chremetem percussisse Demeam, in which the subject and the object are

162 Linearity/Syntagmatics

both in the accusative and we have to decide whether Chremes hit Demea or vice versa. Linacre (1524: Fo.78v, Fo. LXXVIIr) also contrasted volo discere ‘I want to learn’ and volo te discere ‘I want you to learn’ and explained that a Hellenism is introduced when we read of Catullus’s phaselus ‘vessel’ that it ait fuisse navium celerrimus ‘says it was the fastest of ships,’ with a nominative instead of an accusative. Similarly Sanctius (Francisco Sa´ nchez) in his Minerva (Sanctius, 1587: 134v) alleged that cupio esse dives ‘I want to be rich,’ with dives in the nominative, is the Greek-style expression, whereas (according to him) the Latin expression should be cupio esse divitem. Linacre (1524: Fo. 50r) distinguished two kinds (genera) of construction, the iustum ‘normal’ and the figuratum ‘figurative.’ They differ mainly in word order. And it is striking that some of the most insightful discussions of syntax in the Italian Renaissance are offered by treatises not of grammar but of rhetoric (Lepschy, 2004). This contrast between normal and figurative constructions finds its place within the complex history, which still waits to be properly illustrated, of the distinction between natural and artificial order (ordo naturalis and ordo artificialis) (Scaglione, 1972; Weijers, 1994: 59–82; Black, 2001: 281; Rizzo, 2002: 198). In the tradition of Latin school teaching, we meet the expression ordo constructionis, ordo verborum, or just ordo. This term, now largely forgotten, has its own entry in the Oxford English Dictionary: ‘‘In old Latin School-books (ordo verborum). The arrangement of words required in translating into English.’’ The ordo changed the Latin word order (ordo artificialis), preserving the original Latin forms but rearranging them according to English syntax (ordo naturalis), and, in doing so, provided the reader with the (syntactic, if not lexical) meaning of the passage. This process throws light, from an unexpected angle, on the distinction, mentioned at the beginning, between linear word order and syntactic structure. See also: Chomsky, Noam (b. 1928); Configurationality;

Grammar, Early Medieval; Graph Theory; Paradigm versus Syntagm; Peter Helias (12th Century A.D.); Relations and Functions; Rhetoric: History; Sanctius, Franciscus (1523–1600); Saussure, Ferdinand (-Mongin) de (1857– 1913); Saussurean Tradition in 20th-Century Linguistics; Syntactic Constructions; Word Order and Linearization.

Bibliography Amacker R (1995). ‘Y a-t-il une syntaxe saussurienne?’ In De Mauro & Sugeta (eds.). 67–88.

Apollonius Dyscolus (1981). The syntax of Apollonius Dyscolus. Householder F W (trans. and ed.). Amsterdam: John Benjamins. Black R (2001). Humanism and education in medieval and Renaissance Italy: tradition and innovation in Latin schools from the twelfth to the fifteenth century. Cambridge, UK: Cambridge University Press. Buridanus J (1977). Sophismata. Stuttgart: FrommannHolzboog. Buridanus J (2001). Summulae de dialectica. New Haven/ London: Yale University Press. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Corver N & van Riemsdijk H (eds.) (1994). Studies on scrambling: movement and non-movement approaches to free word-order phenomena. Berlin: Mouton de Gruyter. De Mauro T & Sugeta S (eds.) (1995). Saussure and linguistics today. Tokyo: Waseda University/Rome: Bulzoni. Engler R (1974). ‘La line´ arite´ du signifiant.’ In Amacker R, De Mauro T & Prieto L J (eds.) Studi Saussuriani per Robert Godel. Bologna, Italy: il Mulino. 111–120. Godel R (1957). Les sources manuscrites du Cours de linguistique ge´ ne´ rale de F. de Saussure. Geneva/Paris: Droz/ Minard. Graffi G (2001). 200 years of syntax: a critical survey. Amsterdam: John Benjamins. Hjelmslev L (1961). Prolegomena to a theory of language (2nd edn.). Madison, WI: University of Wisconsin Press. Jakobson R (1962). Selected writings 1: Phonological studies. The Hague: Mouton. Karimi S (ed.) (2003). Word order and scrambling. Oxford: Blackwell/Malden. Lepschy G (2004). ‘L’ordine delle parole.’ In Milani C & Finazzi R B (eds.) Per una storia della grammatica in Europa. Atti del convegno 11–12 settembre 2003. Milan: Universita´ Cattolica. 13–33. Linacre T (1524). De emendata structura latini sermonis libri sex. London: Pynson. Lo Cascio V (1995). ‘Linearity and null categories.’ In De Mauro & Sugeta (eds.). 267–295. Luce R D, Bush R R & Galanter E (eds.) (1963). Handbook of mathematical psychology (vol. 2). New York: John Wiley & Sons. Reichenbach H (1957). The philosophy of space and time. New York: Dover. Rijk L M de (1967). Logica modernorum: a contribution to the history of modern terminist logic (vol. 2). Assen: Van Gorcum & Prakke. Rijk L M de (1980). ‘Each man’s ass is not everybody’s ass: on an important item in 13th-century semantics.’ Historiographia Linguistica 7(1–2), 221–230. (Reprinted in Rijk L M de (1989). Through language to reality: studies in medieval semantics and metaphysics. Bos E P (ed.). Northampton, UK: Variorum (Reprints. Chap. 8). Rizzo S (2002). Ricerche sul latino umanistico (vol. 1). Rome: Edizioni di storia e letteratura.

Lingua Francas as Second Languages 163 Robins R H (1980). ‘Functional syntax in medieval Europe.’ Historiographia Linguistica 7(1–2), 231–240. Sanctius F (1587). Minerva, seu de causis linguae latinae. Salamanca, Spain: I. & A. Renart. Saussure F de (1916/1972). De Mauro T (ed.) Cours de linguistique ge´ ne´ rale. Paris: Payot. Saussure F de (1959). Course in general linguistics. Baskin W (trans.). New York: Philosophical Library. Saussure F de (1967–1974). Engler R (ed.) Cours de linguistique ge´ ne´ rale (4 vols). Wiesbaden, Germany: Harrassowitz. Saussure F de (1983). Course in general linguistics. Harris R (trans. and ed.). London: Duckworth.

Scaglione A (1972). The classical theory of composition from its origins to the present: a historical survey. Chapel Hill, NC: University of North Carolina Press. Vincent N (1979). ‘Word order and grammatical theory.’ In Meisel J M & Pam M D (eds.) Linear order and generative theory. Amsterdam: John Benjamins. 1–22. Weijers O (ed.) (1994). Etudes sur le vocabulaire intellectuel du Moyen Age 8: Vocabulary of teaching and research between Middle Ages and Renaissance. Turnhout, Belgium: Brepols. Wunderli P (1981). Saussure-Studien: exegetische und wissenschaftsgeschichtlische Untersuchungen zum Werk von F. de Saussure. Tu¨ bingen: Narr.

Lingua Francas as Second Languages C Meierkord, University of Erfurt, Erfurt, Germany ! 2006 Elsevier Ltd. All rights reserved.

In 1953, UNESCO defined a lingua franca as ‘‘a language which is used habitually by people whose mother tongues are different in order to facilitate communication between them.’’ Although a number of attempts were made to introduce an engineered lingua franca, such as Esperanto, to achieve such a facilitation for worldwide communication, the languages used as lingua francas today are largely natural languages. Some of these languages are, or were, used worldwide, notably English. Others, such as Arabic, French, Spanish, and Russian, are employed for international communication in a particular geographic area. Thus, Arabic serves as a lingua franca in the states of the Maghreb and the Middle East, Spanish is the major lingua franca of Central and South America, and Russian linguistically united the Soviet Union. In addition, there are languages that are used as intranational lingua francas only. Afrikaans, for instance, is used in large parts of South Africa and Namibia to allow for communication among speakers who do not share another language. Similarly, Modern Chinese is employed for communication across the individual speech communities in the People’s Republic of China. In addition, of course, numerous other languages serve groups of different sizes as lingua francas (Languages of Wider Communication). All of the lingua francas mentioned above have native speaker speech communities. However, they are second languages for the majority of their users, and this has implications both for the individual who uses more than one language as well as for those societies in which several languages coexist. This article considers both the multilingual societies in

which lingua francas are used and the bilingual individual using the language. The first part covers aspects such as multilingualism, diglossia, and language shift, and their effects on education and language policy. In the second part, the use of a lingua franca will be discussed as a case of language contact, which involves transfer, borrowing, and code-switching, nativization processes, and similarities with learner languages. The later sections are devoted to a description of lingua franca communication at the international level and a review of approaches toward lingua francas from within the field of foreign language teaching. The article closes with a brief look into current research trends.

Lingua Francas and Multilingual Societies Lingua francas are used for different purposes. Communication among speakers of mutually unintelligible languages occurs across countries as well as within them. This section will examine lingua francas used for intranational, intraregional, and international communication. It draws on data available for

Table 1 Second language speakers of individual lingua francas Language used as lingua franca

Number of L2 speakers in millions

Afrikaans Arabic, Standard Chinese, Mandarin English French Russian Spanish

appr. 3.7 no estimates available 278 167 51 110 59

All figures are www.ethnologue.org.

taken

from

the

Ethnologue

at

Lingua Francas as Second Languages 163 Robins R H (1980). ‘Functional syntax in medieval Europe.’ Historiographia Linguistica 7(1–2), 231–240. Sanctius F (1587). Minerva, seu de causis linguae latinae. Salamanca, Spain: I. & A. Renart. Saussure F de (1916/1972). De Mauro T (ed.) Cours de linguistique ge´ne´rale. Paris: Payot. Saussure F de (1959). Course in general linguistics. Baskin W (trans.). New York: Philosophical Library. Saussure F de (1967–1974). Engler R (ed.) Cours de linguistique ge´ne´rale (4 vols). Wiesbaden, Germany: Harrassowitz. Saussure F de (1983). Course in general linguistics. Harris R (trans. and ed.). London: Duckworth.

Scaglione A (1972). The classical theory of composition from its origins to the present: a historical survey. Chapel Hill, NC: University of North Carolina Press. Vincent N (1979). ‘Word order and grammatical theory.’ In Meisel J M & Pam M D (eds.) Linear order and generative theory. Amsterdam: John Benjamins. 1–22. Weijers O (ed.) (1994). Etudes sur le vocabulaire intellectuel du Moyen Age 8: Vocabulary of teaching and research between Middle Ages and Renaissance. Turnhout, Belgium: Brepols. Wunderli P (1981). Saussure-Studien: exegetische und wissenschaftsgeschichtlische Untersuchungen zum Werk von F. de Saussure. Tu¨bingen: Narr.

Lingua Francas as Second Languages C Meierkord, University of Erfurt, Erfurt, Germany ! 2006 Elsevier Ltd. All rights reserved.

In 1953, UNESCO defined a lingua franca as ‘‘a language which is used habitually by people whose mother tongues are different in order to facilitate communication between them.’’ Although a number of attempts were made to introduce an engineered lingua franca, such as Esperanto, to achieve such a facilitation for worldwide communication, the languages used as lingua francas today are largely natural languages. Some of these languages are, or were, used worldwide, notably English. Others, such as Arabic, French, Spanish, and Russian, are employed for international communication in a particular geographic area. Thus, Arabic serves as a lingua franca in the states of the Maghreb and the Middle East, Spanish is the major lingua franca of Central and South America, and Russian linguistically united the Soviet Union. In addition, there are languages that are used as intranational lingua francas only. Afrikaans, for instance, is used in large parts of South Africa and Namibia to allow for communication among speakers who do not share another language. Similarly, Modern Chinese is employed for communication across the individual speech communities in the People’s Republic of China. In addition, of course, numerous other languages serve groups of different sizes as lingua francas (Languages of Wider Communication). All of the lingua francas mentioned above have native speaker speech communities. However, they are second languages for the majority of their users, and this has implications both for the individual who uses more than one language as well as for those societies in which several languages coexist. This article considers both the multilingual societies in

which lingua francas are used and the bilingual individual using the language. The first part covers aspects such as multilingualism, diglossia, and language shift, and their effects on education and language policy. In the second part, the use of a lingua franca will be discussed as a case of language contact, which involves transfer, borrowing, and code-switching, nativization processes, and similarities with learner languages. The later sections are devoted to a description of lingua franca communication at the international level and a review of approaches toward lingua francas from within the field of foreign language teaching. The article closes with a brief look into current research trends.

Lingua Francas and Multilingual Societies Lingua francas are used for different purposes. Communication among speakers of mutually unintelligible languages occurs across countries as well as within them. This section will examine lingua francas used for intranational, intraregional, and international communication. It draws on data available for

Table 1 Second language speakers of individual lingua francas Language used as lingua franca

Number of L2 speakers in millions

Afrikaans Arabic, Standard Chinese, Mandarin English French Russian Spanish

appr. 3.7 no estimates available 278 167 51 110 59

All figures are www.ethnologue.org.

taken

from

the

Ethnologue

at

164 Lingua Francas as Second Languages

Afrikaans, Standard Arabic, Mandarin Chinese, English, French, Russian, and Spanish. Table 1 presents the figures which have been estimated for the numbers of second language users of these languages. When a language has come to be used as a lingua franca, it is frequently assigned the status of a national or official language of individual countries as documented in Table 2. More detailed descriptions of the linguistic situation and the status of the lingua francas in these countries are available from the Ethnologue (Ethnologue) or from Baker and Prys Jones (1998). Table 1 and Table 2 indicate that lingua francas are used in different contexts: several languages, such as Arabic, English, French, and Spanish have achieved a special status within a large number of individual countries. Others, such as Afrikaans, Chinese, and Russian, have high numbers of second language users but are used in few countries. The following subchapters provide an overview of the diverse scenarios in which a lingua franca may be encountered, followed by a discussion of the implications its use has at the societal level. Scenarios of Lingua Franca Usage

Spanish in the Americas and French in Africa As a result of colonization, both Spanish and French are used on a worldwide scale, but they mainly serve restricted geographical areas. Spanish as a lingua franca is mainly used in the Americas, from California as far as Feuerland. Here, Spanish is spoken by a total of approximately 330 million people (cf. Noll, 2001). Most of the second language speakers of Spanish in this area are descendants of Native Americans. However, Native American languages, such as Aravak, Caribe, Nahautl, Maya, Chibcha, Quechua, Arimara, Mapuche, and Guaranı´, continue to play a prominent role only in Mexico, Guatemala, the Andes, and Paraguay. In other countries, the majority of their

inhabitants have shifted to Spanish as their first language. Cuba and the Dominican Republic have predominantly Black populations. Here, Spanish speakers are mainly descendants of the formerly enslaved African peoples and the European colonizers. The francophone world covers large parts of Europe, America, and Africa. French has the status of a lingua franca in many sub-saharan countries, such as the Ivory Coast, countries in the Indian Ocean, the states of the Maghreb, Andorra, Luxemburg, the Aosta Valley, Vanuatu, Lebanon, etc. (cf. Kleineidam, 1992). More than half of all francophones live in Africa. But different from the case with Spanish in Latin America, French is not spoken by the vast majority of these countries’ inhabitants. In fact, French is often accessible only to a middle class elite. However, Kleineidam (1992) documents that figures for second language speakers of French have been steadily rising between 1980 and 2000 in Africa: there was an increase of 267% in sub-Saharan Africa and of 160% in the Maghreb states. Arabic is used inside as well as across more than twenty countries, mainly located in the Middle East and northern Africa. In these countries, Modern Standard Arabic fulfills the function of an official language, as the medium of education and administration, and for all written communication. In addition, all Muslim states are unified through the literary and written standard provided by Classical Arabic, the language of the Islamic religion and of the Qur’a¯n (cf. Youssi, 1995). Intranational Lingua Francas: Afrikaans, Russian, and Chinese Whenever the political frontiers of a nation were drawn regardless of ethnolinguistic realities, the country is frequently inhabited by speakers of diverse languages. Such multilingual states often decide on one particular language for intranational communication across their different speech

Table 2 Lingua francas enjoying a special status Language (L)

Afrikaans Arabic, Standard Chinese, Mandarin English French German Russian Spanish

Number of countries in which L is spoken

Number of countries in which L is a national language

Number of countries in which L is an official language

1 2

Number of countries in which L enjoys a special status

10 25

– 19

4 2

16

2



5

105 54

48 7

28 24

10 4

31 44

– 19

1 1

2 4

Lingua Francas as Second Languages 165

communities. This is – or was – the case, for example, in China, the former Soviet Union, and South Africa. In China, which officially recognizes 56 nationalities and 125 minority languages (cf. e.g., Sun, 2001 cited in Poa and LaPolla, forthcoming), Mandarin Chinese – also called Modern Standard Chinese or puˇto¯nghua` – is the sole official language. It was formally defined and standardized after World War II and became the medium of instruction in schools, the working language for government and administration, and the language of the media (cf. Chen, 1999: 27). Stern measures to implement Modern Standard Chinese were never adopted in mainland China. Rather, migration of people across the empire has been encouraged, if not enforced, by the government. As a result, bilingualism in Mandarin Chinese and a local variety is the norm in large parts of China today. In 1984, 50% of the whole nation had a speaking proficiency in Modern Standard Chinese, and 90% had a comprehension proficiency in the language. Strategies were similar in the former Soviet Union, which contained peoples speaking more than 130 different languages, but where the government never installed Russian as the national language. Instead, Russian was introduced as a compulsory school subject and promoted as a key to being a full Soviet citizen, based on the ‘‘Marxist principle that at some time in the future all ethnic groups will fuse into one’’ (Comrie, 1981: 37). Social planners supported migration of Russians into remote regions of the state, producing multilingual communities and mixed marriages, mostly with Russian as the dominant language. In 1979, a total of 61.2 million speakers of Russian as an additional language lived in the Soviet states, and the federation was characterized by frequent instances of bilingualism, such as Estonian-Russian, ArmenianRussian, etc. (cf. Haarmann, 1985). A language that has been vigorously enforced as an intranational lingua franca is Afrikaans. Afrikaans was designated as the national language when the Afrikaner Nationalist Party came into power in 1948. Partly in response to the former aggressive anglicization policy effected by the British, Afrikaans was promoted and made the compulsory language of instruction in all Black schools in 1957. Following the abolition of the apartheid policy, Afrikaans is now one of eleven official languages, but due to its background it still is a major lingua franca in South Africa and parts of Namibia. The language has been a lingua franca from when it developed in a sustained situation of language contact between the indigenous Khoekhoe, Dutch settlers, and slaves (cf. McCormick, 2002). It is spoken widely in large parts of South Africa, both as a first and as a second language, by speakers from a variety of ethnicities.

Global Lingua Francas: English Among the different lingua francas which exist around the world, English is the language which has gained the status of a global language. It is used for worldwide communication across nations and individuals, who speak mutually unintelligible languages, but it also serves as an intranational lingua franca in a large number of countries which had previously been colonized by the British (cf. Table 2). The dominance of English at the international level has sometimes been discussed as linguistic imperialism. Phillipson and Skutnabb-Kangas (2001: 573) argue that ‘‘a Western-dominated globalization agenda is being implemented by the transnational corporations, the World Bank, the IMF, and the World Trade Organization.’’ The status of English is strengthened through these global alliances but also through organizations that operate at a regional level. Moreover, increased global mobility, migration, and modern media help the spread of English today. As a result, English has gained enormous prestige, and mastering the language is associated with access to the global market. Crystal (1997) estimates that approximately 1000 million people speak English as a second language today, albeit at different levels of competence. Societal Effects of Lingua Franca Usage

Languages used as intranational lingua francas frequently have the status of a national or official language and are used for governmental and administrative affairs. Raising the status of one of several languages spoken in a country eventually results in the language being more prestigious and associated with professional success and upward social mobility. These often diglossic situations have an impact on education and may eventually cause language shift. Diglossia Frequently, the two or more languages used within one nation coexist in a diglossic situation (Fishman, 1980): One language is used for ‘high’ functions, i.e., in formal interactions such as official discourse, in church and schools, and especially in written communication. The other language is used for so-called ‘low’ functions: informal, mundane, mainly oral interactions. As a result, the high language is often held to be more logical and aesthetic, associated with literary heritage, and it is also the one which is acquired through formal education. For example, Modern Standard Arabic is used in most formal contexts in the Maghreb states, and a colloquial variety of Arabic, or Berber in the case of for example Morocco, is used in informal, oral situations. In Paraguay, where Spanish and Guaranı´ are the two official languages, Spanish performs the high functions.

166 Lingua Francas as Second Languages

Educational Issues The prestige of a lingua franca performing high functions in a country may also influence language policy, and there may be parental or societal demand that the lingua franca be used as the medium of instruction in schools or taught at the expense of other languages. Such is the case for example in South Africa, where English is now increasingly used as a medium of instruction in schools where neither the pupils nor the teachers speak English as their mother tongue. However, designating a lingua franca as the sole or major medium of instruction may have severe consequences for pupils’ social, cognitive, and academic development. Desai (2003) documents that Xhosa children who receive their primary school education through English do not only fail to acquire English effectively, but that they also do not develop proficiency, especially in reading comprehension, in their mother tongue. Also, using the lingua franca may exclude those sections of the population who do not have access to it from socioeconomic advancement. In some cases, the language used as lingua franca may be completely inaccessible to certain groups of society, yielding a bilingual or multilingual elite. For example, several rural areas of China are still without electricity, and access to schools is also limited in some cases, so that as a result citizens have no opportunity to acquire puˇ to¯ nghua` through the media or in school. Even if the lingua franca is not used intranationally, such as English in Germany, individuals may prefer to study this language rather than another foreign language if it is more prestigious and required for the more attractive professions. In fact, Europe today witnesses an enormous increase in English language learning, often to the detriment of multilingualism. Language Shift When proficiency in the lingua franca is a prerequisite to professional and social achievement, language shift is also likely to occur, i.e., bilingual communities will tend to give up their original mother tongue and gradually, usually over three generations, acquire a new language. In South Africa, Indians gradually shifted to English (cf. Mesthrie, 1992), and at present, Blacks increasingly raise their children in English instead of Afrikaans, because English seems to guarantee upward societal movement (cf. McCormick, 2002). Haarmann (1992: 111) points out that following a phase of unbalanced bilingualism with Russian as the dominant language, ‘‘by 1979 about 16.3 million non-Russians (such as Ukrainians, Latvians, Georgians) had shifted to Russian as their first language.’’ And in China, the Qiang have been undergoing language shift to the dominant puˇ to¯ nghua` due to a lack of literature,

media, and education in Qiang and the rapid increase in Chinese-Qiang bilingualism. The process of language shift is, however, not always completed. The Black community in Cape Town, South Africa, on which McCormick (2002) reports, uses English in education and at the workplace. However, the individuals in this community still enact their identities in a mixed Afrikaans-English code.

Lingua Francas and the Multilingual Individual The wide range of social contexts in which lingua francas are used is reflected in the heterogeneity of the forms which languages assume when they are used as lingua francas. Depending on the presence and status of the lingua franca in a particular country, speakers acquire and use the language differently. Speakers living in nations that utilize a lingua franca for intranational purposes frequently acquire a nativized variety of the lingua franca. But in countries where the lingua franca does not enjoy a particular official status and where it is mainly acquired for communication with interlocutors from abroad, the model has usually been either British English (in Europe) or the American Standard variety (e.g., in Japan). Interestingly, Korea has adopted a strategy of teaching a variety called Codified Korean English. Furthermore, the environment in which a lingua franca is acquired varies considerably. Lingua francas are usually taught as second languages in institutional settings. But they are also present in everyday life: in official documents, on television and on the radio, in advertisements etc. These different options for contact with the lingua franca yield diverging opportunities for either instructed or informal acquisition of the language. Formal and Natural Acquisition of Lingua Francas

In many cases, a lingua franca is learned in an institutionalized context. This is the case whenever the language is not used intranationally. For example, English is a subject in Germany, Italy, Turkey, Japan, and many other countries. If a language is used as an intranational lingua franca, it is often still acquired in an institutionalized context. Thus, mother tongue speakers of Xhosa or Zulu in South Africa learn English from primary school onward. At the same time, however, the lingua franca is usually used in several public domains, especially in the media, and speakers may simultaneously acquire it naturally so that formal instruction and natural acquisition supplement each other. This is the case for example in China, where Modern Standard Chinese is not only taught and used as the medium of instruction in

Lingua Francas as Second Languages 167

schools, but is also the language of access to modern media. Radio and television exposes even very young children to Modern Standard Chinese, which thus becomes part of their local context. But despite the availability of formal instruction in the languages used as lingua francas, for a large number of their speakers, especially for those who do not have access to formal education, acquisition through uninstructed channels is frequently their sole option. They pick up the language in everyday interaction with either native or nonnative speakers, and learning results from direct participation and observation. Although such informal settings potentially create rich language learning environments with a vast array of communicative events, they also leave the learner with the task of constructing the correct rules from the input they receive. Acquiring the lingua franca in contexts such as the workplace, and without additional instructed learning, may produce errors that remain uncorrected so that nontarget forms eventually become fixed and fossilize. As a result, the speakers achieve a variety of levels of competence, and large numbers will only be able to master a basilectal form of the lingua franca. This is the case in the following sequence, which has been observed with a South African Xhosa speaker who acquired English mainly through interaction with other nonnative speakers. my child is in Johannesburg but he’s working sometimes wor – not working. I don’t know\ another one’s three, three years here. and then uh, lady’s has got as a house, in Khayelitsha. working in factory to sewing, everything. oh sorry. and another, the lastborn, my lastborn, is in working, he’s passed his standard ten last of last year. but he’s working now. it’s better but eh. all the children is got a problem. because it is not help her parents. you know/

A similar situation characterizes parts of Latin and South America. Lipski (1994: 143) points out, that in Latin America ‘‘radio is often the only means of communication for vast rural areas, and for large segments of the population who suffer partial or total illiteracy.’’ Radio stations in these areas provide a medium for adult education and Spanish language classes for the indigenous communities. However, the model that speakers encounter on the radio is heterogeneous in that most broadcasters do not receive training due to a lack of financial resources. Nativized and Interlanguage Forms of Lingua Francas

When the lingua franca is regularly used for intranational communication, it frequently develops into a

nativized form. Transfers from the speakers’ mother tongues, which occur at the different linguistic levels, eventually stabilize, and the language is adapted to a new sociocultural context, a process which affects both cultural and formal dimensions. This is most apparent in the lexicon of these varieties. For example, speakers of English in Nigeria have coined the expression head-tie to denote a piece of cloth worn around the head by the women in the country. For English, the individual indigenized varieties spoken throughout the world have been thoroughly documented in a vast number of publications, e.g., by McArthur (2002) and especially in the papers published from within the International Corpus of English (ICE) project. With regard to Spanish, sociolinguistic research on Latin America has established that ‘‘Latin American Spanish exhibits numerous supraregional characteristics, and a well-defined if unofficial prestige norm valid across two continents’’ (Lipski, 1994: 149). Instead of being an imitation of Castilian, the Latin American prestige form ‘‘is a set of common denominators in which regionally or ethnically marked items do not appear’’ (Lipski, 1994: 149). If learning the lingua franca takes place in the classroom context only, speakers’ productions frequently display structures and strategies that differ from both the mother tongue and the second language. These interlanguages are approximative, transitional systems, which reflect the developmental stage of the individual learner. They are both unstable and variable, and they are furthermore characterized by communication strategies, employed to compensate for deficits in the second language. For example, paraphrases are used to express lexical concepts for which the interlanguage does not yet contain a specific lexical item. Also, pauses occur frequently between and also within turns, possibly because learners pause to solve production problems. But long pauses that occur between turns may also result from the speakers’ reliance on pauses as turntaking-signals, which implies that they don’t sufficiently recognize and produce other turn-taking-signals. Learners have also been shown to display a low variation in ritual speech acts, which seems to be a further classroom- or textbook-induced characteristic. Po¨ ll (1998: 108) documents that speakers of French as a lingua franca in sub-Saharan Africa frequently produce utterances in which individual structures of the French language have been simplified (e.g., dans is used as the sole locative preposition in Senegal). Also, concord rules may be violated, as is the case in les filles ne fait plus c¸ a (observed in Cameroon).

168 Lingua Francas as Second Languages Linguistic Processes in Lingua Francas Productions

Individuals using a language as a lingua franca by definition command more than one language. As a result of such language contact, the speech of bilinguals using interlanguage varieties as well as nativized forms of individual languages is commonly characterized by borrowing, transfer, code switching, and code mixing. Borrowing and transfer In present-day Bolivian Spanish, contact with the diverse Native American languages is reflected in the form of lexical borrowings from the indigenous languages Aymara, Chiquitana, Guaranı´, and Quechua. For example apallar ‘to harvest’ (cf. Lipski, 1994: 194) or-cosa!, an exclamation meaning ‘very good, excellent,’ have both been copied into Bolivian Spanish. In the diglossic context of Paraguay, borrowing from Guaranı´ is particularly noticeable. Items for flora, fauna, food, etc., are often of Guaranı´ origin. For example, mitaı´ is used for ‘child’ instead of nin˜ o/nin˜ a. Also, the lingua francas display evidence of transfer in that structures of the indigenous languages influence the speakers’ productions in the lingua franca. In Peru, Spanish reflects transfer from Quechua in that the clitic pronoun system has been morphologically simplified, and Quechua word order and tenses have been retained in Spanish (cf. Klee, 1996). Transfer does not, however, radically change the syntactic structure of Spanish. Rather ‘‘speakers of Andean Spanish have found a way to maintain the word order patterns and evidential system of Quechua in a manner that is compatible with the pragmatic use and structure of Spanish’’ (Klee, 1996: 89). Code Switching and Code Mixing Lingua franca speakers have also been found to mix the indigenous language and the lingua franca into a hybrid consisting of words and phrases of both languages (code mixing). The intrasentential use of two languages seems to be a universally accounted phenomenon in the productions of second language speakers in multilingual settings. ka fa izy edy izany/c¸ a semble lourd quand c’est en malgache/vous voyez (Babault, 2001: 137; Malegasy in italics). Estamos haciendo pruebas, y tlamis tikisaske recreo iwa tlamis tikalakiske oksepa, tlamis de nuevo vamos a estudiar (MacSwann, 1999: 136; Nahuatl in italics).

But code switching may also occur intersententially, i.e., speakers also change codes between utterances

within one speech event. Often, especially in diglossic situations, this is done for stylistic purposes, or due to alterations in the formality of the speech event. The following excerpt taken from McCormick (2002: 168) demonstrates a switch from English to Afrikaans, which does not usually serve formal functions in the community she studied. When speaker 1 becomes uncertain about the date of the next meeting, speakers switch into Afrikaans for a more informal discussion. The conversation changes back to English when the formal reading of the minutes is resumed (Afrikaans in italics). Speaker 1: The matter could only be entertained when our next AGM elections would be held the chairman closed the meeting and told the members that on Wednesday ninth Speaker 2: seventeen Speaker 3: sixteenth Speaker 4: laas week was die sestien en die week tevore was die laaste meeting Speaker 5: nee man hy praat van die sestiende sixteenth May Speaker 1: (resumes reading the minutes) The chairman closed the meeting and told the members that on Wednesday sixteenth May nineteen eighty four there would be no meeting

McCormick also documents that code switching and code mixing may evolve into a mixed language, which can develop into a marker of identity.

International Lingua Franca Interactions At the international level, interaction in a lingua franca often implies communication between speakers of different varieties of the lingua franca: indigenized forms and nonnative or learner varieties potentially meet, and as a result interactions in a lingua franca are highly heterogeneous. Lingua franca communication (LFC) implies interaction between various linguistic and cultural systems. Consequently, LFC has been approached from different perspectives: some authors have discussed LFC as a particular type of intercultural communication, whereas others have placed their emphasis on aspects related to interlanguage communication (cf. Meierkord and Knapp, 2002 for a more comprehensive overview). As a form of intercultural communication, LFC has been conceived as interaction between participants from different cultures or discourse systems. Based on the assumption that speakers in LFC would not share a sufficient amount of shared signs and their conventionalized representations to allow them to

Lingua Francas as Second Languages 169

interpret each others’ utterances successfully, LFC has often been approached as being potentially problematic. Participants in LFC have been assumed to misinterpret silences, tone of voice, expressions for speech act functions, etc. Several scholars have assumed that interferences from the individual mother tongue will result in more frequent and also more complex problems, if speakers need to resort to a language neither of them speaks as her/his mother tongue. Such interferences may occur ‘‘in the types of communicative events that learners expect to occur in a given situation, the manner of their participation in them, the specific types of acts they perform and the ways they realize them, the ways topics are nominated and developed, and the way discourse is regulated’’ (Ellis, 1994: 187). In fact, early studies carried out in the learner languages paradigm, such as Schwarz (1980) and Varonis and Gass (1985), who investigated the negotiation of meaning between non-native speakers of English with different linguistic backgrounds, discovered such problematic issues. A number of studies describe the formal properties which English assumes in lingua franca communication. The vast bulk of these studies concentrates on the discourse level. Firth (1996) challenges the above view and argues that participants in lingua franca interactions strive to make interaction ‘normal’ in the sense that misunderstandings tend to be negotiated following a ‘let it pass’ attitude. Speakers seem to accept that understanding may be impaired to a certain extent, and they allow for a certain amount of opacity to make the interaction less vulnerable. However, others have documented frequent instances of misunderstanding in the data they elicited (cf. the papers in Knapp and Meierkord, 2002) and have even held that mutual understanding in LFC is a myth (House, 1999). Descriptive studies investigating the other linguistic levels are still scarce. Meierkord (2004) presents analyses of the syntax in interactions across international Englishes, which reveal an overwhelming similarity of the lingua franca productions with those commonly found in native speaker varieties of English. Particular features, which have been documented for New Englishes or learner varieties, also occur but are infrequent. On the whole, the interactions are characterized by processes of leveling and regularization.

Lingua Francas and Second Language Teaching The status which English enjoys as a global lingua franca and the variability of the forms which English today assumes have also concerned scholars in the field of English language teaching. The issue had

been at the center of discussions in the 1980s, and it has recently gained renewed attention. In the early papers addressing the need for a reorientation of English language teaching, authors suggested that most learners of English would employ the language mainly for communication with other nonnative speakers. Against this assumption, they argued that basing foreign language teaching on a native-speaker model was doubtful and problematic, because it did not sufficiently prepare learners for using the foreign language as a lingua franca. Hu¨ llen (1982: 87) concludes that ‘‘[. . .] the linguistic norms – grammatical, lexical, but also those of pronunciation – will have to change. It is also certain that new means of linguistic politeness and agreed-upon roles in communication will arise.’’ Like him, other authors concentrated on the sociopragmatics of English for international communication and on the negotiation of meaning, both based on the assumption that crosscultural differences in these areas are likely to cause problematic interaction. A number of authors have also attempted to model a form of English that would be easier to learn than the ones based on the native speaker varieties, yet be communicatively adequate. Ogden’s (1933) Basic English is one of the earliest proposals for such a variety. It combines a restricted set of grammatical rules with a lexicon of approximately 850 words. Quirk (1985) proposed Nuclear English, a form of English characterized by, among other things, a reduction of homosemy, of the complexity of for example restrictive relative clauses and of the system of modal verbs. Many other authors did not perceive a need to design a particular variety of English as an international lingua franca. For example, Smith (1984) and Crystal (1997) assume that English will develop without the active intervention of linguistic engineers. Smith (1984: 55) assumes that the term English as an International Language refers to the different uses of the language only, that individual users will employ a number of different forms of English, e.g., nativized varieties, and that participants using English as a lingua franca will need to cope with the resulting heterogeneity. Crystal (1997: 136 ff.) proposes that a form which he calls World Standard Spoken English would arise, a form that would coexist with other varieties. Crystal conceives this form of English as characterized by ‘‘careful pronunciation, conventional grammar, and standard vocabulary’’ (Crystal, 1997: 137) and as a variety which ‘‘takes the form, for example, of consciously avoiding a phrase that you know is not likely to be understood outside your own country’’ (Crystal, 137 ff.). But on the contrary, Burger (2000: 10ff.) attempts to embrace current research findings and proposes the following: a revision of the native speaker as a model

170 Lingua Francas as Second Languages

for English language teaching, an acceptance of hybrid learner varieties, a dominance of communicativity over correctness, an increased coverage of second language varieties of English, an inclusion of nonnative varieties in listening training, stressing intelligibility of pronunciation over native speaker acceptance, the training of negotiation of meaning, and activities to raise intercultural awareness.

Outlook: Current Research Trends Scholars have started to investigate lingua francas from a corpus linguistic perspective. Jenkins (2000) searched her corpus of genuine lingua franca English conversations for instances of native and nonnative speaker pronunciation leading to unintelligibility and misunderstanding. Based on her findings, she argues that a number of sounds, suprasegmentals, and articulatory settings could be modified in comparison to the British or General American English model. In her proposal of what she calls the Lingua Franca Core, for example, /y/ and /ð/ could easily be replaced by /t/ and /d/ or /s/ and /z/ without impeding intelligibility. Similarly, the dark /l/ might be substituted by /o/. A number of projects that aim at the compilation of corpora of LFC are under way. At the universities of Berne, Basle, and Fribourg, Watts, Allerton, and Trudgill, respectively, look into what they have labeled Pan Swiss English, a form of nonnative English emerging among native speakers of Swiss German, French, and Italian in Switzerland. In Austria, Seidlhofer has initiated the compilation of a corpus of English as a lingua franca at the University of Vienna (Seidlhofer, 2004; http://www.univie.ac.at/ voice/). Mauranen collects data for English as an academic lingua franca at the University of Tampere. A slightly different perspective is offered by Meierkord, who relates the findings yielded by the analyses of her corpus of English as a lingua franca in South Africa to identity construction in contemporary South African society. See also: Bilingualism and Second Language Learning;

English: World Englishes; Languages of Wider Communication; Nonnative Speaker Teachers; Third Language Acquisition.

Bibliography Babault S (2001). ‘Les jeunes et le discours mixte a` Madagascar: quelles tendances?’ In Canut C & Caubet D (eds.) Comment les langues se me´ langent. Codeswitching en Francophonie. 135–158. Baker C & Prys Jones S (1998). Encyclopedia of bilingualism and bilingual education. Clevedon: Multilingual Matters.

Chen P (1999). Modern Chinese. History and sociolinguistics. Cambridge: Cambridge University Press. Comrie B (1981). The languages of the Soviet Union. Cambridge: Cambridge University Press. Crystal D (1997). English as a global language. Cambridge: Cambridge University Press. Desai Z (2003). ‘A case for mother tongue education?’ In Brock-Utne B, Desai Z & Qorro M (eds.) Language of instruction in Tanzania and South Africa (LOITASA). Dar-es-Salaam: E & D Limited. 45–68. Ellis R (1994). The study of second language acquisition. Oxford: Oxford University Press. Ethnologue. http://www.ethnologue.org. Firth A (1996). ‘The discoursive accomplishment of normality: On ‘‘lingua franca’’ English and conversation analysis.’ Journal of Pragmatics 26(2), 237–260. Fishman J A (1980). ‘Bilingualism and biculturism as individual and as cosocietal phenomena.’ In Fishman J A, Gertner M H, Lowy E G & Mila´ n W G (eds.) The rise and fall of the ethnic revival. Berlin: Mouton. Haarmann H (1985). ‘The impact of group bilingualism in the Soviet Union.’ In Kreindler I (ed.) Sociolinguistic perspectives on Soviet national languages. Berlin: Mouton de Gruyter. 313–344. Haarmann H (1992). ‘Measures to increase the importance of Russian within and outside the Soviet Union – a case of covert language spread policy (A historical outline).’ International Journal of the Sociology of Language 95, 109–129. House J (1999). ‘Misunderstanding in intercultural communication: Interactions in English as a lingua franc and the myth of mutual intelligibility.’ In Gnutzmann C (ed.) Teaching and learning English as a global language. Tu¨ bingen: Staufenberg. 73–93. ICE (International Corpus of English). http://www.ucl. ac.uk. Klee C A (1996). ‘The Spanish of the Peruvian Andes.’ In Roca A & Jensen J B (eds.) Spanish in contact. Issues in bilingualism. Somerville, MA: Cascadilla Press. 73–91. Kleineidam H (1992). ‘Politique de diffusion linguistique et francophonie: l’action linguistique mene´ e par la France.’ International Journal of the Sociology of Language 95, 11–31. Knapp K & Meierkord C (eds.) (2002). Lingua franca communication. Frankfurt a.M: Lang. Lipski J M (1994). Latin American Spanish. London: Longman. MacSwann J (1999). A minimalist approach to intrasentential code switching. New York: Garland Publishing Inc. McArthur T (2002). The Oxford guide to World English. Oxford: Oxford University Press. McCormick K (2002). Language in Cape Town’s District Six. Oxford: Oxford University Press. Meierkord C (2004). ‘Syntactic variation in interactions across international Englishes.’ English World-Wide 25:1, 109–132. Meierkord C & Knapp K (2002). Approaching lingua franca communication.’ In Knapp K & Meierkord C (eds.) Lingua franca communication. Frankfurt a.M: Lang. 9–28.

Linguistic Anthropology 171 Mesthrie R (1992). English in language shift. Edinburgh: Edinburgh University Press. Noll V (2001). Das amerikanische Spanisch. Ein regionaler ¨ berblick. Tu¨ bingen: Niemeyer. und historischer U Phillipson R & Skutnabb-Kangas T (2001). ‘Linguistic imperialism.’ In Mesthrie R (ed.) Concise encyclopedia of sociolinguistics. Amsterdam: Elsevier. 570–574. Poa D & LaPolla R J (forthcoming). ‘Minority languages of China.’ In Miyako O & Krauus E (eds.) The vanishing languages of the Pacific Rim. Oxford: Oxford University Press. Po¨ ll B (1998). Franzo¨ sisch auberhalb Frankreichs. Geschichte, Status und Profil regionaler und nationaler Varieta¨ ten. Tu¨ bingen: Niemeyer. Schwartz J (1980). ‘The negotiation for meaning: Repair in conversations between second language learners of

English.’ In Larsen-Freeman D (ed.) Discourse analysis in second language research. Rowley, MA: Newbury. 138–153. Seidlhofer B (2004). ‘Research perspectives on teaching English as a Lingua Franca.’ Annual Review of Applied Linguistics 24, 209–239. UNESCO (1953). ‘The use of vernacular languages in education.’ In Fishman J A (ed.) (1953). Readings in the sociology of language. (1968). The Hague: Mouton. Varonis E M, Gass & Susan (1985). ‘Non-native/non-native conversations: a model for negotiation of meaning.’ Applied Linguistics 6, 71–90. Youssi A (1995). ‘The Moroccan triglossia: facts and implications.’ International Journal of the Sociology of Language 112, 29–43.

Linguistic Anthropology S Gal, University of Chicago, Chicago, USA ! 2006 Elsevier Ltd. All rights reserved.

Linguistic anthropology is the study of language in culture and society. The field analyzes linguistic practices as culturally significant actions that constitute social life. The situated use of language is exemplary of the meaning-making process that shapes a social worlds saturated with contrasting values and contested interests, with opposed political positions and identities, with variable access to institutions, resources and power. Linguistic anthropology examines the role of social interaction – and the semiotic processes on which it relies – in making, mediating and authorizing those contrasts and differences. Aspects of context enter into this process through linguistic form itself, as form signals speaker alignments and cultural presuppositions that are called into play during social interaction. Presuppositions invoked during interaction can draw on any cultural realm: categories of contrasting identities, folk ontologies, notions of truth, space, time, cosmological order, and morality. Such presuppositions are invariably linked to language ideologies, that is, culturally specific conceptions about language and its role in social life. Recent research on language-in-context has resulted in new definitions of the field’s fundamental concepts: ‘language’ ‘metalanguage’ ‘discourse’ ‘context’ ‘event’ and ‘text.’ Metalanguage is crucial because it makes possible the reflexivity that is a necessary feature of verbal interaction. Reflexivity has methodological as well as theoretical implications. Speakers’ categories of speech, events and personae are reflexive in that they create frames of

interpretation for social interaction and are not necessarily uniform within any group. They are indispensable starting points for empirical investigation of talk. Such categories provide perspectival views on interaction and contrast with the linguists’ own perspectives and direct observations. Scholarly discourses and debates about language are of theoretical interest as well. Like the metadiscursive categories about language and interaction of ordinary speakers, expert debates also create frames of interpretation; they participate in cultural systems, and often legitimate relations of power. In concert with some poststructuralist philosophies, yet in quite different ways, linguistic anthropology analyzes linguistic practices not only as the instruments of social life, but rather as the ground on which social and cultural conflicts are fought. A key issue has been the creation of cultural authority through communication. As a result, current research in linguistic anthropology has considerable significance in the study of political and economic formations, scientific and religious enterprises, as well as in the more traditional study of group boundaries and social identities. Contemporary linguistic anthropology provides the semiotic concepts necessary to understand how social institutions – including ‘‘language’’ and linguistic structure – are reproduced, authorized, and continually transformed.

Terms and Turfs The label ‘linguistic anthropology’ was coined in the late 19th century by scholars at the American Bureau of Indian Affairs who collected folkloric material among native Americans. Its current use in the United

Linguistic Anthropology 171 Mesthrie R (1992). English in language shift. Edinburgh: Edinburgh University Press. Noll V (2001). Das amerikanische Spanisch. Ein regionaler ¨ berblick. Tu¨bingen: Niemeyer. und historischer U Phillipson R & Skutnabb-Kangas T (2001). ‘Linguistic imperialism.’ In Mesthrie R (ed.) Concise encyclopedia of sociolinguistics. Amsterdam: Elsevier. 570–574. Poa D & LaPolla R J (forthcoming). ‘Minority languages of China.’ In Miyako O & Krauus E (eds.) The vanishing languages of the Pacific Rim. Oxford: Oxford University Press. Po¨ll B (1998). Franzo¨sisch auberhalb Frankreichs. Geschichte, Status und Profil regionaler und nationaler Varieta¨ten. Tu¨bingen: Niemeyer. Schwartz J (1980). ‘The negotiation for meaning: Repair in conversations between second language learners of

English.’ In Larsen-Freeman D (ed.) Discourse analysis in second language research. Rowley, MA: Newbury. 138–153. Seidlhofer B (2004). ‘Research perspectives on teaching English as a Lingua Franca.’ Annual Review of Applied Linguistics 24, 209–239. UNESCO (1953). ‘The use of vernacular languages in education.’ In Fishman J A (ed.) (1953). Readings in the sociology of language. (1968). The Hague: Mouton. Varonis E M, Gass & Susan (1985). ‘Non-native/non-native conversations: a model for negotiation of meaning.’ Applied Linguistics 6, 71–90. Youssi A (1995). ‘The Moroccan triglossia: facts and implications.’ International Journal of the Sociology of Language 112, 29–43.

Linguistic Anthropology S Gal, University of Chicago, Chicago, USA ! 2006 Elsevier Ltd. All rights reserved.

Linguistic anthropology is the study of language in culture and society. The field analyzes linguistic practices as culturally significant actions that constitute social life. The situated use of language is exemplary of the meaning-making process that shapes a social worlds saturated with contrasting values and contested interests, with opposed political positions and identities, with variable access to institutions, resources and power. Linguistic anthropology examines the role of social interaction – and the semiotic processes on which it relies – in making, mediating and authorizing those contrasts and differences. Aspects of context enter into this process through linguistic form itself, as form signals speaker alignments and cultural presuppositions that are called into play during social interaction. Presuppositions invoked during interaction can draw on any cultural realm: categories of contrasting identities, folk ontologies, notions of truth, space, time, cosmological order, and morality. Such presuppositions are invariably linked to language ideologies, that is, culturally specific conceptions about language and its role in social life. Recent research on language-in-context has resulted in new definitions of the field’s fundamental concepts: ‘language’ ‘metalanguage’ ‘discourse’ ‘context’ ‘event’ and ‘text.’ Metalanguage is crucial because it makes possible the reflexivity that is a necessary feature of verbal interaction. Reflexivity has methodological as well as theoretical implications. Speakers’ categories of speech, events and personae are reflexive in that they create frames of

interpretation for social interaction and are not necessarily uniform within any group. They are indispensable starting points for empirical investigation of talk. Such categories provide perspectival views on interaction and contrast with the linguists’ own perspectives and direct observations. Scholarly discourses and debates about language are of theoretical interest as well. Like the metadiscursive categories about language and interaction of ordinary speakers, expert debates also create frames of interpretation; they participate in cultural systems, and often legitimate relations of power. In concert with some poststructuralist philosophies, yet in quite different ways, linguistic anthropology analyzes linguistic practices not only as the instruments of social life, but rather as the ground on which social and cultural conflicts are fought. A key issue has been the creation of cultural authority through communication. As a result, current research in linguistic anthropology has considerable significance in the study of political and economic formations, scientific and religious enterprises, as well as in the more traditional study of group boundaries and social identities. Contemporary linguistic anthropology provides the semiotic concepts necessary to understand how social institutions – including ‘‘language’’ and linguistic structure – are reproduced, authorized, and continually transformed.

Terms and Turfs The label ‘linguistic anthropology’ was coined in the late 19th century by scholars at the American Bureau of Indian Affairs who collected folkloric material among native Americans. Its current use in the United

172 Linguistic Anthropology

States dates from the early 1960s, when ‘linguistic anthropology’ became a cover term for the study of language in social life and conversely for the study of social context in shaping linguistic structure and use. Two commitments have remained central to the field, and link it distinctively to anthropology: First, ethnography is its indispensable methodology, though augmented by elicitation, interview and audiovisual technologies. Second, linguistic form and function are studied within a cross-cultural, comparative framework, with attention to human universals along with historical, regional, and power-laden sociocultural differences. A set of other hybrid labels emerged at roughly the same time, also proposing to study language in social, cultural and psychological terms. Scholars from a number of disciplines came together under labels such as: sociolinguistics (later split into interactional and variationist), ethnography of communication, linguistic stylistics, linguistic pragmatics, psycholinguistics, ethnomethodology, ethnolinguistics, conversation analysis, discourse analysis, interactional analysis, sociology of language, and anthropology of language, among others. Some of these terms became alternate designations within linguistic anthropology (e.g. ethnography of communication, interactional sociolinguistics), others have come to mark differences of emphasis (e.g. discourse analysis, pragmatics). Still others, like variationist sociolinguistics and conversation analysis, have developed characteristic methodologies, remaining closely aligned with their disciplines of origin – linguistics and sociology respectively. The intellectual excitement and energy evidenced by the proliferation of terms, conferences, and edited volumes in the 1960s was an ironic response to the establishment of separate departments of linguistics in American universities in the post-WWII period. These departments provided institutional backing for the formalist study of language as an autonomous phenomenon, putting aside concerns with the contexts of language. They thereby indirectly re-invigorated such contextual questions in other institutional venues. Only psycholinguistics was directly spurred by generative grammar, which legitimated cognitive questions in the face of the reigning behaviorism. The new contextual fields were surely also encouraged by more funding for cybernetics and ‘communication research,’ which became policy sciences during the Cold War. Within anthropology, the new labels joined the much older term ‘anthropological linguistics,’ which was closely tied to fieldwork-based typological research on native North America languages as established by Franz Boas in the early 20th century. Those

who now adopt the label of anthropological linguistics are oriented to linguistics departments, to descriptive work in the structuralist tradition or to historical reconstruction of language and verbal art in unwritten languages. Those identifying as linguistic anthropologists orient to anthropology departments and to language and speech as cultural practice. Many individual scholars are active in both kinds of research. Differences of emphasis notwithstanding, linguistic anthropology and anthropological linguistics have been used interchangeably as labels in textbooks and encyclopedias. The Boasian tradition gives linguistic anthropology significant institutional recognition, intellectual influence and prestige within the discipline of anthropology. In the hybrid fields of the 1960s, practitioners held themselves accountable to different departmental audiences (linguistics, sociology, anthropology), resulting in different emphases and preferred topics. Collections of articles in the last few decades, however, have usually included scholars from several disciplines, all writing on a single theme. The roster of substantive topics has included: the linguistic marking of social relations and identities; conversational interaction and other speech genres; political processes mediated by speech such as decision-making and dispute settlement; language and nation; multilingualism, linguistic variation and multidialectalism; standardization and literacy; national language policy; narrative, performance, verbal art and ritual; the emergence, circulation and desuetude of languages, linguistic varieties, registers and styles; the acquisition of cultural competence through language; the relation of cognition to linguistic categories as coded in grammars and lexicons; the mechanisms of language change. The last half century has also brought new issues such as the globalization of languages and the effect of novel communicative technologies.

Roots and Shoots Linguistic anthropology is often called an interdisciplinary field. But considered as an intellectual (rather than a departmental or institutional) endeavor it is rather a set of lineages or kinship lines that are read and invoked for inspiration and legitimation. As in all segmentary lineage organization, naming one’s ancestors is also a means of forming alliances and oppositions in today’s controversies. (What the field would analyze as the relation of narrated and narrative event.) This very brief recitation of family ties is not a history of linguistic anthropology but an overview of a usable past on which current practitioners rely. The turn of the 20th century is the conventional starting point, especially the work of Franz Boas,

Linguistic Anthropology 173

Edward Sapir and their students. They collected textual materials to document peoples whose cultures were rapidly changing under brutal colonial pressure. These scholars were inspired by the previous century’s German tradition that considered language as a historical guide to the customs and values of a group. Starting with Boas’s studies of verbal art and folklore, poetics has played a continuing role, strengthened by contacts with parallel interests among Prague School linguists. Contact with avatars of European dialectology of the 19th and 20th centuries helped raise concerns about regional distributions of forms, definitions of languages, standardization, dialect boundaries and historical change. A francophone line of structuralist linguistics starting with Saussure, through Benveniste, is perhaps best characterized as the immediate source for the autonomous linguistics that has been linguistic anthropology’s intellectual foil. In this respect, at least, American structuralists from Bloomfield to Harris to Chomsky have been the heirs of Saussure. For linguistic anthropology, by contrast, Saussure’s project is significant as part of a broader understanding of sign phenomena. The other sources of sign theory were the Americans C. S. Peirce and to a lesser extent the more behaviorist Charles Morris. The significance of Peirce’s semiotics for linguistics was emphasized by Roman Jakobson, himself a central node in a kin network that connected American structuralism to Prague School functionalism and Russian formalism while also helping to bring the work of Bakhtin’s decidedly anti-formalist Russian school of poetics and literary studies of the 1920s and 30s to international attention. Literary studies have repeatedly been ‘captured’ as relatives by linguistic anthropology, or vice versa, as in the mid-century dramatism of Kenneth Burke, the semiotic readings of Roland Barthes, or the writings of Raymond Williams and other neo-marxist critics. There were also the quasi-literary interests of Malinowski in language function and ‘context of situation,’ not taken up by British social anthropologists but rather by functional linguists and later in the century within anthropology by Gregory Bateson. Philosophies of language have also been significant interlocutors for linguistic anthropology. Austin’s ordinary language philosophy was particularly important as it side-stepped the Fregean concern with truth conditionality. This was followed by Searle and Grice on speech acts and implicatures, and in a different line by Wittgenstein on language games. A later and contrary branch of this lineage is represented by Hilary Putnam and Saul Kripke on the indexicality of reference. Finally, linguistic anthropology rightly claims important kin connections with phenomenologists such as Husserl who inspired sociologists

from Schutz, to Goffman to Garfinkel, though these doubtless also saw themselves as descendants of G. H. Mead, himself very likely a reader of Peirce. Relying on all these sources, linguistic anthropology was consolidated in the 1960s through two intellectual strategies familiar from the history of science: The first was a bid to constitute an ‘object’ of analysis that had not before been the focus of research. Most broadly stated, this object was the process of face-to-face interaction. One can rephrase the argument this way: the primary datum for all the ethnographic and language-based disciplines is the contingent stream of eventful talk in everyday life, along with its concomitant non-verbal signaling systems. Each disciplinary enterprise abstracts from this in a principled manner. What is not within its focus becomes an obstacle to research and is bracketed or theoretically discounted. For instance, Saussure was quite explicit that a synchronic linguistics should (for the moment) ignore what he defined as ‘external’ though admittedly important facts, instead studying structural relationships of contrast and opposition he defined as ‘internal’ to language. In a parallel way, the sociocultural anthropology of the 1960s treated language as a vehicle for recounting cultural content, thereby excluding from study the situation in which the telling occurred. This first project has had considerably success. Linguistic anthropology abstracted something new from that accessible stream of verbal activity, finding systematicity where others had found only noise: Goffman’s ‘‘neglected situation;’’ ‘‘naturally occurring talk’’ and ‘conversation’ as defined by Schegloff and Sacks; the ‘‘speech event’’ and its functions, as defined by Jakobson; studies of performance genres by Hymes, Friedrich, Albert, and Ervin-Tripp; Barth’s notion of interactions as boundaries; Austin and Searle’s ‘‘speech acts,’’ and the organization of ‘‘social meaning’’ that Gumperz and Labov found in linguistic variation and codeswitching. Goffman declared the relative analytical independence of an ‘‘interactional order’’ governed by a separate set of principles not directly related to larger social structures. There ensued a period of description, typologizing, and cross cultural comparison. The second, more radical intellectual project was a double-edged critique, targeting both linguistics and social science as then constituted. Hymes’s dictum was a classic performative, disguised as mere description: ‘‘. . . whereas the first half of the century was distinguished by a drive for the autonomy of language as an object of study and a focus upon description of structure, the second half was distinguished by a concern for the integration of language in sociocultural context . . . .’’ (1964: 11). This project continues

174 Linguistic Anthropology

to inspire theory and research. It has produced detailed criticisms of mainstream linguistics, sociology and anthropology. In the argument with generative linguistics, linguistic anthropology retained much of structuralist analysis, but rejected the asocial definition of language. What had been peripheralized became central in a series of changes in focus. Linguistic anthropology emphasized multilingual, stratified speech communities instead of the ideal hearer/speaker; performance instead of linguistic competence; linguistic repertoire and speech act instead of abstract grammar and sentence; speech acts and speech events instead of the disembodied sentence. As sources of evidence, contextually located and tape-recorded interaction replaced intuitions about grammaticality. Some of this dovetailed with European initiatives to study sentence level phenomena through their cohesion into larger units. In mainstream American linguistics, however, the subjects taken up by linguistic anthropology were relegated to subfields such as pragmatics and sociolinguistics. In sociology, it was methods and epistemology that were attacked by language-centered approaches. Study of interaction highlighted the situatedness of all sociological descriptions, indeed the unavoidable role of the interviewer in shaping the answers that made up a sociological report. This insight about the ‘reactivity’ of measurement was recognized as important, but was so corrosive to sociological business-as-usual that it was isolated as the workings of a ‘micro-order,’ to be studied separately from the institutional, organizational and demographic issues that occupied the mainstream of sociology. Without theories of how micro and macro were linked, there was a continuing side-lining of language as subject matter, and the trivialization of interactional process as merely the enactment of patterns determined elsewhere, the faithful reflection of supposedly more powerful ‘macro’ forces. The role of linguistic anthropology within anthropology was more complicated. The position of language was significantly transformed in the 1980s in two ways. First, through a redefinition of culture. Rather than a symbolic or cognitive phenomenon (the two previous approaches) culture came to be seen as a set of embodied practices within institutions; practices that, in certain conjunctures, could change the institutions themselves. ‘Language’ was often invoked as a powerful means of constructing reality. Ethnographies of speaking that analyzed race, gender, ethnic conflict or dispute settlement fit well into practice theories such as those inspired by Bourdieu, by Birmingham cultural studies, colonial studies and Gramscian notions. But even when

recognized as important, linguistic practice was rarely analyzed in any detail. Simultaneously, a second enterprise was also launched, related to language but largely independent of linguistic anthropology. Under the influence of literary studies, anthropology mounted a reflexive critique of the poetics and rhetoric of anthropology’s own prose genres, especially ethnographic monographs. Anthropologists joined continental theorists such as Foucault and Derrida in unpacking and undermining the idea of objective knowledge. Metadiscourse, texts, their materiality, their authorization, their ability to ‘objectify’ and devalue others, all took center stage in sociocultural anthropology. But these concerns were often separated from the classic ethnographic and comparative goals of the discipline. For linguistic anthropology, the poststructuralist philosophers’ discussions of discourse, rhetoric and poetics as shapers of ‘truth’ and subjectivity rang familiar tunes – if in unfamiliar keys. As a result, they provoked spirited responses. This critical engagement was guided by the internal logic of linguistic anthropology itself during intensive discussions in the 1980s and 1990s. The debates with poststructuralism encouraged a synthesis within linguistic anthropology that was aimed at developing a processual, event-based, political economy of texts in social life, and a semiotic perspective on culture. The overall project of linguistic anthropology remains the reshaping of linguistic theory from an interactionalist and culturalist perspective, and the revamping of anthropological investigations of meaning and action from the perspective of a semiotically grounded understanding of language, culture and social institutions. Within these broad aims, the last twenty years have brought substantial revisions in theoretical concepts.

Concepts and Controversies The orienting concepts discussed here are not strictly separable; there are overlaps and echoes among them. Each section traces continuities with earlier formulations, discusses points of recent controversy and consensus, and then outlines briefly the implications of current approaches in linguistic anthropology for both linguistic and anthropological theory. Indexicality, Metalanguage, Materiality

The multifunctionality of language was a pillar of 1960s linguistic anthropology. Jakobson (1960) enumerated emotive, poetic, metalinguistic, phatic and conative (action) functions. These operate simultaneously. Depending on the nature and goal of interaction, some are highlighted more than others. Yet

Linguistic Anthropology 175

linguistic anthropologists observed that in many cultural contexts experts and laypeople alike privileged referentiality, believing that the naming of things in the world and predication about them was the pre-eminent role of language. Early linguistic anthropology proposed the category of ‘social meaning’ to designate what is communicated through a disparate set of formal linguistic devices in which picking out a referent is only secondarily involved, or absent altogether. These included Labovian phonological markers of class or regional identity, speech levels, grammatical alternates specific to males vs. females, avoidance registers, and codeswitching between languages and dialects. The conceptual unity of these phenomena has been clarified through more concentrated attention to the non-referential, metalinguistic and poetic functions of language. This has been done through a foundational critique of structural linguistics, fortified by a culturalist reading of Peircean semiotics. The structuralist tradition of grammatical analysis, no less than western common-sense, implicitly relies on the assumption of a stable referentiality for linguistic units. Saussure created a semiotics in which signs link a concept (signified) with a sound image (signifier) in systems of value-creating contrast. But he left unanalyzed the circumstances under which signs would be instantiated. His form of structuralism is able to explicate the workings of grammar as a system of oppositions, sequences and substitutions. But severing an abstract system of types (langue) from their tokens in contexts-of-use (parole) had serious limitations. Most importantly, it could not analyze what Jakobson called ‘shifters:’ linguistic phenomena whose referential value is not entirely fixed within an abstract system, but relies in part on features of the situation in which they are used. There is no typelevel stability in the reference of ‘I,’ it varies with the instance of utterance, always identifying the speaker of the moment. More generally, not only reference but also the interpretations of speech acts, implicatures and presuppositions are necessarily linked to events of speech. Thus, speech as social action is not adequately described as the ‘putting to use’ of a separately analyzed grammar. On the contrary, grammar is full of devices – deictics, tense, mood, evidentials – that gain their interpretation only in part from type-level contrasts, and in part as tokens of use in specific contexts. These phenomena make an autonomous grammar impossible in principle: to describe them fully one needs pragmatics. That is, the speech event in which they occur must be analyzed in ethnographic detail and systematically linked to linguistic form. To do so, Jakobson drew on C. S. Peirce’s triadic semiotic theory in which a sign is linked to an object

for an interpretant. Indexical signs, for Peirce, stand for their objects by virtue of a culturally noticed, realworld contiguity. In contrast to symbols, defined by Peirce as signs that stand for their objects by virtue of a general law, indexes simply point to their objects; they signal through a co-existence between the sign and the objects and speech events of its occurrence. In these terms, shifters are partially indexical, partially symbolic. As Silverstein (1976) argued, the linguistic phenomena earlier identified as having ‘‘social meaning’’ (e.g. phonological variants, codeswitches) are non-referential indexes, relying for their interpretation on their continguity (indexicality) with contexutal features of the speech event in which they occur. That is how a phonological variant can signal the social relations of the speakers in an event, their relation to the topic of talk and/or the nature of the event itself. Non-referential indexes can be placed on a continuum with shifters. Indeed, the philosophical work of Putnam and Kripke showed that any act of reference necessarily has an indexical component. Referential indexes (shifters) and non-referential indexes have two further significant properties. They need some metadiscursive frame in which to be interpreted (see next section). And they can be either presupposing or creative. If presupposing, then their use signals that some aspect of the context is taken for granted as existent; if creative/entailing, then the use of the form itself brings into social relevance (into apparent ‘existence’) the objects or categories with which the form is culturally associated. By linking shifters and non-referential indexes, a Peircean analysis provides a conceptual unity to social indexicals and thus to what used to be called ‘social meaning,’ thereby clinching the case against an autonomous grammar. It also provides conceptual materials for an alternative theory of linguistic structure. Classic empirical studies of indexicals include Errington’s work on speech levels in Java; Silverstein’s re-analyses of Labovian phonologoical variables and of T/V pronoun usage in the history of English; Irvine on Wolof registers; Ochs on indexicals of gender and stance in language socialization; Duranti and Agha on honorifics; and Haviland on Australian avoidance register. Brown and Levinson handled politeness phenomena, which are also of this kind, with a decidedly different approach. The presumption that referentiality and propositionality are the pre-eminent functions of language is part of an ancient western ideology. Not as old, but still powerful is the related idea that metalanguage and poetic forms are mere ornaments to reference. Because language has so often served as a model of culture, these widespread assumptions have implications for anthropological theory. For instance,

176 Linguistic Anthropology

Levi-Strauss borrowed from structural linguistics the idea of distinctive features; ethnoscience borrowed generative grammar’s idea that there are rules of competence. Interpretive anthropology borrowed from philology the notion of text. In each of these otherwise different cases, it was the referential capacity of language that served as the model for culture. Culture, like grammar, was seen as organized symbolic content that could be extracted from the real-time social action and historical positioning in which it was created. This taken-for-granted move of decontextualization reproduced the Cartesian assumption of a chasm between world and word. Accordingly, approaches in anthropology that emphasized practice, political economy and materiality were assumed to be opposed to those concerned with meaning, representation and ideation. In contrast, a linguistic anthropology that places indexicality and speech-as-action at the center of attention provides a different synergy with sociocultural anthropology. Propositionality, however significant, is recognized as a feature peculiar to language. It is least like the rest of culture. Instead, the indexical aspects of linguistic practices, as interpreted by metadiscourses, are among the best examples of cultural meaning-making. Indexical signs are not only linguistic; they are also gestural, visual and sartorial, among other modalities. They are not ‘reflections’ of some other, more (or less) important reality. Rather, they are constitutive of the real-time creation of social-material reality through interaction. Peircean semiotics, with its tripartite emphasis on the object, as mediated by the sign and interpretant, insists on the materiality of communication, and conversely on the semiotic organization of material practices. Context and Contextualization

Speech events were a fundamental unit of early linguistic anthropology. Studies focused on their constituent features (e.g. speaker, hearer, topic), social functions and cross-cultural typologies (e.g. Gumperz and Hymes, 1972; Bauman and Sherzer, 1974). In the last few decades, the structural description of speech events has been transformed into a more flexible concern with the ‘context’ of discourse and performance, bringing several important changes in the understanding of context. Good overviews of these issues are offered by Bauman and Briggs (1990), and Duranti and Goodwin (1992). The notion of ‘context’ as a set of social, spatial and physical features surrounding talk was commonsensical but inadequate. It implied the possibility of infinite regress in the number of features; it neglected the perspective from which context was viewed; and it assumed a

firm divide between talk and context. The problem of infinite regress arose from the effort of the analyst to list exhaustively the factors that might affect the nature and form of the talk. In order to choose which of the many features are relevant, one must address the question of perspective. Features defined from the point of view of the analyst are useless for understanding social process; it is the selective attention by participants to aspects of the social surround that analysts ought to be describing. Conversation analysts such as Schegloff, Sacks, Jefferson, Heritage, Charles and Marjorie Goodwin took as an axiom the importance of discerning what participants orient to on a moment-by-moment basis in the local management of sequential talk. Participants need not share perspective among themselves any more than they do with the analyst. They might well have to negotiate a definition of the situation. Context then becomes a joint accomplishment. Infinite regress is avoided because it is the participants who together signal when ‘enough is enough,’ or defeat each others’ attempts to include more (or less) of the surround. Such signaling is not necessarily propositional speech, yet is certainly communicative. Talk itself signals the frames for its own interpretation, supplying the cues for what is to be taken as its own context. There is no firm divide between a strip of talk, its co-text (linguistic context) and its sociocultural context. It is not context, then, that is of interest, but contextualization: how participants attend to on-going discourse, conveying their assessments, evaluations, presuppositions as well as predictions about the definition of the activity that is occurring, the eventspecific roles of the participants, the intentions of speakers, the direction the activity is likely to take, as well as unexpected switches in all of these. This process relies on culture-specific folk theories about social actors, intentions, events and goals, while recreating those very categories in the process of communication. These theories are not necessarily shared. What Putnam observed about the lexicon is equally apt here: in any group there is likely to be a division of linguistic labor and of the expertise it requires. In addition to local knowledge, contextualization relies on the universal metacommunicative capacity of language, and on the universal ability of speakers to attend to and respond to metamessages about the relationship of talk to its surround. The several concepts that have been crafted for the analysis of this metacommunicative process differ in certain respects, but bear a family resemblance. Lucy (1993) provides a good review of these. Bateson proposed ‘framing’ to denote metamessaging that signals some activity to be play or not-play. This is

Linguistic Anthropology 177

common even among many non-human animal species. Chafe and Fillmore made linguistic use of this formulation. Goffman (1979) extended it with his notion of the footing or stance taken in an interaction, and the participant roles or role fragments – such as ‘author’ ‘animator’ ‘principal’ of an utterance – that are thereby evoked. Philips described participant structures, and Cicourel proposed the notion of schema for related phenomena. Gumperz (1982) introduced contextualization cue to name the many kinds of linguistic signals (e.g. prosody, codeswitching) from which one infers what kind of activity is in effect. Silverstein (1993) distinguished between metasemantics, by which speakers define the meanings of words, and metapragmatics. Metapragmatic discourse is explicit commentary or evaluation of language use (e.g. that some event was gossip). Metapragmatic function, by contrast, is implicit signaling to suggest which cultural frame or activity is in effect. Bakhtin (1981) and Voloshinov proposed literary analysis of reported speech and voicing as metacommunicative devices that present the perspective of one speaker on the speech of another. Bauman (1986) showed that performance is itself reflexive: the speaker assumes responsibility for speaking well, thereby drawing attention to the code and poetic forms through which speech genres are created and thus expectations about them are signaled. A crucial aspect of framing or voicing is the possibility that frames can be embedded in other frames; they can be transposed and projected both forward and backward in time. Furthermore, speakers create interactional tropes, treating interlocutors and events ‘as though they were someone/something else,’ thereby achieving novel communicative and social effects. As part of such effects, voices can be reported in quotation or in various forms of indirect discourse. Social interaction is thus an endless lamination of narrated events and the narrative events within which the stories are told. For linguistics this implies a complexity in patterns of pronouns, tense, evidentials, discourse markers and anaphora that signal such embeddings of frames. These cannot be handled without theorizing indexical phenomena. For sociocultural anthropology the embedding of frames and their interpenetration during narratives and conversation allows analysts to understand processes such as the relationality of personhood and the fragmentation of selves, as well as subject formation, roledistance, and the cultural conceptualization of what counts as authenticity. It is a small example of the reality-constructing processes involved in reported speech that the speech reported need never have happened, or not in the way reported. Yet the report – culturally framed as, say, gossip, journalism,

court testimony, or oracle – can have far-reaching consequences in shaping subsequent social relations. Framing and the propositional content of talk always occur simultaneously. Metamessages allow the analyst to track participants’ interactional moves in an encounter. These moves (the interactional text) include the open-ended set of acts that can be done with words: promises, teases, threats, and the unnamed, more general alignment or antagonism among speakers. These acts are accomplished in part by small observable behaviors such as sequencing, body position, and conversational repair. But they are just as importantly accomplished by the ways in which the names for objects and actions that are the subject matter of talk are selected from the many equally accurate denotational labels available. As Schegloff (1971) pointed out for the limited case of place-names, how one formulates a label is always relative to a particular event of talk, that is, indexical. Selection of a term that picks out a referent involves a delicate (and not always conscious or aware) negotiation of social relationships, assumptions about participants’ levels and types of knowledge, hence their identities and social location. In turn, the use of one rather than another referring expression is creative/ performative. The difference between ‘dine’ ‘take a repast’ ‘chow down’ or ‘put on the old nosebag’ is not only a matter of lexical register. Each claims a speaker identity, positions speakers with respect to each other, with respect to the event, the referent, and to the cultural discourses indexed by the labels selected. Framing and the indexicality of reference together accomplish contextualization: the momentby-moment means through which interaction creates and transforms social relations. Text and Entextualization

There is an irony in the effort of linguistic anthropology to discern how participants contextualize stretches of discourse. For scholars themselves spend most of their time ripping snippets of discourse out of context in order to translate, transcribe and analyze them. The process of decontextualization is a key (reflexive) step in social science methodology. Yet the examples of transposition, reporting the speech of others, and embedded frames discussed earlier show that decontextualization is just as familiar from everyday life. It is the flip side of contextualization. This is what Bakhtin evocatively characterized as our mouths being full of other people’s words. It has been a focus of analysis in linguistic anthropology for the last two decades. Close analysis of the process requires a distinction between ‘text’ and ‘discourse.’ ‘Text’ is any

178 Linguistic Anthropology

objectified unit of discourse that is lifted from its interactional setting. ‘Entextualization’ is the process of transforming a stretch of discourse into such a unit of text, undoing its indexical grounding by detaching it from its co-text and surroundings, yet taking some trace of its earlier context with it to another setting which is thereby changed and which reciprocally transforms the text itself. Certain formal properties can enhance the likelihood of entextualization. For instance, poetic features of cohesion, or genre conventions can signal a boundary to an interaction, therefore a chunking of text. But any stretch of discourse can enter into interdiscursive relations by which it seems to be recalled, repeated or echoed in further discourse. It can be picked out and seemingly frozen as text, to be involved in further intertextual relations that link it to previous and subsequent versions of text. Interdiscursive and intertextual links create the impression that text fragments ‘circulate’ across texts, events and among speakers. The linguistic means of entextualization are the same metadiscursive signals that are essential for contextualization: devices of framing, cueing, metapragmatic discourses and functions. For the purposes of linguistic analysis, it is necessary to study the transformations that discourse undergoes as it is entextualized and then (re)contextualized. Footing or genre might change, as might indexical grounding (as signaled by changes in deictics of person, space and time). The function of the text might also change (e.g. from everyday act to ritualized tradition). New forms, functions and meanings may be emergent in the newly re-contextualized text. And there are questions of access, power and inequality involved in the social arrangements that constrain what sorts of persons and statuses can entextualize in what institutional settings. Processes widely analyzed under other names – translation, codeswitching, glossing, among others – are amenable to further scrutiny in these terms. Furthermore, by cultural definition, some texts are more or less accessible for de- and recontextualization. The anthropological implications of this line of work extend in at least three directions. First, such analysis can reveal how the social magic of authority – political, legal, epistemic – is created in and across interactions. The relationship between narrated event and the story-telling event in which it occurs is a delicate nexus at which to ‘calibrate’ voicing through the projected relations of teller to tale, to audience, to source, and to previous and subsequent events and tellings (Silverstein, 1993). Through metapragmatic framing, speakers can construe the interactive event that is recounted as being distinct from the on-going event of talk (reportative calibration), or

as the same event (reflexive calibration), or as emanating from some other epistemic realm such as the sacred, the universal, or mythic (nomic calibration). Within specific institutional settings, these calibrations create different sorts of authority: claims to knowledge, or claims to (and the social effect of) speaking as/for the people, the ancestors, the gods, or the laws of science. Contests over the metapragmatic framing of sources for statements can create (and destroy) the authority of texts and hence the power of speakers. As Silverstein and Urban (1996) note, a significant part of politics is the struggle to entextualize authoritatively. Linguistic practices, then, are the very grounds of politics, not the medium, description or reflection of them. This includes the assignment of responsibility and blame, credibility and doubt. Hill and Irvine (1993) show that across widely different cultural settings, such attributions are managed through metapragmatic devices such as reported speech and the distribution of voices across participant roles. Second, the processes of entextualization and links among texts (intertextuality) allow a deeper understanding of temporality, spatiality and social connectedness. Despite the undeniably linear, sequential ordering of speech, there is no single ‘now’ in interaction. Any utterance is weighted with the earlier source from which it can be heard to have originated, and the implicit future recounting in which it might participate. The interdiscursive links among interactional contexts can be extended and projected without temporal limit (Irvine, 1996). Even within a single narration there are often layered successions of retellings which, embedded in each other, can make traditionalization visible as a temporal process. Similarly, study of intertextuality can highlight systematic relations among events in spatial extension, inviting scholars to rethink the relation of face-to-face interaction and what used to be called ‘larger social structures.’ This will require further theorization of the various kinds of linkages between interactions. For instance, we need to know how circulation – a shorthand term for intertextual links, echoes, repetitions – has differential ‘reach’ across interactions that are distinguished according to their degrees and types of instutionalization, geographical range, political economic consequentiality, form of mediation as broadcast, print, or face-to-face talk. The phenomenon of translation (linguistic, cultural) deserves considerably more attention in these terms, as it too is a form of multiply layered intertextuality. Finally, there are methodological implications of this perspective on text. Social science research always involves reflexive language. Jakobson remarked that there could be no linguistics without metalanguage,

Linguistic Anthropology 179

as scholars have to ask for glosses, acceptability judgments and paraphrases. The reactivity of fieldwork became an issue for sociocultural anthropologists of the 1980s, (just as it had for sociologists in the 1960s) when they noted that fieldwork is ‘dialogic.’ It is not the positivistic observation of an object by a subject but an encounter between two subjects – the informant and the anthropologist – with different relevances, values, and different sets of ultimate audiences in mind. Much of the subsequent critical commentary focused on issues of objectification in ethnographic writing. Less attention was paid to the fieldwork encounter itself as dialogic and mutually objectifying. Like any interaction, fieldwork and interview are always susceptible to the confusion of mismatched or even incommensurable metapragmatic signals among interactants that Gumperz called ‘cross-talk,’ and that is made worse by power differentials. This is less a problem specific to anthropological research than an insight about the nature of human interaction. As Mannheim and Tedlock (1995) have remarked, the people anthropologists study have been ‘objectifying’ each other well before the arrival of the fieldworker. The task is to specify what kinds of objectification and incommensurability in metacommunication are operating, and how. Language Ideologies

Language ideologies are cultural conceptions about language, its nature, structure and use, and about the place of communicative behavior in social life. Useful definitions and exemplary studies are presented in Woolard and Schieffelin (1994) and Schieffelin et al. (1998). Ideas about speech and language are common in all social groups and are as culturally diverse as linguistic practices themselves. In the linguistic anthropology of the 1960s the study of language attitudes and native models of politeness, language variation, honorifics and appropriateness were grist for cross-cultural typologies and comparisons. These research themes, along with others detailed below, are unified under the rubric of language ideology. The term ‘ideology,’ though polysemous, most often evokes ideas connected to politics and power. Such concerns have a long pedigree in linguistic anthropology. Boas as public intellectual brought anthropological and linguistic evidence to bear against racist science and anti-immigration policies. Overtly political concerns about inequality and race were also present in the 1960s, for instance in the debates among Bernstein, Hymes, Gumperz, Kay and Labov on the existence, value, and consequences of ‘restricted codes’ in working class and Black speech. Current controversies that have strong political implications

include the increasingly global hegemony of English, the linguistic mediation of inequality, the future of endangered languages, and the stigmatization of multilingualism and of certain accents and dialects. What is different today is scholars’ reflexive analysis of communicative processes in their own work and in large scale politics. Language ideologies always include metapragmatics, that is, local suppositions about the relation of speech forms to speakers’ identities and their social situations. But language ideologies are never only about language. They include whatever other conceptual systems are taken to be relevant to language by the speakers and institutions under study. In the analysis of language ideology, as in the study of metapragmatics, there is a split between those approaches privileging explicit, propositional content, and others that focus on implicit ideological patterns inscribed in linguistic, institutional, ritual and other material practices. Language ideologies are never unitary and so the study of ideology commits the theorist to a perspectival approach. As Woolard has emphasized, one must ask: whose ideology is at issue and in what practices and institutions is it sited. There are likely to be contradictions among ideologies. For instance, Bateson’s notion of the double bind consists of two contradictory ideological (meta) messages, delivered in different modalities simultaneously. There is also likely to be contestation among ideologies evident at different social locations. Nor are ideologies likely to be shared within social groups in a world characterized by linguistic divisions of labor. In a single population, language ideologies inscribed in the practices of schooling can conflict with or override those evident in families, friendship networks or other institutions, raising questions about the relative authority of different ideologies. Early discussions of ‘linguistic ideology’ emphasized the tendency of explicit ideological statements to rationalize and thereby distort linguistic practices. Boas and Bloomfield saw speakers’ models of their own speech as obstacles to genuine linguistic analysis. More recently, scholars have noted that there is no access to linguistic materials except through the filter of metalinguistic assumptions – whether these are the assumptions of speakers and/or of analysts. Current use of the term ‘language ideology’ considers such filters not as regrettable distortions but as part of the perspectival nature of ideologies, their necessary partialness and partiality. This includes the perspective of linguists. Language ideologies are grounded in social position and experience, in moral and political stances. But they are not an automatic reflex of these. Rather, ideology mediates between social position and linguistic practice in diverse domains. Students of language

180 Linguistic Anthropology

socialization (Ochs and Schieffelin, 1994) have shown that children do not simply learn linguistic skills. Rather, local ideologies mediate between talk itself and assumptions about the proper relationship between childhood, talk and forms of mothering. Similarly, in the study of literacy, Collins finds that language ideologies add their own contributions as interpretive filters, defining who can be expected to read and write in what way and for what purpose, thereby contributing to the creation of many distinct forms of literacy. The linkage between linguistic practices and categories of identity is also mediated by language ideologies. How are maleness and femaleness indexed in speech? When such indexes appear in interaction, other dimensions of social life – such as the expression of desire, sexual activity, typified emotions, rank and social position – are entailed, in part on the basis of local cultural images of masculinity and femininity (Cameron and Kulick, 2003). By viewing language ideology as an inescapably perspectival lens on social interaction, linguistic anthropology engages in debate with neo-Marxist lineages of ideology-critique. Some studies in linguistic anthropology have marshaled evidence from language use to challenge social theorists’ proposals about the workings of symbolic domination and cultural hegemony. Other work has reconsidered influential formulations about ideology by Bourdieu, Foucault, Althusser, Zˇizˇek and others to reveal their unexamined assumptions about language and semiosis. Social theorists of ideology often and unreflectively rely on implicit linguistic models that, because they seem commonsensical or self-evident, help to make their theories more persuasive. Most generally, ideologies that present themselves as concerning language can work as displacements or coded stories about political, religious or scientific systems; ideologies that seem to be about religion, political theory, human subjectivity or science are often implicit entailments of language ideologies, or the precipitates of widespread linguistic practices. The term ‘displacement’ can be further analyzed here as a form of voicing. However, to recast a language debate as a coded dispute about religion, aesthetics, morality or politics is not, in itself, an explanation. Rather, the goal of analysis in studies of language ideology is to show how such a displacement works in semiotic terms, how it is instantiated in practices, and how it legitimates, justifies or mediates action in quite other areas of social life. Conversely, a particular definition of language may itself be made more credible by its connection to other, non-linguistic, sociopolitical concerns and especially to their supporting institutions. Ethnolinguistic nationalism provides a familiar example. Over several centuries, European philosophical

and political practice did the ideological work of making the connection between the cultural categories of ‘language’ and ‘nation’ appear a necessary, natural and self-evident one, united as much in everyday political practice as in scholarly arguments. This occurred in part through the establishment of a science of language that defined a bounded and unified object of study (‘language’) as a natural entity, out there to be discovered. The ideologically constructed unity of language-and-culture in a populace was seen as the ultimate source of political authority: those who spoke one language constituted a ‘people’ whose united voice would replace the authority of imperial rule. By this logic, any group claiming to speak the same language could use that fact as proof of its nationhood and thus justification for a state of its own. A somewhat different example of authority through language ideology is the political theory of the ‘public sphere’ as guarantor of democratic politics. According to European notions of a public sphere, as dissected by Habermas, the worth of a speaker’s argument is judged not by speakers’ social status. Ideally, democratic citizens make anonymous contributions to policy debate and to critiques of the state. It is the form and rationality of their contributions, not their identities, that is supposed to guarantee the fairness of a democratic polity. From the perspective of this theory of democracy, it is evident that a model of ideal linguistic interaction underpins the semblance of impartiality and hence the legitimacy of democratic process (Gal and Woolard, 2001). It is not only politics that is legitimated by images of language and social life. Bauman’s study of Quakers and Keane’s more recent report on Christian missionizing both suggest that the relations envisioned between speakers and listeners within these religious communities implied forms of interiority and intentionality that became models for various forms of Christian belief. Other forms of belief are also underwritten by understandings about language in social life. For instance, Shapin’s historical account of 17th century science shows that polite conversation among gentlemen was the model that, when transferred to gentlemanly interaction at the Royal Society, created the credibility and assumed replicability of early scientific experimentation. In these examples, linguistic ideologies underpin social institutions, providing the supposedly self-evident background that authorizes new social forms. Language ideologies are cultural frames. As such they have their own histories, which are instantiated and circulated in specific institutions and genres of speech and writing such as the etiquette book, instruction in oratory or realist novel. Another such genre in the west is linguistic philosophy. Its analyses

Linguistic Anthropology 181

are supposedly universal, yet very much rooted in the history of European cultural understandings about language. While the notion of ‘intentionality’ of the speaker is a key term in western philosophy of language, comparative study shows this to be but one historically specific version of an interiority-centered language ideology. As Duranti and Rosaldo have shown, in many social groups outside of Europe, inferences about speakers’ intentionality are not decisive or indispensable in the interpretation of speech acts. Bauman and Briggs’s study of the western philosophical tradition focuses mainly on Herder and Locke, tracing the historical conditions out of which emerged the regimentation of linguistic practices that would subsequently count as examples of ‘folklore’ on the one hand, and ‘objective speech’ on the other. Another such genre is linguistics analysis itself, especially as it has intersected with colonial projects. Historical studies show how language ideologies fit into fields of debate with which they are contemporaneous, and that concern other, diverse matters: the nature of human difference and inequality, competition among scholarly disciplines, or the competence and vision of a particular monarch’s ruling group. Differentiation: Registers, Communities, Variation and Change

Speech community, linguistic repertoire, variation, register and style are among the foundational concepts of linguistic anthropology. Adopted from earlier frameworks of research, they were redefined by the work of the 1960s. In the last twenty years they have been transformed once again in light of the notions of indexicality and metapragmatics/ideology. In any social group, images linking typical persons to typical activities and typical linguistic practices draw on culturally salient and elaborated principles of differentiation (e.g. presupposed notions of caste or occupation, folk theories of gender and personhood) that are often perceived by participants as necessary and inherent distinctions. These ideological principles – axes of differentiation – mediate between social and linguistic characteristics and orient the practices and relations of interactants. Speech communities and language communities are emergent effects built out of such axes of differentiation. Linguistic variation often appears to speakers (and to analysts) as a reflection or diagram of social differentiation. A famous example is the finding by Labov and his students that phonological variables correlate with situational style and the socioeconomic status of speakers. The analytical task is to specify the ideological – or more precisely the semiotic – processes by which these correlations arise and

become significant. Why and how do particular chunks of linguistic material coalesce into recognizable and nameable ways of speaking (registers) that gain significance as signs of particular populations, activities, settings, and are heard as appropriate to certain events. Furthermore, how is it that in any interaction, the expected correlations can be subverted or transposed, thereby signaling quite unexpected messages? The extension of a Peircean theory of signs has been productive in approaching these issues. Linguistic features that form co-occurring clusters or registers are indexical of (point to) categories of speakers who regularly use them, or to situations and activity types in which the features regularly occur. But not all real-world co-occurrences form indexical signals. The co-occurrences must be noticed and formulated within some cultural or ideological system. To make such linkages and render them meaningful for speakers often requires extensive discursive efforts and the effects of media circulation. Another means of establishing meaningful indexicalities is through ritual and institutionalization. When both the ways of speaking and the people or activities are typified, schematized and conceptually linked, the result is a system of registers that evokes a system of stereotypes. Formulations of referents in minute-to-minute interaction rely on these associations. Registers often include not only linguistic material but also other signaling systems such as clothing, demeanor and gesture. Linguistic-forms-in-use that are thus ideologized as distinctive and implicating distinctive kinds of people can always be resignified, further ideologized (or misrecognized) as emblematic of other social, political, or moral characteristics in what Silverstein has dubbed multiple orders of indexicality. Another of Peirce’s sign relations – iconicity – is key in differentiation, according to Irvine and Gal (2000). Peirce distinguished between indexes that point to their objects and icons that share the qualities of their objects, for some interpretant (e.g. a theory or ideology). In sociolinguistic differentiation, there is always a set of contrasting indexes pointing to contrasting objects in a relation that Peirce would call diagrammatic iconicity. Furthermore, the indexical links between linguistic signs and speakers, characteristics or events are understood not simply as a cooccurence but a sharing of quality. When an index is thus perceived as an icon, the resulting sign is a Peircean ‘rheme.’ Essentialization is in part constructed semiotically, through the perception that the sign and the object are iconically linked. A system of such contrasts, salient at one level or scale can be projected, in a fractally recursive manner, onto other scales of social and linguistic relation, either broader or

182 Linguistic Anthropology

narrower. This allows for the proliferation of the same or similar difference at greater and smaller scales. Social or linguistic aspects of the sociolinguistic scene that do not fit such systems of stereotypes are semiotically erased. That is, they are ignored, backgrounded and sometimes physically eliminated. When a system of such indexical signals is the basis of social interaction, then participants can have fairly strong expectations about communication. Even if the participants do not share what is usually called a single language, they recognize the kinds of speech that signals different sorts of people and activities, and a speech community can be said to exist. Note the similarity to the Prague School’s notion of Sprechbund. Since precolonial times, networks of exchange, commerce, travel and exploration have been creating speech communities that are diverse in social function, stability and extent. It is important to make an analytical distinction. Speech communities consist of people who can interpret each others pragmatic, indexical signals to varying degrees. Language communities are groups of people bearing loyalty to norms of denotational system. Usually the denotational form receives a name – English, Swahili, Taiap – and is imagined as bounded and separate from other comparable units. Language communities emerge as cultural system in the context of heterogeneous speech communities when difference in denotational practice is ideologized as significant. Thus contact and interaction – not isolation – produce distinct language communities. Language communities, although always characterized by loyalty to code, are nevertheless culturally distinct. Sometimes a single person’s speech is recognized as exemplary and aesthetically pleasing. In other cases the form of speech used in a certain setting or event (kiva, longhouse, oratory) is considered the model worthy of emulation. More common in the world today is the language community that is linked to a state system and oriented not to beauty but to standardized forms of correctness, monitored by language academies, school systems and grammar books. Named languages do not simply exist in the world. Through institutions they are constantly being made and reconstructed, their boundaries policed and defended. In the process of consolidation, standard languages often become gate-keeping devices in national labor markets, providing speakers who control them with increased access to jobs and other resources (Bourdieu, 1981). But the value of standard languages does not derive from such direct market activity; rather their market value depends on semiotic processes of differentiation. Speakers who are incorporated into colonial empires through bureaucracy, trade, or conquest,

but do not speak the language of the state, come to see their own linguistic practices through the eyes of the powerful center. Therefore, they come to see themselves relationally, as peripheral. For such populations, the switch in perspective produces novel self-understandings as ‘minority’ ‘local’ or ‘indigenous.’ In states organized as democratic and multicultural, legitimating one’s indigeneity or minority standing requires at least partial adoption of the state’s standardizing ideology. Whatever their own ideologies about linguistic practice, such populations must often produce a denotational code different enough from others to count as a ‘language’ of their own. For many decades, such ‘local’ languages were the special province of anthropological linguists, whose descriptions deliberately erased – as inauthentic – the contact languages and multilingualism that tied indigenous speakers to their neighbors and colonial rulers. Part of the problem is that EuroAmerican linguists’ notion of language as morphosyntax-with-sound pattern is often at odds with local definitions that focus on lexical co-locations, place names, prosodical features and textual organization. These differences acquire increased significance when indigenous languages are considered endangered. The question of what merits documentation becomes a highly consequential matter, argued by scholars, by courts, and among members of the language community. As Hill and others have shown, whatever counts as linguistic knowledge in indigenous communities often endows its owner with authority and access to local resources. The position of Euro-American linguists as experts and arbiters in these matters is rife with moral contradictions that have been a focus of professional writing in recent years. The exploration of metapragmatics and language ideology has produced new approaches to language change. Change is often the unintended consequence of people attending to linguistic structures through the prism of their own language ideologies, limited as these are by cognitive contraints on awareness and sociopolitical framings of what is significant. Increasing linguistic differentiation occurs through patterns of schismogenesis among interacting speakers, or conversely through simultaneous use of genetically different denotational codes in codeswitching (Heller, 1988). Codeswitching itself become the focus of loyalty, thereby producing a new language community. Other processes of differentiation result in language obsolescence, or contrariwise in language revival and the creation of ‘heritage’ languages for diasporic populations (Dorian, 1989). Also common is the commodification of language or linguistic practice for touristic purposes and the concomitant ‘ethnicization’ of local denotational codes when they

Linguistic Anthropology 183

co-exist with a standardized state language. In this process there are often structural changes in the local language that mark it as iconic of the group with which it is identified. Ideologies of ‘modernity/ tradition’ ‘male/female,’ ‘purity/dirt,’ and presuppositions about typified emotional states, notions of self, and quite local political issues, all can mediate between the socioeconomic situations of speakers and the forms of language change they experience (Gal and Irvine, 1995). Ideological framings of difference penetrate significantly into grammar. A semiotic analysis of differentiation has implications for the study of processes beyond linguistic practices. The tendency for nationalisms to recursively evoke internal divisions of the populace into foreign-natives vs. native-natives is well explained by the semiotics of differentiation. In colonial and imperial circumstances, details of cultural practices have been interpreted as evidence of the relative ‘humanness’ of conquered populations in contrast to conquerors. Projections of this kind are not presupposing indexes of existing features, but creative (performative) acts that bring into interactional relevance the very iconic similarities they seem to be merely describing. In yet another form of differentiation, speakers adopt the Goffmanian ‘figures’ of others. That is, they take on, for varying periods of time, the registers, objects and activities seen as iconic of others in acts of Bakhtinian mimicry, quotation, parody and/or admiring emulation. By attending to the semiotics of differentiation linguists study the dynamics of heteroglossic social orders. Language and Thought

Whether and how grammatical categories influence habitual thought and ‘perceptions of reality’ are among the oldest concerns of anthropological linguistics/linguistic anthropology. They were first raised for European science by colonial exploration and contact with languages whose grammatical structures seemed exotic in relation to the patterns familiar from study of geographically more proximate populations. In one way or another such differences worried Boas, Sapir, and Whorf, and well before them inspired Humboldt, Herder and Condillac. The issue has continued to draw scholarly interest throughout the 20th century. If one were to consider the social power to be gained from the ability to define social ‘reality’ – as in Gramsci’s cultural hegemony, Foucault’s discourse, Bourdieu’s doxa – then this set of questions would parallel those raised in the rest of linguistic anthropology. Characteristically, however, studies of linguistic relativity have taken a narrower view of linguistic

practices, and have considered neither power nor a socially located and mediating ideology. Positing a more direct relation between language and thought, they have studied psychological and cognitive processes in themselves. There is currently a reversal in this trend, however, bringing studies of linguistic relativity closer to the issues of identity formation, politics, conflict and social differentiation that characterize the rest of linguistic anthropology. Recent work suggests that issues of translation, register and interaction will become as important in studies of linguistic relativity as they are increasingly becoming in other areas of linguistic anthropology. Two reviews provide excellent guides to the state of research and its disputed history: Hill and Mannheim (1992) and Gumperz and Levinson (1996). It will suffice here to note some areas of consensus among scholars, before taking up three contested issues to give a sense of the debates and the way terms such as language and thought have been redefined. These matters are agreed: First, linguistic relativity (or the so-called Sapir-Whorf hypothesis) is not a hypothesis to be ‘tested’ but an axiom or starting point for research. Grammatical categories, to the extent that they are obligatory or habitual and relatively inaccessible to speakers’ consciousness form a privileged location for reproducing cultural and social categories because they constrain the ontology taken for granted by speakers. There is no assumption about the coherence of entire ‘world views’ in this as in any other corners of anthropology. Second, although evidence of universals in human cognition has been thought to undermine a search for languagespecific cognitive phenomena, all researchers acknowledge both. The interesting questions concern the relative strength, nature, sequence and role of universals vs. cultural-linguistic specificities and what those specificities might be. Third, it follows that Whorfian effects exist. This is hardly surprising given the discussions above about creative indexicality and projections. Finally, research priorities have shifted during the 20th century. Due in part to the Chomskyan ‘rationalist’ program in linguistics, the cognitive turn in psychology, and the empirical results of Berlin and Kay’s (1969) research on color terminology, universals took center stage in the 1970s. Currently, there is a renewed interest in Whorfian effects from a number of different perspectives. Turning now to contested issues, the first concerns the category ‘grammar.’ Whorf proposed that languages differ in the grammatical analogies they make. By handling substantively different lexicon within the same grammatical frame, they invite speakers to treat the otherwise different items in a similar way. Thus, English treats days, years and months not as cyclical

184 Linguistic Anthropology

events but with the same grammatical devices as ordinary object nouns. English speakers expect – by unconscious analogy – to count time in the same way as they count tables. They ask about the substance out of which days are made, on the analogy of wood as the substance out of which tables are made. Hence the objectification of time as a substance. Such analogies are unquestioned background assumptions. They become apparent to analysts if one analogy system is compared to another that provides different hidden parallels. Careful methodology is fundamental here: when two systems are compared, neither can be taken as the standard or metalanguage for the other. Some theorists suggest that the privileging of morphosyntax and its effect on semantic categories is misplaced, in a world of multilingualism. Friedrich proposed instead that the tropic or ‘poetic’ aspects of language, inflected by ethnopoetics, will differ most across cultures. (This echoes cognitive linguists’ claims that habitual metaphors structure thought.) Similarly, if the poetic form of narration changes during language shift, there is a loss of a distinct cultural pattern for organizing experience. Others counter that narrative organization signals merely a difference in the way that experience is packaged for the purpose of talk, and is not necessarily reflective of cognition. This formulation runs into trouble, however, if people must use obligatory linguistic categories to encode experience in order to plan for future recountings. A second set of arguments starts from experimental or cognitive psychology and the presumptive priority of universal cognitive processes. For some, linguistic relativity is not an issue because they assume language and cognition to be isomorphic, with thought as ‘inner speech.’ Linguistic relativity is also irrelevant for domains assumed to be unmediated by language: physical, musical or craft skills that are thought to be coded in somatic schema. Theories about universals of thought derive also from the Kantian tradition that takes categories of time, space and cause as the fundamental grounds of human reasoning. For many domains, there are also likely to be universal constraints imposed by the nature of the domain itself, and the specialized anatomical and neurophysiological adaptations of humans to a concrete world: wavelength for color; gravity in the case of space. Even in the realm of language there might be universals of structure or lexical organization. What does linguistic specificity add to such universals? Levinson suggests that in the case of space and possibly many other domains, linguistic relativity is still powerfully involved. Universals substantially underdetermine

the possibilities of conceptual solutions to describing spatial arrangements. A third controversy takes up Hymes’s early suggestion that there is a linguistic relativity of language use as much as of linguistic structure. Populations differ in the genres and events they recognize. Interpretations that participants derive from utterances are always dependent on sociocultural context. Thus, the fit between language and thought is mediated by habitual practice; social interaction and cultural beliefs (ideologies) about the everyday world. For instance, deictics of space are found in all languages. Nevertheless, as Hanks argues, they encode culturally specific information, and they map social and experiential fields, not objective spaces. Thus, cultural schema of several kinds mediate between the use of a deictic term and its proper interpretation. Furthermore, these frames and schema are not always equally available to all speakers in a community. A linguistic division of labor is often evident, as is the consequent necessity to negotiate meanings between interactants. Clearly, this brings to the study of linguistic relativity questions of indexicality, entextualization and ideology. For the study of linguistic relativity the implications are significant: There might be as much variation between speakers in their access to alternate perspectives and theories as there is across ‘cultures.’ Furthermore, distinguishing between ‘language’ ‘culture’ and ‘thought’ is at best a rough methodological tactic. The object of investigation for linguistic anthropology, in current practice, is exactly ‘culture’ as a process that is simultaneously semiotic, interactional and linguistic.

Bibliography Bakhtin M (1981). The dialogic imagination: four essays. Austin, Texas: University of Texas Press. Bauman R (1986). Story, performance and event: contextual studies of oral narrative. New York: Cambride University Press. Bauman R & Briggs C (1990). ‘Poetics and performance as critical perspectives on language and social life.’ Annual Review of Anthropology 19, 59–88. Bauman R & Sherzer J (eds.) (1974). Explorations in the ethnography of speaking. New York: Cambridge University Press. Berlin B & Kay P (1969). Basic color terms: their universality and evolution. Berkeley: University of California Press. Bourdieu P (1981). Language and symbolic power. Cambridge, MA: Harvard University Press. Cameron D & Kulick D (2003). Language and sexuality. New York: Cambridge University Press.

Linguistic Decolonization 185 Dorian N (1989). Investigating obsolescence: studies in language contraction and death. New York: Cambridge University Press. Duranti A & Goodwin C (1992). Rethinking context: language as an interactive phenomenon. New York: Cambridge University Press. Gal S & Irvine J (1995). ‘The boundaries of languages and disciplines: how ideologies construct difference.’ Social Research 62, 967–1001. Gal S & Woolard K (eds.) (2001). Languages and publics: the making of authority. Manchester, UK: St. Jerome’s Publishers. Goffman E (1979). ‘Footing.’ Semiotica 25, 1–29. Gumperz J J (1982). Discourse strategies. New York: Cambridge University Press. Gumperz J J & Hymes D (eds.) (1972). Directions in sociolinguistics: the ethnography of communication. New York: Holt, Rinehart, & Winston. Gumperz J J & Levinson S (eds.) (1996). Rethinking linguistic relativity. New York: Cambridge University Press. Heller M (1988). Codeswitching: anthropological and sociolinguistic perspectives. New York: Mouton de Gruyter. Hill J & Irvine J (eds.) (1993). Responsibility and evidence in oral discourse. Cambridge: Cambridge University Press. Hill J & Mannheim B (1992). ‘Language and world view.’ Annual Review of Anthropology 21, 381–406. Hymes D (1964). ‘Introduction.’ In Hymes D (ed.) Language in culture and society: a reader in linguistics and anthropology. New York: Harper & Row. 1–14. Irvine J (1996). ‘Shadow conversations: the indeterminacy of participant roles.’ In Silverstein M & Urban G (eds.). 131–159.

Irvine J & Gal S (2000). ‘Language ideology and linguistic differentiation.’ In Kroskrity P (ed.) Regimes of language. Santa Fe, NM: School of American Research. 35–84. Jakobson R (1960). ‘Concluding statement: linguistics and poetics.’ In Sebeok T (ed.) Style in language. Cambridge, MA: MIT Press. 350–377. Lucy J (ed.) (1993). Reflexive language: reported speech and metapragmatics. New York: Cambridge University Press. Mannheim B & Tedlock D (1995). ‘Introduction.’ In Tedlock D & Mannheim B (eds.) The dialogic emergence of culture. Urbana and Chicago: University of Illinois Press. 1–32. Ochs E & Schieffelin B (1996). ‘The impact of language socialization on grammatical development.’ In Fletcher P & MacWhinney B (eds.) Handbook of child language. New York: Blackwell. Schegloff E (1971). ‘Notes on a conversational practice: formulating place.’ In Sudnow D (ed.) Studies in social interaction. New York: Free Press. Schieffelin B, Woolard K & Kroskrity P (eds.) (1998). Language ideologies: practice and theory. New York: Oxford University Press. Silverstein M (1976). ‘Shifters, verbal categories and cultural description.’ In Basso K & Selby H (eds.) Meaning in anthropology. Albuquerque: University of new Mexico Press. 11–55. Silverstein M (1993). ‘Metapragmatic discourse and metapragmatic function.’ In Lucy J (ed.). 33–58. Silverstein M & Urban G (1996). Natural histories of discourse. Chicago: University of Chicago Press. Woolard K & Schieffelin B (1994). ‘Language ideology.’ Annual Review of Anthropology 22, 55–82.

Linguistic Decolonization A Jaffe, California State University, Long Beach, CA, USA ! 2006 Elsevier Ltd. All rights reserved.

Linguistic decolonization describes both the actions taken in postcolonial contexts to undo the social, political, and cultural effects of the dominance of colonial languages and a philosophical challenge to the Western language ideologies that underpinned the colonial project and that have persisted in the postcolonial period. A wide view of ‘colonization’ includes not only the classic cases of Western expansionism but also ‘internal colonialism’ involving indigenous and minority populations within the nation-state (see Minorities and Language). We can speak of linguistic decolonization in a multitude of contexts, ranging from new state formation in Africa and Asia and the former republics of the Soviet Union

to indigenous language planning in the Pacific, North and South America, to minority language movements in Western Europe. Given this vast scope, no pretense will be made here to cover all possible contexts and the vast literature in language planning and postcolonial studies; rather the aim is to outline some of the common features and challenges of documented processes of linguistic decolonization and what they have to say about language ideologies and policies in general (see Linguistic Rights). Linguistic decolonization always takes place within a nationalist project: either as an element of new nation-building, or as an effort to legitimate languages and identities that were unrecognized or actively suppressed under colonialism. Linguistic decolonization projects have thus been preoccupied with redressing linguistic inequality and cultural oppression in the public sphere, particularly in education and in official/governmental life, by replacing

Linguistic Decolonization 185 Dorian N (1989). Investigating obsolescence: studies in language contraction and death. New York: Cambridge University Press. Duranti A & Goodwin C (1992). Rethinking context: language as an interactive phenomenon. New York: Cambridge University Press. Gal S & Irvine J (1995). ‘The boundaries of languages and disciplines: how ideologies construct difference.’ Social Research 62, 967–1001. Gal S & Woolard K (eds.) (2001). Languages and publics: the making of authority. Manchester, UK: St. Jerome’s Publishers. Goffman E (1979). ‘Footing.’ Semiotica 25, 1–29. Gumperz J J (1982). Discourse strategies. New York: Cambridge University Press. Gumperz J J & Hymes D (eds.) (1972). Directions in sociolinguistics: the ethnography of communication. New York: Holt, Rinehart, & Winston. Gumperz J J & Levinson S (eds.) (1996). Rethinking linguistic relativity. New York: Cambridge University Press. Heller M (1988). Codeswitching: anthropological and sociolinguistic perspectives. New York: Mouton de Gruyter. Hill J & Irvine J (eds.) (1993). Responsibility and evidence in oral discourse. Cambridge: Cambridge University Press. Hill J & Mannheim B (1992). ‘Language and world view.’ Annual Review of Anthropology 21, 381–406. Hymes D (1964). ‘Introduction.’ In Hymes D (ed.) Language in culture and society: a reader in linguistics and anthropology. New York: Harper & Row. 1–14. Irvine J (1996). ‘Shadow conversations: the indeterminacy of participant roles.’ In Silverstein M & Urban G (eds.). 131–159.

Irvine J & Gal S (2000). ‘Language ideology and linguistic differentiation.’ In Kroskrity P (ed.) Regimes of language. Santa Fe, NM: School of American Research. 35–84. Jakobson R (1960). ‘Concluding statement: linguistics and poetics.’ In Sebeok T (ed.) Style in language. Cambridge, MA: MIT Press. 350–377. Lucy J (ed.) (1993). Reflexive language: reported speech and metapragmatics. New York: Cambridge University Press. Mannheim B & Tedlock D (1995). ‘Introduction.’ In Tedlock D & Mannheim B (eds.) The dialogic emergence of culture. Urbana and Chicago: University of Illinois Press. 1–32. Ochs E & Schieffelin B (1996). ‘The impact of language socialization on grammatical development.’ In Fletcher P & MacWhinney B (eds.) Handbook of child language. New York: Blackwell. Schegloff E (1971). ‘Notes on a conversational practice: formulating place.’ In Sudnow D (ed.) Studies in social interaction. New York: Free Press. Schieffelin B, Woolard K & Kroskrity P (eds.) (1998). Language ideologies: practice and theory. New York: Oxford University Press. Silverstein M (1976). ‘Shifters, verbal categories and cultural description.’ In Basso K & Selby H (eds.) Meaning in anthropology. Albuquerque: University of new Mexico Press. 11–55. Silverstein M (1993). ‘Metapragmatic discourse and metapragmatic function.’ In Lucy J (ed.). 33–58. Silverstein M & Urban G (1996). Natural histories of discourse. Chicago: University of Chicago Press. Woolard K & Schieffelin B (1994). ‘Language ideology.’ Annual Review of Anthropology 22, 55–82.

Linguistic Decolonization A Jaffe, California State University, Long Beach, CA, USA ! 2006 Elsevier Ltd. All rights reserved.

Linguistic decolonization describes both the actions taken in postcolonial contexts to undo the social, political, and cultural effects of the dominance of colonial languages and a philosophical challenge to the Western language ideologies that underpinned the colonial project and that have persisted in the postcolonial period. A wide view of ‘colonization’ includes not only the classic cases of Western expansionism but also ‘internal colonialism’ involving indigenous and minority populations within the nation-state (see Minorities and Language). We can speak of linguistic decolonization in a multitude of contexts, ranging from new state formation in Africa and Asia and the former republics of the Soviet Union

to indigenous language planning in the Pacific, North and South America, to minority language movements in Western Europe. Given this vast scope, no pretense will be made here to cover all possible contexts and the vast literature in language planning and postcolonial studies; rather the aim is to outline some of the common features and challenges of documented processes of linguistic decolonization and what they have to say about language ideologies and policies in general (see Linguistic Rights). Linguistic decolonization always takes place within a nationalist project: either as an element of new nation-building, or as an effort to legitimate languages and identities that were unrecognized or actively suppressed under colonialism. Linguistic decolonization projects have thus been preoccupied with redressing linguistic inequality and cultural oppression in the public sphere, particularly in education and in official/governmental life, by replacing

186 Linguistic Decolonization

all or part of the colonial language’s public functions with one or more local, indigenous, or minority languages. Projects of linguistic decolonization are thus profoundly shaped by dominant ideologies of language and nationalist ideologies about how language is related to cultural and political identities. These ideologies include the central premises that languages are ‘natural’ and clearly bounded entities that map onto equally natural human boundaries; that there is an essential or primordial link between a single language (conceived of as the ‘mother tongue’) and a single/unitary identity (either personal or collective). A combination of social, ideological, and pragmatic issues complicate the process of linguistic decolonization and its goals of democratization and cultural legitimation for previously colonized groups. On the practical level, in postcolonial contexts it is rare that linguistic decolonizers have access to all the material and political resources needed to replace the status and functions of the colonial language in all domains. It is also the case that both local and global political economies of language are resistant to challenges to those powerful, former colonial languages. This is because those languages often still constitute social and political capital at the local level, where they can be used by local elites to legitimate their social positions. Second, those languages (particularly English) have currency in new, global markets, motivating postcolonial social actors to embrace dominant language education in search of economic mobility despite the cultural value of schooling in the minority or indigenous language. It is also the case that when minority or indigenous languages are ‘developed’ in the image of the colonial languages they replace, language planning efforts (normalization, officialization, standardization, codification) have the potential to create new forms of linguistic hierarchy. First of all, language planning policies may favor particular languages or dialects and thus their speakers. In some cases, those hierarchies are accepted by the population, to the benefit of particular speakers. In other cases, this political dimension of language policy ends up creating a permanent crisis of legitimacy that prevents postcolonial language planners from reaching any consensus. More generally, the introduction of minority or indigenous languages in the spheres of literacy and education inevitably creates a divide between ‘good/pure’ and ‘bad/mixed’ codes and validates new forms of expert linguistic knowledge owned by a minority. With respect to linguistic purism in postcolonial contexts, two points can be made. First, it is an outcome of dominant ideologies of authoritative

language (see Power and Pragmatics). Second, as a result of contact with the dominant language, both the form and use of the dominated language are inevitably ‘mixed’ and therefore stigmatized, once ‘high’ forms of that language are introduced. The issue of linguistic purism highlights the difficulty of resisting or challenging the single language/single identity premise of the nationalist project. At the political level, agents of linguistic decolonization are often forced to legitimate noncolonial languages in dominant terms because those terms are imposed by powerful gatekeepers. In addition, there is the question of the ‘colonized mentality’: the fact that the validity of dominant ideologies of language, including the denigration of nondominant languages, has often been internalized by the general population. Postcolonial linguistic agents are often faced with a double-bind: if they use the colonial language, they are seen as traitors to their cultural/ethnic group; if they use the dominated language, their voice has a more limited power and reach (see Pragmatics: Linguistic Imperialism). Truly radical linguistic decolonization projects would thus have to challenge the fundamental premises of dominant linguistic and cultural ideologies and practices. These radical forms of resistance involve the legitimation of plural or hybrid linguistic forms, practices, and identities, including the appropriation and reworking of colonial languages. This kind of linguistic decolonization is rare in the public sphere, although it can be seen in some indigenous education projects and is more noticeable in domains of artistic, creative linguistic practice. See also: Language Adaptation and Modernization; Language Development: Overview; Minorities and Language; Minority Languages: Education; Nationalism and Linguistics; Power and Pragmatics; Pragmatics: Linguistic Imperialism; Standardization.

Bibliography Akkari A (1998). ‘Bilingual education: beyond linguistic instrumentalization.’ Bilingual Research Journal 22(2, 3, 4), 103–125. Altbach P G (1971). ‘Education and neocolonialism.’ In Ashcroft B, Griffiths G & Tiffin H (eds.) The postcolonial studies reader. London: Routledge. 447–451. Apter A (1982). ‘National language planning in plural societies: the search for a framework.’ Language Problems and Language Planning 6(3), 219–240. Blommaert J (1996). ‘Language planning as a discourse on language and society: the linguistic ideology of a scholarly tradition.’ Language Problems and Language Planning 20(3), 199–222.

Linguistic Ethnonationalism 187 Braithwaite E K (1984). History of the voice: the development of nation language in anglophone Caribbean poetry. London: New Beacon. Calvet L J (1974). Linguistique et colonialisme: petit traite´ de glottophagie. Paris: Payot. Eastman C (1992). ‘Sociolinguistics in Africa: language planning.’ In Herbert R K (ed.) Language and society in Africa. Johannesburg: Witwatersrand University Press. 95–114. Eastman T (1995). ‘The national longing for form.’ In Ashcroft B, Griffiths G & Tiffin H (eds.) The postcolonial studies reader. London: Routledge. 170–175. Fabian J (1986). Language and colonial power: the appropriation of Swahili in the former Belgian Congo 1880–1938. Cambridge: Cambridge University Press. Fanon F (1952). Black skin, white masks (Markmann C F (trans.) (1968)). London: Mac Gibbon and Kee. Fanon F (1991). The wretched of the earth (Farrington C (trans.) (1963)). New York: Grove Press. Fardon R & Furniss G (1994). African languages, development and the state. London: Routledge. Fishman J, Ferguson C & Das Gupta (eds.) (1968). Language problems of developing nations. New York: John Wiley and Sons.

Kachru B B (1986). The alchemy of English: the spread, functions and models of non-native Englishes. Oxford: Pergamon Press. Laitin D (1992). Language repertoires. New York: Cambridge University Press. New W H (1978). ‘New language, new world.’ In Narasimhaiah C D (ed.) Awakened conscience: studies in Commonwealth literature. London: Heinemann. O’Brien E (1997). ‘At the frontier of language: literature, theory, politics.’ Minerva – An online Journal of Philosophy 1. http://www.ul.ie/!philos. Ogunjimi B (1995). ‘Currents on language debates in Africa.’ Africa Update 23. http://www.ccsu.edu/afstudy/ upd2–3.html#Z3. Pennycook A (2001). Critical applied linguistics. Mahwah NJ: Erlbaum. Talib I S (2002). The language of postcolonial literatures. London: Routledge. Wa Thiong’o N (1981). Decolonising the mind: the politics of language in African literature. London: James Curry. Weinstein B (1990). Language policy and political development. Norwood NJ: Ablex. Williams G (1992). Sociolinguistics: a sociological critique. London: Routledge.

Linguistic Ethnonationalism P Eisenlohr, Washington University, St. Louis, MO, USA ! 2006 Elsevier Ltd. All rights reserved.

The scholarly treatment of linguistic ethnonationalism is intimately linked to the question of the origins of the nation. A main division in scholarship on nationalism, between those who conceive the nation as an entirely modern phenomenon and those who trace its roots further back in time, has also resulted in contrasting positions regarding the role of language and linguistic practice in the rise of the nation. In the European context, proponents of the latter position have tended to treat linguistic difference as one of the principal markers of preexisting ‘ethnic’ difference that later developed into full-fledged nationhood in the 18th and 19th centuries. Anthony Smith (1989), for example, described awareness of linguistic boundaries and of sharing a vernacular language with emerging literary traditions as one of the features of ‘ethnic cores’ established in the premodern period, which provided a base for modern nationalism later on. Josep Llobera has argued that in the case of the territories later developing into England, France, Germany, Italy, and Spain, a sense of nationhood, even if restricted to a small part of the population

and varying in salience among those territories, was already in place in the late Middle Ages. These early notions of nationhood also rested in part on the sharing of what were identified as common vernacular languages (Llobera, 1994). In contrast, modernists such as Benedict Anderson and Ernest Gellner categorically reject continuities between premodern forms of political identification and modern nationhood, nor have they treated premodern apprehensions of linguistic difference as significant for the constitution of modern nations. Instead, for them the importance of language and linguistic practice in the creation of modern nationhood lies above all in their integrative functions in processes of vernacular standardization and mass communication through print, a line of inquiry initially associated with Karl Deutsch (1953). There is little doubt that a concept of the nation existed in parts of Europe before the 18th century, as for example in the case of Spain, where in the early 17th century nationhood was already explicitly linked to the concept of a distinct vernacular language (Woolard, 2004). However, both modernists and those insisting on medieval or early modern origins of the nation tend to agree that nationalism as a political ideology demanding a form of popular sovereignty rather than a dynastic polity cannot be traced further back than the 18th century, and this also

Linguistic Ethnonationalism 187 Braithwaite E K (1984). History of the voice: the development of nation language in anglophone Caribbean poetry. London: New Beacon. Calvet L J (1974). Linguistique et colonialisme: petit traite´ de glottophagie. Paris: Payot. Eastman C (1992). ‘Sociolinguistics in Africa: language planning.’ In Herbert R K (ed.) Language and society in Africa. Johannesburg: Witwatersrand University Press. 95–114. Eastman T (1995). ‘The national longing for form.’ In Ashcroft B, Griffiths G & Tiffin H (eds.) The postcolonial studies reader. London: Routledge. 170–175. Fabian J (1986). Language and colonial power: the appropriation of Swahili in the former Belgian Congo 1880–1938. Cambridge: Cambridge University Press. Fanon F (1952). Black skin, white masks (Markmann C F (trans.) (1968)). London: Mac Gibbon and Kee. Fanon F (1991). The wretched of the earth (Farrington C (trans.) (1963)). New York: Grove Press. Fardon R & Furniss G (1994). African languages, development and the state. London: Routledge. Fishman J, Ferguson C & Das Gupta (eds.) (1968). Language problems of developing nations. New York: John Wiley and Sons.

Kachru B B (1986). The alchemy of English: the spread, functions and models of non-native Englishes. Oxford: Pergamon Press. Laitin D (1992). Language repertoires. New York: Cambridge University Press. New W H (1978). ‘New language, new world.’ In Narasimhaiah C D (ed.) Awakened conscience: studies in Commonwealth literature. London: Heinemann. O’Brien E (1997). ‘At the frontier of language: literature, theory, politics.’ Minerva – An online Journal of Philosophy 1. http://www.ul.ie/!philos. Ogunjimi B (1995). ‘Currents on language debates in Africa.’ Africa Update 23. http://www.ccsu.edu/afstudy/ upd2–3.html#Z3. Pennycook A (2001). Critical applied linguistics. Mahwah NJ: Erlbaum. Talib I S (2002). The language of postcolonial literatures. London: Routledge. Wa Thiong’o N (1981). Decolonising the mind: the politics of language in African literature. London: James Curry. Weinstein B (1990). Language policy and political development. Norwood NJ: Ablex. Williams G (1992). Sociolinguistics: a sociological critique. London: Routledge.

Linguistic Ethnonationalism P Eisenlohr, Washington University, St. Louis, MO, USA ! 2006 Elsevier Ltd. All rights reserved.

The scholarly treatment of linguistic ethnonationalism is intimately linked to the question of the origins of the nation. A main division in scholarship on nationalism, between those who conceive the nation as an entirely modern phenomenon and those who trace its roots further back in time, has also resulted in contrasting positions regarding the role of language and linguistic practice in the rise of the nation. In the European context, proponents of the latter position have tended to treat linguistic difference as one of the principal markers of preexisting ‘ethnic’ difference that later developed into full-fledged nationhood in the 18th and 19th centuries. Anthony Smith (1989), for example, described awareness of linguistic boundaries and of sharing a vernacular language with emerging literary traditions as one of the features of ‘ethnic cores’ established in the premodern period, which provided a base for modern nationalism later on. Josep Llobera has argued that in the case of the territories later developing into England, France, Germany, Italy, and Spain, a sense of nationhood, even if restricted to a small part of the population

and varying in salience among those territories, was already in place in the late Middle Ages. These early notions of nationhood also rested in part on the sharing of what were identified as common vernacular languages (Llobera, 1994). In contrast, modernists such as Benedict Anderson and Ernest Gellner categorically reject continuities between premodern forms of political identification and modern nationhood, nor have they treated premodern apprehensions of linguistic difference as significant for the constitution of modern nations. Instead, for them the importance of language and linguistic practice in the creation of modern nationhood lies above all in their integrative functions in processes of vernacular standardization and mass communication through print, a line of inquiry initially associated with Karl Deutsch (1953). There is little doubt that a concept of the nation existed in parts of Europe before the 18th century, as for example in the case of Spain, where in the early 17th century nationhood was already explicitly linked to the concept of a distinct vernacular language (Woolard, 2004). However, both modernists and those insisting on medieval or early modern origins of the nation tend to agree that nationalism as a political ideology demanding a form of popular sovereignty rather than a dynastic polity cannot be traced further back than the 18th century, and this also

188 Linguistic Ethnonationalism

applies to forms of nationalism drawing on perceptions of linguistic difference as a justification for popular sovereignty. The first comprehensive formulation of linguistic ethnonationalism is commonly traced to Herder’s critique of Kant’s transcendental philosophy of subjectivity. According to Herder (1968), intellectual production and human subjectivity are not to be understood as based on a putatively universal pure reason but as grounded in and created out of particular linguistic traditions, which constitute humankind as a plurality of ‘peoples.’ Accordingly, the status of a group as a ‘people’ with a claim to nationhood is upheld by cultural difference, which is inseparably linked to a particular linguistic tradition. The ‘spirit’ of a nationalized ‘people’ is thus both expressed and shaped through cultural creativity necessarily bound to the use of a certain language. Herder claimed that language is not a transparent medium of cultural production, but argued that the particular characteristics of a linguistic variety or linguistic tradition have a profoundly shaping impact on the cultural life of a ‘people.’ Thus, according to Herder, language necessarily mediates cultural traditions not in constituting a neutral means of expressing them and conveying them to others, but in the sense of playing a productive role in cultural life and leaving an indelible imprint of its particularities on everything it cocreates and mediates. This formulation has been extremely influential in the formation of German nationalism and other nationalisms of central and eastern Europe, and continues to inspire nationalist ideologues in a wide range of locations today. However, it is precisely the inevitability of the link suggested by Herder between a particular linguistic tradition and a particular sense of nationhood, which has the effect of naturalizing language-based nationhood, that has been widely rejected among scholars of nationalism, even by those insisting on premodern origins of the nation. Instead, the way apprehensions of linguistic difference have constituted and legitimized claims to nationhood has been shown to be a historically contingent process and subject to dynamic cultural construction and contestation. However, scholars have differed considerably in how much latitude they have assigned to such processes of construction. Another important division in approaches to linguistic ethnonationalism is between those emphasizing politicized images of linguistic belonging and difference as creative forces in the rise of nationalism since the 18th century versus those authors privileging the role of language in nationalism in its functional roles in modern systems of mass communication.

‘Modernist’ Theories of Language and Nationalism Ernest Gellner’s and Benedict Anderson’s approaches to nationalism tend to stress the latter, ‘modernist’ position. For Gellner (1983), the emergence of the nation is a product of industrialization in the modern world. In order for industrial civilization to be possible, the multitude of local cultural and linguistic particularities characteristic of agrarian societies has to be superseded by large-scale cultural and linguistic uniformization. In particular, the spread of modern standardized vernacular languages protected by statesponsored educational systems is described by Gellner as functionally necessary for the workings of industrialization. Nationalism emerges as a new ideology serving an integrative function for the larger social aggregates formed by industrial processes of production as well as vernacular literacy and mass communication, which crucially underpin them. Thus, nations are modern inventions, and can only arise under conditions of industrialization, while the cultural contents of nationalisms and the kind of standardized languages they are based on are ultimately arbitrary. Benedict Anderson depicted a related scenario in explaining the rise of nationalism, in which language also plays a central role (Anderson, 1991). Anderson shifted the focus from industrialization to print capitalism in arguing that the circulation of books, newspapers, and pamphlets in vernacular languages in Europe and its settler colonies in the Americas since the early modern period led to the creation of ‘imagined communities’ of nations transcending faceto-face interaction in new circuits of mass mediated communication. Thus, nations are essentially conceived as reading publics separated from each other by the use of different print vernaculars. The association of ethnolinguistic nationalism with the relative decline of sacred languages and the rise of standardized vernaculars is a common theme in these approaches, postulating a close relationship between language and nationalism while explaining the latter as the result of a transformation of patterns of mass communication under conditions of capitalism or industrialization. In this context, both Anderson and Gellner tended to treat the spread of standardized vernaculars more as a matter of functional and administrative convenience rather than a political process driven at least in part by ideologies of linguistic difference. In Anderson’s view, language is important because print languages function as channels of communication regimenting access to and exclusion from the national community, while constituting a shared medium through which ideas of the nation can be disseminated. Both Anderson and Gellner

Linguistic Ethnonationalism 189

tended to downplay how linguistic ethnonationalism privileges language as an ideological site where the imagination of the nation is produced and boundaries of national communities are conceived and formulated. The ways in which language, especially standardized vernacular language, emerges as an ideological prism through which membership of such communities is defined and justified are not just confined to the communicative separation of different reading publics or social aggregates of industrialism due to vernacular linguistic difference. Also, vernacular standardization has often been not only a matter of administrative convenience but also an eminently political process, mediating membership in and exclusion from national communities. Anderson, for example, described the rise of standard French and English as relatively unplanned and driven by ‘pragmatic’ reasons (1991: 40–42), also stating that in 19th-century France and England, ‘‘for quite extraneous reasons, there happened to be, by mid-century, a relatively high coincidence of language-of-state and language of the population’’ (1991: 78). However, the political and military events behind such coincidence need also to be addressed in an understanding of linguistic ethnonationalism. As the work of Eugen Weber shows, a highly ideologized state-driven process of linguistic unification and suppression of regional languages was still ongoing in late 19th-century France, as ‘‘the Third Republic found a France in which French was a foreign language to half its citizens’’ (Weber, 1976: 70). That is, the spread of standardized vernaculars was also furthered through coercion against those now deemed minorities or foreigners in a national community conceived through the lens of political representations of linguistic difference. Thus, Anderson’s important work on the rise of the nation as a consequence of new forms of mass mediated communication needs to be complemented with a stronger emphasis on politically charged ideas of linguistic differentiation as motivating forces in the construction of new national communities.

Linguistic Anthropological Approaches The relationship between linguistic practice and senses of nationhood has been conceived in different terms in linguistic anthropology. Dell Hymes and John Gumperz argued in the 1960s that the use of a linguistic variety as a shared medium of communication does not necessarily turn those sharing a linguistic code into a community or ethnic unit. This tradition of skepticism regarding integrationist approaches assigning shared standardized vernaculars a key role in the formation of modern nations puts contemporary linguistic anthropology at odds

with the revival of the assumption in contemporary theories of nationalism that ‘shared language’ results in the creation of groupness. Instead, linguistic anthropologists have emphasized the importance of ideological mediation in establishing a link between language and nationality. In a research paradigm of ‘language ideology’ (Silverstein, 1979; Woolard and Schieffelin, 1994) linguistic anthropologists have described how such politically charged ideas about language and its users have resulted in the projection of national communities (Gal, 1993; Silverstein, 1996; Schieffelin et al., 1998; Errington, 2000). While concurring with the ‘modernist’ approach to nationalism in treating nations as dynamically constructed cultural phenomena of relatively recent origin, they also extend this perspective to the construction of languages and linguistic communities. Linguistic varieties and linguistic communities are not simply present ‘on the ground’ ready to be made use of by modern systems of mass communication to generate a sense of shared nationhood, they are frequently themselves the outcome of scholarly and administrative construction informed by nationalist ideologies (Gal, 1995). That is, linguistic anthropologists have insisted that vernacular standardization is a political process, which is often the result of the rise of the nation rather than merely its precondition. Urla’s study of Basque linguistic ethnonationalism is a case in point, not only showing that linguistic standardization of Basque was in part motivated by an emerging Basque nationalism, but also that a focus on Basque became central to Basque nationalism precisely because many of those considered members of a Basque nation turned out to have little knowledge of it. The sense of a language being ‘threatened’ can thus emerge as a catalyst for linguistic ethnonationalism, and the language portrayed as an emblem of the nation may not always be the language employed in the systems of mass communication used for national mobilization and the creation of a national public (Urla, 1988, 1993). The latter point is especially important in assessing approaches to linguistic ethnonationalism proposed by political scientists explaining it as a reaction to blocked social mobility and political exclusion due to imposed linguistic barriers in state institutions, as Inglehart and Woodward (1972) did for the 19thcentury Austro-Hungarian monarchy and, more recently, DeVotta (2004) did for Sri Lanka. According to this interpretation, linguistic ethnonationalism arises when members of a group sharing a vernacular language, often a minority or otherwise subordinate group within a state, face discrimination when the state they live in imposes another linguistic variety

190 Linguistic Ethnonationalism

as the sole medium of education and administration. As a counterreaction, influential members of the group start to agitate for a separate nation-state whose institutions would function in their vernacular. The new nation-state to be created would thus provide the opportunities for social mobility that are denied by the institutions of the state that members of such a minority have so far lived in. However, this model does little to account for frequently widespread bi- or multilingualism among linguistic minorities, nor for those scenarios where knowledge of and especially literacy skills in the dominant language of state are more common among members of a minority than corresponding skills in the minority language promoted as the emblem of a new nation. As Urla wrote about the origins of Basque linguistic ethnonationalism in the late 19th century, ‘‘when Sabino Arana y Goiri, the son of a Bilbao industrialist and founder of the PNV (Partido Nacionalista Vasca, Basque Nationalist Party), declared Euskera to be one of the defining features of the Basque nation, the first thing he had to do was to learn it himself’’ (Urla, 1993: 822). The Basque case exemplifies such a situation where bilingual and in many instances predominantly Spanish-using members of a Basque middle class did not face linguistic exclusion in a previously Spanish-only system of education and administration. It also does little to explain why, as in the Basque and the Austro- Hungarian cases, precisely those multilingual urban intellectuals with a full command of the dominant language of state were the initiators of separatist linguistic ethnonationalisms rather than the monolingual rural populations who were often portrayed as exemplary members of the new nations to be created. In other words, linguistic ethnonationalism needs to be taken seriously on its own terms, as the imagination and construction of communities through images of language and linguistic difference. It thus has to be understood as a politically creative force in its own right and as a form of regulating social life, and not be treated as the epiphenomenon of an essentially nonlinguistic political process of competition over scarce resources and state power. Also, the national boundaries thus created through ideologies of language cannot always be reduced to the boundaries between different reading publics, which, according to Anderson, rest on the ‘‘fatality of human linguistic diversity’’ (Anderson, 1991: 43). It is important to realize that linguistic ethnonationalism as a social and political force has frequently reshaped regimes of linguistic diversity by mapping social and political differentiation on perceived linguistic differences in novel ways. For some scholars, this circumstance has been nowhere more clearly illustrated than in

European attempts at linguistic description and reorganization in the colonial world.

The Colonial Context and the Worldwide Spread of Linguistic Ethnonationalism An important question in assessing the global spread of Herderian ideologies of linguistic ethnonationalism is whether the thesis of the modularity of nationalism generally postulated by Anderson (1991: 4) can also be extended to linguistic ethnonationalism. As such, the question of linguistic ethnonationalism in colonial and postcolonial contexts is bound up with wider debates about whether colonial and postcolonial nationalism exhibits irreducible differences from the European–American and European nationalisms that were its first historical instances, as, for example, suggested by Partha Chatterjee (1993). European colonial governments as well as other European nonstate actors active in the colonial world, such as missionary societies, embarked on extensive projects of identifying, describing, classifying, and standardizing vernacular languages of their subject populations, which frequently led to large-scale transformations of the linguistic and political situation in colonies (see also Errington, 2001). One of the main ways in which such colonially induced transformations came about was the introduction of new ideas of linking social to perceived linguistic differentiation. In particular, the Herderian principle of linguistic ethnicity, according to which a population is constituted as an ethnic group by virtue of sharing a common vernacular language, informed such linguistic classificatory work and language construction in the colonies. A second assumption derived from the European trajectory of linguistic ethnonationalism also had a large impact on the colonial reorganization of linguistic landscapes: the idea that a vernacular linguistic tradition necessarily has a standard form. Wherever this standard was not readily discernible, the task was to find and identify the ‘real’ standard form of vernacular varieties, which inevitably resulted in language construction. Colonial India provides a good example of how both these tendencies set off political processes whose outcomes did not always conform to the intentions of the colonizers. For example, the search for a ‘language of command’ for northern and central India by scholars of Fort William College in Calcutta led them on a search for the ‘real Hindustani,’ knowledge of which was expected to facilitate the rule and administration of large parts of India. Undeterred by the initial puzzlement of their Indian subjects, whose visions of linguistic differentiation were largely dominated by an opposition of standardized sacred (Sanskrit, Arabic) and imperial (Persian [Farsi])

Linguistic Ethnonationalism 191

languages on one hand, and largely unstandardized local vernacular varieties on the other, colonial scholars and administrators constructed a standard Hindustani. They thus contributed to the emergence of the Hindi–Urdu conflict, the partition of India, and the creation of Modern Standard Hindi as a national language of India later on (Cohn, 1985; Lelyveld, 1993). In India, the introduction of the Herderian principle of linguistic ethnicity through the colonial mapping of vernacular languages was institutionalized as an alternative to previously established modes of conceiving linguistic differentiation above all in terms of links between ranked endogamous descent groups (jati) and linguistic varieties in eminently local contexts (Washbrook, 1991). This also turned out to have important consequences for postcolonial politics. Linguistic ethnicity in the end provided the basis for political claims for distinct ‘homelands’ for users of particular standardized linguistic varieties, which resulted in a reorganization of federal states along assumed vernacular linguistic lines in postcolonial India. One of the ironies of the global spread of linguistic ethnonationalism is also that in many instances this new political principle originated in colonial attempts at linguistic classification and language construction, which were intended to facilitate and legitimate colonial rule, only to provide a platform for the emergence of anticolonial nationalisms later on. The colonial creation of a standard Malay, for example, not only provided a unified administrative language for a large and populous archipelago with great linguistic diversity, it also facilitated the emergence of an anticolonial public among local elites, who subsequently used the new language, now labeled Indonesian, against their Dutch colonizers, turning it into a key emblem of a newly imagined Indonesian nation whose citizens were expected to become users of the language (Errington, 2000). Related processes were at work in colonial Africa, where the construction of standardized vernaculars for missionary or administrative purposes often led colonial administrators to assign populations distinct ‘ethnic’ or ‘tribal’ identities on the basis of such newly defined linguistic affiliations. These practices had important political consequences once such classifications became established and accepted by large numbers of those subjected to such colonial mapping of social and political difference (Harries, 1988; Irvine, 1993; Meeuvis, 1999). Linguistic anthropologists have seen in these colonial policies particularly illustrative instances of the creative power of linguistic ideologies, where politically charged images of language and linguistic differentiation motivated the creation of new social aggregates with

claims to ethnic and national identities (Irvine and Gal, 2000). Nevertheless, it would be inadequate to speak of a simple imposition of European assumptions of linguistic ethnicity and linguistic ethnonationalism by colonial actors on colonized populations, since the outcome of colonial vernacular language standardization and language construction often resulted in cultural forms profoundly different from European understandings of linguistic ethnonationalism. Sumathi Ramaswamy’s study of Tamil languagefocused nationalism provides a vivid analysis of how colonial perspectives interacted with local visions of language and group identities in unforeseen ways (Ramaswamy, 1997). In taking up European characterizations and comparative analyses of Tamil, Tamil activists linked the Herderian idea of linguistic ethnicity to established notions of devotional practice. They thus arrived at a vision of the Tamil language as a female deity, Tamiltay, and Tamils as a family of faithful children determined to protect a language simultaneously conceived as deity and mother. Ramaswamy has argued that this colonial and postcolonial form of linguistic ethnonationalism should better be understood as ‘language devotionalism.’ The Tamil scenario provides a compelling case for reexamining and refining Anderson’s thesis of the modularity of nationalism and its language-based aspects. Ramaswamy’s work suggests that the spread of nationalism in the colonial and postcolonial worlds should be conceptualized as the result of complex articulations of European understandings of nationality and linguistic belonging with precolonial traditions of establishing links between linguistic practice and social identifications rather than a directly modular adoption of European forms of belonging.

Contemporary Movements of Language Renewal and Linguistic Ethnonationalism Linguists have recently voiced concern about the increasing loss of linguistic diversity in the contemporary world, seeking to mobilize public opinion in Western countries in favor of supporting the continued use of those languages with often smaller numbers of users undergoing language shift. In doing so, they have often drawn on discourses of biodiversity, literally describing these languages as ‘endangered’ species which need to be saved from extinction. Another theme in professional discourses of ‘endangered’ languages is cosmopolitan concerns about the necessity to prevent the loss of intellectual production, feared by many as a consequence of the eventual demise of many lesser-used languages, which

192 Linguistic Ethnonationalism

would benefit humanity in its entirety. However, actual investigation of the social and political contexts of more successful instances of reversing language shift shows that such cosmopolitan views about the necessity to preserve knowledge for ‘humanity’ or the benefits of linguistic diversity portrayed as equivalent to those of biodiversity have played only minor roles. Instead, language activists have often rallied in favor of continued use of a language in response to perceived threats to positively valued forms of groupness, often cast in ethnic or national terms. Thus, such movements for reversing language shift have frequently drawn on the Herderian theme of a necessary link between a particular linguistic tradition and the constitution of a population as an ethnic group with the ethnonationalist claims often implied in a politics of recognition. Furthermore, such contemporary language renewal movements also represent an interesting parallel to earlier scenarios of linguistic ethnonationalism in their extensive use of mass-mediated communication, now also in its electronic and digital forms (Eisenlohr, 2004). First, for example, in digitally recording linguistic practice in an ‘endangered’ language for purposes of documentation and instruction, language activists and linguists also engage in the production and archiving of electronic artifacts in a way that recalls earlier modern forms of storing and displaying the heritage of a ‘people,’ such as the museum and the archive (Silverstein, 2003). Such techniques of collecting and curating have in turn been closely linked to the history of the rise of nationalism, the display and archiving of ‘traditional’ objects and art forms being one of the ways of producing the imagined community of the nation (Anderson, 1991). In creating and selecting documented linguistic material for use in projects of language revitalization, language activists and supporting linguists often define the linguistic ‘tradition’ of the people who are interested in the renewal of a language they identify as ‘theirs.’ They thus play a key role in constructing boundaries of communities through representations of linguistic difference in a manner recalling the roles played by lexicographers, folklorists, and grammarians in 19thcentury European nationalism (Anderson, 1991, Hobsbawm, 1990). Second, in their attempts to create mass-mediated publics for users of lesser-used languages, language activists have often sought to promote the use of such varieties by replicating the creation of new forms of community and political solidarity often associated with the circulation of mass-mediated discourse in a shared vernacular. On the other hand, it is not just that new electronic practices of mass mediation are ways of disseminating and circulating discourse

differently, but also, in the eyes of its consumers, mediated discourse is often invested with particular qualities since such practices of mass mediation are frequently associated with notions of modernity and sophistication. Language activists frequently draw on such evaluations of electronically mass-mediated discourse by seeking to transform valuations of a lesser-used language by using it on television, radio, CD-ROM, DVD, and the Internet. Thus, the use of languages to be revitalized in electronic media is not only aimed at the creation of new publics but also at changing images of the language among its users through a demonstration that the linguistic variety and its users are indeed part of a modern world. Nevertheless, the example of language activism on behalf of lesser-used languages shows that linguistic ethnonationalism can also focus on ‘heritage’ or ‘ancestral’ languages, which are only rarely used in everyday interaction or in practices of literacy, as is for example the case among several Native American groups in the United States and Canada. Such scenarios, not infrequently found among users of a language in advanced stages of language shift, demonstrate that given the appropriate ideological background, the particular linguistic variety adopted as an emblem of ethnonationality need not always be related to those varieties used either in vernacular practice or in mass mediation of discourse.

Bibliography Anderson B (1991). Imagined communities: reflections on the origin and spread of nationalism. London: Verso. Chatterjee P (1993). The nation and its fragments: colonial and postcolonial histories. Princeton, NJ: Princeton University Press. Cohn B S (1985). ‘The command of language and the language of command.’ In Guha R (ed.) Subaltern studies: writings on south Asian history and society 4. Delhi: Oxford University Press. 276–329. Deutsch K W (1953). Nationalism and social communication: an inquiry into the foundations of nationality. Cambridge, MA: MIT Press/New York: John Wiley. DeVotta N (2004). Blowback: linguistic nationalism, institutional decay, and ethnic conflict in Sri Lanka. Palo Alto, CA: Stanford University Press. Eisenlohr P (2004). ‘Language revitalization and new technologies: cultures of electronic mediation and the refiguring of communities.’ Annual Review of Anthropology 33, 21–45. Errington J (2000). ‘Indonesian(’s) authority.’ In Kroskrity (ed.). 205–228. Errington J (2001). ‘Colonial linguistics.’ Annual Review of Anthropology 30, 19–39. Gal S (1993). ‘Diversity and contestation in linguistic ideologies: German speakers in Hungary.’ Language in Society 22, 337–359.

Linguistic Features 193 Gal S (1995). ‘Lost in a Slavic sea: linguistic theories and expert knowledge in 19th century Hungary.’ Pragmatics 5, 155–166. Gellner E (1983). Nations and nationalism. Ithaca, NY: Cornell University Press. Harries P (1988). ‘The roots of ethnicity: discourse and the politics of language construction in Southeast Africa.’ African Affairs 87, 25–54. Herder J G ([1784–1791] 1968). Reflections on the philosophy of the history of mankind. Chicago: University of Chicago Press. Hobsbawm E J (1990). Nations and nationalism since 1780: programme, myth, reality. Cambridge: Cambridge University Press. Inglehart R & Woodward M ([1967] 1972). ‘Language conflicts and the political community.’ In Giglioli P P (ed.) Language and social context. Harmondsworth: Penguin. Irvine J (1993). ‘Mastering African languages: the politics of linguistics in nineteenth century Senegal.’ Social Analysis 33, 27–46. Irvine J & Gal S (2000). ‘Language ideology and linguistic differentiation.’ In Kroskrity (ed.). 35–84. Kroskrity P V (ed.) (2000). Regimes of language: ideologies, polities and identities. Santa Fe: School of American Research Press. Lelyveld D (1993). ‘The fate of Hindustani: colonial knowledge and the project of a national language.’ In Breckenridge C A & van der Veer P (eds.) Orientalism and the postcolonial predicament. Philadelphia: University of Pennsylvania Press. 189–214. Llobera J (1994). The god of modernity: the development of nationalism in Western Europe. Oxford: Berg. Meeuvis M (1999). ‘Flemish nationalism in the Belgian Congo versus Zairian anti-imperialism: continuity and discontinuity in language ideological debates.’ In Blommaert J (ed.) Language ideological debates. Berlin: Mouton de Gruyter. 381–424.

Ramaswamy S (1997). Passions of the tongue: language devotion in Tamil India. Berkeley: University of California Press. Schieffelin B B, Woolard K A & Kroskrity P V (eds.) (1998). Language ideology: practice and theory. Oxford: Oxford University Press. Silverstein M (1979). ‘Language structure and linguistic ideology.’ In Clyne P R, Hanks W F & Hofbauer C L (eds.) The elements: a parasession on linguistic units and levels. Chicago: Chicago Linguistic Society. 193–247. Silverstein M (1996). ‘Monoglot ‘‘standard’’ in America: standardization and metaphors of linguistic hegemony.’ In Brenneis D & Macaulay R K S (eds.) The matrix of language: contemporary linguistic anthropology. Boulder, CO: Westview Press. Silverstein M (2003). ‘The whens and wheres – as well as hows – of ethnolinguistic recognition.’ Public Culture 15, 531–558. Smith A D (1989). ‘The origins of nations.’ Ethnic and Racial Studies 12(3), 340–367. Urla J (1988). ‘Ethnic protest and social planning: a look at Basque language revival.’ Cultural Anthropology 3, 379–394. Urla J (1993). ‘Cultural politics in an age of statistics: numbers, nations and the making of Basque identity.’ American Ethnologist 20(4), 818–843. Washbrook D (1991). ‘‘‘To each a language of his own’’: language, culture and society in colonial India.’ In Corfield P J (ed.) Language, history and class. Oxford: Blackwell. 179–203. Weber E (1976). Peasants into Frenchmen: the modernization of rural France, 1870–1914. Palo Alto, CA: Stanford University Press. Woolard K A (2004). ‘Is the past a foreign country? Time, language origins, and the nation in early modern Spain.’ Journal of Linguistic Anthropology 14(1), 57–80. Woolard K A & Schieffelin B (1994). ‘Language ideology.’ Annual Review of Anthropology 23, 55–82.

Linguistic Features G G Corbett, University of Surrey, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.

When attempting to model and to understand the complexity of natural language, linguists typically have recourse to features. At the simplest level, features are used to factor out common properties. We can recognize a feature NUMBER, with the value ‘plural,’ as present in forms like books, loaves, men, oxen. Using a feature captures the idea that the plural items are in some sense the same (for example, for agreement purposes, since all take these rather than

this) even though number is realized differently. Other examples include TENSE (present, past, etc.) and PERSON (1st, 2nd, 3rd). Features show consistency across entities, and to some extent across languages. They have proved invaluable for analysis and description, and have a major role in contemporary linguistics, from the most abstract theorizing to the most applied computational work. Our examples have been of morphosyntactic features. Features may also be phonological (specifying, for example, the height or backness of a vowel), morphological (specifying the inflectional class of an item), syntactic (for syntactic categories such as V or N) or semantic (such as ANIMACY).

Linguistic Features 193 Gal S (1995). ‘Lost in a Slavic sea: linguistic theories and expert knowledge in 19th century Hungary.’ Pragmatics 5, 155–166. Gellner E (1983). Nations and nationalism. Ithaca, NY: Cornell University Press. Harries P (1988). ‘The roots of ethnicity: discourse and the politics of language construction in Southeast Africa.’ African Affairs 87, 25–54. Herder J G ([1784–1791] 1968). Reflections on the philosophy of the history of mankind. Chicago: University of Chicago Press. Hobsbawm E J (1990). Nations and nationalism since 1780: programme, myth, reality. Cambridge: Cambridge University Press. Inglehart R & Woodward M ([1967] 1972). ‘Language conflicts and the political community.’ In Giglioli P P (ed.) Language and social context. Harmondsworth: Penguin. Irvine J (1993). ‘Mastering African languages: the politics of linguistics in nineteenth century Senegal.’ Social Analysis 33, 27–46. Irvine J & Gal S (2000). ‘Language ideology and linguistic differentiation.’ In Kroskrity (ed.). 35–84. Kroskrity P V (ed.) (2000). Regimes of language: ideologies, polities and identities. Santa Fe: School of American Research Press. Lelyveld D (1993). ‘The fate of Hindustani: colonial knowledge and the project of a national language.’ In Breckenridge C A & van der Veer P (eds.) Orientalism and the postcolonial predicament. Philadelphia: University of Pennsylvania Press. 189–214. Llobera J (1994). The god of modernity: the development of nationalism in Western Europe. Oxford: Berg. Meeuvis M (1999). ‘Flemish nationalism in the Belgian Congo versus Zairian anti-imperialism: continuity and discontinuity in language ideological debates.’ In Blommaert J (ed.) Language ideological debates. Berlin: Mouton de Gruyter. 381–424.

Ramaswamy S (1997). Passions of the tongue: language devotion in Tamil India. Berkeley: University of California Press. Schieffelin B B, Woolard K A & Kroskrity P V (eds.) (1998). Language ideology: practice and theory. Oxford: Oxford University Press. Silverstein M (1979). ‘Language structure and linguistic ideology.’ In Clyne P R, Hanks W F & Hofbauer C L (eds.) The elements: a parasession on linguistic units and levels. Chicago: Chicago Linguistic Society. 193–247. Silverstein M (1996). ‘Monoglot ‘‘standard’’ in America: standardization and metaphors of linguistic hegemony.’ In Brenneis D & Macaulay R K S (eds.) The matrix of language: contemporary linguistic anthropology. Boulder, CO: Westview Press. Silverstein M (2003). ‘The whens and wheres – as well as hows – of ethnolinguistic recognition.’ Public Culture 15, 531–558. Smith A D (1989). ‘The origins of nations.’ Ethnic and Racial Studies 12(3), 340–367. Urla J (1988). ‘Ethnic protest and social planning: a look at Basque language revival.’ Cultural Anthropology 3, 379–394. Urla J (1993). ‘Cultural politics in an age of statistics: numbers, nations and the making of Basque identity.’ American Ethnologist 20(4), 818–843. Washbrook D (1991). ‘‘‘To each a language of his own’’: language, culture and society in colonial India.’ In Corfield P J (ed.) Language, history and class. Oxford: Blackwell. 179–203. Weber E (1976). Peasants into Frenchmen: the modernization of rural France, 1870–1914. Palo Alto, CA: Stanford University Press. Woolard K A (2004). ‘Is the past a foreign country? Time, language origins, and the nation in early modern Spain.’ Journal of Linguistic Anthropology 14(1), 57–80. Woolard K A & Schieffelin B (1994). ‘Language ideology.’ Annual Review of Anthropology 23, 55–82.

Linguistic Features G G Corbett, University of Surrey, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.

When attempting to model and to understand the complexity of natural language, linguists typically have recourse to features. At the simplest level, features are used to factor out common properties. We can recognize a feature NUMBER, with the value ‘plural,’ as present in forms like books, loaves, men, oxen. Using a feature captures the idea that the plural items are in some sense the same (for example, for agreement purposes, since all take these rather than

this) even though number is realized differently. Other examples include TENSE (present, past, etc.) and PERSON (1st, 2nd, 3rd). Features show consistency across entities, and to some extent across languages. They have proved invaluable for analysis and description, and have a major role in contemporary linguistics, from the most abstract theorizing to the most applied computational work. Our examples have been of morphosyntactic features. Features may also be phonological (specifying, for example, the height or backness of a vowel), morphological (specifying the inflectional class of an item), syntactic (for syntactic categories such as V or N) or semantic (such as ANIMACY).

194 Linguistic Features

The notion of feature emerged in discussions on the nature of the phoneme, particularly by Trubetzkoy and Jakobson, and this research was crystallized in Jakobson et al. (1952). In the 1960s, features were given an important place in lexical semantics (Katz and Fodor, 1963), in syntax (Harman, 1963; Katz and Postal, 1964; and Chomsky, 1965) and in morphology (Matthews, 1965). A major development was Generalized Phrase Structure Grammar (Gazdar et al., 1985), which brought together the linguistic work of Stockwell, Schachter, and Partee with the formal work of Martin Kay in an attempt at a fully articulated theory of features. In subsequent years, features have taken on an even greater role, yet conceptualization has lagged behind. The notion of default has been employed widely in linguistics, but with varying interpretations (distinguished in formal terms in Fraser and Corbett, 1997). In Lexical Functional Grammar and Head-Driven Phrase Structure Grammar, work on unification is central to the use of features. In Government and Binding Theory and Minimalism, the notion of checking performs a similar but significantly different role (see, for example, Chomsky, 2000). Even though the theoretical machinery available has been extended by the introduction of typed feature structures (see Carpenter, 1992), we still lack a convincing account for some basic phenomena. However, there are encouraging moves to bring formal accuracy together with a range of interesting data, to bring us closer to an adequate theory of linguistic features.

See also: Case; Feature Organization; Gender; Number;

Tense.

Bibliography Carpenter B (1992). The logic of typed feature structures: with applications to unification grammars, logic programs and constraint resolution. Cambridge: Cambridge University Press. Chomsky N (1965). Aspects of the theory of syntax. Cambridge: MIT Press. Chomsky N (2000). ‘Minimalist inquiries: the framework.’ In Martin R, Michaels D & Uriagereka J (eds.) Step by step: essays on minimalist syntax in honor of Howard Lasnik. Cambridge, MA: MIT Press. 89–155. Fraser N M & Corbett G G (1997). ‘Defaults in Arapesh.’ Lingua 103, 25–57. Gazdar G, Klein E, Pullum G K & Sag A (1985). Generalized Phrase Structure Grammar. Blackwell: Oxford. Harman G (1963). ‘Generative grammars without transformational rules: a defense of phrase structure.’ Language 39, 597–616. Jakobson R C, Fant G & Halle M (1952). Preliminaries to speech analysis: the distinctive features and their correlates. Cambridge: MIT Press. Katz J J & Fodor J A (1963). ‘The structure of a semantic theory.’ Language 39, 170–210. Katz J J & Postal P M (1964). An integrated theory of linguistic descriptions. Cambridge: MIT Press. Matthews P H (1965). ‘The inflectional component of a word-and-paradigm grammar.’ Journal of Linguistics 1, 139–171.

Acknowledgment The support of the ESRC under grant RES051270122 is gratefully acknowledged.

Linguistic Habitus I Gogolin ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the Concise Encyclopedia of Sociolinguistics, pp. 650–652, ! 1994, Elsevier Ltd.

Pierre Bourdieu’s (1981) concept of habitus denotes a modality, which enables the individual to act routinely as well as creatively and innovatively. Central to Bourdieu’s theory is the attempt to describe the dynamic relationships between the structural conditions of an individual existence, the individual’s

activities as a product of socialization under these conditions and the open-ended yet strictly limited capacity of the individual for action. In the process of socialization (see Socialization), a system of permanent dispositions is created in the individual, including sensitivity as one prerequisite for personal development. This system of dispositions and sensitivity is a necessary precondition for successful social activity. Bourdieu emphasizes a circularity between ‘structure,’ ‘habitus,’ and ‘practice.’ Habitus functions as an awareness-matrix, an action-matrix, and a

194 Linguistic Features

The notion of feature emerged in discussions on the nature of the phoneme, particularly by Trubetzkoy and Jakobson, and this research was crystallized in Jakobson et al. (1952). In the 1960s, features were given an important place in lexical semantics (Katz and Fodor, 1963), in syntax (Harman, 1963; Katz and Postal, 1964; and Chomsky, 1965) and in morphology (Matthews, 1965). A major development was Generalized Phrase Structure Grammar (Gazdar et al., 1985), which brought together the linguistic work of Stockwell, Schachter, and Partee with the formal work of Martin Kay in an attempt at a fully articulated theory of features. In subsequent years, features have taken on an even greater role, yet conceptualization has lagged behind. The notion of default has been employed widely in linguistics, but with varying interpretations (distinguished in formal terms in Fraser and Corbett, 1997). In Lexical Functional Grammar and Head-Driven Phrase Structure Grammar, work on unification is central to the use of features. In Government and Binding Theory and Minimalism, the notion of checking performs a similar but significantly different role (see, for example, Chomsky, 2000). Even though the theoretical machinery available has been extended by the introduction of typed feature structures (see Carpenter, 1992), we still lack a convincing account for some basic phenomena. However, there are encouraging moves to bring formal accuracy together with a range of interesting data, to bring us closer to an adequate theory of linguistic features.

See also: Case; Feature Organization; Gender; Number;

Tense.

Bibliography Carpenter B (1992). The logic of typed feature structures: with applications to unification grammars, logic programs and constraint resolution. Cambridge: Cambridge University Press. Chomsky N (1965). Aspects of the theory of syntax. Cambridge: MIT Press. Chomsky N (2000). ‘Minimalist inquiries: the framework.’ In Martin R, Michaels D & Uriagereka J (eds.) Step by step: essays on minimalist syntax in honor of Howard Lasnik. Cambridge, MA: MIT Press. 89–155. Fraser N M & Corbett G G (1997). ‘Defaults in Arapesh.’ Lingua 103, 25–57. Gazdar G, Klein E, Pullum G K & Sag A (1985). Generalized Phrase Structure Grammar. Blackwell: Oxford. Harman G (1963). ‘Generative grammars without transformational rules: a defense of phrase structure.’ Language 39, 597–616. Jakobson R C, Fant G & Halle M (1952). Preliminaries to speech analysis: the distinctive features and their correlates. Cambridge: MIT Press. Katz J J & Fodor J A (1963). ‘The structure of a semantic theory.’ Language 39, 170–210. Katz J J & Postal P M (1964). An integrated theory of linguistic descriptions. Cambridge: MIT Press. Matthews P H (1965). ‘The inflectional component of a word-and-paradigm grammar.’ Journal of Linguistics 1, 139–171.

Acknowledgment The support of the ESRC under grant RES051270122 is gratefully acknowledged.

Linguistic Habitus I Gogolin ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the Concise Encyclopedia of Sociolinguistics, pp. 650–652, ! 1994, Elsevier Ltd.

Pierre Bourdieu’s (1981) concept of habitus denotes a modality, which enables the individual to act routinely as well as creatively and innovatively. Central to Bourdieu’s theory is the attempt to describe the dynamic relationships between the structural conditions of an individual existence, the individual’s

activities as a product of socialization under these conditions and the open-ended yet strictly limited capacity of the individual for action. In the process of socialization (see Socialization), a system of permanent dispositions is created in the individual, including sensitivity as one prerequisite for personal development. This system of dispositions and sensitivity is a necessary precondition for successful social activity. Bourdieu emphasizes a circularity between ‘structure,’ ‘habitus,’ and ‘practice.’ Habitus functions as an awareness-matrix, an action-matrix, and a

Linguistic Habitus 195

thought-matrix for the individual; however, it does not alone determine behavior. A habitus is acquired under a certain set of social conditions, which Bourdieu calls ‘objective structures.’ To these belong the existential requirements, which characterize and define a social class. A habitus therefore is generated and regenerated by the specific objective structures of a class, and at the same time define and redefine, generate and regenerate the lifestyle and practice of its members. Bourdieu describes this circularity as follows: habitus function as ‘‘structured structures, which are suitable to function and work as structuring structures’’ (Bourdieu, 1979: 167). Implicit in a habitus is a tendency toward self-stabilization – despite the fact that socialization is not completed until death. New experiences are integrated in a habitus that leads to its constantly changing form while remaining relatively stable. The term ‘linguistic habitus’ thus refers to the set of dispositions in the field of language: to a person’s notion of linguistic ‘normality’ and of ‘good’ language, to a society’s notion of ‘proper’ ways of language behavior, of ‘legitimate’ language variations and practice. The term does not describe language as a means of communication in a narrow sense, but refers primarily to the symbolic relations and signs by which language becomes a medium of power. Gogolin (1994) defines the monolingual habitus, which is common to the classical European nation states as an example of a linguistic habitus. In the process of their foundation in the 18th and 19th century, the basic and deep-seated belief was created that monolingualism is the universal norm for an individual and for a society (cf. Hobsbawm, 1990). This idea was disseminated and traditionalized by institutions of the nation-state: jurisprudence, military forces, political, and administrative bodies. Probably most influential for the creation and dissemination of these fundamental elements of the concept of nation-state were the modern state school systems, which were established simultaneously to the foundation of the nation-state as such. Linguistic homogenization was the main motive for the development of public education systems. The establishment of one national language and of a monolingual national society was (and often still is) seen as essential for the success of the nation-state, especially at the economic level. In reality, hardly any nation in the world ever had a monolingual population. In most countries – especially in the Australasian, Asian, and African world, but also in Europe – people speak more than one, often many languages (see Bilingualism). Despite this reality, the deep-seated belief in monolingualism as natural in a society governs individual and public

opinions towards language and language practice. In the cases of European nation-states, this preference becomes obvious in connection with immigration. Roughly one-third of the people under the age of 35 years in the member states of the European Unity are migrants, speaking other languages than the national language(s) of the respective state (see Migration and Language). For example, in the year 2000, more than 350 languages were spoken by children in London schools. Nevertheless, the European state school systems act as if they had monolingual populations. The teaching is centered on the ‘legitimate’ language of the state or the area, that is to say the national language or the official language of the region. The other languages are rarely taught at schools; children stay illiterate in their second or third language they live in. Consequently, they are unable to develop their other languages towards the most elaborated state (see Bilingual Education). By these mechanisms, the hierarchy and power relations between languages become stabilized. Those who have no or less access to the legitimate language are at risk of being excluded from participation and equal success in a society. The linguistic habitus is the governing mechanism of these processes. The notion of linguistic normality was traditionalized in history, despite multilingualism. There is no collective remembrance of these historical events; the fact that this notion once was created and implemented explicitly has disappeared from collective memory. According to Bourdieu’s theory, a habitus functions the better, the less conscious its owner is about it. A variation of the monolingual habitus seems to be at work in multilingual nation-states as well, due to the fact that there is usually one selected language of power. This choice may not be a national language. In colonial or postcolonial situations especially, the language of power very often is not a national language, which by constitution has legitimate status. Instead, it is the language, which guarantees survival and success in the linguistic market – mainly the language of the former colonists. Whereas in Europe, the standard national languages are usually considered the legitimate variant, in colonial situations symbolic power is linked to the languages of the former oppressors (cf. Goke-Pariola, 1993). The linguistic habitus ensures the relative stability of these situations, and thus contributes to the durability of societies of marked disparities in power between different social groups.

See also: Bilingualism; Bilingual Education; Migration and Language; Socialization.

196 Linguistic Habitus

Bibliography Bourdieu P (1979). Entwurf einer Theorie der Praxis. Suhrkamp: Frankfurt am Main. Bourdieu P (1981). ‘Structures, strategies, and habitus.’ In Lemert C C (ed.) French sociology: rupture and renewal since 1968. Columbia New York: University Press. Crystal D (1987). The Cambridge encyclopaedia of language. Cambridge: Cambridge University Press.

Gogolin I (1994). Der monolinguale Habitus der multilingualen Schule. Muenster: Waxmann. Goke-Pariola A (1993). ‘Language and symbolic power: Bourdieu and the legacy of Euro–American colonialism in an African society.’ Language and Communication 13(3), 219–234. Hobsbawm E J (1990). Nations and nationalism since 1780: programme, myth, reality. Cambridge: Cambridge University Press.

Linguistic Influence H Gottlieb, University of Copenhagen, Copenhagen, Denmark ! 2006 Elsevier Ltd. All rights reserved.

No Language Is an Island All languages borrow lexical and grammatical features from each other, and down through the ages, English has borrowed intensely from a host of other languages, dead (such as Latin and Greek) or alive (e.g., Italian, French, and German). French loanwords now constitute one-third of the lexical inventory of English, a fact that has led to the epithet ‘a semi-Romance language’ (McArthur, 2002: 135). Although English today is better known as a linguistic donor than a receiver, English keeps adopting and adapting foreign words, many of which are later re-exported to other speech communities. As a case in point, several exotic late-20th-century food terms, e.g., taco and tortilla (originally from Spanish) and sushi (from Japanese) have reached European kitchens and dictionaries after stopovers in America or Britain. This phenomenon is an obvious parallel to relay translation – as one finds in, for instance, the interpreting services of EU bodies – where language contacts between minor languages are typically mediated by a more dominant language, most often English or French. In the following section, we will look at some of the mechanics of linguistic influence in general, providing a backdrop for a more elaborate discussion of the English influence on other languages, by far the most conspicuous and wide-ranging type of linguistic influence found today.

Borrowing from Other Languages: Blessing or Curse? One question that immediately springs to mind when discussing linguistic influence is how such influence

comes about. What does it take to export language features, or – viewed from the receiving end – why and how are loans introduced in a language? In answer to these questions, I will cite two experts in historical linguistics, Lars-Erik Hedlund and Birgitta Hene. In their book La˚nord i svenskan (Hedlund and Hene, 1992: 70), they establish the following taxonomy of the raisons d’eˆtre of lexical innovation through borrowing (all terms are translated by me), summarized in Table 1. In contrast to this descriptive and quite complex model, a prescriptive, somewhat simplistic view distinguishes between loans that are needed and loans that are superfluous. As concluded in a pioneering Danish work on Anglicisms (Sørensen, 1973: 131):

Table 1 The reasons behind lexical borrowing Background

Motivation

Function

A. Voids in the language

1. Name new phenomenon 2. Generalize/specify 3. Express oneself in neutral terms 4. Express a value statement 5. Create certain associations 6. Add humorous effect 7. Avoid sounding repetitive 8. Obtain a more handy expression 9. Express personal or group identity 1. Compensate personal lexical voids 2. Compensate presupposed voids in the audience 1. Represent a foreign culture in translation

Verbalization

B. Voids in the sender/ receiver

C. Foreignlanguage original

Verbalization Information Expressiveness Persuasion Entertainment Rhetorical appeal Ease Psychosocial marker Creating a message Getting the message across All above functions

196 Linguistic Habitus

Bibliography Bourdieu P (1979). Entwurf einer Theorie der Praxis. Suhrkamp: Frankfurt am Main. Bourdieu P (1981). ‘Structures, strategies, and habitus.’ In Lemert C C (ed.) French sociology: rupture and renewal since 1968. Columbia New York: University Press. Crystal D (1987). The Cambridge encyclopaedia of language. Cambridge: Cambridge University Press.

Gogolin I (1994). Der monolinguale Habitus der multilingualen Schule. Muenster: Waxmann. Goke-Pariola A (1993). ‘Language and symbolic power: Bourdieu and the legacy of Euro–American colonialism in an African society.’ Language and Communication 13(3), 219–234. Hobsbawm E J (1990). Nations and nationalism since 1780: programme, myth, reality. Cambridge: Cambridge University Press.

Linguistic Influence H Gottlieb, University of Copenhagen, Copenhagen, Denmark ! 2006 Elsevier Ltd. All rights reserved.

No Language Is an Island All languages borrow lexical and grammatical features from each other, and down through the ages, English has borrowed intensely from a host of other languages, dead (such as Latin and Greek) or alive (e.g., Italian, French, and German). French loanwords now constitute one-third of the lexical inventory of English, a fact that has led to the epithet ‘a semi-Romance language’ (McArthur, 2002: 135). Although English today is better known as a linguistic donor than a receiver, English keeps adopting and adapting foreign words, many of which are later re-exported to other speech communities. As a case in point, several exotic late-20th-century food terms, e.g., taco and tortilla (originally from Spanish) and sushi (from Japanese) have reached European kitchens and dictionaries after stopovers in America or Britain. This phenomenon is an obvious parallel to relay translation – as one finds in, for instance, the interpreting services of EU bodies – where language contacts between minor languages are typically mediated by a more dominant language, most often English or French. In the following section, we will look at some of the mechanics of linguistic influence in general, providing a backdrop for a more elaborate discussion of the English influence on other languages, by far the most conspicuous and wide-ranging type of linguistic influence found today.

Borrowing from Other Languages: Blessing or Curse? One question that immediately springs to mind when discussing linguistic influence is how such influence

comes about. What does it take to export language features, or – viewed from the receiving end – why and how are loans introduced in a language? In answer to these questions, I will cite two experts in historical linguistics, Lars-Erik Hedlund and Birgitta Hene. In their book La˚nord i svenskan (Hedlund and Hene, 1992: 70), they establish the following taxonomy of the raisons d’eˆtre of lexical innovation through borrowing (all terms are translated by me), summarized in Table 1. In contrast to this descriptive and quite complex model, a prescriptive, somewhat simplistic view distinguishes between loans that are needed and loans that are superfluous. As concluded in a pioneering Danish work on Anglicisms (Sørensen, 1973: 131):

Table 1 The reasons behind lexical borrowing Background

Motivation

Function

A. Voids in the language

1. Name new phenomenon 2. Generalize/specify 3. Express oneself in neutral terms 4. Express a value statement 5. Create certain associations 6. Add humorous effect 7. Avoid sounding repetitive 8. Obtain a more handy expression 9. Express personal or group identity 1. Compensate personal lexical voids 2. Compensate presupposed voids in the audience 1. Represent a foreign culture in translation

Verbalization

B. Voids in the sender/ receiver

C. Foreignlanguage original

Verbalization Information Expressiveness Persuasion Entertainment Rhetorical appeal Ease Psychosocial marker Creating a message Getting the message across All above functions

Linguistic Influence 197 As legitimate loans we may accept those words and phrases which serve a sensible purpose, i.e. those that primarily fulfil an informative function. As opposed to this, one should be wary of loans suggested for prestige purposes. [my translation]

However, even superfluous, prestige-driven loans tend to carve semantic niches for themselves (Gottlieb, 2004), and, moreover, it is not easy to judge which motivation(s) may be at play in the borrowing process. In a recent discussion on German loanwords in English – high-register items such as Zeitgeist and Sprachgefu¨ hl – American-based lexicographer A. J. Meier bridges the need-or-prestige dichotomy with these words: The line between need and prestige, however, can be somewhat obscure, given the tendency for foreign words to belong to a more ‘‘educated register.’’ Indeed, I would submit that, by virtue of its foreignness, a word attains greater saliency and thus, to some extent, is imbued with greater expressive power, a power concordant with both need and prestige. (Meier, 2000: 169)

As any discussion concerning linguistic borrowing is bound to involve notions of cultural and linguistic power, the debate tends to become emotionally heated. Internationally, the public debate oscillates between the following statements and claims (Table 2). Still, despite the arguments against borrowing – nowadays typically from English – most speech communities continue doing so with increasing speed; even in self-protective societies such as Iceland, purist measures have a limited effect. Foreign words travel without passports, and as in politics, power is in the hands of laymen, not experts.

The Semantic Functions of Borrowings Having gained ground, borrowings fall into the following semantic categories:

1. Additions. Borrowed expressions refer to new phenomena in the world outside the speech communities adopting them: in many Germanic languages, the Anglicism AIDS (or aids) came with the disease, so to speak. France managed with a loan translation of the four lexical elements (Acquired Immuno-Deficiency Syndrome), thus coming up with SIDA. 2. Replacements. Anglicisms and other-isms often pop up in situations where the entities they refer to already exist in the domestic language: in these years, Danish sceneskræk (a direct translation of the English ‘stage fright’) is gaining ground at the expense of lampefeber (from German Lampenfieber). This replacement of a Germanism by an Anglicism is typical of the recent development in minor Germanic speech communities; ironically, in this case the two German components are both found in English (‘lamp’ þ ‘fever’), whereas the English-inspired elements scene and skræk were originally borrowed from French (sce`ne) and German (Schreck), respectively. 3. Differentiators. Finally, by wedging themselves in, some borrowings may contribute to semantic differentiation: as rollemodel (from English ‘role model’) is now establishing itself in contemporary Danish usage, the existing term forbillede (after German Vorbild), seems to be acquiring a strictly metaphorical meaning. In other words, part of the semantic range of forbillede (that referring to persons) is presently being taken over by rollemodel, which can be attested by Danish corpus data. However, a borrowing of this sort, which in the beginning covers only a sub-sense of a domestic word, may one day take over the entire semantic field of that word. In that event, a differentiator turns into (or turns out to be) a replacement. In the case of rollemodel in Danish, this is not the case – yet.

Influence from English Table 2 Standard arguments for and against linguistic borrowing For borrowing

Against borrowing

Facilitates learning of donor language Shortens distance between languages and cultures Provides expressive enrichment Makes translation simpler

Impedes reading of national classics Increases distance between generations and social groups Leads to linguistic impoverishment Kills the fascination with foreign languages and cultures Paves the way for foreign cultural dominance

Fights chauvinism and provincialism

For more than a century, English has been immensely successful in influencing languages worldwide. Still, English possesses very few features that fill existing semantic voids in other languages. Naturally, many concepts and products were introduced by Englishspeaking countries, companies, and individuals. But the fact that, for instance, many Japanese and European brands and product names are English – even names for domestic-market goods – shows that, in certain domains at least, an anglophone ring adds to the value of things. The English influence is generated by three interconnected factors:

198 Linguistic Influence

1. Most opinion leaders worldwide are dependent on English as a second language, both for personal communication and for information gathering, and even children in a number of officially nonEnglish-speaking countries rely heavily on their command of English when surfing the Internet, listening to rap and rock songs, and watching TV series and films (whether subtitled or not). 2. English is unrivaled as a cover-all language: it is usually the first language in which new technology, trends, and lifestyles are presented internationally, and a language increasingly used worldwide as a lingua franca, even when no native speakers of English are present. In virtually every country on the planet, English is now the first foreign language taught in school. Within the last two decades, Russian, French, Spanish, and German have all lost the status of primary foreign language outside the nations and regions in which they are officially spoken. 3. An increasing part of the media input in the modern world consists of texts translated or in some other way derived from English sources. In great parts of the world, most films, TV series, computer games and novels are anglophone imports, all of this having a major impact on the national languages (Gottlieb, 1999, 2001). National and regional languages are not just influenced by English, but in academic and business communication, they have de facto lost several domains. Today, in a number of countries, the credo ‘certain things are best expressed in English’ not only is heard among blase´ cosmopolitans, but also is often uttered by government officials and businessmen, even by school children.

The Notion of Anglicism: Establishing a New Paradigm In the various national debates concerning contemporary English-language influence, emotions often replace empirical facts, as ill-defined concepts are used for polemical effect. Below I will suggest a neutral terminology, including a typology to embrace all types of linguistic influence, with English as the influential language, and with Anglicism as a key concept. However, before defining this term, two related key concepts will be addressed: 1. English. In today’s language-political discussions, ‘English’ rarely refers to England, not even to Britain. The central reference here is, not surprisingly, the United States. In most speech communities, ‘influence from English’ means (linguistic) influence from the United States, the dominant

anglophone nation. Since the breakthrough of sound film in Hollywood in the late 1920s, Britain has played second fiddle in the spread of AngloSaxon values and linguistic features, although in most European countries most teachers of English still try to emulate a British, i.e., RP, accent in their daily work. 2. Loan words. First, thousands of Anglicisms are not direct loanwords, but belong to one of the many other categories of Anglicism (see the typology below). Second – as goes for all borrowings – Anglicisms are not loans to be paid or handed back to the donor language at a later stage. Instead, they can be seen as either ‘stolen goods’ or rather as the fertile offspring of other speech communities’ planting of English seeds. The foreign soil is also the reason why, quite often, when working into English, the best translation of an Anglicism is not the same expression in English. Traditionally, scholars have defined an Anglicism as ‘‘a word borrowed from the English language which is adapted with respect to the linguistic system of the receptor language and integrated into it’’ (Sicherl, 1999: 12). This definition should be praised for emphasizing that, once borrowed, loans – if they manage to survive – may remain forever with the borrowing language and may in that process change pronunciation, spelling, meaning, etc. However, two serious objections to this type of definition can be raised: 1. It is too narrow. It only looks at the most conspicuous elements of language: the individual lexical items. Morphological, syntactic, and other features are ignored. 2. It expresses a naive integrational paradigm, which is no longer generally valid, that of stable domestic language structures which eventually digest and integrate all (English) loans. What happens between English and other languages today points to a paradigm of systemic influence. In an increasing number of speech communities, especially Germanic ones, English linguistic features – even grammatical ones – are now adopted, rather than adapted. And even in exotic speech communities under the spell of English, the common practice of code-shifting overrules potential gaps in language systems that would prevent direct transfer of English language features and norms. What is, then, a reasonable definition of Anglicism? In order to cover the entire spectrum of present-day influence from English, the notion of Anglicism should be defined as ‘any individual or

Linguistic Influence 199

systemic language feature adapted or adopted from English, or inspired or boosted by English models, used in intralingual communication in a language other than English.’ Based on this definition, the taxonomy in Tables 4–6 encompasses all linguistic phenomena caused by English influence. As illustrated in Table 3, the taxonomy rests on a two-by-two categorization based on two distinctive features: 1. The distinction found in the Anglicism definition above, between, on the one hand, items that are either adopted (i.e., retained, and thus obviously of English heritage) or adapted (i.e., camouflaged or literally translated into the recipient language) and items that are inspired or numerically boosted by English language phenomena. 2. The distinction between microlanguage items (including morphemes, phonemes, lexemes, phraseology, and syntax), and macrolanguage ones (phenomena found at clause, sentence, or text level). The reason why my categorization has yielded a tripartite taxonomy – and not four main categories – is the following: I have not included reactive macrolanguage Anglicisms, a potential category that is almost impossible to operationalize, as it is hard to

Table 3 Key parameters in categorizing Anglicisms

Adapted or adopted from English Inspired or boosted by English models

Subclause items

Clause, sentence, and text items

Active Anglicisms Reactive Anglicisms

Code-shifts [not included in present model]

determine whether a given sentence or text (type) is inspired by English or not. In Tables 4, 5, and 6, each subcategory is exemplified, and the English trigger behind each example is given, something that is especially needed when dealing with reactive Anglicisms, which by definition hide their English ancestry to native speakers and foreign observers alike. Some of the terms used in the taxonomy are well established and universally agreed upon – e.g., semantic loans and morphosyntactic calques – while others are new (sentence-internal vs. sentence-shaped shifts, for instance). Finally, there are terms that are debated, yet too established to deserve being discarded altogether: ‘borrowing’ and ‘loan’ are thus preferred to the neologism ‘import word,’ although this term reflects the true nature of linguistic influence.

Anglicisms and Acceptability In addition to the structural classification presented in Tables 4, 5 and 6, a language–political categorization can be applied to Anglicisms, as not all of them reach acceptance by (all) language users in the affected speech community. At any given time, depending on a number of factors – the type of Anglicism, a particular item’s (lack of) prestige, its usage history, etc. – individual items can be ranked as follows, in decreasing order of acceptability: 1. Integrated items (not intuitively identified as English loans, accepted by all): French flanelle < English flannel, probably from Welsh gwlanen 2. Naturalized items (identified as English loans and commonly accepted): Romanian interviu < English interview; Hungarian marketing (¼ English)

Table 4 Active Anglicisms Category

Type

Examples

English trigger

Overt lexical borrowings

Single-word unit Multi-word unit Sub-word unit Single-word unit Multi-word unit

branding (Danish) Learning by Doing (German) -minded (Norwegian) keks (Slovene) Stop en halv! (Danish)

‘branding’ ‘learning by doing’ ‘-minded’ ‘cakes’ ‘Stop and haul!’

Compound substitute Multi-word substitute Partial loan translation Archaism Semantic change

involtino primavera (Italian) ta er at siga (Faroese) Computerkunst (German) butterfly (Danish) overhead (Norwegian)

Contamination

after-ski (Swedish)

Morphological change Jocular derivation

fit for fight (Swedish) webmoster (Danish)

Covert lexical borrowings

[literally ‘stop one half’] Loan translations Hybrids Pseudoanglicisms

‘spring roll’ ‘that is to say’ ‘computer art’ ‘butterfly tie’ (¼ bow tie) ‘overhead’

(¼ slide, OHP transparency)

[literally ‘web auntie’]

‘after’ þ ‘ski’ (English: ‘apres-ski’) ‘fighting fit’ ‘webmaster’

200 Linguistic Influence

3. Implants (English-sounding, accepted by certain user groups only): Finnish benchmarking (¼ English); Danish hænge ud < English hang out 4. Interfering items (often inaccurate solutions, including mistranslations): (Danish militære barakker < (military) barracks; correct term: kaserne, originally from French caserne via German Kaserne). Extending the natural metaphors used in the four terms above, one could say that these four categories represent not only a cline in terms of acceptability, but also a Darwinist race for survival, with many Anglicisms beginning their life as interfering items, which – as in the above example – may mislead the unsuspecting reader: in Danish, barakker are poorly built one-story houses. Some new, interfering Anglicisms, which in written sources are often printed in quotation marks or italics, reach the implant stage, and out of these only a few become naturalized, or end up as fully integrated items.

With respect to idiomaticity, the watershed goes between types 2 and 3 above. To most language users, items of types 3 and 4 are seen as foreign or unwanted.

When Linguistic Influence Leads to Language Death It is not the intention here to warn against foreignlanguage influence, a primary source of renewal and growth of any language, including English. On the other hand, as with drugs and other substances, what is found stimulating in small doses may kill in large quantities, and language death is indeed an issue here. In northern Europe, English has all but wiped out several Gaelic languages – with Welsh presently fighting its way back from near-extinction – and it certainly has exterminated the Norse language spoken in the Orkneys until a few hundred years ago. Today, more than military invasions or political power, the media are instrumental factors behind the

Table 5 Reactive Anglicisms Category

Type

Anglicism

Standard

Trigger

Semantic loans

Extensions Reversions Limitations

lernen (German) overhøre (Danish) morgen (Danish)

wissen (= know) høre (= hear) [overhøre = ignore] nat (0–5 A.M.) morgen (5–9 A.M.) formiddag

‘learn’ ‘overhear’ ‘morning’

Orthographic loans

Changed spelling Changed punctuation

literatur (Danish)

litteratur

‘literature’

Den erfarne amerikanske senator, Joseph Biden, har en anden udlægning. (Danish) unik pronounced as

Den erfarne amerikanske senator Joseph Biden har en anden udlægning.

‘. . . American senator, Joseph Biden, has . . .’

[oo’nik]

‘unique’

slightly rising intonation

Standard American intonational pattern

plural –er Es maestro de escuela

-s ‘Here you are.’ ‘He’s a teacher’

Han vil dog ikke . . .

‘However, he will not . . .’

Ring til en ekspert / Tilkald en ekspert op af vandet [op ¼ up]

‘Call an expert’

(9–12 A.M.)

Phonetic loans

Phonetic changes Prosodic changes

Morphosyntactic calques

Inflections Phraseology Constructions Word order Valency

[you’nik] (Danish) falling intonation in exclamations (Brazilian Portuguese) autobahns (Danish) Hier sind Sie. (German) Es un maestro de escuela (Spanish) Dog, han vil ikke . . . (Danish) Ring en ekspert

(Danish)

Translationese

Prepositional choices Favorized cognates Default equivalents

ud af vandet (Danish)

Bitte sehr.

‘out of the water’

co`pia (Catalan)

exemplar

‘copy’

[. . . of a book] anla¨nda (Swedish)

komma

‘arrive’

Linguistic Influence 201

continuing triumphs of English on the world map. And as people add English to their personal repertoire of languages, Anglicisms are adapted and adopted into their local vernaculars, whether spoken by hundreds of millions, as Chinese or Portuguese, or only by hundreds of thousands, as Icelandic and Faroese. Whether local languages lose domains in the process or even end up losing all their speakers is decided by their users, not by native speakers of English or anglophone institutions. Taking Denmark as a typical example of a speech community under English influence, the most important agents in the ongoing anglification are the national media, especially those whose texts are based on anglophone sources. Responding to a seminal Danish study on the impact of English (summarized in Preisler, 2003) – in which influence from above, i.e., the educational system and the business world, is amplified by influence from below, e.g., U.S.-inspired youth subcultures – the doyen of Danish Anglicism research, Knud Sørensen, points out that in his view ‘‘the most important impact of English might be termed influence from the middle, i.e. the influence exerted by the press’’ (Sørensen, 2003: 354). Ironically, a large part of the media-generated Anglicisms are not deliberately created for effect; strict deadlines simply do not go well with idiomaticity when you are juggling with two languages. In Sørensen’s terms, ‘‘a journalist who works with an English-language source at his elbow will often be tempted to take the line of least resistance and make a rough translation of an English idiom, sometimes

forgetting whether it will make sense to his Danish readers’’ (Sørensen, 2003: 349). With an all-pervasive issue such as the English influence on cognate languages, we easily find ourselves in a chicken-and-egg situation. Whether we argue that the media merely reflect the linguistic realities of youthful customers and audiences – the position taken by Preisler – or that the media play a more proactive role in the continuing anglification of other languages – Sørensen’s point – does not make much difference. The fact remains that the English influence comes from everywhere, is felt by everyone, and has effects everywhere in the given domestic language, with Danish as a case in point. As Danish dialects are dying out – at a rate still matched by few other European languages – we may soon advance to a situation where Danish itself becomes a dialect, shared by future bilinguals from the Danish Isles. Just like Danish speakers of the Southern Jutland dialect sønderjysk used to switch off Standard Danish when meeting old friends in urban Denmark, Danes reading English paperbacks, watching American films and using English at work may in the future switch off English when they spend time with family and friends, in what may be the only domain of Danish not yielding to English: the intimate sphere. Today, Danes who master English – and who accept a modicum of English influence on Danish – are comparable to those who, 100 hundred years ago, were ahead of their peers in adopting Standard Danish as a prestigious alternative to their traditional dialects, now nearly extinct (Hjarvard, 2004).

Table 6 Code-shifts Category

Example

Pragmatic context

Tags Sentenceinternal shifts Bilingual wordplay

, okay? . . . en person som under press kan ‘blow the cover sky high!’

Standard oral interpersonal assurance formula Lines from a (Norwegian) novel

‘De Frygtløse’ – ‘The Muuhvie’ (Danish title for the American animated movie [featuring cows] ‘Home on the Range,’ 2004)

Sentenceshaped shifts Total shifts

Way to go, girl!

Common linguistic device in commercial punchlines and political slogans in semibilingual speech communities The final words in a (Danish) music review

Domain losses

‘Visit the World’s Biggest LEGO Shop’ (Website for Danish toy company only in English) ‘Layout construction: a case study in algorithm engineering’

Addressing locals and foreigners through English-only communication. Title of an academic research paper written by four (Danish) scientists. (In some countries up to 90% domain losses in computer games, scientific papers, pop lyrics, and certain business documents; more moderate losses in domains such as advertisements, commercial brands and film titles.)

202 Linguistic Influence

Metaphors We Drown By: Forever Flooded by Anglicisms? The debate on Anglicisms is international and goes back more than 100 hundred years to the days when millions of Europeans fled their countries hoping for a better future in English-speaking North America and Australia. This wave of poor emigrants coincided with the heyday of the British Empire around the turn of the 20th century. And that was the time when the first critical voices were raised in the apparently never-ending debate on English linguistic influence abroad. Especially in Germany, the impact was strongly felt at an early stage (Dunger, 1899). Even the flood metaphor, portraying the exposed language as a victim of some natural disaster, seems to be a veteran term: referring to the period before 1914, a later observer noted that ‘‘the flooding of German life and the German language with English had reached such an extent that the whole situation for Germany appeared almost threatening’’ (Stiven, 1936: 101, cited in Viereck, 1986: 110). Also in France, a country renowned for its linguistic conservatism, oceanic terms have been used for what happened to the national vernacular in the late 20th century. According to one German linguist: Purism has sterilized the language. The French are desperately afraid of coining new words. This linguistic Malthusianism is ultimately responsible for the fact that thousands of foreign words flood into the gap which has to be filled somehow. (Hausmann, 1986: 87).

This textbook case of German Schadenfreude demonstrates that in a modern world, what seemed to be difficult already before WW1 is now impossible: English cannot be kept back. Instead, bilingualism could be the answer, paired with an increased awareness of everything that pertains to (one’s own) language and culture. Still, waves are always washed away by other waves, and ‘‘many loans seem to be part of trends and waves that make a lot of noise, but are relatively soon forgotten’’ (Graedler, 2002: 79). What is crucial here, however, is that these recurrent phenomena – shortlived or not – are always English; neither Europe’s most spoken mother tongues (Russian and German) nor the only billionþ language worldwide (Mandarin Chinese) seem likely to compete with English in this respect. When it comes to successful export of linguistic features, the only near-future rival to English may be Arabic, representing an Islamic culture that is now the only vocal challenge to Anglo-Saxon globalization, and thus a language with an enormous potential for

covert prestige in the eyes of non-anglophone subcultural and linguistically trend-setting groups the world over. So far, apart from a few ‘tasty’ words such as kebab and shawarma, recent European loans from Arabic tend to represent notions related to religion and politics, words that often acquire sinister connotations in the process: fatwa, ayatollah, mullah, imam, sharia, ramadan, intifada, sunna, burka, and jihad.

The Development of Anglicisms As we have seen, influence from English may take many forms and create many different types of linguistic offspring. An example of how differently English loans may develop over time is found in Hong Kong Chinese and Japanese, respectively (Chan and Kwok, 1986; Ishiwata, 1986). In Chinese as spoken in Hong Kong, the situation prior to the integration of the ex-English crown colony was typically as follows: Some loanwords may enjoy a brief popularity and are gradually replaced by terms which are more meaningful. [. . .] Take the example of ‘laser’. It entered the Chinese lexicon as [lœy se]; today it is being replaced by the descriptive [gik gwoN] or ‘piercing ray’ (Chan and Kwok, 1986: 415).

Expressed in structural terms, what happens with the ‘laser’ type Anglicisms in Hong Kong Chinese is that what was launched as an incomprehensible phonetic loan (an overt lexical borrowing) ended up as an allChinese neologism, immediately comprehensible to the speaker. Of course, this is not a unique Asian phenomenon; in several languages, not only loan translations offer themselves as successors to pioneering English-sounding terms: quite often the original Anglicisms yield to expressions that cannot by any standards be termed Anglicisms. In Japan, the opposite tendency seems to have been at work for more than a century. Whereas prior to the Meiji restoration in 1869, Japanese coined its own translations of (scientific) terms of English origin, ‘‘it has become more and more common to adopt the western terms as they are without any attempt at translation’’ (Ishiwata, 1986: 459). As indicated earlier, this is exactly the situation in several European speech communities today. While, for instance 50 years ago, in German the GrecoEnglish compound ‘television’ became Fernsehen, decades later ‘telefax’ was taken over as Telefax. Along the same lines, in Denmark cinema ‘thrillers’ were baptized gysere (a literal translation of the term) as this genre was introduced to Danish cinema audiences in the early 1930s, but today’s Danish DVD

Linguistic Influence 203

patrons exclusively use the term ‘thriller’. In the passing years, the word ‘gyser’ has increasingly been used metaphorically: in recent press reportage, the term typically refers to sporting events, political elections, and the like. This wedging in of a new term (ironically, in this case, the original English term) allows the existing word to take on a new meaning, or at least new connotations. If the old term is not lost in the process, we have a case of language enrichment.

Dictionaries on –isms: Testimonies of Linguistic Influence Given the historical and present importance of borrowings, a vast number of general foreign-word dictionaries have been compiled. Yet dictionaries attesting the influence of languages such as French, German, and Russian are sorely missing, and largely nonexistent, a fact that is lamented by one of Europe’s leading names in contact linguistics in his chapter entitled ‘Wanted? Dictionaries of Gallicisms, Germanisms and neo-classic diction’ (Go¨ rlach, 2003: 124–162). Anglicism dictionaries, however, seem to flourish, as documented in a comprehensive bibliography comprising dictionaries and other publications on Anglicisms (Go¨ rlach, ed., 2002). Still, as long as national lexicographical definitions and resources vary considerably, one cannot conclude from dictionary data on Anglicisms to linguistic realities concerning their relative numbers – nor their frequencies – in a given set of languages, as shown in a study comparing Anglicism dictionaries from four Germanic speech communities (Gottlieb, 2002).

The Introduction of Anglicisms Anglicisms may be introduced through either personal or impersonal contacts between an anglophone source and a non-anglophone target. Personal Contacts

The first wave of Anglicisms in the speech communities surrounding Britain was found in the 18th century. Mediated by sailors, English nautical terms were introduced overseas. Today, most of these loans are integrated in the borrowing languages, and in Denmark nobody except etymologists realizes that words such as kutter (fishing vessel) and splejse, meaning ‘sharing expenses between friends’ (from ‘splicing’ ropes) are English loanwords. When this influence from below was first felt outside Britain, the ruling classes abroad began mediating English influence from above in adopting English terms and habits, including everything from gentleman and gin to tennis and golf.

Outside Europe, colonial elites picked up English, and since then the English influence has been felt in national and local languages throughout what is now termed the Commonwealth nations. In similar fashion, non-English-speaking countries in the American hemisphere – the Philippines, for instance – were heavily influenced by American English long before the Second World War. Even today, in most of the world’s non-anglophone speech communities, personal contacts constitute an important channel through which English language features are introduced. A poignant example is found in the influential foreign-based reporters often operating in English even when based in an area where English is not the national language, e.g., the Middle East. Yet since the 1940s at least, most Anglicisms result from impersonal contacts. They are introduced in target languages – directly or via intermediary languages – through literature and the mass media, as discussed in the following sections. Impersonal Contacts

Today the media play a more decisive role in language change than do personal contacts; the latter nowadays tend to stimulate foreign-language skills rather than influence the native tongues of those involved. Original Products: Direct English Input A major dividing line within the impersonal contacts category runs between translated and original (or nontranslated) entities. Most of the original products are nonverbal, though as symbols of Anglo-American lifestyle, they have a major impact on the language in the cultures in question. Original nonverbal products include clothes, food, media technology, etc. Among the verbal products presently consumed untranslated in non-anglophone speech communities, not least by the young, are rock songs, video games, printed fiction, CNN-type news coverage, and Internet communication in English, phenomena contributing to the possible future loss of domains for the domestic languages. Translated Products: Anglicisms Every Minute This subcategory covers books, technical documentation, films, DVDs, and TV programs, often comprising a major part of the total consumption in nonanglophone countries. In Denmark, some 40% of what people read – TV subtitles included – is translated from English (Gottlieb, 1997: 148–153). All over the world, the audiovisual, or polysemiotic, media – television, video, DVD, and film – are instrumental in introducing language change. The

204 Linguistic Influence

first study to focus on Anglicisms on the screen (Sajavaara, 1991) highlighted the role of TV subtitles in the ongoing English influence on Finnish, a language which does not belong to the Indo-European language family. In Spain, one of Europe’s major dubbing countries, critical observers have long talked about ‘‘the effect that English is having on the Spanish speaker at home as a result of the vast quantities of badly translated material flooding the spheres of journalism, radio, television, and advertising’’ (Lorenzo, 1996: 18, quoting A. Gooch: Spanish and the onslaught of the Anglicism). A phenomenon often mentioned in this context are the all-pervading morphosyntactic calques (cf. Table 5). Certain types of calque are more representative of dubbing, in which the translated lines should fit the rhythm of the original dialogue, often leading to unidiomatic and English-sounding versions (Herbst, 1994 on German dubbing, Gottlieb, 2001 on dubbing in Denmark). Other types have become almost second-nature to subtitling from English, in which viewers hear – and very often understand – the actors’ original lines. An often-cited example of this is the transfer of the English ‘question plus affirmative answer’ sequence in dialogue situations where many other languages use the opposite pattern to express the same verbal exchange. In subtitling, the idea of viewers hearing a yes, but reading a nej (‘no’) seems to terrify most translators. Even in dubbing countries, the transfer of such questions-cum-answers seems to be a problem. In order to avoid Anglicisms, the Catalan Televisı´o de Catalunya dubbing stylebook (1997: 62) urges translators to render the lines – Because you didn’t want to be a witness, right? – Yes.

as – Perque` voste` no volia fer di testimoni, oi? – No.

As in subtitling, the idiomatic solution often sounds wrong, by English standards, which may prevail if morphosyntactic calques continue to appear as frequently as now. That such calques and other reactive Anglicisms are indeed common in contemporary (film and TV) translations, both in dubbed and in subtitled versions, is documented in a study (Gottlieb, 1999) looking at unintegrated and unidiomatic Anglicisms in the subtitles of two American films broadcast by Danish public-service TV. These two films contained an average of 0.43 and 0.57 unassimilated Anglicism tokens per minute, respectively.

In a follow-up study comparing the dubbed and subtitled video versions of three American family films (Gottlieb, 2001), the subtitled versions displayed Anglicism densities of 0.50, 0.57, and 0.73 tokens per minute, while the dubbed versions contained more than twice as many Anglicisms: 1.04, 1.77, and 1.85 tokens per minute, respectively. In conclusion, both screen translation methods seem to play a very active role in the anglification of the target languages involved. Within the realm of books and other monosemiotic media, there is a scarcity of empirical work comparing Anglicisms in translations and original texts. However, a major Swedish study comparing the vocabulary in 27 novels translated from English with 29 native Swedish novels found a general tendency toward translationese, with many default equivalents (cf. Table 5), thus confirming common knowledge among translation scholars and critics: ‘‘Many English words seem to trigger a standard translation in Swedish although the Swedish translation differs stylistically from the English original.’’ (Gellerstam, 1986: 91).

Anglification Beyond Words Not only translated texts – which make up a significant part of present-day mass communication in any minor speech community – reveal that non-anglophone language systems are currently being anglified. Also original discourse may tell of English influence, as the personae we create in fiction and the world view we express in nonfictional genres often emulate British and American models. In several speech communities, lexis, phraseology, semantics, syntax, and morphology are in a state of flux; some established domestic words resembling their English synonyms obtain boosted frequencies (cf. the favorized cognates subcategory in Table 5), and even the phonemic system is undergoing changes, with English phonemes getting a foothold in standard pronunciation. In quantitative terms, Anglicisms may not seem so conspicuous in European languages today. Yet a considerable part of the present growth and development of Western languages is triggered by English. In Danish, for instance, the vocabulary is being reshuffled, and not only are more than half of all new words inspired by English (Gottlieb, 2004: 49), but these loans – typically nouns – tend to carry significant semantic weight: ‘‘Those are the words that are instrumental in creating our world view, [. . .] and this means that to an increasing extent we let another culture with its language govern our reality’’ (Jarvad, 1995: 135; my translation). In the same vein, Finnish linguist Paavo Pulkkinen notices that since World War II, the number of new

Linguistic Influence 205

semantic loans in Finnish has increased at a higher rate than the number of loan translations. He suggests that the reason for this shift (from active to reactive Anglicisms) is that ‘‘numerous Finns have recently begun thinking partly along Anglo-American lines’’ (Pulkkinen, 1989: 92; my translation). Clearly, the notion of Anglicism has major implications nationally as well as internationally. With English as a modern lingua franca, the more crossnational communication, the more Anglicisms in the world’s languages, the more easily people will understand each other. In turn, this may imply that real English is changed in the process. Especially in Europe, non-native speakers of English sometimes understand each other better if their English contains shared un-English syntactic or semantic features transferred from their individual languages. Imagine for instance a Frenchman and a German successfully using the word ‘eventually’ in the same non-native way. Even outside Europe, such shortcuts may make sense: in Japanese, the word barakku means the same as the Danish barak, as opposed to the meaning of barracks in English. However, the danger remains that the world is reconceptualized in Anglo-American terms. But again, in a politically and economically lopsided world – with anglophone cultures setting the agenda more than ever – getting rid of Anglicisms in defense of linguistic purity would only be possible with draconian measures. Paradoxically, in modern society the steady anglification of domestic languages may prove to be a litmus test of their viability. Although loss of domains is an immanent risk to national languages, in our day and age, a pure language is a fossilized one. See also: Bilingualism; Dubbing; Language Attitudes; Lan-

guage Policies: Policies on Language in Europe; Lexicography: Overview; Neologisms; Subtitling.

Bibliography Chan M & Kwok H (1986). ‘The impact of English on Hong Kong Chinese.’ In Viereck W & Bald W-D (eds.) English in contact with other languages. Budapest: Akade´ miai Kiado´ . 407–431. Dunger H (1899). ‘Wider die Engla¨ nderei in der deutschen Sprache.’ Zeitschrift des Allgemeinen Deutschen Sprachvereins 14, 241–251. Gellerstam M (1986). ‘Translationese in Swedish novels translated from English.’ In Wollin L & Lindquist H (eds.) Translation studies in Scandinavia. Lund: Lund University Press. 88–95. Gooch A (1971). ‘Spanish and the onslaught of the Anglicism.’ Vida Hispa´ nica 19, 17–21. Go¨ rlach M (ed.) (2002). An annotated bibliography of European Anglicisms. Oxford: Oxford University Press.

Go¨ rlach M (2003). English words abroad. Amsterdam: John Benjamins. Gottlieb H (1997). Subtitles, translation & idioms. University of Copenhagen: Center for Translation Studies, English Department. Gottlieb H (1999). ‘The impact of English: Danish TV subtitles as mediators of anglicisms.’ ZAA, Zeitschrift fu¨ r Anglistik und Amerikanistik 47(2), 133–153. Gottlieb H (2001). ‘In video veritas: are Danish voices less American than Danish subtitles?’ In Chaume F & Agost R (eds.) La traduccio´ n in los medios audiovisuales. Castello´ de la Plana: Publicacions de l’UJI, Universitat Jaume I. 193–220. Gottlieb H (2002). ‘Four Germanic dictionaries of anglicisms: when definitions speak louder than words.’ In Gottlieb H, Mogensen J E & Zettersten A (eds.) Symposium on lexicography X. Lexicographica. Series Maior 109. Tu¨ bingen: Niemeyer. 125–143. Gottlieb H (2004). ‘ Danish echoes of English.’ NJES, Nordic Journal of English Studies 3(2), Special issue: The influence of English on the languages in the Nordic countries. 39–65. Graedler A-L (2002). ‘Norwegian.’ In Go¨ rlach M (ed.) English in Europe. Oxford: Oxford University Press. 57–81. Hausmann F J (1986). ‘The influence of the English language on French.’ In Viereck W & Bald W-D (eds.) English in contact with other languages. Budapest: Akade´ miai Kiado´ . 79–105. Hedlund L-E & Hene B (1992). La˚ nord i svenskan. Om spra˚kfo¨ ra¨ ndringar i tid och rum. Stockholm: Wiken. Herbst T (1994). Linguistische Aspekte der Synchronisation von Fernsehserien. Tu¨ bingen: Niemeyer. Hjarvard S (2004). ‘The globalization of language. How the media contribute to the spread of English and the emergence of medialects.’ In Carlsson U (ed.) Nordicom, Nordic Research on Media & Communication Review 25 (1–2), 75–97. Ishiwata T (1986). ‘English borrowings in Japanese.’ In Viereck W & Bald W-D (eds.) English in contact with other languages. Budapest: Akade´ miai Kiado´ . 457–471. Jarvad P (1995). Nye ord – hvorfor og hvordan? Copenhagen: Gyldendal. Lorenzo E (1996). Anglicismos hispa´ nicos. Madrid: Editorial Gredos, Biblioteca Roma´ nica Hispa´ nica. McArthur T (2002). The Oxford guide to world English. Oxford: Oxford University Press. Meier A J (2000). ‘The status of ‘‘foreign words’’ in English: the case of eight German words.’ American Speech 75(2), 169–183. Preisler B (2003). ‘English in Danish and the Danes’ English.’ International Journal of the Sociology of Language 159, 109–126. Pulkkinen P (1989). ‘Anglicismerna i finska spra˚ket.’ In Bojsen E (ed.) Spra˚k i Norden 1989. Nordisk Spra˚ksekretariats Skrifter 10. Oslo: J. W. Cappelens Forlag. 89–93. Sajavaara K (1991). ‘English in Finnish: television subtitles.’ In Ivir V & Kalogjera D (eds.) Languages in contact and contrast. Essays in contact linguistics. Berlin: Mouton de Gruyter. 381–390.

206 Linguistic Influence Sicherl E (1999). The English element in contemporary standard Slovene: phonological, morphological and semantic aspects. University of Ljubljana: Znanstveni insˇ titut Filozofske fakultete. Sørensen K (1973). Engelske la˚ n i dansk. Dansk Sprognævns skrifter 8. Copenhagen: Gyldendal. Sørensen K (2003). ‘250 Years of English influence on the Danish language.’ In Sevaldsen J (ed.) Britain and Denmark. Political, economic and cultural relations in the 19th and 20th centuries. University of Copenhagen: Museum Tusculanum Press. 345–355.

Stiven A B (1936). Englands Einflub auf den deutschen Wortschatz. Zeulenroda: Sporn. Televisio´ de Catalunya (1997). ‘Criteris lingu¨ ı´stics sobre traduccio´ i doblatge.’ Barcelona: Edicions 62. Viereck W (1986). ‘The influence of English on German in the past and in the Federal Republic of Germany.’ In Viereck W & Bald W-D (eds.) English in contact with other languages. Budapest: Akade´ miai Kiado´ . 107–128. Viereck W & Bald W-D (eds.) (1986). English in contact with other languages. Budapest: Akade´ miai Kiado´ .

Linguistic Paradigms T W Stewart, Truman State University, Kirksville, MO, USA ! 2006 Elsevier Ltd. All rights reserved.

The word paradigm has garnered much attention since Kuhn’s (1962) influential treatment of scientific revolution with respect to paradigm shifts. ‘Paradigm’ there refers to a broad frame of reference for scientific conceptualization and investigation, following from a set of shared assumptions about what the world is like. In the linguistic context, however, ‘paradigm’ is primarily used as a morphological term that refers to an organized space of potential words or word-forms related to a common base element. The most commonly encountered instance of the linguistic paradigm is a systematically collected array of inflectionally related word-forms. Such an array is laid out in the form of a table (see Table 1), and columns and rows within the table are defined by contrasting values of inflectional features (see Design Features of Language). There is a long-standing tradition of presentation of word-forms in paradigms in both descriptive and teaching grammars, where systematic elicitation or presentation guides the design of visual representation. It is not the case, however, that all approaches

to morphological description agree on the adequacy of the notion of paradigm for theory construction. Linguistic morphological theory can proceed from one fundamental assumption about the basic units of the lexicon. The decision whether to adopt a morpheme-based or a lexeme-based conceptualization of morphology’s most relevant level of description determines the relevance (or irrelevance) of the paradigm as a descriptive tool. In a morpheme-based theory of morphology, the grammar describes the assembly of complex morphological objects out of minimal meaningful forms, i.e., morphemes. In a morpheme-based theory, inflected word-forms are not integrally related to one another; they simply contain a greater or lesser number of morphemes in common. The paradigm, therefore, is a merely descriptive conceit – it is epiphenomenal, and not part of the grammar proper. For this reason, morpheme-based theories make no special prediction about the relative relatedness of inflected words containing the same lexical stem versus those containing the same inflectional affix. In a lexeme-based theory, by contrast, the lexicon is organized according to abstract word-like units, or lexemes, and inflectionally related word-forms are related in the lexicon via their dependence on a unique lexeme. In a lexeme-based theory, therefore,

Table 1 Paradigm of the Latin adjective bonus ‘good’ Masculine

Nominative Genitive Dative Accusative Ablative

Feminine

Neuter

Singular

Plural

Singular

Plural

Singular

Plural

bonus bonı¯ bono¯ bonum bono¯

bonı¯ bono¯rum bonı¯s bono¯s bonı¯s

bona bonae bonae bonam bona¯

bonae bona¯rum bonı¯s bona¯s bonı¯s

bonum bonı¯ bono¯ bonum bono¯

bona bono¯rum bonı¯s bona bonı¯s

The paradigm is defined with respect to the inflectional features ‘Gender,’ ‘Number,’ and ‘Case.’

206 Linguistic Influence Sicherl E (1999). The English element in contemporary standard Slovene: phonological, morphological and semantic aspects. University of Ljubljana: Znanstveni insˇtitut Filozofske fakultete. Sørensen K (1973). Engelske la˚n i dansk. Dansk Sprognævns skrifter 8. Copenhagen: Gyldendal. Sørensen K (2003). ‘250 Years of English influence on the Danish language.’ In Sevaldsen J (ed.) Britain and Denmark. Political, economic and cultural relations in the 19th and 20th centuries. University of Copenhagen: Museum Tusculanum Press. 345–355.

Stiven A B (1936). Englands Einflub auf den deutschen Wortschatz. Zeulenroda: Sporn. Televisio´ de Catalunya (1997). ‘Criteris lingu¨ı´stics sobre traduccio´ i doblatge.’ Barcelona: Edicions 62. Viereck W (1986). ‘The influence of English on German in the past and in the Federal Republic of Germany.’ In Viereck W & Bald W-D (eds.) English in contact with other languages. Budapest: Akade´miai Kiado´. 107–128. Viereck W & Bald W-D (eds.) (1986). English in contact with other languages. Budapest: Akade´miai Kiado´.

Linguistic Paradigms T W Stewart, Truman State University, Kirksville, MO, USA ! 2006 Elsevier Ltd. All rights reserved.

The word paradigm has garnered much attention since Kuhn’s (1962) influential treatment of scientific revolution with respect to paradigm shifts. ‘Paradigm’ there refers to a broad frame of reference for scientific conceptualization and investigation, following from a set of shared assumptions about what the world is like. In the linguistic context, however, ‘paradigm’ is primarily used as a morphological term that refers to an organized space of potential words or word-forms related to a common base element. The most commonly encountered instance of the linguistic paradigm is a systematically collected array of inflectionally related word-forms. Such an array is laid out in the form of a table (see Table 1), and columns and rows within the table are defined by contrasting values of inflectional features (see Design Features of Language). There is a long-standing tradition of presentation of word-forms in paradigms in both descriptive and teaching grammars, where systematic elicitation or presentation guides the design of visual representation. It is not the case, however, that all approaches

to morphological description agree on the adequacy of the notion of paradigm for theory construction. Linguistic morphological theory can proceed from one fundamental assumption about the basic units of the lexicon. The decision whether to adopt a morpheme-based or a lexeme-based conceptualization of morphology’s most relevant level of description determines the relevance (or irrelevance) of the paradigm as a descriptive tool. In a morpheme-based theory of morphology, the grammar describes the assembly of complex morphological objects out of minimal meaningful forms, i.e., morphemes. In a morpheme-based theory, inflected word-forms are not integrally related to one another; they simply contain a greater or lesser number of morphemes in common. The paradigm, therefore, is a merely descriptive conceit – it is epiphenomenal, and not part of the grammar proper. For this reason, morpheme-based theories make no special prediction about the relative relatedness of inflected words containing the same lexical stem versus those containing the same inflectional affix. In a lexeme-based theory, by contrast, the lexicon is organized according to abstract word-like units, or lexemes, and inflectionally related word-forms are related in the lexicon via their dependence on a unique lexeme. In a lexeme-based theory, therefore,

Table 1 Paradigm of the Latin adjective bonus ‘good’ Masculine

Nominative Genitive Dative Accusative Ablative

Feminine

Neuter

Singular

Plural

Singular

Plural

Singular

Plural

bonus bonı¯ bono¯ bonum bono¯

bonı¯ bono¯rum bonı¯s bono¯s bonı¯s

bona bonae bonae bonam bona¯

bonae bona¯rum bonı¯s bona¯s bonı¯s

bonum bonı¯ bono¯ bonum bono¯

bona bono¯rum bonı¯s bona bonı¯s

The paradigm is defined with respect to the inflectional features ‘Gender,’ ‘Number,’ and ‘Case.’

Linguistic Paradigms 207

the paradigm is not merely convenient for description; it is an organizing principle of the inflectional component of grammar, and the cells of the paradigm contain forms that instantiate the lexeme as suitably modified morphologically for use in particular grammatical contexts. The prediction may follow from this that the inflectional forms of a single lexeme are related in a special way, and this may be borne out in, for example, reaction times in a psycholinguistic priming task. The traditional typological characterization of languages as isolating, fusional, or agglutinating (a division that is both descriptively nonexhaustive and an oversimplification for almost any given individual language) may bear on the apparent utility of the paradigm on even a descriptive level. For example, a highly agglutinating language, such as Turkish, has very few co-occurrence restrictions on inflectional morphology, and so it has a paradigm that is potentially quite complex (see Table 2). On the other hand, a highly isolating language such as (Mandarin) Chinese makes most of its grammatical contrasts by introducing discrete function words (adpositions, auxiliaries, etc.,) and distinct syntactic constructions. In other words, there is little inflectional morphology in Mandarin, and so there is little motivation for an abstract lexeme separate from the surface word, and the notion of the paradigm is therefore rendered otiose. Since isolating languages have no inflectional morphological paradigms to speak of, the question may arise as to the possibility of morphological analogy in such languages, that is, of so-called ‘paradigmleveling,’ in which synchronically irregular forms are remade on analogy with inflectionally related, regular word-forms. Proponents of the Lexical Diffusion school of language change, many of whom work principally with Chinese data, dismiss out of hand the possibility of analogical influence in cases of irregular change. Analogy as a linguistic strategy has never been limited to inflectional paradigms, so although

it is a fair step to set aside considerations of paradigm leveling in Chinese, it does not necessarily follow that analogy is not applicable in Chinese linguistic change. The paradigm thus seems most useful for the presentation of languages with considerable inflectional morphology, but at the same time, with limits on the number of features that may be realized on any given word-form. This best describes fusional languages, and it is perhaps not surprising that the paradigm is especially common in the history of linguistic description of Indo-European languages, many of which can be characterized as fusional, or else minimally agglutinative. The above descriptive difficulties associated with Turkish and Mandarin point to two further theoretical issues: 1. Do speakers store full paradigms of word-forms in their heads and select words from this set, no matter how many forms this entails (cf. Odden, 1981 on the potential for literally trillions of grammatically distinct verb forms for each and every Shona verb), or do the contrastive features define the paradigm space, and general morphological rules allow for the computation of inflected word-forms? 2. Is the only sensible use of the paradigm to be found in a traditional morphology with the word-form as the upper limit? Might one rather extend ‘grammatically related’ into the realm of collocations of content and function words, i.e., periphrastic expressions? Both of these questions are open and active in linguistic research, although the traditional conceptualization of morphology (as the formation of words and the study of their constituent parts and/or relationships) has militated against multiword units within paradigms. It is only in the context of a paradigm that one may refer to a defective paradigm, i.e., the situation in which a lexeme lacks a particular inflected

Table 2 A fragment of the morphological paradigm for Turkish deniz ‘ocean’

Singular Plural Diminutive sg. Diminutive pl. ‘my X’ ‘your (sg.) X’ ‘our X’ ‘your (pl.) X’ ‘my Xs (pl.)’ ‘our Xs (pl.)’

‘X’

‘to X’

‘of X’

‘in X’

‘from X’

deniz denizler denizcik denizcikler denizim denizin denizimiz deniziniz denizlerim denizlerimiz

denize denizlere denizcike denizciklere denizime denizine denizimize denizinize denizlerime denizlerimize

denizin denizlerin denizcikin denizciklerin denizimin denizinin denizimizin denizinizin denizlerimin denizlerimizin

denizde denizlerde denizcikde denizciklerde denizimde denizinde denizimizde denizinizde denizlerimde denizlerimizde

denizden denizlerden denizcikden denizciklerden denizimden denizinden denizimizden denizinizden denizlerimden denizlerimizden

208 Linguistic Paradigms

word-form when it would be expected, all else being equal, to possess such a word-form. In English, an example is in the paradigm of the verb STRIVE, which for most speakers lacks a past/passive participle, or at least there is a significant lack of consensus on what it should be: ?striven, ?strove, ?strived. Any of these may be generated by a more or less productive rule of English verb morphology (cf. DRIVE vs. DIVE vs. THRIVE), and as such, the hesitation to choose a form with certainty indicates a (perhaps accidental) gap in the paradigm. A full discussion of gaps is beyond the scope of this article, but it should be pointed out that whole classes of lexemes may lack a particular form or set of forms for semantic or pragmatic reasons or for reasons of pure historical accident. In either case, the result is a paradigm that may be seen as defective with respect to the inflectional paradigms of other, comparable lexemes in the language in question. As a final consideration, there may be in some cases a motivation for the addition of derivationally related words to the discussion of paradigms. That is, one may speak of regular derivation as operating within the space of a derivational paradigm, with different base words and their derivatives standing in comparable relations to their counterparts in other derivational paradigms. For example, sing > song > singer is analogous to shoot > shot > shooter within an abstract paradigm {Action} > {Result} > {Agent}. Discussion of derivation in paradigmatic terms is somewhat problematic, given that each lexeme in turn licenses an inflectional paradigm of its own. Although there exists considerable discussion of

a formal compartmentalization of derivation and inflection (the so-called ‘split-morphology hypothesis’), there are numerous languages in which a strict sequencing of inflectional markings outside of derivational ones is not respected. In such cases, therefore, paradigmatic intermingling may be more clearly motivated. See also: Design Features of Language.

Bibliography Bochner H (1993). Simplicity in generative morphology. New York: Mouton de Gruyter. Carstairs[-McCarthy] A (1988). Allomorphy in inflexion. London: Croom Helm. Carstairs-McCarthy A (1998). ‘Paradigmatic structure: inflectional paradigms and morphological classes.’ In Spencer A & Zwicky A M (eds.) Handbook of morphology. Oxford: Blackwell. 322–334. Kuhn T S (1962). The structure of scientific revolutions. Chicago: University of Chicago Press. Mathhews P H (1972). Inflectional morphology: a theoretical study based on aspects of Latin verb conjugation. Cambridge: Cambridge University Press. Odden D (1981). Problems of tone assignment in Shona. Ph.D. diss., University of Illinois. Stump G T (1991). ‘A paradigm-based theory of morphosemantic mismatches.’ Language 67, 675–725. Stump G T (2001). Inflectional morphology: a theory of paradigm structure. Cambridge: Cambridge University Press.

Linguistic Reality L Wetzel, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.

Linguistic reality is that portion or aspect of reality that linguistics seeks to understand and explain, and in virtue of which true linguistic claims are true. Linguistics is the scientific study of language, its units, nature, and structure. To see what the units, nature, and structure of language are, we should look to what linguists say. A reasonable place to begin is An encyclopedia of language. It reveals that in addition to languages and language users, phonetics cuts up linguistic reality into vowels, consonants, syllables, words, and sound segments, the human vocal

tract and its parts (the tongue has five), among other things (see Phonetic Pedagogy). Phonology is also concerned with sounds, but organizes them in terms of phonemes, allophones, alternations, utterances, phonological representations, underlying forms, syllables, words, stress-groups, feet, and tone groups (see Phonology: Overview). In grammar, morphology describes morphemes, roots, affixes, and so forth, and syntax analyzes sentences, semantic representations, LF representations, among other things (see Grammar). Semantics studies signs, their meanings, their sense relations, propositions, etc. Pragmatics deals with speech acts, speaker meanings, sentence meanings, implicatures, presuppositions, etc. Lexicography investigates nouns, verbs, words, their stems, definitions, forms, pronunciations, origin.

208 Linguistic Paradigms

word-form when it would be expected, all else being equal, to possess such a word-form. In English, an example is in the paradigm of the verb STRIVE, which for most speakers lacks a past/passive participle, or at least there is a significant lack of consensus on what it should be: ?striven, ?strove, ?strived. Any of these may be generated by a more or less productive rule of English verb morphology (cf. DRIVE vs. DIVE vs. THRIVE), and as such, the hesitation to choose a form with certainty indicates a (perhaps accidental) gap in the paradigm. A full discussion of gaps is beyond the scope of this article, but it should be pointed out that whole classes of lexemes may lack a particular form or set of forms for semantic or pragmatic reasons or for reasons of pure historical accident. In either case, the result is a paradigm that may be seen as defective with respect to the inflectional paradigms of other, comparable lexemes in the language in question. As a final consideration, there may be in some cases a motivation for the addition of derivationally related words to the discussion of paradigms. That is, one may speak of regular derivation as operating within the space of a derivational paradigm, with different base words and their derivatives standing in comparable relations to their counterparts in other derivational paradigms. For example, sing > song > singer is analogous to shoot > shot > shooter within an abstract paradigm {Action} > {Result} > {Agent}. Discussion of derivation in paradigmatic terms is somewhat problematic, given that each lexeme in turn licenses an inflectional paradigm of its own. Although there exists considerable discussion of

a formal compartmentalization of derivation and inflection (the so-called ‘split-morphology hypothesis’), there are numerous languages in which a strict sequencing of inflectional markings outside of derivational ones is not respected. In such cases, therefore, paradigmatic intermingling may be more clearly motivated. See also: Design Features of Language.

Bibliography Bochner H (1993). Simplicity in generative morphology. New York: Mouton de Gruyter. Carstairs[-McCarthy] A (1988). Allomorphy in inflexion. London: Croom Helm. Carstairs-McCarthy A (1998). ‘Paradigmatic structure: inflectional paradigms and morphological classes.’ In Spencer A & Zwicky A M (eds.) Handbook of morphology. Oxford: Blackwell. 322–334. Kuhn T S (1962). The structure of scientific revolutions. Chicago: University of Chicago Press. Mathhews P H (1972). Inflectional morphology: a theoretical study based on aspects of Latin verb conjugation. Cambridge: Cambridge University Press. Odden D (1981). Problems of tone assignment in Shona. Ph.D. diss., University of Illinois. Stump G T (1991). ‘A paradigm-based theory of morphosemantic mismatches.’ Language 67, 675–725. Stump G T (2001). Inflectional morphology: a theory of paradigm structure. Cambridge: Cambridge University Press.

Linguistic Reality L Wetzel, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.

Linguistic reality is that portion or aspect of reality that linguistics seeks to understand and explain, and in virtue of which true linguistic claims are true. Linguistics is the scientific study of language, its units, nature, and structure. To see what the units, nature, and structure of language are, we should look to what linguists say. A reasonable place to begin is An encyclopedia of language. It reveals that in addition to languages and language users, phonetics cuts up linguistic reality into vowels, consonants, syllables, words, and sound segments, the human vocal

tract and its parts (the tongue has five), among other things (see Phonetic Pedagogy). Phonology is also concerned with sounds, but organizes them in terms of phonemes, allophones, alternations, utterances, phonological representations, underlying forms, syllables, words, stress-groups, feet, and tone groups (see Phonology: Overview). In grammar, morphology describes morphemes, roots, affixes, and so forth, and syntax analyzes sentences, semantic representations, LF representations, among other things (see Grammar). Semantics studies signs, their meanings, their sense relations, propositions, etc. Pragmatics deals with speech acts, speaker meanings, sentence meanings, implicatures, presuppositions, etc. Lexicography investigates nouns, verbs, words, their stems, definitions, forms, pronunciations, origin.

Linguistic Reality 209

Why Those Objects? What is the linguist’s justification for positing, or recognizing, vowels and consonants, phonemes, morphemes, words, sentences and so on? A general though unhelpful answer is that countenancing such objects produces theories that have a great deal of classificatory, explanatory, and predictive power. The more specific question ‘but why posit phonemes?’ must be justified within phonology by comparing a theory with phonemes to one without, as must the more specific question of why [b] and [p] are phonemes of English. Why certain noises are enshrined as dictionary entries needs to be justified within lexicographical methodology, rather than, say, within philosophy. The point is that the justification is to be found within linguistic theory, and in most cases within a particular subdiscipline, since subdisciplines have different assumptions. (Phonetics and phonology approach the sound signal differently and hence parse it in terms of different units. Lexicography might give credence to the thesis that all linguistic objects have instances, since each of the million or so words recognized by the O.E.D. have been uttered or inscribed by someone at some time. But grammar might not; there are more, perhaps infinitely many more, sentences than will ever be used.) Assuming there is sufficient internal justification for recognizing the existence of the above-mentioned objects, we may now ask the following question.

What Are They? What, for example, is that quintessential linguistic object an expression? For that matter, what is a language? One answer to the former would be that an expression is anything that is a noun, verb, adjective, etc., or a sequence of such things built up by means of the grammatical rules. But that merely raises the question of what these other things are. Obviously, nouns and verbs are for the most part words, but what are words? Are they physical objects, particular events, abstract objects, kinds, classes, or mental entities? (Parallel questions may be asked of languages: are they social practices, abstract objects, classes of expressions, or psychological products of the language faculty? While it is beyond the scope of this short encyclopedia article to address the more complicated question about language, it is worth noting that the proposed answers mirror those given below on what a word is.)

Types and Tokens Before such questions can be tackled, we have to disambiguate the word ‘word,’ which could mean

either ‘word type’ or ‘word token.’ There is only one definite article in English, the word ‘the’ – the word type – but there are millions of the’s on pages in the Library of Congress and the lips of members of Congress – the word tokens (see Type versus Token). Word tokens are particulars; they might be composed of ink, sounds, or smoke signals. They have a unique spatiotemporal location, unlike types. Types are unique but capable of more than one instantiation or representation, unlike their tokens. A good indicator that there are both types and tokens is the different counting procedures associated with the same word. When it is said that an educated person’s vocabulary is upward of 10 000 words, that there are exactly 26 letters of the English alphabet, or that English has 18 cardinal vowels, types are being counted, since the numbers would exceed a million if we were counting tokens. As the examples suggest, not only words but also vowels, phonemes, letters, utterances, sentences, and most of the rest of the linguistic objects found in our survey come in types and tokens. Which sorts of objects, then, does linguistics explicitly refer to and quantify over – types or tokens? Types – although linguistics is an empirical science based on empirical data, applicable to particulars – tokens – in the causal realm. Yet to ensure generality in an economical format, most of the references and quantifications that appear in linguistic theory involve types and their relationships to each other. Expressions, for example, may be composed of words, and words composed of syllables, and syllables of phonemes, and phonemes of features – all of which are types. (This might be thought puzzling; how can an expression such as ‘Pease porridge hot, pease porridge cold’ be six words long, if there are only four words, each of them unique, of which it might be composed? For a resolution of this puzzle, (see Type versus Token). Linguistics, in other words, is awash in references to and quantifications over types, including words. Our original question then becomes the following one.

What Are Word Types? It should be clear that word types (hereafter: words) are not particular physical objects/events, but it remains to be seen whether they are abstract objects, kinds, classes, or mental entities – or perhaps whether they do not exist at all. Even among word types, there are several sorts of words. Yet one of the lexicographer’s uses of the word ‘word’ stands out. A rough characterization of this sort is the sort of thing that merits a dictionary entry. (Rough, because some entries in the dictionary, e.g.,

210 Linguistic Reality

il-, -ile, and metric system are not words, and some words, e.g., place names and other proper names, do not get a dictionary entry.) To fix our thoughts on what an account must explain, let us consider the word color, or colour. According to the O.E.D., the noun color is from early modern English, is pronounced [kQ"ler] has two ‘modern current or most usual spellings’ [colour, color]; 18 earlier spellings [collor, collour, coloure, colowr colowre, colur colure, cooler, couler, coullor, coullour, coolore, coulor, coulore, coulour, culler, cullor, cullour]; and 18 different senses – divided into four branches – with numerous subsenses. The verb color is a different word, but with the same pronunciation and spellings (O.E.D., vol. 2: 636–639). Webster’s assures us that the word schedule has four current pronunciations: ["ske-(,)ju¨ (e)l], ["ske-jel] (US), ["she-jel] (Can.) and ["she-(,)dyu¨ (e)l] (Eng.) (O.E.D., vol. 2: 1044). Thus, a word can be written or spoken; it can have more than one correct spelling, more than one correct spelling at the same time, more than one sense at the same time, the same correct spelling and pronunciation as a different word; and lastly, a word may have more than one correct pronunciation at a given time. These linguistic facts have to be accommodated by any account of words.

Realism Probably the most popular account of words is given by platonic realism: words are abstract objects – acausal objects, like numbers, that have no spatiotemporal location. As can be seen from the preceding paragraph, they are very abstract entities indeed, for there is no relatively simple property, like spelling or pronunciation or meaning, that all tokens of the word color have in common; not even all their written tokens have the same correct spelling. (Indeed, the realist may argue that that is one of the primary justifications for positing word types – being a token of the word color might be the only glue that binds the considerable variety of space-time particulars together). This should discourage the misconception that realism is committed to a platonic ‘form’ (the spelling, say) that all instances resemble the way a cookie resembles a cookie cutter (although the old view lives on in the fact that spellings are called ‘forms’ of the word). Family resemblance in the standard cases is the most that might be hoped for, but intentionality and context are such important factors in determining what type a token is a type of, that even resemblance can fail in nonstandard cases. (It should be noted, if it is not already clear from the foregoing, that a physical object that is a token of a type is not one intrinsically – merely by being a

certain sequence of shaped ink marks, say. It is only a token relative to a type, a language, and perhaps an orientation. Moreover, it may need to have been produced with a certain intention and in accordance with certain conventions.) Platonic realism, whether of words or any other abstract objects, must face a serious epistemological challenge, namely, to explain how it is that spatiotemporal creatures such as ourselves can have knowledge of these abstract things if we cannot causally interact with them. Admittedly, we do causally interact with word tokens, but if the tokens are as diverse as was emphasized above, how do we arrive at the properties of the type at all? There are various realist responses to this problem (responses that are not necessarily mutually exclusive). One is to appeal to intuition. Another is to claim that just as maps represent a city, so tokens represent their types, thus reducing the problem to one of characterizing representation. Another is to reject platonic realism altogether in favor of Aristotelian realism. This involves dropping the claim that words have no spatiotemporal location, and claiming instead that they have many such locations; each type is ‘in’ each of its tokens. Such a position suggests that there could not be an uninstantiated type. While plausible for words, it is not plausible for sentences. A third response is to claim that words are kinds, just as species are, thus reducing the problem of how we arrive at knowledge of types to one of induction.

Conceptualism As the terms ‘platonic’ and ‘Aristotelian realism’ suggest, we have run into the old philosophical problem of universals in virtue of the fact that types have instances. Not surprisingly, the same camps are in evidence here. The traditional opponents of universals/abstract objects were the conceptualists and the nominalists. The conceptualists argued that there are no general things such as man; there are only general ideas – that is, ideas that apply to more than one thing. Applied to words, the thesis would be that words are not abstract objects ‘out there,’ but objects in the mind. Their existence then, would be contingent on having been thought of. While this contingency may have a good deal to recommend it in the case of linguistic items, by itself conceptualism is just a stopgap measure. For ideas also appear to come in types and tokens (as evidenced by the fact that two people sometimes have the same idea). So either the conceptualist is proposing that word types are idea types – which would be a species of realism – or she is proposing that there are no types, only mental particulars in particular persons, which is a species of nominalism.

Linguistic Reality 211

Nominalism The problem for those hostile to universals and abstract objects is to account for our apparent theoretical commitment to types, which are clearly not spatiotemporal particulars. Traditional nominalists argued (as their name implies) that there are no general things, there are only general words, which apply to more than one thing. But this too is not a solution to the current problem, presupposing as it does that there are word types – types are the problem. Class nominalists have proposed that a word type is just the class, or set, of its tokens. But this is unsatisfactory because, first, classes are abstract objects too, so it is hard to see how this is really a form of nominalism about abstract objects. And second, classes are ill-suited for the job, since classes have their membership and their cardinality necessarily, but how many tokens a word has is a contingent matter. (One less token would not annihilate the word.) Initially more promising is the nominalistic claim that talk of types is harmless because it is unnecessary – it is just shorthand for talk of tokens. The mountain lion is a mammal is easily translated as ‘every mountain lion is a mammal.’ So to refer to the noun color, say, we need only refer instead to all its tokens. One problem is how to do this. We can’t say ‘every token of the noun ‘‘color’’. . . ,’ because ‘color’ refers to a type. And ‘every noun ‘‘color’’. . .’ does not seem grammatical, a fact that is even more apparent if we consider sentences (e.g., ‘every ‘‘the cat is on the mat’’. . .’). Even if we could, truths will convert to falsehoods (using the ‘every’-conversion). The noun color is pronounced [kQ"ler], but particular inscriptions of it are not audible at all. So the question is how we might identify these tokens grammatically but without referring to the noun color itself and still say something true and (in some appropriate sense) equivalent. The idea seems to be that the type must embody certain similar features that all and only its tokens have. This is a beguiling idea, until one tries to find such a feature, or features, amid the large variety of its tokens – even the well-formed tokens. Consider color and schedule again. They demonstrate that neither same spelling, same sense, nor same pronunciation prevail. As the preeminent nominalist Goodman observed, ‘‘Similarity, ever ready to solve philosophical problems and overcome obstacles, is a pretender, an impostor, a quack . . . . Similarity does not pick out inscriptions that are ‘tokens of a common type’ . . . . Only our addiction to similarity deludes us into accepting similarity as the basis for grouping inscriptions into the several letters, words, and so forth’’ (Goodman, 1972: 437–438).

Further undermining the reductive approach being considered is that each of the possible defining features mentioned (e.g., spelling, pronunciation) involve reference to types: letter types in the spellings, phoneme types in the pronunciation. (Types are defined in terms of each other.) These too would have to be analyzed away in terms of quantifications over particulars. If, as a last resort, one were to specify a massive disjunction that avoided all references to types, one that captured each and every token of the noun color, one would capture tokens of other words, too. Paraphrasing quantifications over word types would be extraordinarily difficult. The moral is that whatever word types are, they are indispensable. See also: Grammar; Phonetic Pedagogy; Phonology: Over-

view; Type versus Token.

Bibliography Armstrong D (1986). ‘In defense of structural universals.’ Australasian Journal of Philosophy 64, 85–88. Asher N (1993). Reference to abstract objects in discourse. The Netherlands: Kluwer. Bromberger S (1992). ‘Types and tokens in linguistics.’ In his On what we know we don’t know. Chicago: University of Chicago Press. Chomsky N (1957). Syntactic structures. The Hague: Mouton & Co. Collinge N E (ed.) (1990). An encyclopedia of language. London: Routledge. Goodman N (1951/1977). Structure of appearance, 3rd edn. Dordrecht, Holland: Reidel. Goodman N (1972). ‘Seven strictures on Similarity.’ Problems and projects. Indianapolis: Bobbs-Merrill. Goodman N & Quine W V (1947). ‘Steps toward a constructive nominalism.’ Journal of Symbolic Logic 12, 105–122. Reprinted in Goodman (1972). Hale B (1987). Abstract objects. Oxford/New York: Basil Blackwell. Hutton C (1990). Abstraction and instance: the type-token relation in linguistic theory. Oxford: Pergamon Press. Katz J J (1981). Languages and other abstract objects. Totawa, NJ: Rowman and Littlefield. Mish F C et al. (eds.) (1993). Merriam Webster’s collegiate dictionary (10th edn.). Springfield: Merriam Webster, Inc. Murray J A H et al. (eds.) (1971). The Oxford English dictionary. Oxford: Oxford University Press. Lewis D (1986a). ‘Against structural universals.’ Australasian Journal of Philosophy 64, 25–46. Lewis D (1986b). ‘Comment on Armstrong and Forrest.’ Australasian Journal of Philosophy 64, 92–93. Peirce C S (1931–58). Collected papers of Charles Sanders Peirce. Hartshorne & Weiss (eds.). Cambridge: Harvard University Press.

212 Linguistic Reality Quine W V (1953). ‘On what there is.’ In his From a logical point of view. Cambridge: Harvard University Press. Quine W V (1987). Quiddities: an intermittently philosophical dictionary. Cambridge: Harvard University Press. 216–219. Simons P (1982). ‘Token resistance.’ Analysis 42(4), 195–203.

Wetzel L (1993). ‘What are occurrences of expressions?’ Journal of Philosophical Logic 22, 215–220. Wetzel L (2000). ‘The trouble with nominalism.’ Philosophical Studies 98(3), 361–370. Wetzel L (2002). ‘On types and words.’ Journal of Philosophical Research 27, 239–265. Wetzel L (in press). Types and tokens: an essay on universals. Cambridge: MIT Press.

Linguistic Rights T Skutnabb-Kangas, Roskilde Universitetscenter, Roskilde, Denmark ! 2006 Elsevier Ltd. All rights reserved.

Central Concepts All the rights that individuals, groups, organizations, and states have in relation to languages (their own or others) are linguistic rights or language rights (LRs). Languages may similarly have rights. Strictly speaking, only binding rights (coded in laws or regulations of various kinds) count. In most cases, these also include a duty-holder who has to see to it that the rights can be enjoyed. A state or a regional authority can, for instance, have the duty to organize education through the medium of a certain language for certain individuals or groups in a specific place. In addition to rights proper, there are many nonbinding recommendations, declarations, and other nice intentions and wishes about LRs. An individual in a specific country may have the right to use her or his mother tongue in various contexts, for instance in dealing with authorities, local, regional, or state-wide, orally or in writing or both; the authorities do not necessarily need to reply in the same language. The mother tongue is often for legal purposes defined in a strict way, as the first language that a person learned, and still speaks, and with which s/he identifies. A definition often used in situations where forced assimilation of indigenous peoples has made the older generation speak the dominant language to their children, is sometimes much less strict: a mother tongue, with LRs connected to it, is defined as a language which is, or has been, the first language of the individual her-/himself, or of (one of) the parents or grandparents. Often, it depends on how many individuals there are in a country (area, region, municipality, etc.) whether individuals (speakers or signers) belonging to that group have any LRs; the group has to have a certain size. Two of the most important European

LRs documents, from the Council of Europe, use group size as a criterion, but do not in any way define it: The European Charter on Regional or Minority Languages, and the Framework Convention on the Protection of National Minorities, both in force since 1998, use formulations such as ‘‘in substantial numbers’’ or ‘‘pupils who so wish in a number considered sufficient’’ or ‘‘if the number of users of a regional or minority language justifies it.’’ (See the latest news about these documents and their ratifications at the Council of Europe. The treaty numbers are 148 and 158). If an individual can use the right to all mother tongue medium services anywhere in her or his country, we speak of a principle of personality. Usually, only members of large groups with excellent protection have this kind of right, which in most cases is restricted to the dominant language speakers in a country. Often, such speakers are not even aware of how precious these rights are, and how unusual it is to possess such rights for any of the world’s linguistic minorities (or even some linguistic majorities: e.g., in several African countries, the old colonial languages still have more rights than the indigenous African languages have). If LRs are connected to a specific region, such as is the case in Switzerland, where German-speakers can use their language in official contexts only in certain cantons, while French, Italian, or Romansch-speakers can use their respective languages only in certain other cantons (where German speakers do not have the right to use German), we speak of a principle of territoriality. That means in practice that if Italian-speaking Swiss parents want their children to be educated through the medium of Italian, they have to live in the only canton (Tessin) where this is a right. Many international organizations and most states have language policies that spell out the official languages of the organization or state and, by implication, the LRs of the people, groups, and states dealing with, and working within, that entity. The United Nations has six official languages, the Council of

212 Linguistic Reality Quine W V (1953). ‘On what there is.’ In his From a logical point of view. Cambridge: Harvard University Press. Quine W V (1987). Quiddities: an intermittently philosophical dictionary. Cambridge: Harvard University Press. 216–219. Simons P (1982). ‘Token resistance.’ Analysis 42(4), 195–203.

Wetzel L (1993). ‘What are occurrences of expressions?’ Journal of Philosophical Logic 22, 215–220. Wetzel L (2000). ‘The trouble with nominalism.’ Philosophical Studies 98(3), 361–370. Wetzel L (2002). ‘On types and words.’ Journal of Philosophical Research 27, 239–265. Wetzel L (in press). Types and tokens: an essay on universals. Cambridge: MIT Press.

Linguistic Rights T Skutnabb-Kangas, Roskilde Universitetscenter, Roskilde, Denmark ! 2006 Elsevier Ltd. All rights reserved.

Central Concepts All the rights that individuals, groups, organizations, and states have in relation to languages (their own or others) are linguistic rights or language rights (LRs). Languages may similarly have rights. Strictly speaking, only binding rights (coded in laws or regulations of various kinds) count. In most cases, these also include a duty-holder who has to see to it that the rights can be enjoyed. A state or a regional authority can, for instance, have the duty to organize education through the medium of a certain language for certain individuals or groups in a specific place. In addition to rights proper, there are many nonbinding recommendations, declarations, and other nice intentions and wishes about LRs. An individual in a specific country may have the right to use her or his mother tongue in various contexts, for instance in dealing with authorities, local, regional, or state-wide, orally or in writing or both; the authorities do not necessarily need to reply in the same language. The mother tongue is often for legal purposes defined in a strict way, as the first language that a person learned, and still speaks, and with which s/he identifies. A definition often used in situations where forced assimilation of indigenous peoples has made the older generation speak the dominant language to their children, is sometimes much less strict: a mother tongue, with LRs connected to it, is defined as a language which is, or has been, the first language of the individual her-/himself, or of (one of) the parents or grandparents. Often, it depends on how many individuals there are in a country (area, region, municipality, etc.) whether individuals (speakers or signers) belonging to that group have any LRs; the group has to have a certain size. Two of the most important European

LRs documents, from the Council of Europe, use group size as a criterion, but do not in any way define it: The European Charter on Regional or Minority Languages, and the Framework Convention on the Protection of National Minorities, both in force since 1998, use formulations such as ‘‘in substantial numbers’’ or ‘‘pupils who so wish in a number considered sufficient’’ or ‘‘if the number of users of a regional or minority language justifies it.’’ (See the latest news about these documents and their ratifications at the Council of Europe. The treaty numbers are 148 and 158). If an individual can use the right to all mother tongue medium services anywhere in her or his country, we speak of a principle of personality. Usually, only members of large groups with excellent protection have this kind of right, which in most cases is restricted to the dominant language speakers in a country. Often, such speakers are not even aware of how precious these rights are, and how unusual it is to possess such rights for any of the world’s linguistic minorities (or even some linguistic majorities: e.g., in several African countries, the old colonial languages still have more rights than the indigenous African languages have). If LRs are connected to a specific region, such as is the case in Switzerland, where German-speakers can use their language in official contexts only in certain cantons, while French, Italian, or Romansch-speakers can use their respective languages only in certain other cantons (where German speakers do not have the right to use German), we speak of a principle of territoriality. That means in practice that if Italian-speaking Swiss parents want their children to be educated through the medium of Italian, they have to live in the only canton (Tessin) where this is a right. Many international organizations and most states have language policies that spell out the official languages of the organization or state and, by implication, the LRs of the people, groups, and states dealing with, and working within, that entity. The United Nations has six official languages, the Council of

Linguistic Rights 213

Europe only two (English and French). The European Union has several times increased the number of its official languages, such that after its latest expansion, in May 2004, the Union now has 20 official languages; all official documents have to be made available in all of them. Many organizations also have working languages; their number may be more restricted. A number of states have only one official (or state) language; most have two or more (English is an official language in more than 70 states; see Skutnabb-Kangas, 2000). South Africa has 11 official languages, India 22. In addition, many states specify one or several national, additional, link, or national heritage languages in their constitutions; in most cases, these have fewer rights than the official languages have (see de Varennes, 1996).

Linguistic Human Rights (LHRs) The very recent and still somewhat unclear concept of Linguistic Human Rights (LHRs) combines language rights (LRs) with human rights (HRs). LHRs are those (and only those) LRs which, first, are necessary to satisfy people’s basic needs (including the need to live a dignified life), and which, second, therefore are so basic, so fundamental that no state (or individual or group) is supposed to violate them. Some basic rights prohibit discrimination on the basis of language (negative rights); others ensure equal treatment to language groups (positive rights). There are many LRs which are not LHRs. It would, for instance, be nice if everybody could, even in civil court cases, have a judge and witnesses who speak (or sign) this person’s language, regardless of how few users the language has. Today, it is mostly in criminal cases only that one has a linguistic HUMAN right to be informed of the charge against oneself in a language one understands (i.e., not necessarily the mother tongue); in all other contexts, people may or may not have a LANGUAGE right, depending on the country and language; in the best cases, interpreters paid for by the state are used. Likewise, it would be nice if the following demands were to be met: All language communities are entitled to have at their disposal all the human and material resources necessary to ensure that their language is present to the extent they desire at all levels of education within their territory: properly trained teachers, appropriate teaching methods, textbooks, finance, buildings and equipment, traditional and innovative technology.

But such demands are completely unrealistic and cannot be considered part of LHRs. (They come from Article 25 of the 1996 Draft Universal Declaration of

Linguistic Rights.) At this moment, only a few dozen language communities in the world have these kinds of rights; this has to be seen against the background of the fact that there are some 6500 to 7000 spoken languages (and perhaps an equal number of sign languages) in the world (see Ethnologue; see SkutnabbKangas, 2000, Chapter 1, for the unreliability of the statistics). Two kinds of interest in LHRs can be distinguished. One is ‘‘the expressive interest in language as a marker of identity,’’ the other an ‘‘instrumental interest in language as a means of communication’’ (Rubio-Marı´n, 2003: 56). The expressive (or noninstrumental) language rights ‘‘aim at ensuring a person’s capacity to enjoy a secure linguistic environment in her/his mother tongue and a linguistic group’s fair chance of cultural self-reproduction’’ (Rubio-Marı´n, 2003: 56). It is only these rights that Rubio-Marı´n calls ‘‘language rights in a strict sense’’ (Rubio-Marı´n, 2003: 56); in other words, these could be seen as LHRs. The instrumental language rights ‘‘aim at ensuring that language is not an obstacle to the effective enjoyment of rights with a linguistic dimension, to the meaningful participation in public institutions and democratic process, and to the enjoyment of social and economic opportunities that require linguistic skills’’ (Rubio-Marı´n, 2003: 56). So far, it is not at all clear what should and what should not be considered LHRs, witness the lively ongoing debates about the topic. Language is one of the four most important human characteristics (among many others that are also listed in human rights instruments) on the basis of which discrimination is never allowed (the others are gender, ‘race,’ and religion). Still, language often disappears in the educational paragraphs of binding HRs instruments. One example: the paragraph on education (x 26) of the Universal Declaration of Human Rights (1948) does not refer to language at all. Educational linguistic human rights, especially the right to mother tongue medium education, are among the most important rights for any minority. Without them, a minority whose children attend school, usually cannot reproduce itself as a minority: it cannot integrate with the majority, but is forced to assimilate to it. Binding educational clauses of human rights instruments have more opt-outs, modifications, alternatives, etc., than other Articles of such instruments have. One example is the UN Declaration on the Rights of Persons Belonging to National or Ethnic, Religious and Linguistic Minorities in 1992 (emphases added: ‘obligating’ and positive measures in italics, ‘opt-outs’ in bold):

214 Linguistic Rights A. States shall protect the existence and the national or ethnic, cultural, religious, and linguistic identity of minorities within their respective territories, and shall encourage conditions for the promotion of that identity. B. States shall adopt appropriate legislative and other measures to achieve those ends. ... 4.3. States should take appropriate measures so that, wherever possible, persons belonging to minorities have adequate opportunities to learn their mother tongue or to have instruction in their mother tongue.

The Council of Europe’s Framework Convention for the Protection of National Minorities and The European Charter for Regional or Minority Languages, both in force since 1998, also have many of these modifications, alternatives, and opt-outs. The Framework Convention’s education Article reads as follows (emphasis added): In areas inhabited by persons belonging to national minorities traditionally or in substantial numbers, if there is sufficient demand, the parties shall endeavor to ensure, as far as possible and within the framework of their education systems, that persons belonging to those minorities have adequate opportunities for being taught in the minority language or for receiving instruction in this language.

The opt-outs and alternatives (‘claw-backs’) in the Charter and the Convention permit reluctant states to meet the requirements in a minimalist way, something they can legitimize by claiming that a provision was not ‘possible’ or ‘appropriate,’ or that numbers were not ‘sufficient’ or did not ‘justify’ a provision, or that it ‘allowed’ the minority to organize teaching of their language as a subject, but at their own cost. Without binding educational linguistic human rights, most minorities have to accept ‘subtractive’ education through the medium of a dominant/majority language. In subtractive language learning, a new (dominant/majority) language is learned at the cost of the mother tongue, which is displaced, leading to diglossia and often to the replacement of the mother tongue. Diglossia means a situation with functional differentiation of languages, e.g., one at home and in the neighborhood, another for use at school and with authorities. Assimilationist, subtractive education of indigenous and minority children is genocidal; it fits two of the five definitions of genocide in the UN International

Convention on the Prevention and Punishment of the Crime of Genocide (E793, 1948): Article II(e): ‘‘forcibly transferring children of the group to another group,’’ and Article II(b): ‘‘causing serious bodily or mental harm to members of the group’’(emphasis added; see Skutnabb-Kangas, 2000 for details).

The human rights system should protect people in the globalization process, rather than give market forces free range. Human rights, especially economic and social rights, are, according to human rights lawyer Katarina Tomasevski (1996: 104), supposed to act as correctives to the free market. She claims that ‘‘The purpose of international human rights law is . . . to overrule the law of supply and demand and remove pricetags from people and from necessities for their survival.’’ These necessities for survival thus include not only basic food and housing (which would come under economic and social rights), but also basics for the sustenance of a dignified life, including basic civil, political, and cultural rights. It should, therefore, be in accordance with the spirit of human rights to grant people full linguistic human rights. See also: Endangered Languages; Intercultural Pragmatics

and Communication; Linguistic Decolonization; Pragmatics: Linguistic Imperialism; Social Aspects of Pragmatics.

Bibliography Rubio-Marı´n R (2003). ‘Language rights: exploring the competing rationales.’ In Kymlicka W & Patten A (eds.) Language rights and political theory. Oxford: Oxford University Press. 52–79. Skutnabb-Kangas T (2000). Linguistic genocide in education – or worldwide diversity and human rights? Mahwah, NJ: Lawrence Erlbaum Associates. Tomasevski K (1996). ‘International prospects for the future of the welfare state.’ In Reconceptualizing the welfare state. Copenhagen: The Danish Centre for Human Rights. 100–117. de Varennes F (1996). Language, minorities and human rights. Dordrecht: Martinus Nijhoff.

Relevant Websites http://conventions.coe.int – Council of Europe. http://www.sil.org – Ethnologue. http://www.linguistic-declaration.org – Draft Declaration of Linguistic Rights.

Universal

Linguistic Terminology 215

Linguistic Terminology J B Walmsley, Universita¨t Bielefeld, Bielefeld, Germany ! 2006 Elsevier Ltd. All rights reserved.

A precise terminology is a conditio sine qua non if one wants to understand grammatical facts . . . . (Jespersen, 1924: 315)

The terminology of a discipline consists of a set of concepts structuring the field, together with the labels associated with the concepts. Linguistic terminology is the terminology we use to describe the object ‘language.’ The study of linguistic terminology thus lies on the one hand at the intersection between terminology research in general (terminology science, terminology management, history of terminology, terminographical recording, etc.), and the study of language, or linguistics, on the other – linguistics being, as Firth remarked, nothing other than language turned back on itself. As objects of description, natural languages are unique in that, so far as we know, they all contain within themselves the resources they need to refer to themselves and to the ways in which they are used (Lyons, 1980: 293). Native speakers have the means to talk about their own language, speech, failures in communication, etc., as objects of discourse, and this ability appears to be a skill that children acquire as part of the natural process of language acquisition (Gombert, 1997). Seen from this perspective, any linguist wanting to provide a comprehensive account of the syntax and semantics, and even the pragmatics, of ‘a language,’ must also provide an account of its metalinguistic terms and uses. In discourse about the object ‘language,’ terminology needs to be distinguished both from the metalinguistic discourse of which it is a subpart, and from the (partly overlapping) ‘nomenclature’ (names for concepts): a nomenclature can be a list of names, whereas a terminology is a set of systems and subsystems of names. Interest in the metalinguistic functions of language in the Anglo-Saxon world has mainly concentrated on such things as type-token relations, token reflexivity, and the study of propositions (such as the truth conditions of sentences like ‘This sentence is false’). But metalinguistic uses of natural languages pervade much more extensive areas of everyday discourse. The boundary between the use of a natural language as a metalanguage for everyday purposes and its use for technical descriptive or theoretical purposes is not always easily drawn. This can be illustrated in simple, everyday terms: What is the meaning of ‘word’ in ‘English has a vocabulary of

half-a-million words,’ or in ‘This dictionary contains 55 000 words’? From a terminological point of view, the English lexicon contains different kinds of words (or, preferably, ‘lexemes’): lexemes whose function is almost entirely restricted to their technical use (‘adjective,’ ‘infinitive,’ ‘deixis’), lexemes that have both a technical sense and a less-well-defined everyday sense (‘word,’ ‘voice,’ ‘gender’), and the vast majority of other lexemes, which do not form part of the linguistic terminology. The problems arise with the first two categories, and it is this that distinguishes the terminology of linguistics from terminologies in the natural sciences. In the more mature natural sciences, terminology has kept pace with scientific developments, and it is based on widespread agreement as to the nature and status of the concepts concerned. Such agreement does not exist in linguistics. (For example, how many parts of speech are there? Should they be called ‘form classes,’ ‘word classes,’ or even ‘lexeme classes’? How many tenses does English have?) The evidence suggests that terminological agreement in linguistics is becoming less rather than more likely. Linguistic terminology thus faces three major problem areas: the categories underlying the terminology, the terms themselves, and the proliferation of terms, leading to the ‘translation problem.’ Attempts to deal with these problems have followed two major routes: normative or descriptive. Those taking the normative route seek to change the terminology either radically (Barnes, Fries) or by tinkering with it (Sonnenschein, Jespersen). Those taking the descriptive route write dictionaries of linguistic terminology (Nash, Gibbon). Between about 1885 and 1950, the simplification and harmonization of terminology was a topic of particular interest to linguists worldwide. With the foundation of the Birmingham Grammatical Society in 1885, Sonnenschein initiated a campaign to simplify and harmonize the terminologies of the main foreign languages taught in schools. The most significant outcome of this movement was the publication of the Parallel grammar series – a set of textbooks that ultimately comprised more than 25 volumes covering eight languages. The success of this series helped to stimulate similar work in other countries, which Sonnenschein tried to coordinate on an international scale. These efforts culminated in the publication of schemes of terminology for France (1910), England and Wales (1911), and the United States (1913) (Walmsley, 2001). Terminology became one of the main themes at the Sixth International

216 Linguistic Terminology

Congress of Linguists, Paris, 1948, and was also one of Fries’ targets (1952). Nevertheless, none of the normative proposals seems to have enjoyed significant success. The argument against the traditional Latinate terminology is that it invites the grammarian to press his or her description of a language into categories to which it is not suited. There is no a priori reason why the terminology should work like this. In principle, four strategies are available to us: we can invent new terms but apply them to the conventional categories; we can retain the conventional nomenclature but establish new categories; we can postulate new categories and devise a new terminology to go with them; or we can stick with the conventional categories and the old nomenclature. (This last option would be one way of viewing what we think of as ‘traditional grammar.’) To first address the problem of the categories (as opposed to their labels), the evidence shows that adopting the traditional categories of Latin grammar did indeed have a decisive influence on the history of English grammaticography. Levin described as a ‘‘fallacy . . . discussing the grammar of English on the basis of preconceptions derived from the grammar of another language (say Latin)’’ (Levin, 1960: 262). Following Levin, Crystal complained, ‘‘Often a grammarian would take over and work within a traditional frame of reference assuming that it was satisfactory for his purposes . . . a more conscious awareness of linguistic principles would have shown that it was not; . . . the best example . . . [of this is] . . . the attempt to describe modern languages as if they were variants of Latin’’ (Crystal, 1971: 69). Despite such admonitions, concepts not appropriate to – or even non-existent in – present-day English continue to be taught in introductory textbooks and teaching grammars, among them voice, case, a future tense, and subjunctive mood. As long ago as 1922, McKerrow, for instance, wrote, ‘‘if we were now starting for the first time to construct a grammar of modern English, without knowledge of or reference to the classics, it might never occur to us to postulate a passive voice at all’’ (McKerrow, 1922: 163; cf. Andersen, 1991). It is sometimes claimed that English nouns have ‘only’ two cases and that some pronouns have a distinctive form when used as objects (an ‘objective,’ i.e., accusative, form). Of the latter, Sweet wrote: ‘‘the so-called accusative of the personal pronoun is functionally not a case at all . . . The real difference between ‘I’ and ‘me’ is that ‘I’ is an inseparable prefix used to form finite verbs, while ‘me’ is an independent or absolute pronoun, which can be used without a verb to follow’’ (Sweet, 1875–1876: 495; cf. Hudson, 1995).

The limitations of the traditional core of terminology also mean that categories that could – or should – be recognized in English go unlabeled, or even unidentified. If we set up a table of the modal verbs (can, could, may, might, must, shall, should, will, would), for instance, we find that traditionally we have names for some syntagms (will go, ‘future’; should go, ‘subjunctive’), but not for others. ‘‘Why should the combinations would go and would have gone have special terms rather than might go and might have gone, or dared go, etc.? The only reason is that these forms serve to translate simple tense-forms of certain other languages’’ (Jespersen, 1924: 281). Observations such as these would seem to justify the Structuralists’ criticisms of earlier grammarians. Nevertheless, the evidence shows that earlier grammarians, far from being the ill-informed, complacent practitioners pictured by Crystal et al., were well aware of the difficulties inherent in changing the categories and the nomenclature: ‘‘there is probably no language so different from the English as the Latin is . . .’’ (Ward, 1767: iv); ‘‘if an English Grammar were to be made for the sole Purposes of those who propose to learn, and to use no Language but the English only, it might be put into a different Form from that of the Grammars of the learned Languages’’ (Ward, 1767: vii). The problem Ward saw was that to the extent that we set up categories and terminologies outside the Latin tradition, we create difficulties in cross-linguistic comparison. And it was this crosslinguistic comparability that earlier grammarians exploited when they first began to teach Latin through the medium of English in the 14th century (Walmsley, 2004). Since much of the teaching was done through translation, it is not surprising that pupils were taught to discover in English exactly those categories that already existed in Latin. And when the first grammar of English to be written in English – William Bullokar’s – appeared about two centuries later (Turner, 1980), again, it was not ignorance that induced the author to model his description so closely on Latin; rather, it was his desire to prove that English, too, could be ‘‘. . .‘a perfect ruled tongue, / conferable to grammar art / as any ruled long’. English [was] as orderly and subjectable to grammatical rules as any language that has been the object of prolonged teaching’’ (Robins, 1994: 21). Concerning the names for the concepts, grammarians have been criticized for ‘‘obscuring the grammatical requirements of English by discussing facts of English form in the vocabulary designed to codify Greek and Latin’’ (Dinneen, 1967: 170) – in other words, for using Latinate labels for the concepts. Again, earlier grammarians were not unaware of the problem: ‘‘It has been asserted . . . that in order to

Linguistic Terminology 217

produce a complete Grammar of our language, we should imitate the Latin in the formation of our Cases, Moods, and Tenses; whilst others . . . affirm that the English tongue has no similitude whatever with the Latin in this respect, and therefore they are for preventing our making use of the common technical terms usually employed in Grammars of our language’’ (Goldsmith, in Wiseman’s Complete English Grammar, 1764: Preface; quoted in Michael, 1970: 502–503). The difficulty lies in finding appropriate alternative terms. This has not prevented reformers from proposing ‘native’ English alternatives to terms of Latin origin. Barnes (1878), for instance, included a list of native alternatives for more than 350 Latinate words, including terms for grammar. But it is difficult to see what advantage a native terminology brings, if coupled to the traditional categorial framework. The most radical attempt to break out of the system was that of Fries (1952). On the basis of strictly distributional criteria, he postulated four major parts of speech and 15 classes of function words, which he labeled 1–4 and A–O, respectively. Fries warned against equating his word classes with those of traditional grammar: ‘‘The reader, familiar with the conventional grammar, will probably attempt . . . to equate these class numbers with the usual names, ‘nouns,’ ‘verbs,’ ‘adjectives,’ and ‘adverbs’ . . . If he does, he will certainly find increasing difficulty in the following chapters’’ (Fries, 1952: 87; cf. Fries, 1952: 2). Fries’s scheme, however, did not succeed in replacing the old Latinate terms – testimony to the robustness of the traditional terminology. The third and final problem area concerns the proliferation of linguistic terminologies. The 20th century saw an unprecedented increase in the number of grammatical theories. The interested customer could choose from about 50 or 60 theories of grammar. These theories frequently draw the boundaries between themselves and neighboring theories in part at least by means of their terminology. Here, as elsewhere, it is not simply a question of choosing a new label for the same concept: if a term has a place in a different system, it becomes a different concept. ‘Subject’ does not mean the same thing in a system in which it is a primitive as it does in a system in which it is a derived category, as is the case in Government-(and)-Binding (GB) theory. The proliferation of grammar theories leads to an apparent paradox: if each theory has its own terminology – either introducing new terms or redefining old ones – how is it that linguists of different persuasions can communicate with one another? The language of any specific theory can be viewed as a sort of sub-metalanguage. Linguists recognize when a

colleague is ‘speaking GPSG,’ for instance, and informal translations are not infrequently provided to make the theory comprehensible to linguists of other schools. For example, ‘To put the same thing in HPSG terms . . .’ or ‘In Cognitive Grammar terms, this would be . . . .’ For understanding to take place, linguists must be as familiar as possible with both theories, and specifically with the points of difference between them. When the adherent of one theory speaks to one of a different persuasion, the hearer needs to make ongoing adjustments – that is, make a sort of mental translation – to the running text, to take account of the differences. The current core of traditional linguistic terminology is defective insofar as it perpetuates the description of categories not attested in the languages being described and makes other categories invisible or difficult to identify. It does not reflect the individuality of specific languages, nor is it flexible in the sense of being easy to modify. It does, though, offer a coherent system as a basis for comparing languages, and ‘‘what is generally referred to as ‘traditional grammar’ . . . is much richer and more diverse than is often suggested in the cursory references made to it by many modern handbooks of linguistics’’ (Lyons, 1968: 3). In this function, the core terms act rather like empty shells or cases into which the particular sub-metalinguistic meanings for individual theories can be inserted. Further terms are added as the theory requires. From a different perspective, the lack of flexibility reveals itself as a remarkable robustness, both synchronically and through time. Provided that certain conditions are met, the terminology permits communication across theories and across languages (typology). Similarly, it facilitates accessibility to linguistic theories and descriptions over a long period of western cultural history. Its core of traditional terms is thus part of the glue that holds the discipline of linguistics together: without this common terminological core, the discipline would fragment into a myriad of mutually unintelligible metalanguages. See also: Bullokar, William (c. 1531–1609); Case; Firth,

John Rupert (1890–1960); Fries, Charles Carpenter (1897–1967); Jespersen, Otto (1860–1943); Lyons, John, Sir (b. 1932); Sonnenschein, Edward Adolf (1851–1929); Sweet, Henry (1845–1912); Traditional Grammar.

Bibliography Andersen P K (1991). A new look at the passive. Frankfurt am Main: Lang. Barnes W (1878). An outline of English speech-craft. London: C. Kegan Paul & Co. Crystal D (1971). Linguistics. Harmondsworth: Penguin.

218 Linguistic Terminology Dinneen F P, S J (1967). An introduction to general linguistics. New York: Holt, Rinehart & Winston. Fries C C (1952). The structure of English. New York: Harcourt, Brace & Co. Gibbon D (ed.) (2000). Handbook of multimodal and spoken dialogue systems: resources, terminology and product evaluation. Boston: Kluwer. Gombert J E (1997). ‘Metalinguistic development in first language acquisition.’ In Van Lier L & Corson D (eds.) Encyclopedia of language and education. Vol. 6. Knowledge about language. Dordrecht: Kluwer. 43–51. Hudson R A (1995). ‘Does English really have case?’ Journal of Linguistics 31, 375–392. Jespersen O (1924). The philosophy of grammar. London: George Allen & Unwin. Levin S R (1960). ‘Comparing traditional and structural grammar.’ College English 21, 260–265. Lyons J (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press. Lyons J (1980). ‘Review of Josette Rey-Debove, Le me´talangage: E´tude linguistique du discours sur le langage. (Collection L’ordre des mots.) Paris: Le Robert, 1978. Pp. 318.’ Journal of Linguistics 16, 292–300. McKerrow R B (1922). ‘English grammar and English grammars.’ Essays and Studies VIII, 148–167. Michael I (1970). The English grammatical categories and the tradition to 1800. Cambridge: Cambridge University Press. Nash R (1968). Multilingual lexicon of linguistics and philology: English, Russian, German, French. Coral Gables, FL: University of Miami Press.

Robins R H (1994). ‘William Bullokar’s Bref grammar for English: Text and context.’ In Blaicher G & Glaser B (eds.) Anglistentag 1993 Eichsta¨tt. Proceedings. Tu¨ bingen: Niemeyer. 19–31. Sonnenschein E A (1916). A new English grammar based on the recommendations of the Joint Committee of Grammatical Terminology. Oxford: Clarendon Press. Sweet H (1875–1876). ‘Words, logic and grammar.’ Transactions of the Philological Society 1875–6, 470–503. Turner J R (ed.) (1980). The works of William Bullokar II. Leeds: Leeds University Press. Walmsley J (1999). ‘English grammatical terminology from the 16th century to the present.’ In Hoffmann L, Kalverka¨ mper H & Wiegand H E (eds.) Fachsprachen-Languages for Special Purposes. Ein Handbuch zur Fachsprachenforschung und Terminologiewissenschaft. Berlin & New York: Walter de Gruyter. 2494–2502. Walmsley J (2001). ‘The ‘‘entente cordiale grammaticale’’ – 1885–1915.’ In Colombat B & Savelli M (eds.) Me´talangage et terminologie linguistique. Leuven: Peeters. 499–512. Walmsley J (2004). ‘Latein als Objektsprache, Englisch als Metasprache in spa¨ tmittelalterlichen grammatischen Texten.’ In Hassler G & Volkmann G (eds.) History of linguistics in texts and concepts, 2 vols. Muenster: Nodus. Vol. II, 455–467. Ward W (1767). A grammar of the English language in two treatises. York: Etherington.

Linguistic Theory in the Later Middle Ages M Amsler, University of Wisconsin, Milwaukee, WI, USA ! 2006 Elsevier Ltd. All rights reserved.

Beginning in the late 11th century, language study was divided between pedagogical grammar, instructing nonnative speakers in Latin literacy and speaking, and theoretical (speculative) grammar, which explored the causes of imposition, modes of signification, and the relations between language, concepts, and reality. Linguistic theory was developed in commentaries on Aristotle and Priscian and treatises on universal grammar. More than Alcuin and other Carolingian grammarians, late medieval ‘higher grammar’ expanded logical approaches to language and grammar by making greater use of Priscian’s Institutiones grammaticae (cf. 500) and newly available Aristotle texts (Metaphysics, Sophistical Refutations) along with the ‘old logic’ (De interpretatione, Categories, Topics).

Late medieval linguistic theory combined ancient logical and grammatical approaches to language, with Latin as the exemplar. Both Aristotelian logic and Roman grammar maintained a theory of the arbitrary linguistic sign. Aristotle defined language from the point of view of the perceiver, and De interpretatione (with Boethius’ commentary) became a base text: ‘‘Spoken sounds are symbols of affections in the soul, and written marks [are] symbols of spoken sounds . . . Speech is signifying sound, whose parts signify separately as words but not as propositions . . .’’; De interpretatione, 16a). Aristotle’s conceptualist theory of language was fundamental to late medieval linguistic theory. Priscian defined word classes (partes) and construction (oratio) from the point of view of the speaker: ‘‘[Construction is] the comprehension of words in their most appropriate arrangement, with letters combined to form syllables, syllables combined to form words, and words combined to form sentences’’ (Inst. gram., 17.3). Unlike pedagogical grammarians, Aristotle and Priscian

218 Linguistic Terminology Dinneen F P, S J (1967). An introduction to general linguistics. New York: Holt, Rinehart & Winston. Fries C C (1952). The structure of English. New York: Harcourt, Brace & Co. Gibbon D (ed.) (2000). Handbook of multimodal and spoken dialogue systems: resources, terminology and product evaluation. Boston: Kluwer. Gombert J E (1997). ‘Metalinguistic development in first language acquisition.’ In Van Lier L & Corson D (eds.) Encyclopedia of language and education. Vol. 6. Knowledge about language. Dordrecht: Kluwer. 43–51. Hudson R A (1995). ‘Does English really have case?’ Journal of Linguistics 31, 375–392. Jespersen O (1924). The philosophy of grammar. London: George Allen & Unwin. Levin S R (1960). ‘Comparing traditional and structural grammar.’ College English 21, 260–265. Lyons J (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press. Lyons J (1980). ‘Review of Josette Rey-Debove, Le me´talangage: E´tude linguistique du discours sur le langage. (Collection L’ordre des mots.) Paris: Le Robert, 1978. Pp. 318.’ Journal of Linguistics 16, 292–300. McKerrow R B (1922). ‘English grammar and English grammars.’ Essays and Studies VIII, 148–167. Michael I (1970). The English grammatical categories and the tradition to 1800. Cambridge: Cambridge University Press. Nash R (1968). Multilingual lexicon of linguistics and philology: English, Russian, German, French. Coral Gables, FL: University of Miami Press.

Robins R H (1994). ‘William Bullokar’s Bref grammar for English: Text and context.’ In Blaicher G & Glaser B (eds.) Anglistentag 1993 Eichsta¨tt. Proceedings. Tu¨bingen: Niemeyer. 19–31. Sonnenschein E A (1916). A new English grammar based on the recommendations of the Joint Committee of Grammatical Terminology. Oxford: Clarendon Press. Sweet H (1875–1876). ‘Words, logic and grammar.’ Transactions of the Philological Society 1875–6, 470–503. Turner J R (ed.) (1980). The works of William Bullokar II. Leeds: Leeds University Press. Walmsley J (1999). ‘English grammatical terminology from the 16th century to the present.’ In Hoffmann L, Kalverka¨mper H & Wiegand H E (eds.) Fachsprachen-Languages for Special Purposes. Ein Handbuch zur Fachsprachenforschung und Terminologiewissenschaft. Berlin & New York: Walter de Gruyter. 2494–2502. Walmsley J (2001). ‘The ‘‘entente cordiale grammaticale’’ – 1885–1915.’ In Colombat B & Savelli M (eds.) Me´talangage et terminologie linguistique. Leuven: Peeters. 499–512. Walmsley J (2004). ‘Latein als Objektsprache, Englisch als Metasprache in spa¨tmittelalterlichen grammatischen Texten.’ In Hassler G & Volkmann G (eds.) History of linguistics in texts and concepts, 2 vols. Muenster: Nodus. Vol. II, 455–467. Ward W (1767). A grammar of the English language in two treatises. York: Etherington.

Linguistic Theory in the Later Middle Ages M Amsler, University of Wisconsin, Milwaukee, WI, USA ! 2006 Elsevier Ltd. All rights reserved.

Beginning in the late 11th century, language study was divided between pedagogical grammar, instructing nonnative speakers in Latin literacy and speaking, and theoretical (speculative) grammar, which explored the causes of imposition, modes of signification, and the relations between language, concepts, and reality. Linguistic theory was developed in commentaries on Aristotle and Priscian and treatises on universal grammar. More than Alcuin and other Carolingian grammarians, late medieval ‘higher grammar’ expanded logical approaches to language and grammar by making greater use of Priscian’s Institutiones grammaticae (cf. 500) and newly available Aristotle texts (Metaphysics, Sophistical Refutations) along with the ‘old logic’ (De interpretatione, Categories, Topics).

Late medieval linguistic theory combined ancient logical and grammatical approaches to language, with Latin as the exemplar. Both Aristotelian logic and Roman grammar maintained a theory of the arbitrary linguistic sign. Aristotle defined language from the point of view of the perceiver, and De interpretatione (with Boethius’ commentary) became a base text: ‘‘Spoken sounds are symbols of affections in the soul, and written marks [are] symbols of spoken sounds . . . Speech is signifying sound, whose parts signify separately as words but not as propositions . . .’’; De interpretatione, 16a). Aristotle’s conceptualist theory of language was fundamental to late medieval linguistic theory. Priscian defined word classes (partes) and construction (oratio) from the point of view of the speaker: ‘‘[Construction is] the comprehension of words in their most appropriate arrangement, with letters combined to form syllables, syllables combined to form words, and words combined to form sentences’’ (Inst. gram., 17.3). Unlike pedagogical grammarians, Aristotle and Priscian

Linguistic Theory in the Later Middle Ages 219

interrogated grammatical metalanguage and criteria for defining linguistic forms. In the 12th century, William of Conches, Peter Helias, and the anonymous author of the Glosulae wrote commentaries on Priscian focusing on the definitions of word classes, syntactic structures, and the relation between sentence meaning (sense) and reference. These Paris-based precursors of the modistae, or speculative grammarians, focused on modes of signifying (modi significandi), including verbal (grammatical) and extraverbal (semantic) features, to reframe traditional word class definitions. Two decades earlier, Abelard had challenged Aristotelian realism by arguing that modes of signifying and Aristotle’s categories (motion, rest, quantity, etc.) were features not of reality but of how language describes reality (Logica 116.35–117.2). William and Peter Helias took a different approach. In Summa super Priscianum (1140), Helias used the modes of signifying to account for the ontological ‘causes’ of the parts of speech. Nouns signify substance subsisting in matter and also signify the combination of matter and form. After the original (arbitrary) imposition of nouns to signify substance, people ‘enlarged’ speech when they used nouns to signify things ‘as if’ they are substances, ‘in the mode of substance (modo substantiae) . . . without tense, in a common or proper or almost common or proper manner’ (Helias, 1993: 120–21). Verbs signify that which acts or is acted on or semantic primitives ‘in the mode of’ acting or being acted on, with tense markers (Helias, 1993: 196). Participles share features of nouns and verbs: they signify acting or being acted on through time, but they also deploy the mode of signifying the actor (agentus) or the one acted on (patius) and thus display nominal signification (Helias, 1993: 449). In the 13th and 14th centuries, theoretical grammarians known as modistae, based in Paris, Bologne, and Erfurt, systematized this reanalysis of the modi significandi, parts of speech, and syntactic relations into a complex theory of language and grammar. Like William of Conches and Roger Bacon, modistae such as Michel de Marbais (fl. 1240–1290), Boethius of Dacia (fl. 1265–1280), Martin of Dacia (d. 1304), Thomas of Erfurt (fl. 1300), and Siger de Courtrai (d. 1341) argued that grammar was properly concerned with underlying structures common to all languages. Universal grammar is a speculative (theoretical) science that treats the parts of speech as aspects of syntax and procedures for combining words into grammatical sentences (congruity, government) that correspond in some respect to the nature of reality. Unlike pedagogic grammar, the modistae used both auctores and made-up Latin sentences to

illustrate grammaticality. The modistae accepted that language is a set of arbitrary signs signifying mental concepts directly and reality only indirectly. They produced a coherent theory of language that systematically linked word class properties with the properties of mental concepts and then with the properties of universals standing behind material instantiations. They took as given the first imposition of names on reality (things, qualities, action); second imposition designates words according to forms and functions in sentences (grammatical difference). Following Aristotle and Porphyry, the modistae argued that such impositions could be necessary or arbitrary, depending on the degree of realist correspondence between words and things. Many speculative grammar treatises and commentaries retain the organization of Priscian’s Latin grammar, with sections on orthography (vox, littera, syllaba), etymology (partes orationis), syntax, and prosody, but they add another level to grammatical explanation by exploring the modes of signifying, the causes or rationales for differentiating word classes, and the bases for how each linguistic category relates to mental and external reality. As theoreticians, the modistae and their critics were most interested in universal grammar and syntax. Although less concerned with linguistic sound (phonology), some modistae, following Aristotle, stated that speech is materially percussed, disturbed air, which is moved by the soul (seat of intellect) using the vocal tract as an instrument. Some speculative grammarians took note of different Latin and vernacular pronunciations or the physiology of articulation (based on Arabic commentaries on Aristotle). Modistic linguistic theory related word classes to reality and to principles of syntax. The properties of things as they exist (modi essendi) are perceived by the intellect and conceptualized as modi intelligendi. These perceptions and concepts are reflected in the structures of Latin grammar and expressed in speech through the modi significandi, whose formal elements constitute an arbitrary sign system. Spoken words unite voice and meaning (dictiones) as signs of concepts. The correspondence of the structure of grammar to the nature of thought and reality can be represented as follows: THING (res) in the world CONCEPT (res intellecta) in the mind SIGNIFIED (res significata) in the word (from Law, 175)

Properties of the thing (modi essendi, ‘modes of being’) Properties of the concept (modi intelligendi, ‘modes of understanding’) Properties of the signified (modi significandi, ‘modes of signifying’)

220 Linguistic Theory in the Later Middle Ages

Because the modi significandi directly signify mental concepts and not things themselves, the intellect can also form words that do not correspond to external reality and can attribute to them morphological features of the appropriate word class’s modes of signifying (e.g., nouns chimaera, nullus). Modistic syntactic theory was built around the theory of constructions. Every part of speech is ordered to be combinable with others to form a construction. Following Aristotle, the modistae argued that every construction is comprised of a subject and a predicate (in contemporary linguistics, NP and VP). Earlier, Helias had distinguished utterance congruity (congrua voce) from conceptual congruity (congrua sensu) (Helias, 1993: 832). In the 13th century, Boethius of Dacia, contradicting Priscian, argued that impersonal expressions (fulminat ‘it lightens’) are similar to intransitive expressions (currit, ‘she runs’). Both constructions are imperfectly realized on the level of utterance (vox) yet conceptually complete (secundum intellectum) because the grammatical subjects are understood (Boethius of Dacia, Modi, q. 11, ad. 2, 46–48). Congruity and regimen (government) are the principal features of modistic syntactic theory. Helias defined regimen in syntactic terms, while he used congruitas in both syntactic and semantic contexts. Congruity demands that (1) two or more words related syntactically agree with the appropriate morphology (gender, number, case, etc.) required by their syntactic functions, or (2) the words convey a conceivable (but not necessarily true) meaning, or (3) both. Thus, constructions can be perfect grammatically and incomplete in sense, or vice versa. Ovid’s expression Turba ruunt (‘the crowd run’) illustrates how a sentence can be congruent in sense but not grammatically. Regimen determines that one word will ‘force’ another word to be included in the utterance with a certain case or form in order to complete (perfect) the construction. Socrates accusat is a congruent construction in that the subject and verb agree grammatically, but the sentence does not generate a perfect sense because the regens, accusat, is a transitive verb and determines an object for the perfected construction (see Helias, 1993: 835, 1051). The modistae distinguished at least two grammatical levels for each part of speech, structured by several modes of signifying. The essential modi significandi constitute words’ being, while accidental modes constitute their inflection. Essential modes are subdivided into general (common to several parts of speech) and specific (particular to an individual part). Nouns and pronouns have in common some modes of signifying in that they ‘‘signify by the mode of habit and quietness [stability and permanence]’’; verbs and participles ‘‘signify by the mode of motion and becoming’’

(Sophismata, I, 43?). At the next level of essential signification, nouns are distinct from pronouns in that nouns deploy ‘the mode of determinate apprehension’ (qualifying and defining substance); pronouns deploy the ‘mode of indeterminate apprehension’ (Simon of Dacia, Martin of Dacia). And so forth. The accidental modes of signifying correspond generally to the traditional accidents of Roman grammar. Most accidents establish how an individual part can be used in a construction (respectivi, syntactic inflections), but some parts’ accidents mark grammatically how the word relates to the modes of being of the things signified (absoluta, semantic features such as that the noun nullus rather than a verb signifies ‘nothing’ as a substance) (Thomas of Erfurt, Gram. spec., 150). The modistae developed a strong syntactic theory of transitivity and voice, based on dependens (dependency). Every construction has at least one dependent that can have one of two modes of signifying: either the dependent goes to the first constructible or to an element that in turn goes to the first constructible (subject), or the dependent does not go to the first constructible (Martin of Dacia, Modi signif, ss. 203–211, 90–94). The theory of transitivity and syntactic firsts and seconds suggests how the modistae mostly adopt vernacular and ecclesiastical Latin’s SVO word order as ordo naturalis (‘natural order’), in contrast to ordo artificialis (‘artificial order,’ often inverted word order). The first mode is the principle of an intransitive, the second of a transitive construction. The modistae’s use of the terms transitivus and intransitivus, very different from Priscian’s, is based on Aristotle’s account of action. In a transitive construction, the action passes into an (direct) object); in an intransitive construction, the action is immanent in the subject. As moderate realists, the modistae argued that verbs’ grammatical inflections (currit ¼ 3rd per. sing. pres.) mark potential action which, depending on the sentence, can either be drawn from the verb and transferred to another constructible or remain immanent in the verb. Thus in the sentence Socrates legit librum (‘Socrates reads a book’), the relation between the first (subject) noun and the verb is a constructio intransitiva, whereas the relation between the verb and the second (object) noun is a constructio transitiva in that the action of reading is passed from Socrates to the book (Thomas of Erfurt, Gram. spec., cc. 47–48). Transitivity and dependency were also key to modistic phrase structure, as the last example sentence indicates. In the sentence homo albus currit bene (‘the white man runs well’), the adjective and verb depend immediately on the subject noun, while the adverb depends immediately on the verb, which

Linguistic Theory in the Later Middle Ages 221

draws it back to the subject noun. The adjective and the adverb are ‘determinants’ of the syntactic elements to which they are attached, as marked by inflection or position. The noun phrase homo albus is analyzed as an ‘intransitive construction of persons’ (constructio intransitivus personarum), while homo currit is analyzed as an ‘intransitive construction of acts’ (constructio intransitivus actuum). According to Radulpus Brito (Quaestiones super Priscianum minorem, q. 24) and others, only the latter sort of constructions form propositions. The clause Si Socrates currit (‘If Socrates runs’) is a dependent because the conjunction creates an expectation in the listener that more will follow (Thomas of Erfurt, Gram. spec., c. 54). In modistic theory, the conjunction si’s syntactic function is governed by elements outside the dependent clause in which the conjunction occurs. Thus the modistae derived the logical structure of propositions from underlying grammar. They did not restrict dependency and transitivity to the level of word class or morphology, but rather argued they are fundamental to construction itself. However, not all modistae agreed that every type of word (e.g., conjunction) had a dependency feature that must be closed or perfected in a sentence. The modistae worked out a coherent, sophisticated grammatical theory that exceeded Priscian’s largely morphological analysis and established new metalanguage for descriptive parsing and theory of language. Their syntactic theory distinguished surface from underlying structures, material utterance from grammatical congruity and conceptual meaning, form from function, and linear from logical word order. Thus, the modistae developed a theory of grammar that increasingly emphasized the autonomy and interrelation of formal, verbal features at the level of both word and sentence. However, some modistic linguistic theory as well as intentionalist critics of the modistae relied on the apprehensions of users (speakers, writers, listeners, readers) to complete the sense of an utterance according to internal grammatical rules, constructions of possible meaning, or discursive contexts. Turba ruunt might be ungrammatical (lacking subject-verb agreement), but the sense of a crowd as a collective entity makes the sentence comprehensible. Aqua, aqua (‘Water, water!’) might be ‘imperfect,’ even ‘ungrammatical,’ but when shouted by someone whose house is afire, the utterance is understandable and effective. See also: Aristotle and the Stoics on Language; De Dicto versus De Re; Language of Thought; Logic and Language:

Philosophical Aspects; Marbais, Michel de (fl. 1255); Martin of Dacia (d. 1304); Peter Helias (12th Century A.D.); Philosophy of Language, Medieval; Priscianus Caesarien-

sis (d. ca. 530); Realism and Antirealism; Roman Ars Grammatica; Siger de Courtrai (d. 1341); Thomas of Erfurt (fl. 1300); William of Conches (ca. 1080–1154); Word Classes/ Parts of Speech: Overview.

Bibliography Bursill-Hall G (1971). Speculative grammars of the middle ages: the doctrine of the partes orationes of the Modistae. The Hague and Paris: Mouton. Covington M (1984). Syntactic theory in the high middle ages: Modistic models of sentence structure. Cambridge: Cambridge University Press. Ebbesen S (1980). ‘Is ‘canis currit’ ungrammatical? Grammar in Elenchi commentaries.’ Historiographia Linguistica 7, 53–68. Fredborg K M (1980). ‘Universal grammar according to some 12th-century grammarians.’ Historiographia linguistica 7, 69–84. Fredborg K M (1988). ‘Speculative grammar.’ In Dronke P (ed.) A history of twelfth-century western philosophy. Cambridge: Cambridge University Press. 177–195. Helias P (1993). Summa super Priscianum (2 vols). Reilly L A (ed.). Toronto: Pontifical Institute of Mediaeval Studies. Hunt R W (1941, 1950). ‘Studies in Priscian in the eleventh and twelfth centuries.’ Medieval and renaissance studies 1, 194–231 and 2, 1–56. (Rpt. In Hunt R W [1980]. The history of grammar in the middle ages. Bursill-Hall G (ed.). Amsterdam: Benjamins.) Kneepkens C H (1990). ‘Transitivity, intransitivity and related concepts in 12th-century grammar: an explorative study.’ In Bursill-Hall G et al. (eds.) De ortu grammaticae: studies in medieval grammar and linguistic theory in memory of Jan Pinborg. Amsterdam and Philadelphia: John Benjamins. 161–190. Kelly L (1977). Quaestiones Alberti de modis significandi. Amsterdam: John Benjamins. Kelly L G (2002). The mirror of grammar: theology, philosophy and the Modistae. Amsterdam and Philadelphia: John Benjamins. Law V (2003). History of linguistics in Europe: from Plato to 1600. Cambridge: Cambridge University Press. 175. Maieru` A (1994). ‘Medieval linguistics: The philosophy of language.’ In Lepschy G (ed.) History of linguistics 2: classical and medieval linguistics. Sansone E (trans.). London: Longman. 272–315. Marmo C (1995). ‘A pragmatic approach to language in Modism.’ In Ebbesen S (ed.) Sprachtheories im spa¨ tantike und mittelalter. Tu¨ bingen: Gunter Narr. 169–183. McDermott A & Senape C (1980). Godfrey of Fontaine’s abridgement of Boethius of Dacia’s Modi significandi sive quaestiones super Priscianum maiorem. Amsterdam and Philadelphia: John Benjamins. Percival K (1976). ‘Deep and surface structure concepts in renaissance and medieval syntactic theory.’ In Parret H (ed.) History of linguistic thought and contemporary linguistics. Berlin and New York: de Gruyter. 238–253.

222 Linguistic Theory in the Later Middle Ages Pinborg J (1967). Die entwicklung der sprachtheorie im mittelalter. Mu¨ nster: Aschendorff/Copenhagen: FrostHagen. Rodrı´quez E P (2002). ‘Speculations about the potestas litterarum in medieval grammar (11th through 13th centuries).’ Historiographia linguistica 29, 293–327. Rosier I (1982). ‘La the´orie me´die´vale des Modes de signifier.’ Langages 65, 117–128.

Rosier I (1983). La grammaire spe´ culative des Modistes. Lille: Presses Universitaires de Lille. Rosier I (1994). La parole comme acte: sur la grammaire et la se´ mantique au XIIIe sie`cle. Paris: J. Vrin. Thomas of Erfurt. (1972). Grammatica speculativa. BursillHall G (trans.). London: Longman.

Linguistic Universals, Chomskyan S Cristofaro, University of Pavia, Pavia, Italy ! 2006 Elsevier Ltd. All rights reserved.

One of the key assumptions of the Chomskyan approach to the study of language is the existence of Universal Grammar, that is, as Chomsky (1976: 29) puts it, a set of principles, rules, or conditions that are elements or properties of all human languages. In fact, as is pointed out by Jackendoff (2002: Ch. 4), the notion of Universal Grammar is used in the literature in three distinct, albeit related, senses. First (for example, in Chomsky, 1965) Universal Grammar has been used to refer to a set of features that all languages have in common. These features can be regarded as universals of language proper, and they are of two types: features pertaining to the types of rules and constraints that have to be present in the grammar, such as phrasal formation rules, derivation rules, and constraints thereon; and features pertaining to the material that provides the basic building blocks of linguistic structure, such as phonological distinctive features in phonology, or parts of speech and the notion of syntactic tree in syntax. In Chomsky (1965) these two feature types are indicated as formal and substantive universals respectively. The other two senses of the notion of Universal Grammar specifically pertain to the way in which speakers acquire their language. Universal Grammar is also used (for example, in Chomsky, 1972) to refer to the initial state of the mind of a language learner that makes language acquisition possible. This initial state involves knowledge of the complete range of possible human grammars, from which the language learner selects a target grammar in response to the linguistic data to which he or she is exposed. Finally, in its most common sense, Universal Grammar is taken to correspond to the initial state of the mind of a language learner along with the machinery (which Chomsky calls Language Acquisition Device)

that makes it possible for the speaker to move from the initial state to the final target grammar. The notion of Universal Grammar, in its various senses, implies that there are two universal aspects of human language. On the one hand, there is a set of features common to all languages. On the other hand, there is a universal mechanism that makes language acquisition possible. This mechanism crucially involves knowledge of the range of possible grammars, including the features common to all languages, as well as the machinery that allows the language learner to select the target grammar. The latest versions of generative theory, Principles and Parameters and Minimalism (Chomsky, 1981; Chomsky, 1995), assume that Universal Grammar, intended as the initial state of the mind of the language learner along with the Language Acquisition Device, includes specification of a number of universal principles, each associated with an open value parameter. Individual parameters may have different values in different languages, which are triggered by the linguistic input to which the language learner is exposed. Interaction between the universal principles, the open value parameters, and the input that triggers specific values of these parameters makes it possible for the learner to acquire the target grammar. The inventory of principles and parameters that are taken to be part of Universal Grammar has evolved over the decades, and different authors take different views. While some authors (e.g., Baker, 2001) have rather elaborate lists of principles and parameters, there has been a general tendency (exemplified, for instance, by Chomsky, 1995 and Jackendoff, 2002) to reduce the inventory of elements posited in Universal Grammar as much as possible, in order to posit the smallest set of elements that can still account for the data. A classical example of an open value parameter is provided by the so-called head parameter. The head parameter is associated with X-bar theory (Chomsky,

222 Linguistic Theory in the Later Middle Ages Pinborg J (1967). Die entwicklung der sprachtheorie im mittelalter. Mu¨nster: Aschendorff/Copenhagen: FrostHagen. Rodrı´quez E P (2002). ‘Speculations about the potestas litterarum in medieval grammar (11th through 13th centuries).’ Historiographia linguistica 29, 293–327. Rosier I (1982). ‘La the´orie me´die´vale des Modes de signifier.’ Langages 65, 117–128.

Rosier I (1983). La grammaire spe´culative des Modistes. Lille: Presses Universitaires de Lille. Rosier I (1994). La parole comme acte: sur la grammaire et la se´mantique au XIIIe sie`cle. Paris: J. Vrin. Thomas of Erfurt. (1972). Grammatica speculativa. BursillHall G (trans.). London: Longman.

Linguistic Universals, Chomskyan S Cristofaro, University of Pavia, Pavia, Italy ! 2006 Elsevier Ltd. All rights reserved.

One of the key assumptions of the Chomskyan approach to the study of language is the existence of Universal Grammar, that is, as Chomsky (1976: 29) puts it, a set of principles, rules, or conditions that are elements or properties of all human languages. In fact, as is pointed out by Jackendoff (2002: Ch. 4), the notion of Universal Grammar is used in the literature in three distinct, albeit related, senses. First (for example, in Chomsky, 1965) Universal Grammar has been used to refer to a set of features that all languages have in common. These features can be regarded as universals of language proper, and they are of two types: features pertaining to the types of rules and constraints that have to be present in the grammar, such as phrasal formation rules, derivation rules, and constraints thereon; and features pertaining to the material that provides the basic building blocks of linguistic structure, such as phonological distinctive features in phonology, or parts of speech and the notion of syntactic tree in syntax. In Chomsky (1965) these two feature types are indicated as formal and substantive universals respectively. The other two senses of the notion of Universal Grammar specifically pertain to the way in which speakers acquire their language. Universal Grammar is also used (for example, in Chomsky, 1972) to refer to the initial state of the mind of a language learner that makes language acquisition possible. This initial state involves knowledge of the complete range of possible human grammars, from which the language learner selects a target grammar in response to the linguistic data to which he or she is exposed. Finally, in its most common sense, Universal Grammar is taken to correspond to the initial state of the mind of a language learner along with the machinery (which Chomsky calls Language Acquisition Device)

that makes it possible for the speaker to move from the initial state to the final target grammar. The notion of Universal Grammar, in its various senses, implies that there are two universal aspects of human language. On the one hand, there is a set of features common to all languages. On the other hand, there is a universal mechanism that makes language acquisition possible. This mechanism crucially involves knowledge of the range of possible grammars, including the features common to all languages, as well as the machinery that allows the language learner to select the target grammar. The latest versions of generative theory, Principles and Parameters and Minimalism (Chomsky, 1981; Chomsky, 1995), assume that Universal Grammar, intended as the initial state of the mind of the language learner along with the Language Acquisition Device, includes specification of a number of universal principles, each associated with an open value parameter. Individual parameters may have different values in different languages, which are triggered by the linguistic input to which the language learner is exposed. Interaction between the universal principles, the open value parameters, and the input that triggers specific values of these parameters makes it possible for the learner to acquire the target grammar. The inventory of principles and parameters that are taken to be part of Universal Grammar has evolved over the decades, and different authors take different views. While some authors (e.g., Baker, 2001) have rather elaborate lists of principles and parameters, there has been a general tendency (exemplified, for instance, by Chomsky, 1995 and Jackendoff, 2002) to reduce the inventory of elements posited in Universal Grammar as much as possible, in order to posit the smallest set of elements that can still account for the data. A classical example of an open value parameter is provided by the so-called head parameter. The head parameter is associated with X-bar theory (Chomsky,

Linguistic Universals, Chomskyan 223

1981), a theory of syntax that assumes a universal principle whereby a phrase always contains a head of the same lexical category (a noun, verb, or preposition) along with possible complements. The head parameter distinguishes between languages in which complements follow the head, as in the English example in (1), and languages in which complements precede the head, as in the Japanese example in (2). (1) eat an apple (2) Japanese ringo-o tabe-ru apple-ACC eat-NONPAST ‘eat an apple’ (Fukui, 1995: 329)

Depending on the external linguistic input he or she receives, the language learner selects the value ‘headinitial’ or ‘head-final’ for the target language. In this way, the universal principle that all phrases contain a head along with possible complements interacts in the learning process with the input about the value of the head parameter in the target language, yielding the correct grammar for the target language. This model is meant to simultaneously account for universals of language and the actual diversity of languages, as revealed by cross-linguistic investigation. Universals of language are found in the principles and open value parameters that are part of Universal Grammar, while the range of possible values for each parameter determines actual crosslinguistic variation. A similar model is proposed in Optimality Theory (see, e.g., Prince and Smolensky, 1993), where cross-linguistic variation is accounted for by postulating different rankings of a number of universal principles in different languages. Insofar as it postulates a set of properties common to all languages, and a model to account for actual cross-linguistic variation, the Chomskyan approach is close to the other major approach to linguistic universals, the Greenbergian approach (see Linguistic Universals, Greenbergian). However, the Chomskyan approach displays a number of distinguishing features. A major distinguishing feature of the Chomskyan approach involves the status attributed to linguistic universals in terms of mental representation. In the Greenbergian approach, universals are properties holding for all (or a statistically significant part of) human languages, but the theory has no implication as to whether or not these properties are part of a speaker’s mental representation of his or her language. In the Chomskyan approach, on the other hand, universals (including both universal principles and open value parameters) have psychological reality, that is, it is assumed that they have specific representations in the human mind. This is a consequence of the explanatory value attributed to universals in

this approach. In the Chomskyan perspective, the very reason to postulate linguistic universals in the first place is the need to account for the rapidity and uniformity of language learning. This argument is traditionally known as the ‘poverty of the stimulus’ argument. It is argued that the primary linguistic data available to the language learner is largely insufficient to construct the target grammar. Therefore, language acquisition would be impossible if the language learner were not endowed with an initial prespecification of the brain specifying the form of the grammar of a possible human language. This prespecification takes the form of the universal principles and parameters that represent the bulk of Universal Grammar. These principles and parameters are ultimately thought to be innate. Thus, Universal Grammar is basically regarded as an innate toolkit humans are equipped with that makes language acquisition possible. This has two important consequences for the nature of linguistic universals in the Chomskyan theory. First, universals have explanatory value in the theory, in that it is basically assumed that the reason why languages are the way they are is because they reflect a set of universal principles represented in the speaker’s mind. This is an example of internal explanation, in that the grammatical organization of human languages is accounted for in terms of a set of principles internal to the grammar itself, which have psychological reality. The recourse to internal explanation is traditionally regarded as one of the most distinguishing features of the Chomskyan approach to linguistic universals, as opposed to other approaches such as the Greenbergian approach or functionalist approaches in general, which account for linguistic universals in terms of principles external to the grammatical system itself and ultimately rooted in semantics, pragmatics, and language use. It should however be noted that, as is observed by Newmeyer (1998: 104–105), linguistic universals in the Chomskyan approach are ultimately motivated in terms of innateness, and innateness is also external to the grammatical system as such. The other consequence of the toolkit nature of Universal Grammar is that the mental representation of linguistic universals includes a number of options that are not actually implemented in the grammar of individual languages. Since it is assumed that the language learner cannot construct the target grammar on the basis of the external input and that any learner can learn any language, it follows that the whole range of possible options for a grammar must be prespecified in the learner’s mind, even if not all of those options will actually be realized in the target grammar. As a result, it may be assumed that a particular mechanism ascribed to Universal Grammar

224 Linguistic Universals, Chomskyan

is indeed a universal of language even if it is not manifested in individual languages. For example, the fact that some languages are head-initial does not mean that the value ‘head-final’ is not specified in the mind of speakers of those languages; it just means that this option is not implemented in the grammar of the relevant languages. This brings us to another major distinguishing feature of the Chomskyan approach: Universals are motivated in terms of theory-internal criteria rather than empirical cross-linguistic investigation. In the Greenbergian approach, universals are data-driven, and they are established based on large language samples. In the Chomskyan approach, on the other hand, universals are primarily established on the basis of theory-internal considerations. The rationale behind this approach is that if it can be demonstrated that a property of a single language cannot be learned, that property has to be part of the innate endowment of the language learner, and therefore it is universal. Arguments for the innate, and therefore universal, status of a particular principle usually include the fact that the principle is so abstract or complex that it couldn’t be learned inductively, the fact that the principle appears at an extremely early stage of child development, and the fact that the principle cannot plausibly be attributed to any aspect of the external input received by the language learner (Newmeyer, 1998: 85). A consequence of the theory-driven status of language universals in the Chomskyan approach is that universals are established on the basis of in-depth investigation of individual languages rather than cross-linguistic comparison as such. The Principles and Parameters version of generative grammar has indeed pointed out the necessity of cross-linguistic comparison (see, e.g., Chomsky, 1995). Other generative theories, such as Lexical-Functional Grammar (Bresnan, 2001), Auto-Lexical Syntax (Sadock, 1991), and Role and Reference Grammar (Van Valin and LaPolla, 1997) have also developed a machinery designed to account for the wide variety of syntactic phenomena found across languages. As is observed by Hawkins (1988: 87–88), however, emphasis still lies on theory in the Principles and Parameters paradigm,

so that first a theory is formulated, usually on the basis of a well-known language such as English, and then it is tested and refined by looking at whatever languages seem appropriate. This is in contrast to the Greenbergian approach, where universals are descriptions of statistical tendencies revealed by cross-linguistic investigation. See also: Chomsky, Noam (b. 1928); Formalism/Formalist Linguistics; Grammar; Levels of Adequacy, Observational, Descriptive, Explanatory; Linguistic Universals, Greenbergian; Principles and Parameters Framework of Generative Grammar; Variation and Formal Theories of Language: HPSG.

Bibliography Baker M C (2001). The atoms of language: the mind’s hidden rule of grammar. New York: Basic Books. Bresnan J (2001). Lexical-functional syntax. Oxford: Blackwell. Chomsky N (1965). Aspects of the theory of syntax. Cambridge: The MIT Press. Chomsky N (1972). Language and mind (2nd edn.). New York: Harcourt, Brace and World. Chomsky N (1976). Reflections on language. London: Temple Smith. Chomsky N (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky N (1995). The minimalist program. Cambridge: The MIT Press. Fukui N (1995). ‘The principles-and-parameters approach.’ In Shibatani M & Bynon T (eds.) Approaches to language typology. Oxford: Clarendon Press. 327–372. Hawkins J A (1988). ‘On generative and typological approaches to universal grammar.’ Lingua 74, 85–100. Jackendoff R (2002). Foundations of language. Oxford: Oxford University Press 8. Newmeyer F J (1998). Language form and language function. Cambridge: The MIT Press. Prince A & Smolensky P (1993). Optimality theory: constraint interaction in generative grammar. Piscataway, NJ: Rutgers University Center for Cognitive Science. Sadock J (1991). Auto-lexical syntax. Chicago: University of Chicago Press. Van Valin R D Jr & LaPolla R J (1997). Syntax. Cambridge: Cambridge University Press.

Linguistic Universals, Greenbergian 225

Linguistic Universals, Greenbergian S Cristofaro, University of Pavia, Pavia, Italy ! 2006 Elsevier Ltd. All rights reserved.

The major assumption underlying the Greenbergian, or typological, approach to the study of language, and the so-called functional-typological approach that originated from it (Croft, 2003: Ch. 1), is that the structural variation displayed by the world’s languages is ordered and can be described in terms of linguistic universals, that is, a set of constraints or restrictive principles having universal validity (Greenberg 1963/ 1966; Greenberg et al., 1978). These constraints are of two types. On the one hand are constraints stating that all languages behave in the same way with respect to the distribution of single features, such as the presence or absence of vowels. These constraints state that the relevant feature is either universally present or universally absent in the world’s languages, leaving no room for variation. These constraints go under the name of ‘unrestricted universals’ and are comparable to the universal principles postulated in the Chomskyan approach (see Linguistic Universals, Chomskyan). A classical example of an unrestricted universal is the statement that all languages have vowels (see, e.g., Croft, 2003: 52). On the other hand are constraints concerning the correlation between different features. These constraints state that all languages that exhibit a feature A also exhibit a feature B. For instance, languages having nasal vowels also have corresponding oral vowels (Croft, 2003: 54). Of course, since the relevant feature need not be present in a language, these constraints cannot be regarded as universal in the sense stated above: languages do not display uniformity with respect to the relevant features. For instance, there may be languages with both oral and nasal vowels, or languages with oral vowels only. However, the restrictive principle excludes the existence of languages with nasal vowels but no corresponding oral vowels. In this case, the restrictive principle describes a pattern of variation. Languages behave in different ways with respect to the distribution of the relevant features, but their variation must obey the limits set by the restrictive principle. What is universal in this case, then, is the fact that languages have to conform to the same pattern. However, the pattern itself allows a certain amount of variation. This second type of language universals goes under the name of ‘implicational universals.’ Implicational universals state a dependency between logically independent grammatical parameters. As such, they may be seen as an application of propositional logic to typology. A standard implicational universal of the

form A ! B covers four logically possible types: A & B, ~A & B, ~A & ~B, and A & ~B. Of these, the former three types are allowed by the implication, while the fourth is excluded. Any dependency relation between logically independent grammatical parameters may thus be described by means of an implication formulated in such a way that the allowed types correspond to the actually attested ones and the prohibited ones are actually unattested. For example, cross-linguistic investigation shows that if relative clauses (Rel) precede the noun (N) in a language, then demonstratives (Dem) also precede the noun. This pattern covers three actually occurring types, RelN & DemN, NRel & NDem, NRel & DemN, and an unattested one, RelN & NDem. This is schematized in Table 1 (þ ¼ attested; # ¼ unattested). The distribution of attested and unattested combinations in this case makes it possible to draw the implication ‘RelN ! DemN,’ which predicts the occurrence of the attested types and excludes the unattested one, just as in Table 1. In the last decades, typological research has accumulated a considerable body of evidence for implicational universals in phonology and morphosyntax, concerning, for example, word order, alignment patterns, parts of speech, animacy, and grammatical relations (see standard textbooks in typology such as Comrie, 1989 or Croft, 2003 for detailed treatment of these issues and relevant literature). In particular, individual implications turn out to be combinable in chains, or hierarchies, of the form: (1) A ! B & B ! C & . . .

In implicational hierarchies, the consequence of each implication is the antecedent of the following one. In current typological practice, these implicational hierarchies are indicated in the form (2) . . . C > B > A

The predictive power of implicational hierarchies is very high. If any term involved in the hierarchy is present, then all the terms to the left of it on the chain must be present. This allows for multiple predictions based on a single assumption. On the other Table 1 Implicational universals: demonstrative, and relative clause

ReIN NRel (Croft, 2003: 54).

order

of

noun,

DemN

NDem

þ þ

# þ

226 Linguistic Universals, Greenbergian

hand, if any term involved in the hierarchy is absent, then all the terms to the right of it on the chain must be absent, too. This excludes a large number of language types, and the more implications involved in the chain, the more language types excluded. This is why typological research has focused on the individuation of cross-linguistic hierarchies, such as the Accessibility Hierarchy for relativization (Keenan and Comrie, 1977), the hierarchies for number, animacy, and grammatical relations (these are described in detail in Croft, 2003: Ch. 5), as well as Givo´ n’s Binding Hierarchy for complementation (Givo´ n, 1980). In contrast to the Chomskyan approach, in which linguistic universals have psychological reality and are in themselves explanatory principles motivating language structure, typological universals are primarily meant as generalizations resulting from empirical cross-linguistic investigation that describe statistically significant distributions of particular phenomena (see on this point Newmeyer, 1998: Ch. 6; Newmeyer, 2004). This has a number of important consequences. First, cross-linguistic comparison plays a crucial role in typological research, because universals can only be established through cross-linguistic comparison and must be established on the basis of statistically relevant patterns. As a result, considerable attention has been devoted to issues such as sampling methods, the statistical significance of individual putative universals (see, e.g., Dryer, 1992; and Rijkhoff and Bakker, 1998), and criteria to consistently identify the phenomena under investigation from one language to the other (Croft, 2003: Ch. 1). The patterns described by language universals are in themselves in need of explanation. In the typological approach, universals are usually accounted for in terms of functional factors, that is, semantic, pragmatic, processing, or frequency factors. These explanations are usually regarded as external explanations, as opposed to the explanations usually invoked within the generative paradigm, which are internal to the grammatical system. Unlike language universals as such, at least some of the functional principles invoked in the typological approach are taken to have psychological reality (see, among others, Croft, 2001). An example of functional explanation is provided by the various processing and frequency factors that have been proposed for what is perhaps the bestknown typological hierarchy, Keenan and Comrie’s (1977) Accessibility Hierarchy for relativization: (3) Subject > Direct object > Indirect Object > Oblique > Genitive > Object of comparison

This hierarchy states that if a language can form relative clauses on any syntactic role on the hierarchy,

then it can form relative clauses on all syntactic roles to the left. Keenan and Comrie’s original explanation for this hierarchy is grounded on psychological ease of comprehension. The lower a role is on the Accessibility Hierarchy, the harder it is to understand relative clauses formed on that role. The underlying assumption is that if the speakers of a language are able to process relative clauses formed on a more difficult role, they are able to process relative clauses formed on less difficult roles. A similar analysis has been proposed by Hawkins (1994). On the other hand, Fox (1987) accounts for the Accessibility Hierarchy in discourse rather than psychological/processing terms. Relying on a corpus of spoken English data, she claims that the roles that are actually most accessible to relativization are intransitive subject and direct object. This is because the primary discourse function of relative clauses is to situate the referent that is being introduced as a relevant part of the ongoing discourse. This is typically done by means of relative clauses formed on subjects of intransitive verbs or direct objects. So this type of relative clause is the most frequent relative clause type at the discourse level, and this accounts for why the relevant syntactic roles are more accessible to relativization. Other implicational patterns have been explained in semantic (rather than processing or frequency) terms. For example, cross-linguistic investigation shows that if inalienable possession is expressed by constructions involving a certain distance between morphemes, then alienable possession will be expressed by constructions involving at least the same distance between morphemes. Haiman (1983, 1985) accounts for this pattern in terms of an iconic principle, whereby greater closeness between particular concepts (as manifested in the relationship between possessor and possessum in inalienable possession) is reflected by greater formal closeness between the linguistic elements expressing these concepts. In some cases, it has been suggested that the explanation for a particular implicational pattern is diachronic and that functional principles only play an indirect role. For example, a bidirectional implication (logical equivalence) exists such that if a language has prepositions, then it has noun-genitive order, and if it has postpositions, then it has genitive-noun order (see, among others, Croft, 2003: 58). This can be accounted for in terms of Hawkins’s (1983) principle of Cross-Category Harmony (which is ultimately motivated in terms of processing ease). Nouns and adposition belong to the same category, in that they are heads of their phrases, so the correlation between the order of adpositional constructions and the order of genitival constructions reflects a general principle

Linguistics as a Science 227

whereby modifiers are always located on the same side of the head. Bybee (1988) argues, however, that the correlation is actually motivated in diachronic terms: adpositional constructions typically originate from genitival constructions, and they maintain the order of genitival constructions. In this case, Bybee argues, functional principles may be invoked to account for why adpositional constructions develop from genitival constructions in the first place, but no functional principle can actually be invoked to account for the word order correlations between the two constructions. It should finally be observed that a widespread assumption in the typological approach is that the various language types allowed by implicational universals reflect different functional principles, and the competition between the various functional principles determines cross-linguistic variation (Du Bois, 1985). For example, an implicational pattern exists such that if the singular is expressed by a certain number of morphemes, then the plural will be expressed by at least as many morphemes. This is usually explained in terms of an economic principle, whereby the singular is the most frequent category at the discourse level and therefore need not be expressed overtly. One such explanation accounts for the existence of languages in which the singular is zero-marked and the plural is marked overtly. However, the implicational pattern also allows for languages in which both singular and plural are expressed overtly. These languages reflect an iconic principle whereby all aspects of conceptual structure (in this case, the values ‘singular’ and ‘plural’) are expressed overtly (Croft, 2003). The implicational pattern allows for both language types, but the two language types reflect competing functional principles, each prevailing in different languages. The competition between the different principles is what determines cross-linguistic variation, as described by the implicational pattern itself. See also: Linguistic Universals, Chomskyan.

Bibliography Bybee J (1988). ‘The diachronic dimension in explanation.’ In Hawkins J A (ed.) Explaining language universals. Oxford: Basil Blackwell. 350–379. Comrie B (1989). Language universals and linguistic typology (2nd edn.). Oxford: Basil Blackwell. Croft W (2001). Radical construction grammar. Oxford: Oxford University Press. Croft W (2003). Typology and universals (2nd edn.). Cambridge: Cambridge University Press. Dryer M S (1992). ‘The Greenbergian word order correlations.’ Language 68, 81–138. Du Bois J A (1985). ‘Competing motivations.’ In Haiman J (ed.) Iconicity in syntax. Philadelphia: John Benjamins. 343–366. Fox B A (1987). ‘The noun phrase accessibility hierarchy reinterpreted: Subject primacy or the absolutive hypothesis?’ Language 63, 856–870. Givo´ n T (1980). ‘The binding hierarchy and the typology of complements.’ Studies in Language 4, 333–377. Greenberg J H (1963/1966). ‘Some universals of grammar with particular reference to the order of meaningful elements.’ In Greenberg J H (ed.) Universals of language, 2nd edn. Cambridge, MA: MIT Press. 73–113. Greenberg J H, Ferguson C A & Moravcsick E A (eds.) (1978). Universals of human language (4 vols). Stanford: Stanford University Press. Haiman J (1983). ‘Iconic and economic motivation.’ Language 59, 781–819. Haiman J (1985). Natural syntax. Cambridge: Cambridge University Press. Hawkins J A (1983). Word order universals. New York: Academic Press. Hawkins J A (1994). A performance theory of word order and constituency. Cambridge: Cambridge University Press. Keenan E L & Comrie B (1977). ‘Noun phrase accessibility and universal grammar.’ Linguistic Inquiry 8, 63–99. Newmeyer F J (1998). Language form and language function. Cambridge: The MIT Press. Newmeyer F J (2004). ‘Typological evidence and Universal Grammar.’ Studies in Language 28, 526–548. Rijkhoff J & Bakker D (1998). ‘Language sampling.’ Linguistic Typology 2, 263–314.

Linguistics as a Science B Clark, Middlesex University, London, UK ! 2006 Elsevier Ltd. All rights reserved.

A common description of linguistics is that it is the ‘scientific study of language.’ This might seem to be a loose or metaphorical use since the subject matter of

linguistics is quite different from what are often thought of as the ‘hard’ sciences such as physics or chemistry. But linguists are engaged in a process of inquiry that aims to discover facts about the world we live in, and so their work shares important properties of other sciences. Some work in linguistics (e.g., acoustic phonetics) resembles the ‘hard’ sciences in

History of Linguistics: Discipline of Linguistics 341 Sprachwissenschaft.’ Beitra¨ge zur Geschichte der Sprachwissenschaft 13, 115–126. Schmitter P (2003b). Historiographie und Narration: Metahistorische Aspekte der Wissenschaftsgeschichtsschreibung der Linguistik. Seoul: Sowadalmedia. Tu¨bingen: Gunter Narr (in Komission). Schreyer R (2000). ‘What’s wrong with the historiography of linguistics?’ Beitra¨ge zur Geschichte der Sprachwissenschaft 10, 205–208. Sebeok T A (ed.) (1975). Current issues in linguistics. Vol. 13: Historiography of linguistics (2 vols). The Hague: Mouton. Simone R (1975 [1973]). ‘The´orie et histoire de la linguistique.’ Historiographia Linguistica 2, 353–378. Simone R (1995). ‘Purus historicus est asinus: Quattro modi di fare storia della linguistica.’ Lingua e Stile 30(1), 117–126.

Streitberg W (ed.) (1916–1936). Geschichte der indogermanischen Sprachwissenschaft seit ihrer Begru¨ndung durch Franz Bopp (6 vols). Strassburg: Karl J. Tru¨bner/Berlin: Walter de Gruyter. Swiggers P (1997). Histoire de la pense´e linguistique. Analyse du langage et re´flexion linguistique dans la culture occidentale, de l’Antiquite´ au XIXe sie`cle. Paris: Presses Universitaires de France. Szemere´nyi O. (1971). Richtungen der modernen Sprachwissenschaft 1: Von Saussure bis Bloomfield 1916–1950. Hidelberg: Winter Taylor D J (ed.) (1987). The history of linguistics in the classical period. Amsterdam & Philadelphia: John Benjamins. Tagliavini (1963). Storia di parole pagane e cristiane attraversoi tempi. Brescia: Morcelliana

History of Linguistics: Discipline of Linguistics N Smith, University College London, London, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Language makes us human. Whatever we do, language is central to our lives, and the use of language underpins the study of every other discipline. Understanding language gives us insight into ourselves and a tool for the investigation of the rest of the universe. Martians and dolphins, bonobos and bees, may be just as intelligent, cute, adept at social organization, and morally worthwhile, but they don’t share our language, they don’t speak ‘human’. Linguistics, the scientific study of language, seeks to describe and explain this human faculty. It is concerned with three things: discovering precisely what it means to ‘know a language’; providing techniques for describing this knowledge; and explaining why our knowledge takes the form it does. These concerns may seem too obvious to need discussing, but the complexity of our knowledge of language becomes strikingly apparent when we see someone lose their language after they have had a stroke, or when we observe an infant who has yet to acquire the faculty that we deploy so easily. To understand these three concerns, we need a theory, and that is what linguistics provides.

The Meaning of ‘Language’ That linguistics is ‘the scientific study of language’ has become a cliche´, but what it means to be ‘scientific’

may not always be obvious, and what people mean when they use the word ‘language’ varies from occasion to occasion. Consideration of what is involved in being scientific is deferred until later in the essay, for now it suffices to observe that only a few aspects of language have been illuminated by theoretical (scientific) linguistics, so there are many areas where it has little, if anything, helpful to say. The situation is akin to that in biology, viewed as the science of living things. Despite their importance to us, biology has nothing to say about the definition of pets; similarly, despite their relevance to us, linguistics has nothing to say about the definition of dialects. In everyday usage, ‘language’ is used differently, depending on whether it is construed as a property of the individual, of society, of the species, or as an autonomous entity in the world. Linguists working in the tradition of ‘generative’ grammar, the framework that has dominated linguistics for the last 50 years, argue that an ‘individual’ approach to language is logically prior to any other, but in principle we have the possible domains shown in (1), each suggesting different kinds of questions: (1) Language and the Individual Language and the Brain Language and Society Language and the Species Language and Literature Language and the World

Looking at ‘Language and the Individual’, the central question raised is ‘what constitutes our ‘‘knowledge of language’’?’ What properties or attributes does one have to have to be correctly described

342 History of Linguistics: Discipline of Linguistics

as a speaker of English, or Burmese, or any other ‘natural language’ – the term linguists use to refer to languages naturally acquired and spoken by humans, as opposed to the ‘artificial’ languages of logic or computing? An extension of this question is how and where knowledge of language is represented in the brain, and what mechanisms need to be postulated to enable us to account for our use of this knowledge. Neurolinguistics is an area of remarkable growth, supported by technological advances in imaging. Under ‘Language and Society’, sociolinguists raise questions such as ‘What are the social variants (class, age, gender, power) that determine, or correlate with, the use of particular forms of the language?’ A woman might use some pronunciations or grammatical constructions with statistically significantly greater frequency than a man of the same age, or a female of a different generation. For the world’s multilingual majority, social considerations may even determine which language is used in specific situations. A Swiss from Graubu¨ nden might use Romansh at home, Swiss German at work, and High German at a conference. Looking at ‘Language and the Species’, we might be preoccupied with the puzzle that all human children learn their first language with seeming effortlessness, while the young of other species, however intelligent, show minimal such ability. Linguists, and researchers in related fields, investigate not only whether this claim to uniqueness is indeed true but, if it is, how the faculty evolved. When we turn to the relation between Language and Literature, we confront several issues: ‘What is literary form?’; that is, what are the linguistic properties that make something a novel or a novella, a sonnet or an epic? How are literary effects achieved? What are the linguistic characteristics of a particular style? Looking at ‘Language and the World’ raises issues of three different kinds. First, how does language relates to things outside the head? That the word ‘London’ refers to the capital of the United Kingdom is innocuous enough as an informal claim, but it raises interesting, and vexed, philosophical questions. The debate revolves around the status of language as a property of an individual, rather than as an entity with independent existence. This ‘external’ notion of language is presupposed by those who write irate letters to the press, inveighing against split infinitives, and lamenting the fact that our language is becoming degenerate, either because of the sloppiness of modern youth, the pernicious influence of text messaging, or the role of multiculturalism. The third issue is in many ways the most obvious and the most puzzling: Why are there so many languages? Why does ‘human’ have so many dialects?

Knowledge of Language The generativist claim that study of the individual’s knowledge of language must be the first or exclusive focus of a scientific linguistics is controversial; that it is a possible, indeed necessary, focus is not seriously in doubt. This individualistic claim implies that linguistics is a branch of psychology, ultimately biology, rather than, say, of sociology. This is not to deny that there are interesting domains of knowledge that take the social conditions of language use as their central focus; it is rather to claim that there is a psychological enterprise which looks at one branch of human cognition and which lends itself to rigorous investigation and, moreover, that it is logically prior to looking at the exploitation of this knowledge in society. This focus on knowledge is highlighted in the claim that the subject of linguistics is ‘I-language’, rather than ‘E-language’, where the ‘I’ stand for internal to a particular individual, and ‘E’ stands for external (to the mind of the individual). A corollary of this orientation is that the descriptions that linguists devise are said to be ‘psychologically real’, where this is not a claim about psychological experimentation or the kind of evidence used in formulating particular linguistic hypotheses, but is simply the claim that we are investigating the human mind and that current theory is the closest approximation to the truth that we have. The mind is ultimately a product of the brain (and other systems), and evidence about the mental can sometimes be gleaned from studies of the neural. In general, however, linguists remain agnostic about the details of the relation between the mind and the brain (frequently referring simply to the ‘mind/brain’). That is, we devise theories of a sub-part of human knowledge, but whether that knowledge is localized in the temporal lobe of the left hemisphere, or is distributed throughout the brain, or whatever, is less important. This is not because of lack of interest, but simply because – at present – theories of neural structure are too embryonic to cast much light on linguistic generalizations. Different languages allow different word orders, so that Japanese puts the verb at the end of the sentence and English puts it in the middle. Linguistic theory must provide the means for describing and ultimately explaining this fact, but at present we have no inkling of how the difference between a Japanese and an English speaker might be neurally implemented, so the neurological structure of (this bit of) the language faculty is still a closed book. What do you have to know to count as a ‘speaker’ of a language? If you say you speak English, it implies that you understand English, too. The point may

History of Linguistics: Discipline of Linguistics 343

seem trivial, but knowledge of language is neutral between speaking and hearing; both activities draw on the same fund of knowledge. There is no known illness or accident which leaves you able to speak only English and understand only Portuguese, for instance. This is not to deny that you may be better at talking than listening; or that you may suffer brain damage that leaves you unable to speak while you remain perfectly able to understand. A particularly poignant example of this is provided by Bauby’s autobiographical account of ‘locked-in’ syndrome, where a stroke left the author speechless, but with his language and his ability to understand intact. In normal, nonpathological, cases, however, your ability to utter (2a): (2a) Giraffes have long necks (2b) Giraffes have necks long

involves the same ability that enables you to understand (2a), and also to judge that someone who mistakenly says (2b) has got it wrong. The implication of this observation is that the primary focus of linguistics is on characterizing this neutral knowledge, rather than the mechanisms of speaking, hearing, and judging that are parasitic on it. In other words, linguistics is (largely) about one form of cognition, and only secondarily about the deployment of that cognitive ability. In the standard terminology, this is known as the ‘competence-performance’ distinction. Your knowledge of language (your competence) underlies your ability to speak, to understand, and to give judgements of well- or ill-formedness (your performance). You may be knocked unconscious and be temporarily unable to speak or understand, but your knowledge typically remains intact – you have competence with no ability for performance. The converse situation, in which you could perform in the absence of any competence, does not occur, though it may characterize the ‘linguistic’ capabilities of the average parrot, which may be able to utter entertaining sentences of what sounds like English, but presumably without the knowledge of English grammar that underlies our abilities. To count as a speaker of English, you need first to know a large number of words: not just nouns, verbs, and adjectives – words such as cat and go and pretty, whose meaning is relatively transparent, but items such as the, under, and however, whose meaning and use are less easy to specify. Of course, not everyone has the same vocabulary: I may know technical terms in linguistics that you are ignorant of, and you may be familiar with words pertaining to reggae or arachnology that I don’t know, but if either of us were ignorant of words such as mother or and, people might be justifiably reluctant to classify us as speakers of English.

In addition to knowing the words of a language, you need to know what to do with those words – you need to know the grammar. Our knowledge of language falls into two compartments – the vocabulary (or ‘lexicon’) and the ‘computations’ we can carry out using that vocabulary. This computational system, comprising syntax and morphology, is surprisingly complex, and enables us to produce baroque examples such as Chomsky’s (1995: 88) Who do you wonder whether John said solved the problem? Such sentences are of marginal acceptability and citing them may strain the tolerance of outsiders, but this marginal status may itself provide crucial evidence for or against some theoretical claim concerning our knowledge. Henceforth, I shall assume that you and I have the same I-language, abstracting away from differences in vocabulary and grammar. Fortunately, it’s enough for the present purposes to look at the more basic, but nonetheless rich and surprising, knowledge we have of words as simple as be and the. Consider the examples that follow, which illustrate a wide range of things you know, even if you weren’t previously aware of knowing them. It’s selfevident that is and have mean different things, as shown in (3), but sometimes they seem to be used interchangeably as in (4): (3a) Tom is a problem (3b) Tom has a problem (4a) Tim is yet to win the Booker prize (4b) Tim has yet to win the Booker prize

How is it that something as basic as is can sometimes get the same interpretation as has and sometimes a different one? Or consider the so-called definite article (the), which is often said to mark the distinction between entities which are already familiar and those which are new, as in (5a) and (5b) respectively: (5a) My friend likes the penguins (5b) My friend likes penguins

But this characterization is not adequate to account for the rather macabre effects found in the newspaper report in (6b) beside the relatively unexceptionable (6a): (6a) The woman had lived with the dead man for two years (6b) The woman had lived with a dead man for two years

Still less can it account for the fact that on occasion the presence or absence of the seems to indicate the difference between subject and object, as in (7): (7a) This man is in charge of my brother (7b) This man is in the charge of my brother

344 History of Linguistics: Discipline of Linguistics

In (7a) this man has control of my brother; in (7b) my brother has control of this man. So what does the really mean? Does it even make sense to ask such a question? Take a more complex example: the word last is multiply ambiguous: apart from its use as a noun or a verb, it can function as an adjective meaning either ‘final’ or ‘previous’, as illustrated in (8): (8a) This is your last chance (8b) Your last example surprised me

This ambiguity can result in dialogues which have strikingly different interpretations, as in the alternatives in (9): (9) Q ‘‘What were you doing in Paris?’’ A1 ‘‘Oh, I was collecting material for my last book’’ A2 ‘‘Oh, I’m collecting material for my last book’’

Answer 1 is itself ambiguous, with either meaning possible for last (though ‘previous’ is the more easily accessible); Answer 2 has only the interpretation that the book under discussion is planned to be the final one I write. The difference must be attributable to the contrast between the past and the present tense, as that is the only way the sentences differ, but it’s not obvious why sentences should be ambiguous or not depending on the tense they contain. Linguists thrive on such ambiguity, as it regularly provides evidence for structural differences that may not be otherwise apparent. A simple example is provided by the inscrutable notice outside our local school, given in (10): (10) This school accepts girls and boys under six

Whether the school accepts girls of any age but only small boys, or no children over six is indeterminate without more information. As we shall see under the section ‘Describing Knowledge of Language,’ (10) has two quite different syntactic structures corresponding to the two meanings. Similarly the fact that (11) has a number of different interpretations can give us clues as to how to analyze the various possibilities: (11) My son has grown another foot

If my son has become taller, the example is parallel to (12a); if he is a freak or a remarkably successful gardener, there are other possibilities, as shown in (12b) and (12c), suggesting that another foot in (11) may be correctly analyzed either as a measure phrase or as a direct object: (12a) He has grown by another foot (12b) He has grown a third foot

(12c) Another foot has been grown (in this flowerpot)

Such differences of interpretation make the complexity of our knowledge apparent, but unambiguous examples can be just as illuminating and can simultaneously provide evidence against the traditional philosophical claim that meaning can be adequately treated in terms of truth. Thus, we know that (13): (13) My first wife gave me this watch

suggests rather strongly that I have been married more than once, but I can utter it truthfully despite having been married only once: my only wife is presumably my first wife. The example is misleading, not false, and so implies that there is much more to meaning than mere truth. As shown by Chomsky’s (1957) famous Colorless green ideas sleep furiously, structure and meaning (syntax and semantics) can dissociate, so we also know that, despite being initially plausible and syntactically unexceptionable, (14) is meaningless: (14) More people have visited Moscow than I have

All the preceding examples illustrate both our knowledge of vocabulary and how it interacts with (syntactic) structure. The responsibility of linguistics is to describe the full range of such facts, not just for English, but for all human languages. Then, in virtue of its scientific pretensions, it has to (attempt to) explain why these facts rather than any others are the ones that occur – again both in English and in other languages. To do justice to the richness of what we know, it is necessary to distinguish not just the lexicon and the computational system, but to differentiate among syntax, semantics, morphology, phonology and phonetics, and to relate this knowledge to pragmatics – how we interpret utterances in context. Take our knowledge of morphology, the internal structure of words. We know that thick, thicker, thickest, and thicken are all words of English, but that there is no thinnen to accompany thin, thinner, thinnest. We know that thick relates to thicken and that rich relates to enrich, whereas richen is slightly odd, and enthick is impossible. This knowledge can’t just be a result of our never having heard thinnen or enthick before, you may never have heard texted before, as in ‘‘I’ve just texted an urgent message to Fred’’, but you know that that is possible. As linguists, we may also know that some languages, such as Vietnamese, have almost no morphology: words in this language have none of the internal structure characteristic of affix-rich items such as indecisiveness or rearranged. On the other hand, some (polysynthetic) languages, such as Inuktitut (Eskimo) or

History of Linguistics: Discipline of Linguistics 345

Mohawk pile one affix on top of another so that words are often strikingly complex, and correspond to whole sentences in English. Baker (2001: 87) gives the Mohawk example in (15) with the meaning ‘‘He made the thing that one puts on one’s body ugly for her’’: (15) Washakotya’tawitsherahetkvhta’se’

Our knowledge of phonology, the sound structure of language, is equally rich. We know that past, spat, and stap are possible words of English, indeed they all exist; that stip and stup are also possible words, even though they happen not to exist; but that satp, ptas and tpas are not even possible words. Apart from knowing the segmental make-up of words, we also have knowledge of ‘supra-segmentals’: that photograph is stressed on the first syllable, photographer on the second, and photographic on the third. Two points need to be made: first, we ‘know’ this in the sense that we produce the correct pronunciations on demand, and we recognize that deviations from these pronunciations are slips of the tongue or foreigners’ mistakes; that is, knowledge of language need not be immediately available to conscious introspection. Second, the characterization in terms of ‘first’, ‘second’, and ‘third’ syllable is actually not the correct theoretical characterization of our knowledge. As we shall see below, rules of grammar (including phonology) cannot count. We know more. In an example such as (5a) above, My friend likes the penguins, we have to account for the pronunciation of the before the initial ‘p’ of penguins: a pronunciation rather different from that of the same lexical item the when it occurs before a vowel, as in My friend likes the otters. Knowledge of this kind is supplemented by phonetic knowledge which is even harder to bring to consciousness: that the ‘t’ in photographer is aspirated, but the ‘t’ in photograph is not; that the ‘r’ in grime is voiced, but that in prime it is slightly devoiced; that the vowel is longer in wed than in wet. Such facts belong to the domain of phonetics, the field that deals with the sound properties of language in general, rather than the sound structure of a particular language. Our phonological knowledge is not self-contained, but may interact in complex ways with our knowledge of the rest of the grammar. We know that (16a) has an alternative pronunciation of the kind given in (16b), where is is ‘contracted’ to’s, but that (17a) cannot be matched by the impossible (17b) (impossible is indicated by the asterisk), despite the apparent similarity of the examples: (16a) The prime minister is a war criminal (16b) The prime minister’s a war criminal

(17a) The president is a war criminal and the prime minister is too (17b) *The president is a war criminal and the prime minister’s too

An understanding of such asymmetries requires investigation of the relation between syntactic and phonological processes, and relies on an analysis of empty categories: entities that have syntactic and semantic properties but are silent. In addition to phonology and morphology, we need to account for the (semantic) fact that sentences have meaning. The examples in (18) exploit most of the same words but their meanings are radically different: (18a) My friend likes the penguins (18b) The penguins like my friend (18c) My friend doesn’t like the penguins

Moreover, the semantics is ‘compositional’ – except for idioms, the meaning of a sentence is a function of the meaning of its parts, and their syntactic configuration. The meaning difference between (18a) and (18b) is dependent on which item is subject and which object, notions that can be defined syntactically. In fact, life is a little more complicated than that, as the semantic interpretation of ‘subject’ is not uniform, and we need to advert to ‘thematic relations’ involving ideas of agentivity and patienthood, as shown by the minimal pair in (19): (19a) John undertook the surgery reluctantly (19b) John underwent the surgery reluctantly

John is the subject in both sentences, but is the agent (the surgeon) in the former, and the patient (in both senses) in the latter. These relations are internal to a single sentence, but we also need to relate (the meanings of) different sentences. There are two possibilities: relations which depend on the meaning of individual words, and relations which are purely sentential in that they are independent of such lexical relations. An example of the former is illustrated by (20): (20a) Mozart persuaded da Ponte to write a libretto (20b) Da Ponte intended to write something

where (20b) follows, by virtue of the meaning of persuade from (20a). An example of the latter is provided by pairs such as (21a) and (21b), where the truth of (21a) guarantees the truth of (21b): (21a) Torture is immoral and should be illegal (21b) Torture is immoral

In the next section, I will outline some of the descriptive mechanisms exploited by (generative) linguistics; then I will try to show how we can approach an explanation for at least some phenomena, looking at

346 History of Linguistics: Discipline of Linguistics

a range of examples from English and elsewhere, and use this extension to substantiate the claim that linguistics is a science. Throughout, I shall concentrate on syntax. Phonology and phonetics, morphology, and semantics are rich disciplines in their own right, each with a massive literature, but the essence of the analysis of sentences is their syntactic structure. And life is finite.

Describing Knowledge of Language Sentences have structure of various kinds. Returning to the example, My friend likes the penguins, we need to describe it in different ways at several distinct ‘levels of representation’: phonological, semantic, and syntactic. Thus, it can be pronounced in a variety of ways – with stress on friend or on penguins, for instance, with concomitant differences of interpretation. Restricting attention to the syntax, it is intuitively clear that my and friend, and the and penguins ‘go together’ in a way that friend and likes, and likes and the do not. Each initial word of My friend and the penguins enables us to pick out some individual or individuals in the world, whereas friend likes and likes the have no such function within them. This intuition is accounted for in terms of ‘constituency’ represented by means of a simplified tree diagram of the kind in (22): (22)

The top of the tree (IP) indicates that the whole sequence ‘‘My friend likes the penguins’’ is an ‘I(nflection) P(hrase)’{it used to be called ‘Sentence’, but the terminology has changed to reflect changes in our understanding}. The IP ‘branches’ into an NP and a VP, where ‘NP’ means ‘Noun Phrase’, that is a sequence consisting of a Noun (N) and something else, and ‘VP’ stands for ‘Verb Phrase’, that is a sequence consisting of a Verb (V) and something else, here in this instance another Noun Phase. The verb is the (present-tense) form likes, and the two Noun Phrases each consist of a Noun (here the singular friend and the plural penguins) preceded by a ‘Det(erminer)’, respectively my and the. Each of ‘IP’, ‘NP’, ‘VP’, ‘N’, etc., are referred to as ‘nodes’ in the tree; IP, NP, and VP, etc., are said to ‘dominate’ everything below them, and to ‘immediately dominate’

everything immediately below them. So VP dominates all of V, NP, Det, N, the, and penguins, but immediately dominates only V and NP, which are known as ‘sisters’. Once one has got used to the jargon, the advantages of such an analysis are many: it simultaneously shows the linear sequence of items – the order they come in – and the relationships among the component parts: so the and penguins are more closely related, by virtue of being NPs, than are likes and the which do not form a ‘constituent’ of any kind. A constituent is defined as any sequence of items that can be traced exhaustively to a single node in the tree: likes and the can be traced back to VP (and indeed IP), but these nodes dominate other material, too (penguins, for instance) so likes the, like friend likes, are not constituents. We now have an explicit way of characterizing the example This school accepts girls and boys under six. The two interpretations of the object, girls and boys under six, can be represented with different constituent structure as in (23): (23a) [girls] and [boys under six] (23b) [girls and boys] [under six]

where the brackets mark the constituents, and indicate that the ‘scope’ of under six is respectively either just boys (23a) or includes girls and boys (23b). In addition to this syntactic constituent structure, there is morphological structure to deal with: the fact that penguins is plural is marked by the addition of the suffix –s to the base penguin, and the opposite order (with s- prefixed to penguin to give spenguin) is impossible (in English). Investigating the full range of such facts in the world’s languages is a matter of intensive research, and addresses the same immediate task of accounting for how it is that native speakers can have the intuitions and make the judgements of well- and ill-formedness that they do. This last point bears elaborating. One of the surprising facts about our linguistic ability is that it extends to knowing what is impossible as well as what is possible: we have intuitions of ill-formedness or ‘negative knowledge’. I have already traded on this fact in assuming that, even though you had probably never heard either example before, you would agree that Giraffes have necks long is wrong, whereas I’ve just texted an urgent message to Fred is acceptable. The point can be generalized: the fact that one can recognize mistakes and distinguish them from new but well-formed creations is evidence for the rulegoverned nature of the language faculty. Mistakes presuppose norms, or rules. It is also noteworthy that there are ‘impossible’ mistakes: some logically possible errors just don’t happen, even though one might expect them to. Consider an example from

History of Linguistics: Discipline of Linguistics 347

language acquisition and the task of the child in working out how questions and statements of the kind in (24) are related: (24a) The children are playing truant (24b) Are the children playing truant?

There are all sorts of hypotheses children might entertain: move the auxiliary (are), move the third word, permute the first and second constituents, and so on. The kinds of mistake that children do make, however, show that their hypotheses overlap with these in interesting ways. First, they sometimes make mistakes of a kind for which there is no obvious pattern in the input, even though they may be theoretically well motivated: examples such as the ‘auxiliary copying’ in (25): (25a) Is the steam is hot? (25b) Are the children are playing truant?

Second, they never try out any hypothesis that would involve them in counting: their attempts always range over modifications of linguistic structure, never of mathematical structure. It seems that all rules in all languages are what is called ‘structure-dependent’ – they depend on notions such as constituent, Noun Phrase, and so on, but not ‘third word’. Moreover, children seem not to need to learn this fact – it is a principle that guides their language acquisition from the start: it is innate. Claims of innateness have been unnecessarily controversial in modern linguistics. No one doubts that humans are innately (genetically) different from cats, chimpanzees, and dolphins, and that this difference underlies our ability to acquire language. Equally, no one doubts that humans acquire different languages depending on the environment they are brought up in: if children are brought up in Turkey rather than Greece, they learn Turkish rather than Greek. It is obvious that both nature and nurture have a crucial role to play. Where controversy is justified, and where empirically different claims can be tested, is in the detail of what needs to be ascribed to the child’s ‘initial state’, of what precisely is innate and what has to be acquired on the basis of experience. Explaining structure-dependence is an area where innateness has been repeatedly (and controversially) defended with a form of argument based on the ‘poverty of the stimulus’ – the idea that we end up knowing things that it is impossible, or at least implausible, to think that we could find in the input. Consider examples more complex than those above, such as (26): (26a) The children who were naughty are playing truant (26b) Are the children who were naughty playing truant?

If ‘moving the third word’ or ‘moving the (first) auxiliary’ were really the correct way of characterizing the relation in (24) one would expect to find example mistakes like that in (27): (27a) Who the children were naughty are playing truant? (27b) Were the children who naughty are playing truant?

Such mistakes simply do not occur. Of course, it is always (usefully) dangerous to say that something does not happen: it may happen in the next utterance one comes across. But this means that the claim is eminently falsifiable (see below), and can anyway be checked by looking for relevant counterexamples in the literature. A nice example of this kind is provided by Neeleman and Weerman’s (1997) account of acquisitional differences between Dutch and English. They predicted that Dutch children should, and English children should not, produce sentences with an adverb intervening between a verb and its object, as in (28): (28) I will eat quickly the yoghourt

They ransacked the largest international corpus of child data in checking their predictions, and happily found no exceptions. Formalizing our knowledge of language demands a complex toolkit, only a tiny fraction of which has been given here, but such formalization is a necessary prerequisite to finding explanations, to assimilating linguistics to the scientific enterprise. Given the tools developed here, we can make general hypotheses about the nature of language and begin to test them on a wider range of data from English and elsewhere.

Explanation in Language Examples involving structure-dependence enable one to address the demand for explanation in addition to description. Let’s pursue the issue by looking at the occurrence of items such as any, ever, or anything in English (so-called ‘Negative Polarity Items’). At a descriptive level, it is sufficient simply to contrast possible and impossible sentences of the sort seen in (29a) and (29b), where those in (29a) are fully acceptable but those in (29b) are ungrammatical, or infelicitous, or just wrong: (29a) John ate something/ some salad (29b) *John ate anything/ any salad

But why is there this contrast? The example in (30) shows that any(thing) can occur happily enough in negative statements, but it occurs unhappily in positive statements:

348 History of Linguistics: Discipline of Linguistics (30) John didn’t eat anything/ any salad

Looking at such negative examples, the generalization seems to be that any(thing) needs to occur with (be ‘licensed by’) a negator. But such an account is inadequate in two different ways: first, (31) shows that it is not just negators that are relevant, but a variety of elements behave in a similar fashion. This class includes questions, conditionals, and other items that there is no space to characterize: (31a) (31b) (31c) (31d)

Did John eaten anything/ any salad? If John ate anything/ any salad, I’d be amazed Everyone who has any sense has left already John denied having eaten any of the cakes

Second, even if we restrict ourselves to negatives, it still seems that life is more complicated than we might wish – (32a) is unsurprisingly fine but, despite being negative, (32b) is unacceptable and none of (32c) to (32e) is acceptable: (32a) (32b) (32c) (32d) (32e)

Something/ some spider bit him in the leg *Anything/ any spider didn’t bite him in the leg *Anything is annoying me *Anything isn’t annoying me *John denied any of the accusations

That is, our first approximation that any needs to be licensed by a negative fails in both directions – some sentences with negatives do not allow any; some sentences without a negative do allow any. The next obvious assumption might be that any(thing) has to be preceded by a negator of some kind (not or n’t here), but (33) shows that this hypothesis is inadequate: it works for (33a) and (33b) but not for (33c) or (33d) – where nothing is another negator:

the negative in a different clause and are nonetheless grammatical; some have the negative in the same clause and are ungrammatical. The correct explanation necessitates an appeal to the notion of ‘c-command,’ a relation between ‘nodes’ in a tree. To make this comprehensible and plausible, we need to introduce a little more of the technical machinery of generative grammar. The representation of sentence structure in terms of trees of the kind shown in (22) can obviously be extended to show the structure of (29a), as shown in (35), where the only novel feature is the uncontroversial claim that some is a kind of Determiner: (35)

More complex sentences require more complex configurations. Thus, the salient property of an example such as (33a) ‘The fact that he has come won’t change anything’ is that the subject is not just a noun or noun phrase, but a noun phrase containing another sentence ‘he has come’. To a first approximation it would have the (simplified) form given in (36), and the ungrammatical example in (33c) *The fact that he hasn’t come will change anything would be characterized by the tree given in (37): (36)

(33a) The fact that he has come won’t change anything (33b) Nothing will change anything (33c) *The fact that he hasn’t come will change anything (33d) *That nothing has happened will change anything

The examples in (33a) to (33d) suggest another possibility: perhaps the negator has to be in the same clause as the item (any) being licensed? In (33a), the negator and anything are in the same clause (compare ‘‘This won’t change anything’’), whereas in (33c) and (33d), the negator is in a different clause. We are getting closer, but (34) shows that this is still inadequate as an explanation, as here the negator and anything are blatantly in different clauses, but the result is well-formed. (34) I don’t think he has eaten anything

The claim that the negative (or other item) must be in the same clause as any fails: some sentences have

Some of the details of the tree have been included for the sake of those who are already familiar with syntax. So the Complementizer Phrase (CP), optionally headed by a Complementizer such as that, and the I’ (a constituent intermediate in size between a sentence (IP) and an Inflection element like will) are there for

History of Linguistics: Discipline of Linguistics 349

the cognoscenti. But two things in these trees are important for everyone: first, that they contain a constituent Neg(ation), itself a subpart of a NegP(hrase); and second, that it makes sense to talk of one item being higher in the tree than another. That is, in (36), the ‘Neg’ is higher in the tree than anything, whereas in (37) the ‘Neg’ is lower in the tree than anything. (37)

language need to learn, rather (like structure-dependence) it acts as a constraint that determines the kind of hypotheses they can come up with in mastering their first language. Let us look at one generalization of the usefulness of c-command in English: its use in ‘binding theory’, the part of linguistics that deals with the distribution of pronouns and reflexives. It is a commonplace that reflexive pronouns such as myself, yourself, himself, and so on, have to agree (or ‘be compatible’) with their antecedent – the entity they refer back to, so the examples in (38) are fine, but those in (39) are ungrammatical: (38a) I admire myself (38b) The judge admires himself (38c) The waitress might flatter herself (39a) *I admire yourself (39b) *He admires herself (39c) *The waitress flattered ourselves

There are all sorts of other interesting complications with reflexives: if there are two possible antecedents, the sentence is ambiguous, so in (40) herself can refer to either the nurse or the woman: To make this account rigorous, we need to define exactly what is meant by ‘higher’ and ‘lower’, and that is what is meant by ‘c-command’: a node A in a tree c-commands another node B if and only if the first branching node dominating A also dominates B. In (36), Neg c-commands anything because the first branching node above Neg (i.e., NegP) also dominates the NP anything; in (37), Neg does not c-command the word anything because the first branching node above Neg (again NegP) does not dominate anything. It may seem as if we are using a sledgehammer to crack a nut, but the beauty of the analysis is that ccommand is not just an arbitrary condition introduced to account for a narrow range of data in English. Rather it extends in two directions: it is a major and essential ingredient in the explanation first of a range of other phenomena in English; and second to a wide range of phenomena in other languages, indeed in all languages: c-command is universal. Before illustrating other uses of c-command, note that if it is universal, we would like an explanation for how that is possible. The obvious answer is that it is innate, part of the faculty of language that differentiates humans from other organisms and explains why all kids but no kittens acquire language. If correct, certain implications follow immediately: c-command is not a condition that children acquiring their first

(40) The nurse showed the woman some documents about herself

but this is true only if the two potential antecedents are in the same clause as the reflexive: (41) is unambiguous, and herself can refer only to the princess, because the queen is in a different clause: (41) The queen said the princess had disgraced herself

Neither of these extra considerations accounts for why (42a) is unambiguous and (42b) is simply ungrammatical: (42a) The mother of the princess has disgraced herself (42b) *The brother of the princess has disgraced herself

The question is why herself in (42) can’t refer back to the princess, but only to the mother or the brother, resulting in the judgements indicated. The answer is that the antecedent of the reflexive must not only be compatible and in the same clause, but must also ccommand it. The structure of possessives such as the princess’s mother or the mother of the princess is a matter of contention, but what is not in dispute is that princess is lower in the tree than mother or brother and hence does not c-command the reflexive: compare the trees in (43a) and (43b) for (38c) and (42a):

350 History of Linguistics: Discipline of Linguistics (43a)

There is good evidence (see Law, 2004) that zaa3 occurs in some C position of the sentence, and hence c-commands everything preceding it in the example in (46): see the tree in (48), (C is arguably final in Cantonese, not initial as it is in English): (48)

(43b)

In both trees, the underlined DP (The waitress in (43a), The brother/mother of the princess in (43b)) c-commands herself, but the crossed-out DP The princess in (43b) does not c-command herself so cannot act as its antecedent. C-command is pervasive in the syntax of English, not just in accounting for polarity items and reflexives. More strikingly, it is pervasive in the syntax of every other human language. Consider (Cantonese) Chinese. Cantonese has a rich selection of sentencefinal particles with a wide range of meanings from conveying a feeling of intimacy to indicating which element in the preceding sequence is the focus. In English, we can indicate this focus by means of stress, giving rise to the kind of difference in (44a) and (44b): (44a) John only watches football (he doesn’t play it) (44b) John only watches football (not cricket)

But to talk simply in terms of linear precedence or word order is inadequate. Cantonese also has a process of topicalization whereby a constituent – e.g., zukkau (‘football’) – can be moved to the front of the sentence, where it is attached even higher in the tree than zaa3, and marked with le1 (the 1 indicates a high level tone). This is shown in (49a), with a range of putative translations in (49b) to (49d). Crucially, as indicated by #, (49d) is not a possible interpretation of the Cantonese sentence. (49a) (49b) (49c) (49d)

zukkau-le1, Billy tai t zaa3 Football, only Billy watches Football, Billy only watches #Only football does Billy watch

Why this should be so is indicated in the tree in (50), where zukkau is not c-commanded by zaa3: (The ‘t’, for ‘trace’ in (48a) and (49a) indicates where the topicalized constituent zukkau moved from). (50)

It’s even possible, with suitable pause and stress, to have (45b) with the same interpretation as (45a): (45a) Only John watches football (not Bill) (45b) John only, watches football (not Bill)

Just as in English, Cantonese uses stress to identify the intended focus from the set of possible foci, and the operator zaa3 (only) then associates with this intended focus, as in (46), which can have the various interpretations shown in (47): (46) Billy Billy

tai watch

zukkau football

zaa3 zaa3

(47a) Only Billy watches football (not Peter) (47b) Billy only watches football (he doesn’t play it) (47c) Billy only watches football (not cricket)

Because zaa3 does not c-command zukkau, the attempted interpretation in (49d) is simply impossible. The examples are extremely simple, indeed extremely oversimplified, but the moral is clear: the same abstract syntactic condition (c-command)

History of Linguistics: Discipline of Linguistics 351

operates in Chinese just as it does in English, and in every other language. It is worth emphasizing that successful explanations for one class of data are good to the extent that they generalize to phenomena for which they were not devised. C-command was not invented to account for Chinese, but the fact that it automatically accommodates quite subtle data in that language lends support to a theory that incorporates it. The point can be illustrated more widely. Every time one draws a tree of the kind illustrated above, one makes predictions about the well-formedness of a host of other sentences. It is striking that the trees in (36) and (37) exhibit a defining property of human language – its recursive power. That is, the possibility of including one sentence inside another sentence, potentially without limit, gives rise to the infinite expressive power of natural language syntax.

Linguistics as a ‘Science’ Making testable predictions of this kind is one of the hallmarks of science, and we can now elaborate on the claim that linguistics is ‘scientific’. For any discipline to be scientific it must satisfy (at least) the conditions in (51): (51a) It must seek explanation (51b) It must pursue universals (51c) This will necessarily involve idealization which may well give rise to a tension between commonsense and science (51d) Most crucially, it will make falsifiable predictions

The scientific enterprise is a search for explanatory laws or principles. That is, linguists – like physicists or molecular biologist – seek not only data, but also data that can be used as evidence for some theoretical claim. Consider again the analysis of reflexives. Early work in linguistics of the sort best exemplified by the work of Bloomfield (1935) provided detailed exemplification of a wide range of reflexive constructions from a variety of languages, but stopped short of trying to explain their distribution. One of the achievements of generative grammar has been precisely to explain – in terms of ‘binding theory’ – why reflexive pronouns have the distribution they do. To elaborate a little on the discussion given already under the ‘Explanation in Language’ section, the appearance of a reflexive pronoun is determined by principle A of binding theory which says that a reflexive must be ‘bound’ in some domain. As we saw, this means that it must have an antecedent which also meets a number of other conditions. Principle A is in contrast with Principle B, which determines the

distribution of ‘ordinary’ pronouns. That is, between them the principles account for the range of facts discussed above as well as for the contrast between John admires him and John admires himself; why one can construe John and him as referring to the same person in (52b) but not in (52a), even though the latter seems to include the former as a proper subpart, and a host of other facts: (52a) John expects to see him (52b) I wonder who John expects to see him

Evidence for – or against – the claims of binding theory, or any part of the theoretical edifice, can be drawn from a wide variety of domains: the distribution of words in sentences; the acquisition of their first language by children, and of second and subsequent languages by both children and adults; the historical change of language over time; the processing of language – be it production or perception – in normal and abnormal circumstances; the problems that arise in pathology, as a result of language disturbance caused by a stroke or a tumor, and so on. In every case, explanation calls for concentration on those data that can provide evidence: the data themselves are trivial until embedded in a theory that can produce testable hypotheses. A concomitant of this search for explanation is that the generalizations made must carry over in relevant ways to all languages, not just to English or Latin or Chinese. That is, the quest for laws entails that any science must pursue universals, even if that means narrowing the domain of inquiry. This position has two implications: first, that the same principles should apply to Dutch and Hindi and Chinese – so ‘all languages’ is to be construed literally; but second, that the domain of application of these principles may not be superficially obvious. To take the second observation first, it is well-known that there are socalled ‘emphatic’ reflexives, as illustrated in (53a) and (53b), which raise difficulties for any simple analysis of reflexivization in general: (53a) John himself came (53b) John came himself

These ‘reflexives’, so labeled because they include the morpheme {self }, have somewhat different properties from ‘real’ reflexives: for instance, they don’t have any thematic role, (came takes only one argument – you can’t ‘come somebody else’), but simply emphasize the importance of the one role mentioned. On the other hand, they clearly do obey some of the same constraints as ordinary reflexives, as witness the peculiarity of the examples in (54): (54a) *The boy came herself

352 History of Linguistics: Discipline of Linguistics (54b) *The boy’s mother himself came

This duality suggests that it might be necessary – as a temporary measure – to limit the domain of binding theory to arguments taking a thematic role, leaving the emphatic examples to be accommodated later after further research. The situation is parallel to the development of a scientific theory of motion. For Aristotle, all motion fell within the ambit of his theory of movement, even the movement of flowers growing. Galileo was able to provide a unified account of terrestrial and heavenly motion by restricting attention to mechanical motion and excluding biological growth. This should not be viewed as a retreat to a position where whatever you say turns out to be true, simply because you have excluded those areas where what you say is false. Rather it is an attempt to define an area where we can begin to understand the complexity of the real world by focusing on phenomena that are comprehensible. This narrowing is of two kinds: first, one can simply ignore data which fall outside the generalization one is attempting to explain; second, there is scientific idealization – the pretence that things are simpler than they really are. This is justified because such simplification enables one to approach an understanding of the abstract principles which underlie complex phenomena. Such idealization in linguistics was first made explicit in Chomsky’s distinction between competence and performance and his claim that ‘‘linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech-community.’’ We all know that real speech communities are not homogeneous, but the force of the idealization is that the heterogeneity that does exist is not a necessary component in an adequate characterization of our knowledge of language or how we come by that knowledge. Consider in this latter respect the simplifying assumption – the striking idealization – that first language acquisition is ‘instantaneous’. It is obvious that children take a considerable time to master the intricacies of their first language. Given how complex the knowledge they end up with is, it may still be justifiable to talk of the surprising speed with which they reach this mastery, but it is not by any stretch of the imagination ‘instantaneous’. So what is the force of the assumption? Consider the acquisition of negation. Most, perhaps all, children go through a stage in which they produce negative sentences with the negative marker (no or not in English) in peripheral position in the sentence – i.e., first or last – as in (55a) and (55b), heard from two different two-year-olds: (55a) No computer on (55b) Computer on no

The context made it clear in each case that the force of the utterance was an order not to turn the computer on. Superficially, it looks as if the two children have different grammatical systems (though they were equally proficient at understanding adult instructions, suggesting that their grammar was more sophisticated than might appear). What is relevant here, however, is the fact that – as far as is known – both children will end up with the same grammatical knowledge of English negation. That is, the different stages they go through in their acquisition of the details of the grammar has no effect on the knowledge they end up with – their adult competence. This claim may, of course, be false. It might turn out that adults who uttered (55a) as children have different grammars from those who uttered (55b) as children. It’s possible, but there is no evidence to that effect, and the idealization to instantaneity is accordingly justified. If one of the things we wish to explain is how humans can progress from a stage in which they are apparently language-less to a stage of adult knowledge, it is advantageous to be able to abstract away from the different paths they may take in acquiring that knowledge. The idealization also simplifies the account of the initial state of the language faculty: what needs to be attributed to the mental make-up of human infants to explain the fact that they do, while infant chimps do not, acquire language. Idealization of this kind is in turn likely to involve a tension between commonsense and science. The claim of instantaneous language acquisition seems blatantly silly until one considers more carefully what it means. Consider a second example, again from first language acquisition. Children regularly mispronounce the words they are learning, sometimes with surprising results, as in the case of the puzzle puzzle. When he was about two and a half, my son – like many children – used to pronounce puddle as ‘puggle’ ([pVgel]). He was perfectly consistent, and used to pronounce words of a comparable kind with the same kind of deformation: so bottle became ‘bockle’, pedal became ‘peggle’, and so on. The obvious explanation for this behavior was that, for reasons of motor control, he was unable to pronounce puddle. But at the same time as he made this mispronunciation, he was also making ‘mistakes’ with words such as zoo, pronounced as ‘do’, lazy, pronounced as ‘lady’, and so on. The result was striking: although he pronounced puddle as ‘puggle’, he consistently pronounced puzzle as ‘puddle’ ([pVdel]), so the reason for the former ‘mistake’ could clearly not be that he was incapable of the appropriate motor control. He could pronounce ‘puddle’, but only as his version of puzzle not for puddle. So the commonsense explanation of the phenomenon was wrong. An

History of Linguistics: Discipline of Linguistics 353

obvious alternative explanation was that he couldn’t hear the difference, but that hypothesis wasn’t much more plausible either, as his pronunciations of the two words were consistently different, indicating that he must be able to perceive the contrast. So the second ‘obvious’ commonsense explanation was equally problematic. The correct explanation was provided by Marcy Macken, who demonstrated that there was a perceptual problem, but not between puzzle and puddle, but rather between puddle and puggle. Of course, puggle is not a word of English, so I had failed to observe relevant examples. Words like riddle and wriggle provide a straightforward minimal pair, but they had not been in my son’s vocabulary. Fortunately, Macken observed that other examples made the case as well as the (missing) minimal pair did. Words such as pickle were intermittently pronounced ‘pittle’ ([pit?l]) suggesting that there was indeed perceptual confusion. The puzzle puzzle could only be solved when the difference between a variety of other examples was simultaneously taken into account. I have gone on about this example at such length because it illustrates the beauty of being (potentially) wrong. The most crucial part of the scientific enterprise is that it makes testable (or ‘refutable’ or ‘falsifiable’) predictions. Because my son regularly distinguished puddle and puzzle, and similar examples, I had claimed explicitly that he had no perceptual problem. Macken showed that I was wrong and, on the basis of my own data, showed how I was wrong, leading to an improvement in our general understanding of language acquisition, and the language faculty more generally. Such falsifiability is pervasive in linguistics as in all the sciences, and suggests that many, perhaps all, our hypotheses and principles will be in need of revision when we get a better understanding of what is going on. It follows that binding theory, which I have appealed to above, is probably wrong, and will need replacing by some more sophisticated theory in due course. Again this is to be welcomed, though we must simultaneously guard against the danger of ‘naive falsificationism’. There are always going to be contrary data that one’s current theory cannot explain. This is not a reason for simply jettisoning the theory and whatever insights it may provide, but a point of departure for refinement and extension. A clear example is provided by the theory of parametric variation, and the striking revision of his earlier work in Chomsky’s current Minimalist Program (1995). I have suggested that, like all principles of grammar, binding theory should be universal. But there are problems. Even though (virtually) all languages have reflexives, their distribution is subject to slightly different conditions in different languages. Consider

again the contrast between (40), The nurse showed the woman some documents about herself, and (41), The queen said the princess had disgraced herself, where the former is ambiguous but the latter is unambiguous. The contrast was attributed to the fact that (in English) the antecedent of a reflexive must be in the same clause. So far so good, but if one takes equivalent examples in Chinese, it turns out that the equivalent of (40) is unambiguous, and the equivalent of (41) is ambiguous. The theory would appear to have been refuted: a prediction was made, it was tested, and found to be false. But simply giving up the theory would be defeatist, and it would also mean giving up the explanation for the data it does account for. The solution is interesting: the universality of binding theory (and likewise for other subtheories of the grammar) is maintained, but some latitude is allowed in the definitions involved – they are ‘parametrized’, as the jargon has it. In this case, all reflexives have to have an antecedent, but language learners have to choose (on the basis of the data they are exposed to) among several other options: whether they are learning a language in which that antecedent has to appear in the same clause or in some other welldefined domain; whether the antecedent has to be a subject or can bear other grammatical relations, and others. In Chinese, the antecedent of a reflexive must be a subject, so (40) is unambiguous; on the other hand, that antecedent does not have to be in the same clause, so (41) is ambiguous. If you are worried that this is too simple a get-out, an analogy with incest may be helpful: all cultures appear to have an incest taboo forbidding sexual relations between relatives (for instance, fathers and their daughters). The taboo is universal. But how that taboo is instantiated is culture-specific: for example, some groups allow cousins to marry, others do not. The situation with regard to language and language-learning is somewhat more complex than the cultural example, because there are many more choices to be made. The acquisitional task is more complex than it would have been if all languages were exactly like English, but it is not as severe as one might fear. The idea is that the full range of parametric choices in language is available to the child prior to experience – they are in some sense innate – and the child’s task reduces to choosing from a set of language structures options it already ‘knows.’

Beyond Language: Pragmatics and the Language of Thought We have looked at a wide range of examples illustrating some of our knowledge of phonology, morphology, semantics, and (mainly) syntax, but we also have

354 History of Linguistics: Discipline of Linguistics

knowledge that goes beyond words and sentences. Consider (56a) and (56b): as a remark about Fred, (56a) is fine, with stress on bats as indicated by the bold print, but as a reply to the question in (56b) it is anomalous: (56a) Fred has written a book about bats (56b) Who has written a book about bats?

Such discoursal knowledge must be distinguished both from syntactic knowledge of the kind that tells us that (57) is ungrammatical: (57) Fred has written about bats a book

and from real world knowledge of the kind that prompts our scepticism about (58a) and (58b): (58a) Bananas have legs (58b) Your saucer is being aggressive again

Someone who utters (56a) in response to (56b) probably needs remedial English lessons; someone who utters either of the sentences (58a) or (58b) is either a linguist or in need of psychiatric help. This brings us into the realm of pragmatics, our interpretation of utterances in context, and to the relation of language to thought. The examples in (58a) and (58b) are felt to be odd not because of our linguistic knowledge – you get the same effect whatever language you translate them into – but because we know that the world isn’t like that. It is our encyclopedic knowledge that tells us this, not knowledge of our language (English). However, when we interpret someone’s utterances in some context, we habitually use both our knowledge of English (or whatever other language we are using) and our encyclopedic knowledge. Suppose you hear (3a) Tom is a problem. Your knowledge of English vocabulary and grammar provides you with a meaning for the sentence, but it doesn’t tell you enough to act. Is your interlocutor looking for sympathy, asking you to do something about it, hoping for a denial? Any or all of these may be what you decide is the case on a particular occasion, but you carry out this construal on the basis of your knowledge of the speaker, of Tom, of your past exchanges with both of them, and so on, indefinitely. The core notion involved is what is ‘relevant’, an idea that has been made explicit in Relevance Theory, an important extension of linguistics. We are now beyond the language faculty and can hand over responsibility to other disciplines; but one final question needs to be addressed: What is language for? There are two standard answers: for thought and for communication. Both answers are true, but both need a little hedging. First, we can obviously communicate without using language by means of coughs,

sniffs, gestures, and so on. But language is far more subtle than any other system known: conveying specific negative or conditional propositions by means of gestures or sniffing is not obviously possible. Innumerable other creatures have complex communication systems, but none of them, as far as we know, have anything with the recursive power of human syntax (see Sperber and Wilson, 1995; Hauser et al., 2002). Second, the system we use to think with must have a great deal in common with the natural languages we speak, but it is not identical to them. The language of thought can include elements that natural languages cannot – visual images, for instance; and natural languages have properties that would be unnecessary, or even unhelpful, in the language of thought – pronouns, for instance. If I tell you that she is beautiful, it’s of no use to you storing that in memory as ‘she’ is beautiful; it has to be stored with a name or some other description replacing she. Nonetheless, language has a central role in each of these domains, linking perception and articulation on the one hand to thought processes on the other. This means that the output of our language faculty must be ‘legible’ to these other systems. Language acts as a code linking representations of sound to representations of meaning. These representations must then be in a form that makes it possible for the sensorimotor apparatus to convert them into pronunciations and percepts, and for the conceptual system to use them for thinking, especially inference. So, linguistics provides an account of each of syntax, phonology, morphology, and semantics, and how they relate to each other; pragmatics then tells us how such purely linguistic representations relate to the language of thought – the medium in which we think and carry out inferencing. This relation underlies our ability to interpret the world and the people in it, but the linguistic component is only the first step on the journey. We normally take someone who utters ‘torture is immoral’ to believe that torture is immoral, and we expect to be able to predict (at least some of) their actions on the basis of this. But people may lie, and about that linguistics has nothing to say. See also: Data and Evidence; Language of Thought; Linguistic Anthropology; Phonetics: Overview; Phonology: Overview; Pragmatics: Overview; Principles and Parameters Framework of Generative Grammar; Psycholinguistics: History; Syntax-Semantics Interface.

Bibliography Baker M C (2001a). ‘The natures of nonconfigurationality.’ In Baltin M & Collins C (eds.) The handbook of contemporary syntactic theory. Oxford: Blackwell. 407–438.

History of Linguistics in Central and South America 355 Baker M C (2001b). The atoms of language: the mind’s hidden rules of grammar. Oxford: Oxford University Press. Bauby J-D (1997). The diving-bell and the butterfly. London: Fourth Estate. Bloomfield L (1935). Language. London: Allen & Unwin. Carston R (2002). Thoughts and utterances: the pragmatics of explicit communication. Oxford: Blackwell. Chierchia G & McConnell-Ginet S (2000). Meaning and grammar: an introduction to semantics2. Cambridge, MA: MIT Press. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky N (1994). ‘Chomsky, Noam.’ In Guttenplan S (ed.) A companion to the philosophy of mind. Oxford: Blackwell. 153–167. Chomsky N (1995). The minimalist program. Cambridge, MA: MIT Press. Fabb N (2002). Language and literary structure: the linguistic analysis of form in verse and narrative. Cambridge, UK: Cambridge University Press. Fromkin V (ed.) (2000). Linguistics: an introduction to linguistic theory. Oxford: Blackwell. Fromkin V, Rodman R & Hyams N (2002). An introduction to language. Gussenhoven C (2002). Phonology: analysis and theory. Cambridge, UK: Cambridge University Press. Hauser M, Chomsky N & TecumsehFitch W (2002). ‘The faculty of language: what is it, who has it, and how did it evolve?’ Science 298 (Nov. 22), 1569–1579. Heim I & Kratzer A (1998). Semantics in generative grammar. Oxford: Blackwell.

Huddleston R & Pullum G K (2004). The Cambridge grammar of English. Cambridge, UK: Cambridge University Press. Hudson R A (1990). English word grammar. Oxford, UK: Blackwell. Hudson R A (1996). Sociolinguistics (2nd edn.). Cambridge, UK: Cambridge University Press. Jackendoff R (2002). Foundations of language: brain, meaning, grammar, evolution. Oxford, UK: Oxford University Press. Law A (2004). ‘Sentence-final focus particles in Cantonese.’ Ph.D. thesis, University College London. Macken M (1980). ‘The child’s lexical representation: The Puzzle-Puddle-Pickle evidence.’ Journal of Linguistics 16, 1–17. McGilvray J (1999). Chomsky: language, mind, and politics. Cambridge, UK: Polity Press. Neeleman A & Weerman F (1997). ‘L1 and L2 word order acquisition.’ Language acquisition 6, 125–170. Radford A (2004). English syntax: an introduction. Cambridge, UK: Cambridge University Press. Smith N V (1973). The acquisition of phonology: a case study. Cambridge, UK: Cambridge University Press. Smith N V (1989). The Twitter Machine: reflections on language. Oxford, UK: Blackwell. Smith N V (1999/2004). Chomsky: ideas and ideals. Cambridge, UK: Cambridge University Press. Smith N V (2002). Language, bananas and bonobos: linguistic problems, puzzles and polemics. Oxford: Blackwell. Sperber D & Wilson D (1995). Relevance: communication and cognition. Oxford: Blackwell.

History of Linguistics in Central and South America O Zwartjes, Universiteit van Amsterdam, Amsterdam, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

Spanish and Portuguese missionaries dedicated themselves to the study of non-Indo-European languages, resulting in the publication of a great number of works in or about the Amerindian languages. Although missionaries based and structured their works in agreement with the traditional Greco-Latin grammatical framework, they were forced by the linguistic facts to search for unconventional solutions in their descriptions, which never had been studied before in Europe. There is no doubt that their approaches were in many cases original and creative on all subdisciplines of grammar: phonology; orthography (the rendering of sounds never heard or written before);

morphology–morphosyntax (their model predated ours, not yet elaborated, used to describe phenomena such as agglutination, incorporation, different word structure, processes of derivation, and composition in typologically different languages); syntax (e.g., topicalization, ergativity, different word order, the dichotomy active and nonactive, incorporation); and lexicography – another challenge for the missionaries and an important tool for the translation of religious texts. Missionary works are important sources for other subdisciplines of linguistics, such as sociolinguistics (language variety, language change, standardization), pragmatics, translation theories, cultural anthropology; for reasons of space these will not be treated in this article. Descriptive linguistics always requires an underlying theoretical framework and reflects the linguistic thinking of the period in question. Interest in

Linguistics as a Science 227

whereby modifiers are always located on the same side of the head. Bybee (1988) argues, however, that the correlation is actually motivated in diachronic terms: adpositional constructions typically originate from genitival constructions, and they maintain the order of genitival constructions. In this case, Bybee argues, functional principles may be invoked to account for why adpositional constructions develop from genitival constructions in the first place, but no functional principle can actually be invoked to account for the word order correlations between the two constructions. It should finally be observed that a widespread assumption in the typological approach is that the various language types allowed by implicational universals reflect different functional principles, and the competition between the various functional principles determines cross-linguistic variation (Du Bois, 1985). For example, an implicational pattern exists such that if the singular is expressed by a certain number of morphemes, then the plural will be expressed by at least as many morphemes. This is usually explained in terms of an economic principle, whereby the singular is the most frequent category at the discourse level and therefore need not be expressed overtly. One such explanation accounts for the existence of languages in which the singular is zero-marked and the plural is marked overtly. However, the implicational pattern also allows for languages in which both singular and plural are expressed overtly. These languages reflect an iconic principle whereby all aspects of conceptual structure (in this case, the values ‘singular’ and ‘plural’) are expressed overtly (Croft, 2003). The implicational pattern allows for both language types, but the two language types reflect competing functional principles, each prevailing in different languages. The competition between the different principles is what determines cross-linguistic variation, as described by the implicational pattern itself. See also: Linguistic Universals, Chomskyan.

Bibliography Bybee J (1988). ‘The diachronic dimension in explanation.’ In Hawkins J A (ed.) Explaining language universals. Oxford: Basil Blackwell. 350–379. Comrie B (1989). Language universals and linguistic typology (2nd edn.). Oxford: Basil Blackwell. Croft W (2001). Radical construction grammar. Oxford: Oxford University Press. Croft W (2003). Typology and universals (2nd edn.). Cambridge: Cambridge University Press. Dryer M S (1992). ‘The Greenbergian word order correlations.’ Language 68, 81–138. Du Bois J A (1985). ‘Competing motivations.’ In Haiman J (ed.) Iconicity in syntax. Philadelphia: John Benjamins. 343–366. Fox B A (1987). ‘The noun phrase accessibility hierarchy reinterpreted: Subject primacy or the absolutive hypothesis?’ Language 63, 856–870. Givo´n T (1980). ‘The binding hierarchy and the typology of complements.’ Studies in Language 4, 333–377. Greenberg J H (1963/1966). ‘Some universals of grammar with particular reference to the order of meaningful elements.’ In Greenberg J H (ed.) Universals of language, 2nd edn. Cambridge, MA: MIT Press. 73–113. Greenberg J H, Ferguson C A & Moravcsick E A (eds.) (1978). Universals of human language (4 vols). Stanford: Stanford University Press. Haiman J (1983). ‘Iconic and economic motivation.’ Language 59, 781–819. Haiman J (1985). Natural syntax. Cambridge: Cambridge University Press. Hawkins J A (1983). Word order universals. New York: Academic Press. Hawkins J A (1994). A performance theory of word order and constituency. Cambridge: Cambridge University Press. Keenan E L & Comrie B (1977). ‘Noun phrase accessibility and universal grammar.’ Linguistic Inquiry 8, 63–99. Newmeyer F J (1998). Language form and language function. Cambridge: The MIT Press. Newmeyer F J (2004). ‘Typological evidence and Universal Grammar.’ Studies in Language 28, 526–548. Rijkhoff J & Bakker D (1998). ‘Language sampling.’ Linguistic Typology 2, 263–314.

Linguistics as a Science B Clark, Middlesex University, London, UK ! 2006 Elsevier Ltd. All rights reserved.

A common description of linguistics is that it is the ‘scientific study of language.’ This might seem to be a loose or metaphorical use since the subject matter of

linguistics is quite different from what are often thought of as the ‘hard’ sciences such as physics or chemistry. But linguists are engaged in a process of inquiry that aims to discover facts about the world we live in, and so their work shares important properties of other sciences. Some work in linguistics (e.g., acoustic phonetics) resembles the ‘hard’ sciences in

228 Linguistics as a Science

that it studies physical phenomena in the world. Like psychology, linguistics faces specific issues associated with the fact that its subject matter involves properties of humans, namely, linguistic knowledge and behavior. This article considers some views on what it means to say that a discipline is scientific, what it means to investigate language scientifically, and some different scientific approaches adopted by linguists.

What Is a Science? The general, popular assumption about what constitutes a science is still probably one based on ‘inductivism,’ or logical positivism, which was the dominant view of science at the start of the 20th century. In this view, scientists must begin by making sure that they are as objective as possible and simply observe relevant data without prejudging it. They must also make every effort to ensure that they do not themselves affect the data that they are studying. After objectively observing the data, generalizations will emerge and from these generalizations, laws can be derived. Suppose, for example, that we go to the Antarctic and observe penguins. After a certain number of observations, we might notice that we have never seen a penguin fly. On the other hand, we have seen many penguins swim. So we wonder whether these facts might be the basis of generalizations. We continue to observe and every penguin we observe swims but none of them fly, so we do generalize and come up with the hypotheses that: (1) No penguins can fly (2) All penguins can swim

These hypotheses can then be tested by further observation. We might also devise specific tests. For example, we might encourage penguins into the air to see whether any of them attempt to fly. Or we might put penguins in water to make sure each of them can swim. Hypotheses (1) and (2) will hold as long as every penguin we observe swims but fails to fly. If repeated tests confirm the hypotheses, then they will be established as verified conclusions. This approach seems intuitive and clear to most people, but there are serious problems with it. Adapting a diagram from Chalmers (1999: 54), we can represent this way of approaching science as in Figure 1. Perhaps the most fundamental problem with this model is that the derivation of laws and theories is based on induction, which means that they can never be guaranteed to be true. To see this, we need to look at what is involved in induction. Induction is a process whereby general conclusions are derived based on evidence provided by specific

Figure 1 Inductivist model of science (based on Chalmers, 1999: 54).

observations in the past. If I go to the same bar often enough and order the same drink, say, a pint of stout, every time, it is possible that the bartender will conclude that I will want a pint of stout every time I come into the bar. If he is keen to make sure I am happy, he might even start pouring my stout as soon as he sees me arrive. The problem, of course, is that there is nothing to stop me deciding one day to have something different for a change. If I do order a different drink one day, say a glass of lemonade, then the bartender will have wasted his time, and possibly his stout. In other words, conclusions that have been derived through a process of induction are not secure. By contrast, deductive conclusions are guaranteed to be true as long as the premises on which they are based are true. Here are possible steps in the bartender’s inductive reasoning process: (3a) Billy ordered a pint of stout when he came into the bar on Monday. (3b) Billy ordered a pint of stout when he came into the bar on Tuesday. (3c) Billy ordered a pint of stout when he came into the bar on Wednesday. (3d) Therefore, Billy will order a pint of stout every time he comes into the bar.

And here are two examples of deductive inferences: (4a) Billy will order a pint of stout every time he comes into the bar. (4b) Therefore, Billy will order a pint of stout when he comes into the bar on Thursday. (5a) Billy is drinking stout (5b) All stouts are alcoholic drinks. (5c) Therefore, Billy is drinking an alcoholic drink.

While (3a–c) could be true and (3d) still turn out to be false, there is no way that (4a) could be true and (4b) false, or that (5a–b) could be true and (5c) false. The unreliability of inductive inferences is a serious problem for the inductivist approach to science, since it means that no conclusion can ever be safely asserted to be true. You never know for sure that the next penguin you look at won’t be able to fly, or whether the next penguin you look at will indeed be able to swim. This means not only that we can’t be sure of

Linguistics as a Science 229

our conclusions but also that we can never be sure that our scientific endeavors have resulted in any progress. It is always possible that we have just moved from one false hypothesis to another, since we have not been lucky enough to come across the data that would demonstrate our mistake. Another problem with this view of scientific inquiry is that it is not possible to observe phenomena objectively without first making some assumptions about what might be relevant. Suppose, for example, that I decide I am interested in how children acquire their first language. How will I know which things to observe? Should I begin by observing the speech of the children themselves? Or the speech of other children around them? Or the speech of grownups around them? Or other kinds of behavior exhibited by the children themselves? Or other kinds of behavior exhibited by other children? Or other kinds of behavior exhibited by grownups around them? Or the extent to which they see or hear television? Or their diet? There are countless possibilities and we cannot begin carefully observing data without first guessing which particular data might be relevant. In the same way, hypotheses about whether penguins can fly or swim arise because someone asks the question whether they can. A scientist following the inductivist model might just as easily have decided to look at the penguins’ color, their physical behavior, how they mate, how many of them are ever together in one group, and so on. The hypotheses about their status as flightless swimmers will only arise if it is assumed that it might be relevant in this context. Not only does the inductivist model of science mean we can never be sure we’ve made any progress, we also don’t have any clear rationale for deciding on our first steps in investigating a particular phenomenon. These were acknowledged problems for the traditional, inductivist model of science. But no superior model was available until the philosopher Karl Popper (1972, 1979) suggested a new way of thinking about science. He pointed out that, even though we can never be sure an assumption is true, we can demonstrate a conclusion to be false. If I order a glass of lemonade even once, then we know that it’s not true that I only ever order stout. If we see just one penguin that can fly, then we know for certain that not all penguins are flightless. He pointed out that falsifying a hypothesis counts as progress, not only because it removes an error from a theory but also because it usually leads to a new, improved hypothesis. Suppose, for example, that we discover a flying penguin. We will not simply reject our initial hypothesis that all penguins are flightless and start again from scratch. Instead, we will wonder what the difference is between those penguins that fly and those

Figure 2 Popperian (falsificationist) model of science.

that don’t. We might, for example, discover that there is a particular penguin species that can fly and come up with a new hypothesis to reflect that. Another important point Popper made was that the source of hypotheses is much less important than whether they make clear predictions and so can be tested to see if they are false. Newton’s theory of gravity, for example, clearly predicts that any object we drop will fall towards the earth. Other things being equal, we know that a floating or rising object will demonstrate that the hypothesis is false. By contrast, if your horoscope claims that ‘someone is thinking of you,’ it is clearly impossible to show that this is not so. Therefore, this is not a falsifiable, and so not a scientific, claim. Popper’s approach suggests that observation does not need to be the first step in a process of scientific inquiry. Instead, hypotheses can, and indeed must, precede observation. Hypotheses may arise because of objective observation, because of some subjective prejudice, because we dream them, or from any source at all. What is important is that we can test them and so attempt to make progress in our understanding. Popper’s view of how science progresses can be represented diagrammatically, as in Figure 2. The foundations of science, in this view, are not based on objectivity and the notion that observation precedes hypothesis formulation. Instead, what is important is that our hypotheses are clearly formulated and testable. Since we can never verify a hypothesis, we aim instead to develop the best hypotheses we can, and this is evaluated in terms of clarity and falsifiability. Because of the prominence it gives to hypothesis formation and to deductive inferences, Popper’s vision of science is also known as the ‘hypothetico-deductive’ model. Naturally, there are a number of alternative visions. Imre Lakatos (1970) developed a more sophisticated falsificationist model based on considering the nature of scientific research programs rather than isolated hypotheses, or sets of hypotheses. Thomas Kuhn (1970) suggested a sociological model, in which he described science in terms of paradigms. In Kuhn’s view, a scientist proceeds first by learning a current paradigm. He then solves ‘puzzles’ using the tools available within the paradigm. Sometimes, puzzles are not solvable within the existing paradigm. If this continues, a crisis occurs that eventually leads to

230 Linguistics as a Science

a ‘revolution’ in which the existing paradigm is discarded and a new paradigm takes its place. One important feature of Kuhn’s approach is that its sociological nature means that it does not provide a means of distinguishing between scientific and nonscientific statements other than in terms of what is accepted by the relevant group of scientists. A more radical position is taken by Paul Feyerabend (1975, 1978), who claims that the notion of a reliable scientific methodology is an illusion and that scientists have failed to demonstrate that their findings have more value than any other kinds of ‘wisdom.’ Whichever philosophy of science is assumed, work in linguistics fits the model as well as any other science. Linguists do not all agree about the nature of their scientific endeavor, but the majority of linguists do see linguistics as a science; and Popper’s views on what constitutes scientific activity have been influential in linguistics, just as they have been in other disciplines. (For an introduction to the philosophy of science, see Chalmers, 1999).

The Scientific Study of Language As Yngve points out, ‘‘the origins of linguistic theory can be recognized in Plato and Aristotle, but most clearly in the early Stoics from about 300 to 150 B.C.’ (Yngve, 1996: 14). Modern scientific linguistics began to develop in the early 19th century in the work of scholars such as Rasmus Rask, Jacob Grimm, Franz Bopp, and Wilhelm von Humboldt. Rask (1830) referred to Linnaeus and Newton in proposing that language was a natural object that should be studied scientifically. The main focus at this time was on the comparative method, looking at similarities among different languages and using this evidence to reconstruct the ancestors of languages. (For a fuller account of the history of linguistics, see Robins, 1990.) The notion that linguistics is a science has continued since then, while assumptions about what makes linguistics scientific have changed. Perhaps the most significant developments have been the work of Ferdinand de Saussure (1972, 1983), which is usually seen as the starting point of modern linguistics; the development of a rigorous notion of linguistics as a science by Leonard Bloomfield (1926, 1933, 1970) and his American contemporaries in the first half of the 20th century; and the overturning of Bloomfield’s approach by the work of Noam Chomsky (1957, 1965). In linguistics today, there are a wide range of approaches, methodologies, and notions of what is scientific, but the Chomskyan approach remains the dominant one. Saussure’s ideas established the notion that linguistics could be ‘synchronic’ (concerned with a

particular language at a particular point in time) as well as ‘diachronic’ (looking at how a language has developed over time). Saussure’s work had considerable influence in the development of structuralist approaches in linguistics and beyond. One particularly significant structuralist approach was that of Bloomfield and his followers in the first half of the 20th century, who developed a much more detailed view of linguistics as a science. As Robins (1990: 208) puts it, ‘‘Bloomfield was rigorously scientific, in the light of his own, mechanist, interpretation of science, concentrating on methodology and on formal analysis.’’ It was important for Bloomfield that linguistics should be seen as a scientific enterprise, and his view of what it meant to be scientific was an empiricist one, based on behaviorist psychology. A scientific approach was an empirical study based on objective observation of facts that would lead the scientist to discover facts about languages. It was vital to avoid subjectivity. This meant avoiding hypotheses that did not emerge from objective observation, and it also meant denying the existence of the mind or of mental phenomena. This was because the data alone, considered objectively, did not justify the assumption that mental phenomena really existed. For behaviorists, all behavior could be explained in terms of external stimuli and the reflexes that they caused. Even very complex behavior, such as linguistic behavior, could be understood in terms of complex responses to stimuli. Perhaps the main concern of linguistics was to record facts about particular languages. This was given urgency by the fact that many native American languages were threatened with extinction. It was important that they should be recorded before they were lost forever. Linguists developed a number of ‘discovery procedures’ that could be used to scientifically (i.e., objectively) work out facts about the languages being studied. Chomsky’s approach explicitly rejected at least the following assumptions of the Bloomfieldian approach: . That observation should precede hypotheses . That the ultimate aim of linguistics was to describe . That languages were to be understood as collections of utterances (phenomena external to the mind) . That linguistics should not presuppose the existence of mental phenomena. Chomsky argued instead that: . The main aim of linguistics was to construct theories of language and languages . The ultimate aim of linguistics was to explain language and languages

Linguistics as a Science 231

. That language should be understood as a mental phenomenon (and languages as mental phenomena) . That there was strong evidence for the existence of the human mind and a cognitive system of knowledge of language. In general, the key notion in Chomsky’s work was that there was convincing evidence for the existence of mental structures underlying human language. This evidence came from the linguistic intuitions of speakers. These included intuitions about what is and is not possible in languages they know. A famous example is the contrast between eager to please and easy to please illustrated in (6)–(7): (6a) John is easy to please (6b) It is easy to please John (7a) John is eager to please (7b) *It is eager to please John

(The asterisk is used here to indicate that most speakers judge (7b) to be unacceptable. It is also used sometimes to indicate the theoretical claim that (7b) is not grammatical in English). Speakers of English agree that (6a) and (6b) are both acceptable and have similar meanings. Although there is no logical reason for ruling out (7b) and it is easy to see what it would mean by analogy to (6a), speakers agree that (7b) is not an acceptable utterance. This can be explained by the assumption that a cognitive system of knowledge, a grammar, licenses (6a), (6b), and (7a) but rules out (7b). Using examples like this, Chomsky argued for the existence of mental grammars (‘competence’) that underlay actual human linguistic behavior (‘performance’). In Chomsky’s (1986: 15–51) terms, linguistics should move from the study of ‘E-language,’ or ‘externalized language,’ to the study of ‘I-language,’ or ‘internalized language.’ This approach revolutionized linguistics, was a major influence on the so-called ‘cognitive revolution’ that reestablished the mind as a focus of study in psychology, and led to the establishment of the discipline of cognitive science. So what is the scientific methodology of Chomskyan linguistics like? To a large extent, it follows the Popperian model represented in Figure 2. The focus of linguistics is very much on coming up with clearly stated, testable hypotheses, testing them, and constantly updating hypotheses based on how they respond to testing. However, things are not as straightforward as this. It is not always the case that we have a theory that copes with all of the data until we come across problematic data and we then replace that theory with a new theory. Instead, we compare theories and stick with the theory that we think is the ‘best

so far’ in that it copes best with the relevant data. Most theories do not deal with all of the existing data, and we are usually aware of problems with some data. When this happens, linguists do not reject the theory that cannot cope with the difficult data. Rather, they note the difficulty and continue with their research program until they either find a way to deal with it within the existing theory or until a new theory is formulated that deals with this data and is preferable to the existing theory when looked at overall. In some cases, data is difficult to interpret. It may be, for example, that speakers are divided over whether a particular form is acceptable. This may correlate with nonlinguistic facts about the group of subjects, such as age (e.g., speakers over a certain age make one judgment while younger speakers make another), class, gender, or geographical location. If so, then the variation in the data can be explained based on these correlations, e.g., as a dialect difference or as a difference in the language of different age groups (which could be an example of a language change in progress). In some cases, though, there may be no obvious correlation with nonlinguistic features of the subjects, i.e., the numbers of speakers who make the different assumptions about the status of a particular form might be comparable in all groups, whether divided by age, class, gender, or geography. When this happens, linguists may ‘let the grammar decide’; in other words, they may decide that whatever their existing grammatical theory predicts about the form in question is correct. In evaluating theories, Chomsky (1986, 2000) has proposed that we can consider three ‘levels of adequacy’ that our theories should aim to meet: observational, descriptive, and explanatory adequacy. A theory is observationally adequate if it describes correctly which forms are grammatical. It is descriptively adequate if it also characterizes knowledge that speakers have about those forms. It is explanatorily adequate if it provides an explanation for why the intuitions of speakers are as they are. It is important to note that, in the Chomskyan view, a linguist’s grammar is a theory about the competence of one or more speakers. So to say that a particular expression is ‘grammatical’ is to make a theoretical claim about the unconscious system of knowledge that speakers have about their language. This is one reason why linguists need to be careful when asking speakers for their judgments on particular utterances. Asking whether a particular form is grammatical may lead to confusion or may generate responses that reflect nonlinguistic (or metalinguistic) assumptions about the relative social status of different forms. So it is often better to ask questions like ‘‘Does this sound

232 Linguistics as a Science

like a likely utterance to you?’’ or ‘‘Could you imagine someone saying this?’’ All sciences involve idealisations. A physicist studying the effects of gravity, for example, wants to observe what a feather and a ball would do if dropped in identical atmospheres and if they were not shaped in such a way that the feather is affected to a greater extent by air resistance than the ball. Similarly, Chomsky points out that the object of study for linguistics is an idealized linguistic system: Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speechcommunity, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance. (Chomsky, 1965: 3)

This quote has sometimes been misunderstood. Chomsky is not claiming that such a speaker-listener exists. Instead he is saying that linguistics is not concerned with those nonlinguistic properties that lead to different individuals performing differently when making judgments about their language or using it (i.e., with ‘performance’). So linguists are studying an object that does not exist as a physical entity. No two speakers share exactly the same language, since all speakers have acquired their language in slightly different circumstances. However, the scientific study of language requires that we abstract away from the differences to find out about the underlying language faculty. This raises a number of methodological issues. Chomsky (1980: 189–192) makes an analogy between investigation of human language and investigation of thermonuclear reactions taking place inside the sun. It is not possible to set up a laboratory inside the sun and look directly at what is there. Similarly, we cannot view linguistic competence directly. However, we can observe light and heat emanating from the sun and make inferences based on this evidence about what is happening inside the center of the sun. Similarly, we can look at the performance of speakers of a language, including their ‘metalinguistic performance’ (Birdsong, 1989; Schu¨ tze, 1996), when asked to make judgments and make inferences about the system of competence that is involved in determining that performance. In the Popperian spirit, linguistics should make no assumptions in advance about which data will be relevant in studying language, nor about what methods should be used in studying it. What is important is that hypotheses are clearly stated, testable, and tested. In practice, though, there has also always been a tendency for particular linguists to continue to use the

same kinds of methods and data, so that certain groups are referred to in those terms, e.g., as ‘corpus linguists’ or ‘intuitionists.’ Much of the work of Chomsky and his followers has been based on evidence from the intuitions of speakers, gathered by personal introspection (i.e., the researcher’s own intuitions) interview or questionnaire. But there is no reason in advance to rule out the relevance of other data, whether from corpora, psycholinguistic experiments, or other sources. There are constant debates about the reliability of particular kinds of data and methods for acquiring and interpreting it. Corpus data has been judged problematic because the precise data depends on the accident of who says or writes what at a particular time. Unusual but relevant data is unlikely to appear. The use of intuitions has been questioned because of the risk of subjectivity and the dependence on researchers gathering and interpreting them with enough care. And so on. But the vital thing is that the relevance of particular data is explicitly discussed and justified. As with all sciences, linguists will choose to pay particular attention to some data and to ignore other data based on the hypotheses they have adopted about what will be relevant. But the important question should always be whether the data can be shown to shed light on the particular hypothesis being investigated. Not all linguists are theoretical linguists. Descriptive linguists aim only to describe languages, not to explain them. A descriptive grammar will aim to make clear what forms exist in a particular language. Any descriptive grammar will have to make idealisations since, as mentioned above, it is not clear that any two speakers will ever have internalized exactly the same system. Where a large percentage of a language group agree on a particular form, it is fairly easy for the descriptive linguist to write a grammar in agreement with the majority view. In other cases, it may be much harder to decide which option to adopt. The Bloomfieldian American Structuralists who preceded Chomsky were mainly concerned with describing languages. They developed efficient methods for determining facts about languages and provided a wealth of data that theoretical linguists can use in testing their hypotheses. Descriptive linguistics continues to provide useful data with a range of practical as well as theoretical applications. Not all linguists use falsificationist, or ‘hypotheticodeductive,’ methods. Conversation analysts, for example, avoid idealisations and use inductive methods to arrive at their conclusions. Conversation analysis has its foundations in ethnomethodology, a branch of sociology that grew out of dissatisfaction with the methods of sociology in the 1960s and 1970s. Research in conversation analysis is qualitative rather

Linguistics as a Science 233

than quantitative and avoids early hypothesis formation. One rationale for this is that conversation is complex behavior, and explanations of it will presuppose an understanding of that complexity that cannot be justified until we find out more about the detail of what goes on in conversations. Harvey Sacks (1995), one of the originators of conversation analysis, explicitly states that he views his work as behaviorist and that this approach has much in common with the behaviorist methodology adopted by Bloomfield and the American Structuralists in the early 20th century. One motivation for using inductivist methods is when research is qualitative rather than quantitative. Lazaraton (2002: 33) suggests that the following features can be seen to distinguish qualitative and quantitative research: Qualitative Research naturalistic observational subjective descriptive process-oriented valid holistic ungeneralizable single case analysis

Quantitative Research controlled experimental objective inferential outcome-oriented reliable particularistic generalizable aggregate analysis (Lazaraton, 2002: 33)

Qualitative research is appropriate when the aim is to discover the attitudes and experiences of speakers and the motivations and processes behind events, rather than simply counting the number of times a particular event occurs. This is not to suggest that there is a necessary link between particular types of research (e.g., quantitative or qualitative) and particular models of science (e.g., inductivist or falsificationist); neither quantitative nor qualitative research are necessarily incompatible with any particular model of science. Not everyone agrees that linguistics is an empirical science. Some reasons for this view stem from the use of intuitions as data. Itkonen (1974), for example, suggested that the reliance on intuitions means that the claims of linguists cannot be falsified. He suggests that the occurrence of an utterance such as *girl the came in does not falsify the claim that definite articles in English precede nouns in a noun phrase, since this utterance is ‘incorrect’ and the claim is about ‘correct’ utterances and sentences. Given this, he suggests that linguistics is different from natural sciences, in which, for example, the discovery of a piece of metal that does not expand when heated would be enough to falsify the claim that all metals expand when heated. Yngve (1996) suggests that the ‘ancient semiotic-grammatical

foundations’ of linguistics are not compatible with modern science. As a result, he suggests putting these foundations aside and replacing them with ‘‘new foundations that are fully consonant with modern science as practiced in the more highly developed sciences of physics, chemistry and biology’’ (Yngve, 1996: 309). Yngve’s proposal can be seen as an attempt to reconceptualize linguistics as a ‘hard’ science. This raises the questions of whether all of the phenomena that have been studied by linguists can fit into this new model and whether the discipline retains its interest if they do not. For most linguists, however, linguistics is by definition scientific. While much work in linguistics can be understood in some form of Popperian, or postPopperian, terms, there is nevertheless a wide range of views on the exact nature of the scientific study involved. See also: Chomsky, Noam (b. 1928); Data and Evidence;

Idealization; Language as an Object of Study; Levels of Adequacy, Observational, Descriptive, Explanatory; Linguistics: Approaches; Structuralism.

Bibliography Birdsong D (1989). Metalinguistic performance and interlinguistic competence. New York: Springer-Verlag. Bloomfield L (1926). ‘A set of postulates for the science of language.’ Language 2, 153–164. [Rpt. in Bloomfield, 1970.] Bloomfield L (1933). Language. New York: Holt. Bloomfield L (1970). A Leonard Bloomfield anthology. Hockett C (ed.). Bloomington: Indiana University Press. Chalmers A F (1999). What is this thing called science? (3rd edn.). Buckingham, UK: Open University Press [1st ed., 1978]. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1965). Aspects of the theory of syntax. Cambridge: MIT Press. Chomsky N (1980). Rules and representations. Oxford: Basil Blackwell. Chomsky N (1986). Knowledge of language. New York: Praeger. Chomsky N (2000). New horizons in the study of language and mind. Cambridge: Cambridge University Press. Feyerabend P (1975). Against method: outline of an anarchistic theory of knowledge. London: New Left Books. Feyerabend P (1978). Science in a free society. London: New Left Books. Itkonen E (1974). ‘Linguistics and metascience.’ Studia Philosophica Turkuensia II. Risteen Kirjapaino, Kokema¨ ki. [Republished as Itkonen E (1978). Grammatical theory and metascience: a critical investigation into the methodological and philosophical foundations of ‘autonomous’ linguistics. Amsterdam: John Benjamins.]

234 Linguistics as a Science Kuhn T (1970). The structure of scientific revolutions. Chicago: Chicago University Press. Lakatos I (1970). ‘Falsification and the methodology of scientific research programmes.’ In Lakatos I & Musgrave A (eds.) Criticism and the growth of knowledge. Cambridge: Cambridge University Press. 91–196. [Reprinted in Lakatos I (1978). Worrall J & Currie G (eds.). The methodology of scientific research programmes. Cambridge: Cambridge University Press.] Lazaraton A (2002). ‘Quantitative and qualitative approaches to discourse analysis.’ Annual Review of Applied Linguistics 22, 32–51. Popper K R (1972). The logic of scientific discovery. London: Hutchinson. Popper K R (1979). Objective knowledge. Oxford: Oxford University Press. Rask R ([1830] 1932–1933). ‘En Forelæsning over Sprogets Filosofi.’ In Hjelmslev L (ed.) Ausgewa¨hlte Abhan-

dlungen, vol. II. Copenhagen: Levin and Munksgaard. 375–378. Robins R H (1990). A short history of linguistics (3rd edn.). London: Longman. Sacks H (1995). Lectures on conversation. Schegloff E (ed.). Oxford: Basil Blackwell. Saussure F de (1972). [Originally published 1916.] Cours de linguistique ge´ne´rale. Paris: Payot. [ed. Bally C & Sechehaye A, with the assistance of Riedlinger A, with introduction and notes by Mauro T.] Saussure F de (1983). Course in general linguistics. Harris R (ed.). London: Duckworth. Schu¨ tze C T (1996). The empirical base of linguistics: grammaticality judgments and linguistic methodology. Chicago: Chicago University Press. Yngve V H (1996). From grammar to science: new foundations for general linguistics. Amsterdam: John Benjamins.

Linguistics as a University Subject: Early History, in America J S Falk, La Jolla, CA, USA ! 2006 Elsevier Ltd. All rights reserved.

In the 1860s, William Dwight Whitney (1827–1894), professor of Sanskrit and instructor of modern languages at Yale College, became increasingly concerned that students in American higher education were beginning work in comparative philology without an understanding of ‘the grand truths and principles which underlie and give significance to their work’ (Whitney, 1867: viii). Whitney then wrote the book that became the most important linguistics text in 19th-century America, Language and the study of language: twelve lectures on the principles of linguistic science (1867). Many of his idea‘s on the nature of human language remain central to modern linguistics – the primacy of speech over writing, the arbitrary connection between sound and meaning, the universality and ease of language acquisition by children. But the central topics of 19th-century German linguistics – genetic and typological classification and the causes of language change – formed the core of his text and of the graduate program in philology that he established at Yale in 1869, one of the earliest graduate programs in any field in the United States. At the turn of the 19th to the 20th century, Franz Boas (1858–1942) held one of the few professorships in anthropology in all of North America, at Columbia University in New York City. Boas considered linguistic study of Native American languages to be one of the core curricular areas for graduate studies

in anthropology. He used his own field notes from languages on the Northwest Coast to demonstrate linguistic analysis in the classroom, and following experience there, students specializing in linguistics conducted work in the field. The Handbook of American Indian Languages (Boas, 1911) provided both the model of linguistic analysis to be followed in dissertations and their chief venue of publication. Largely synchronic and focused on Native American languages, the path of Boasian linguistic education was different from that of the historical studies fostered by Whitney within the German historical paradigm. Under Whitney, students sought a foundation in classical languages, in older forms of modern languages, and, for research purposes, in the European languages of scholarship, along with such general knowledge of language as was provided in Whitney’s textbook. They were then equipped for historical linguistic work. Under Boas, there were courses in ethnography, in physical anthropology, in statistics, and courses demonstrating the analysis of American Indian languages, which prepared students for independent field work and linguistic analysis. No matter which of these traditions was emphasized in the early decades of the 20th century, there were no autonomous university departments of linguistics. Rather, the few linguistics programs that came into existence were assembled from faculty already appointed in other departments. For example, when Yale University announced a new department of linguistics in 1931, the only person to hold the title of professor of linguistics was Edgar H. Sturtevant

234 Linguistics as a Science Kuhn T (1970). The structure of scientific revolutions. Chicago: Chicago University Press. Lakatos I (1970). ‘Falsification and the methodology of scientific research programmes.’ In Lakatos I & Musgrave A (eds.) Criticism and the growth of knowledge. Cambridge: Cambridge University Press. 91–196. [Reprinted in Lakatos I (1978). Worrall J & Currie G (eds.). The methodology of scientific research programmes. Cambridge: Cambridge University Press.] Lazaraton A (2002). ‘Quantitative and qualitative approaches to discourse analysis.’ Annual Review of Applied Linguistics 22, 32–51. Popper K R (1972). The logic of scientific discovery. London: Hutchinson. Popper K R (1979). Objective knowledge. Oxford: Oxford University Press. Rask R ([1830] 1932–1933). ‘En Forelæsning over Sprogets Filosofi.’ In Hjelmslev L (ed.) Ausgewa¨hlte Abhan-

dlungen, vol. II. Copenhagen: Levin and Munksgaard. 375–378. Robins R H (1990). A short history of linguistics (3rd edn.). London: Longman. Sacks H (1995). Lectures on conversation. Schegloff E (ed.). Oxford: Basil Blackwell. Saussure F de (1972). [Originally published 1916.] Cours de linguistique ge´ne´rale. Paris: Payot. [ed. Bally C & Sechehaye A, with the assistance of Riedlinger A, with introduction and notes by Mauro T.] Saussure F de (1983). Course in general linguistics. Harris R (ed.). London: Duckworth. Schu¨tze C T (1996). The empirical base of linguistics: grammaticality judgments and linguistic methodology. Chicago: Chicago University Press. Yngve V H (1996). From grammar to science: new foundations for general linguistics. Amsterdam: John Benjamins.

Linguistics as a University Subject: Early History, in America J S Falk, La Jolla, CA, USA ! 2006 Elsevier Ltd. All rights reserved.

In the 1860s, William Dwight Whitney (1827–1894), professor of Sanskrit and instructor of modern languages at Yale College, became increasingly concerned that students in American higher education were beginning work in comparative philology without an understanding of ‘the grand truths and principles which underlie and give significance to their work’ (Whitney, 1867: viii). Whitney then wrote the book that became the most important linguistics text in 19th-century America, Language and the study of language: twelve lectures on the principles of linguistic science (1867). Many of his idea‘s on the nature of human language remain central to modern linguistics – the primacy of speech over writing, the arbitrary connection between sound and meaning, the universality and ease of language acquisition by children. But the central topics of 19th-century German linguistics – genetic and typological classification and the causes of language change – formed the core of his text and of the graduate program in philology that he established at Yale in 1869, one of the earliest graduate programs in any field in the United States. At the turn of the 19th to the 20th century, Franz Boas (1858–1942) held one of the few professorships in anthropology in all of North America, at Columbia University in New York City. Boas considered linguistic study of Native American languages to be one of the core curricular areas for graduate studies

in anthropology. He used his own field notes from languages on the Northwest Coast to demonstrate linguistic analysis in the classroom, and following experience there, students specializing in linguistics conducted work in the field. The Handbook of American Indian Languages (Boas, 1911) provided both the model of linguistic analysis to be followed in dissertations and their chief venue of publication. Largely synchronic and focused on Native American languages, the path of Boasian linguistic education was different from that of the historical studies fostered by Whitney within the German historical paradigm. Under Whitney, students sought a foundation in classical languages, in older forms of modern languages, and, for research purposes, in the European languages of scholarship, along with such general knowledge of language as was provided in Whitney’s textbook. They were then equipped for historical linguistic work. Under Boas, there were courses in ethnography, in physical anthropology, in statistics, and courses demonstrating the analysis of American Indian languages, which prepared students for independent field work and linguistic analysis. No matter which of these traditions was emphasized in the early decades of the 20th century, there were no autonomous university departments of linguistics. Rather, the few linguistics programs that came into existence were assembled from faculty already appointed in other departments. For example, when Yale University announced a new department of linguistics in 1931, the only person to hold the title of professor of linguistics was Edgar H. Sturtevant

Linguistics as a University Subject: Early History, in America 235

(1875–1952). He and all other faculty were jointly appointed with other departments, including anthropology, classical languages and literatures, English, Germanics, and Semitic and Biblical languages and literatures. Yale initially offered only the Ph.D. in linguistics; a Master’s degree became available in 1943–1944. Instruction remained entirely at the graduate level, a pattern for most linguistics programs until the later decades of the 20th century. Yale did not teach an undergraduate linguistics course until 1947–1948 and did not grant its first undergraduate degree in linguistics until late in the 1960s (Wells, c. 1975: 18 n. 6). Even before the establishment of a linguistics department at Yale, however, linguistics in some form was taught in nearly every major American university. One of the first projects undertaken by the Linguistic Society of America (LSA), following its founding in 1924, was the ‘Survey of Linguistic Studies: Opportunities for Advanced Work in the United States’ (Kent and Sturtevant, 1926). All but one of the 26 universities surveyed (those belonging to the Association of American Universities) listed ‘courses bearing upon linguistic science’ in catalogues and bulletins for the 1926–1927 academic year. Seven institutions had particularly strong offerings, each with 15 or more affiliated faculty members teaching upward of 30 courses: University of California (Berkeley), University of Chicago, Columbia University, Harvard University, University of Michigan, University of Pennsylvania, and Yale University. Twenty-five schools had language (nonliterary) courses in modern and older languages, but the leading programs also offered coursework in general linguistics, general phonetics, comparative Indo-European, and the analysis of American Indian languages. In 1928, under the direction of Sturtevant, the LSA began holding an annual summer Linguistic Institute, meant to serve both graduate students in linguistics and senior scholars wishing a collegial environment in which to discuss and share research. At the first Institute, 45 people registered for courses, including four members of the faculty. Courses offered at the early Institutes represented the historical interests of the majority of LSA members, with most classes devoted to older languages, their historical development and comparison. In 1928, for example, 31 of the 37 courses were historical. But Sturtevant insisted on also scheduling courses that were not historically oriented – experimental phonetics, methods of analyzing unwritten languages, description of contemporary American English, and fieldwork on the dialects of the United States and Canada. As the Institutes became better known and increasingly prestigious, the offering of such courses spread into university

programs. Thirty-five years later, at the 1963 Institute, 37 courses were distributed over four main areas, three synchronic, one diachronic; 220 people registered (1964, LSA Bulletin 37: 20). At every Institute, the most well-attended course was ‘An Introduction to Linguistic Science.’ In 1937 and 1938, the course was especially popular with faculty members as well as students when it was taught by Edward Sapir (1884–1939) and Leonard Bloomfield (1887–1949), respectively. Bloomfield did more than any other American linguist to shape the teaching of linguistics at American universities in the second quarter of the 20th century. His textbook Language (1933) came to dominate the field, and the first two-thirds of it, focused on synchronic analysis, constituted the foundation for American structural linguistics, the paradigm that increasingly displaced more traditional diachronic work as the main orientation of American linguistics. The genius of the text was that it presented historical topics fully and clearly, while at the same time Bloomfield gave priority to a systematic exposition of synchronic descriptive analysis. Language was the preeminent textbook in American linguistics well into the 1950s. In 1941, a ‘‘considerable body of linguists assumed the role of active language teachers and attempted wholeheartedly to apply the findings of their science to the practical problems of language teaching’’ (Moulton, 1961: 82). Many linguists, including Whitney and Bloomfield, had taught languages in college and university and even written grammars and language textbooks for the well-known European languages, but the outbreak of World War II in 1939 directed attention to the need for materials and speakers of languages until then unfamiliar in the American education system. The American Council of Learned Societies in 1941 established the Intensive Language Program (ILP) which employed linguists, mostly recent doctoral recipients with experience in the analysis of American Indian languages, to develop teaching materials based on linguistic analysis. Using these materials, often while they were still in development, linguists established intensive college courses in such languages as Swahili, Hausa, Fanti, Arabic, Chinese, Japanese, and Thai. By the summer of 1942, the ILP was prepared to offer 56 courses in 26 languages at 18 universities. Approximately 700 students enrolled, some of them linguistics majors seeking knowledge of languages distinct from their native English. Over the next two decades, it became common to find linguistics programs requiring their students to gain exposure to such languages. During World War II, the United States’ armed forces adopted the ILP, and linguists were recruited

236 Linguistics as a University Subject: Early History, in America

to prepare language training materials for the military. One result from this concentrated period of pedagogical concern was a close association between linguistics and the less commonly taught languages, not only in research and publication, but also within the administrative structure, and thus the programs, at some universities. A second, related result was the incorporation into linguistics programs of an area commonly referred to as applied linguistics, which concerned the application of linguistic principles to language teaching. Linguistics students now had yet another area of specialization available to them. Bloomfield’s textbook was supplemented in the 1940s by workbooks prepared for phonemics by Kenneth Pike (1912–2000) and for morphology by Eugene Nida (b. 1914). Both contained problems drawn from a wide range of languages and were initially used in training sessions for Christian missionary linguists at the Summer Institute of Linguistics (SIL). From its informal beginning in 1934 with just two students, SIL went on to train hundreds of students each summer in techniques for linguistic field work on undocumented languages (see Robbins, 1977). The texts by Pike and Nida also were adopted at secular colleges and universities for classes on the analysis of linguistic data. During this period of American descriptivism, from the 1940s until the early 1960s, problem-solving was a key component of the linguistics curriculum, where it prepared students for field research. A good example of descriptivist training was at the University of California, Berkeley, which formalized its linguistics program in the early 1950s, and in 1952 established the Survey of California Indian Languages, directed by Mary R. Haas (1910–1996). The Survey was ‘‘funded to allow students to do fieldwork, to write dissertations, and, aided by the University’s publication policies, to publish accounts of these Indian languages’’ (Emeneau, 1997: 618). The 1950s brought to maturity the American descriptivist approach to linguistics, sometimes referred to as American structuralism. Two widely adopted textbooks appeared, one by Henry A. Gleason, Jr. (b. 1917), the other by Charles F. Hockett (1916– 2000). Gleason’s An introduction to descriptive linguistics (1955) dealt solely with synchronic topics and was accompanied by a workbook of problems. Hockett in A course in modern linguistics (1958) followed the pattern of Bloomfield’s Language, with the first two-thirds of the book devoted to synchronic analysis, the last third to language change, historical reconstruction, and the comparative method. The texts reflected the orientation of mid-century American linguistics with core chapters on articulatory phonetics, American structural phonemics, and morphology.

Neither work contained a chapter on semantics, and discussion of syntax was limited. Just a few years later syntax became a major topic in American linguistics with the growing prominence of generative grammar. Veterans’ education benefits led to dramatic increases in undergraduate and then graduate matriculation in the decade following World War II. Linguistics gained students and visibility within the academic world, and a number of institutions formalized their linguistics programs into departments or created new programs. When the United States Congress passed the National Defense Education Act of 1958 (NDEA), the field benefited under provisions that strengthened both science and foreign languages in higher education. Especially important were graduate fellowships under Title IV of the act, which not only provided three years of support to students but also three years of supplementary stipends to the program in which they enrolled. In 1963, five years after the passage of the NDEA, there were in the United States at least 17 departments of linguistics, 14 interdepartmental programs, and 43 other departments offering linguistic coursework; 882 students were enrolled for advanced degrees in linguistics, and 20 institutions offered the baccalaureate degree in linguistics (Newmeyer, 1986: 45–46). Coincidentally, the impetus to graduate study in linguistics provided by the NDEA came just as generative grammar was rising in challenge to the American descriptivist tradition, and the emergence of an exciting new theoretical approach invigorated linguistics as a university subject in America. See also: Anthropological Linguistics: Overview; Applied Linguistics in North America; Bloomfield, Leonard (1887– 1949); Boas, Franz (1858–1942); Historical and Comparative Linguistics in the 19th Century; Hockett, Charles Francis (1916–2000); Linguistic Anthropology; Missionary Linguistics; Nida, Eugene Albert (b. 1914); Pike, Kenneth Lee (1912–2000); Sturtevant, Edgar Howard (1875–1952); Whitney, William Dwight (1827–1894).

Bibliography Bloomfield L (1933). Language. New York: Holt, Rinehart and Winston. Boas F (ed.) (1911). Handbook of American Indian languages. Washington, D.C.: Bureau of American Ethnology. [Three additional volumes published 1922, 1933–1937, and 1941.] Darnell R (1998). And along came Boas: continuity and revolution in Americanist anthropology. Amsterdam/ Philadelphia: John Benjamins.

Linguistics as a University Subject: Early History, in Europe 237 Emeneau M B (1997). ‘Mary R. Haas and ‘‘Berkeley Linguistics.’’’ Anthropological Linguistics 39, 617–619. Falk J S (1998). ‘The American shift from historical to non-historical linguistics: E. H. Sturtevant and the first Linguistic Institutes.’ Language and Communication 18, 171–180. Gleason H A Jr (1955). An introduction to descriptive linguistics. New York: Holt, Rinehart and Winston. Hockett C F (1958). A course in modern linguistics. New York: Macmillan. Kent R G & Sturtevant E H (1926). ‘Survey of linguistic studies: opportunities for advanced work in the United States.’ LSA Bulletin 1, 3–14. Moulton W G (1961). ‘Linguistics and language teaching in the United States 1940–1960.’ In Mohrmann C, Sommerfelt A & Whatmough J (eds.) Trends in European and American linguistics 1930–1960. Utrecht/Antwerp: Spectrum. 82–109. Newmeyer F J (1986). Linguistic theory in America (2nd edn.). Orlando: Academic Press.

Nida E A (1949). Morphology: the descriptive analysis of words (2nd edn.). Ann Arbor: The University of Michigan Press. Pike K L (1947). Phonemics: a technique for reducing languages to writing. Ann Arbor: The University of Michigan Press. Robbins F E (1977). ‘Training in linguistics.’ In Brend R M & Pike K L (eds.) The Summer Institute of Linguistics: its works and contributions. The Hague: Mouton. 57–68. Sturtevant E H (1940). ‘History of the Linguistic Institute.’ LSA Bulletin 13, 83–89. Voegelin C F & Harris Z S (1952). ‘Training in anthropological linguistics.’ American Anthropologist 54, 322–327. Wells R S (c. 1975). ‘Linguistics at Yale, 1931–1946.’ Unpublished manuscript. Yale University Library, Yale Archives, Faculty of Arts & Sciences, Linguistics. Group No. YRG 14–V. Whitney W D (1867). Language and the study of language: twelve lectures on the principles of linguistic science. New York: Charles Scribner & Company.

Linguistics as a University Subject: Early History, in Europe T Lindstro¨m Tiedemann, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Pinning down the history of the subject of linguistics is complicated by the fact that it is hard to know where to draw the line between linguistics, its subdisciplines and other disciplines. Is the study of specific languages for their own sake and/or in order to understand their literature to count as linguistics? Is there a difference between the teaching and research of classical languages and modern languages? Notably, Morpurgo-Davies has claimed that a German professor of Sanskrit in the 19th century usually had much in common with professors of comparative linguistics (Morpurgo-Davies, 1998: 8). Furthermore, some departments of phonetics later became departments of linguistics (e.g., at Uppsala University, Sweden), so how should we treat them in a history of linguistics as a university subject? Scholars have been interested in linguistic topics for thousands of years. Still there has been talk about an institutionalization of linguistics during the 19th century (e.g., Morpurgo-Davies, 1998: 3; Hovdhaugen et al., 2000: 507). However, in many countries it took much longer for the first chairs in general linguistics to appear. In Scandinavia, for instance, general linguistics has been seen as ‘‘closely tied to structuralism’’ and in some countries even more so to generative grammar (Hovdhaugen et al., 2000: 507).

Nevertheless, even before the first chairs in general linguistics appeared, there were chairs established in Comparative Linguistics/Philology, Phonetics, and chairs partly devoted to Linguistics. Moreover, there were already chairs in Latin, Greek, and Oriental languages, usually meaning Hebrew primarily, at most universities during the 17th century. In the 19th century universities started to establish chairs in comparative linguistics, in modern languages, and finally in phonetics and general linguistics. This must be seen in relation to the more general changes at the universities during the 19th century, the introduction of research alongside teaching and the specialization that replaced academic careers from one professorship to the next all the way up to theology (see e.g., Morpurgo-Davies, 1998: 10). Germany and the Friedrich-Wilhelm-University in Berlin (today called the Humboldt University), has been recognized as the center for the important changes carried out at universities during the early 19th century. Wilhelm von Humboldt’s (1767–1835) idea was that universities had to combine research and teaching. Professors should spend equal amounts of time on both and ‘‘teaching should be linked to research’’ (Hovdhaugen et al., 2000: 134). In addition, one should be required to specialize in one field rather than having the possibility of advancing through the various subjects (Hovdhaugen et al., 2000: 134–135). The first university position in linguistics was created for Franz Bopp (1791–1867), who became

Linguistics as a University Subject: Early History, in Europe 237 Emeneau M B (1997). ‘Mary R. Haas and ‘‘Berkeley Linguistics.’’’ Anthropological Linguistics 39, 617–619. Falk J S (1998). ‘The American shift from historical to non-historical linguistics: E. H. Sturtevant and the first Linguistic Institutes.’ Language and Communication 18, 171–180. Gleason H A Jr (1955). An introduction to descriptive linguistics. New York: Holt, Rinehart and Winston. Hockett C F (1958). A course in modern linguistics. New York: Macmillan. Kent R G & Sturtevant E H (1926). ‘Survey of linguistic studies: opportunities for advanced work in the United States.’ LSA Bulletin 1, 3–14. Moulton W G (1961). ‘Linguistics and language teaching in the United States 1940–1960.’ In Mohrmann C, Sommerfelt A & Whatmough J (eds.) Trends in European and American linguistics 1930–1960. Utrecht/Antwerp: Spectrum. 82–109. Newmeyer F J (1986). Linguistic theory in America (2nd edn.). Orlando: Academic Press.

Nida E A (1949). Morphology: the descriptive analysis of words (2nd edn.). Ann Arbor: The University of Michigan Press. Pike K L (1947). Phonemics: a technique for reducing languages to writing. Ann Arbor: The University of Michigan Press. Robbins F E (1977). ‘Training in linguistics.’ In Brend R M & Pike K L (eds.) The Summer Institute of Linguistics: its works and contributions. The Hague: Mouton. 57–68. Sturtevant E H (1940). ‘History of the Linguistic Institute.’ LSA Bulletin 13, 83–89. Voegelin C F & Harris Z S (1952). ‘Training in anthropological linguistics.’ American Anthropologist 54, 322–327. Wells R S (c. 1975). ‘Linguistics at Yale, 1931–1946.’ Unpublished manuscript. Yale University Library, Yale Archives, Faculty of Arts & Sciences, Linguistics. Group No. YRG 14–V. Whitney W D (1867). Language and the study of language: twelve lectures on the principles of linguistic science. New York: Charles Scribner & Company.

Linguistics as a University Subject: Early History, in Europe T Lindstro¨m Tiedemann, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Pinning down the history of the subject of linguistics is complicated by the fact that it is hard to know where to draw the line between linguistics, its subdisciplines and other disciplines. Is the study of specific languages for their own sake and/or in order to understand their literature to count as linguistics? Is there a difference between the teaching and research of classical languages and modern languages? Notably, Morpurgo-Davies has claimed that a German professor of Sanskrit in the 19th century usually had much in common with professors of comparative linguistics (Morpurgo-Davies, 1998: 8). Furthermore, some departments of phonetics later became departments of linguistics (e.g., at Uppsala University, Sweden), so how should we treat them in a history of linguistics as a university subject? Scholars have been interested in linguistic topics for thousands of years. Still there has been talk about an institutionalization of linguistics during the 19th century (e.g., Morpurgo-Davies, 1998: 3; Hovdhaugen et al., 2000: 507). However, in many countries it took much longer for the first chairs in general linguistics to appear. In Scandinavia, for instance, general linguistics has been seen as ‘‘closely tied to structuralism’’ and in some countries even more so to generative grammar (Hovdhaugen et al., 2000: 507).

Nevertheless, even before the first chairs in general linguistics appeared, there were chairs established in Comparative Linguistics/Philology, Phonetics, and chairs partly devoted to Linguistics. Moreover, there were already chairs in Latin, Greek, and Oriental languages, usually meaning Hebrew primarily, at most universities during the 17th century. In the 19th century universities started to establish chairs in comparative linguistics, in modern languages, and finally in phonetics and general linguistics. This must be seen in relation to the more general changes at the universities during the 19th century, the introduction of research alongside teaching and the specialization that replaced academic careers from one professorship to the next all the way up to theology (see e.g., Morpurgo-Davies, 1998: 10). Germany and the Friedrich-Wilhelm-University in Berlin (today called the Humboldt University), has been recognized as the center for the important changes carried out at universities during the early 19th century. Wilhelm von Humboldt’s (1767–1835) idea was that universities had to combine research and teaching. Professors should spend equal amounts of time on both and ‘‘teaching should be linked to research’’ (Hovdhaugen et al., 2000: 134). In addition, one should be required to specialize in one field rather than having the possibility of advancing through the various subjects (Hovdhaugen et al., 2000: 134–135). The first university position in linguistics was created for Franz Bopp (1791–1867), who became

238 Linguistics as a University Subject: Early History, in Europe

extraordinary professor of Orientalische Literatur und allgemeine Sprachkunde in 1821 and was made ordinary professor in 1825 (Morpurgo-Davies, 1998: 7; Killy & Vierhaus, 1998). Notably, linguistics was only part of his duties. Within 100 years all German universities had one or more positions in linguistics, although this usually meant comparative linguistics (Morpurgo-Davies, 1998: 7, with reference to Wackernagel, 1904: 206). Before the 19th century, linguistics and language study was a gentlemanly pastime. However, with the expansion of higher education institutions things were changing, making it possible to make a living as an academic specialized in a linguistic subject around the end of the 19th century (MorpurgoDavies, 1998: 7). Nothing was left to the dilettantes and amateurs; science and learning became a profession—a development which in retrospect has perhaps turned out to be not altogether a good thing. (Aarsleff, 1983: 180–181) In contrast to the previous centuries, the increased power and prestige of the universities led to a concentration of almost all aspects of linguistic research of any significance in the universities. [. . .] The role played by amateur linguists and other institutions such as academies became more marginal, [. . .] (Hovdhaugen et al., 2000: 135)

Still, it is clear that many of the famous linguists from the 19th century did not hold posts in the subject: Johan Nicolai Madvig (1804–1886) was Danish Secretary of Culture (cf. Hovdhaugen et al., 2000), Rasmus Rask (1787–1832) served as professor of literature, as a librarian, and later finally became professor of Oriental languages (Hovdhaugen et al., 2000: 159), Jacob (1785–1863) and Wilhelm (1786–1859) Grimm worked as ministerial secretaries, librarians, and as professors and librarians at Go¨ ttingen after which they taught in Berlin (cf Seuren, 1998: 82), and Humboldt was a minister of the Prussian government. Between 1500 and 1800, linguistic work such as grammars, dictionaries, and historical studies was primarily done by teachers, clergymen, administrators, and not by university professors and lecturers (cf. Hovdhaugen et al., 2000: 26). During the 17th and 18th centuries, it became common in Scandinavia to hire (although often without pay) people from abroad as language masters, at first primarily to teach French (Hovdhaugen et al., 2000: 27). It was only in the 19th century that universities began to appoint lecturers and professors in modern languages. It is important to remember that the changes in linguistics during the 19th century were not only due to the interest in the subject itself, but also to the

size, importance, and number of the higher education institutions. For instance, Sanderson (1975: 1) has said that British higher education ‘‘was never more radically reshaped.’’ At the beginning of the 19th century, England and Wales only had two institutions for higher education: the universities of Oxford and Cambridge (Morpurgo-Davies, 1998: 4) and still before the second world war the only ‘‘established posts in linguistics were specialist chairs of Comparative Philology at Oxford, Cambridge and a few other universities and of Phonetics at University College London’’ (Brown and Law, 2002: vii). That is if we count only comparative philology, linguistics and phonetics, while excluding posts in specific languages. In addition, it is important to note that at the beginning of the 19th century, the establishment of chairs in the linguistic subjects still did not guarantee the chance to research this subject, since the chairs were still primarily teaching chairs: [Friedrich August Rosen] was in 1828 invited to fill the chair of Oriental languages in the London University [. . .] where he taught elementary lessons in Hindustani, Arabic, Persian, and Sanskrit for sixteen hours a week at a guaranteed salary of one hundred pounds a year. (Aarsleff, 1983: 177)

It was not until the end of the war, in 1944, that the first British chair in General Linguistics was set up and J. R. Firth (1890–1960) was elected to it (School of Oriental and African Studies, Department of Phonetics and Linguistics) (Brown and Law, 2002: vii). Oxford and Cambridge were much later establishing departments and chairs in Linguistics, Oxford getting its first chair in Linguistics in 1976 (Roy Harris), Cambridge’s first Professor of Linguistics being Peter Matthews who took up this post in 1980 (Matthews, 2002: 207). The University of Edinburgh was, however, earlier in establishing their first chair in Linguistics. There was a department and chair for English Language and General Linguistics between the years 1948–1964. And when the department was split in two (in 1964) a new chair was established for General Linguistics. Interestingly, the new department of General Linguistics did not teach undergraduates in the first years, but instead only had a Postgraduate Diploma and some research students (Asher, 2002: 35; Lyons, 2002: 181). After the expansion during the 1960s and 1970s, the British universities unfortunately started to experience grant cuts that affected many small linguistics departments, which had to close down or merge with other departments during the 1980s (Lyons, 2002: 189). In recent years, this trend has repeated itself with the decision being taken in 2003 to close down the Durham Department of Linguistics and English

Linguistics as a University Subject: Early History, in Europe 239

language, followed in May 2004 by the proposed closure of the School of Language, Linguistics and Translation Studies at the University of East Anglia by 2007. In France all universities were closed as a result of the French revolution and it was not until the 1880s that French universities made a proper come back (Morpurgo-Davies, 1998: 5). Still France had a tradition of studies of oriental languages, including Sanskrit, which as in Germany was in some ways a forerunner of linguistics (Morpurgo-Davies, 1998: 9). Morpurgo-Davies believes (with reference to Auroux, 1983: 253, 1984: 304ff) that linguistics in France really started to establish itself during the late 1860s when Michel Bre´ al (1832–1915) became professor of comparative grammar, the Socie´ te´ Linguistique de Paris was founded, and the E´ cole Pratique des Hautes E´ tudes was created (Morpurgo-Davies, 1998: 9). The situation in Italy was, according to MorpurgoDavies, very different from that in France, although, like elsewhere, the first chairs in Sanskrit came before the chairs in linguistics (1998: 9). The main difference

was that in Italy the new kingdom (1861) made it official policy to encourage ‘‘the establishment of new chairs of Sanskrit and of Comparative Linguistics’’ and so by the 1870s Turin, Florence, Bologna, Pisa, Naples, Rome, and Milan all had chairs in these subjects and Italy was thereby ahead of both France and England (Morpurgo-Davies, 1998: 9–10). It is important to note that of course a subject is not only taught if there is a professor in it, nor is the creation of a chair or even a lectureship the first sign of interest in the subject. Hovdhaugen et al. (2000: 165) mention that even though the Scandinavian universities only hired professors in comparative linguistics from the 1870s, the subject was taught already in the 1840s. Similarly, there have been different ways of making general linguistics part of the education students received. In many parts of Europe, it was introduced through comparative linguistics and Sanskrit. As an example of how it could otherwise be introduced, modern general linguistics started to appear in Sweden during the 1940s when the Spra˚kvetenskapliga Sa¨llskapet i Uppsala (the Linguistic Society in Uppsala) arranged a series of

Table 1 Some of the first chairs in linguistics and general linguistics in Europe Country

University

First professor

Subject

Year

Germany

University of Berlin

Franz Bopp (1791–1867)

1821

Germany

University of Berlin

Franz Bopp (1791–1867)

Germany

University of Halle

1838

Germany

University of Berlin

General Linguistics (extraordinary)

1862

France Norway England Scotland

E´cole d’Anthropologie University of Oslo School of Oriental and African Studies University of Edinburgh

August Friedrich Pott (1802–1887) Heymann Steinthal (1823–1899) Abel Hovelacque (1843–1896) Alf Sommerfelt (1892–1965) John Rupert Firth (1890–1960) Angus McIntosh (b. 1914)

Oriental Literature & General Linguistics (extraordinary) Oriental Literature & General Linguistics (ordinary) General Linguistics (ordinary)

1867 1931 1944 1948

Scotland Norway Sweden

University of Edinburgh University of Trondheim University of Umea˚

Finland Sweden

University of Helsinki University of Stockholm

General Linguistics General Linguistics

1966 1967

Sweden Denmark England England Finland England Norway

University of Lund University of Copenhagen University of Oxford University of Sussex University of Turku University of Cambridge University of Bergen

General Linguistics General Linguistics General Linguistics Linguistics General Linguistics General Linguistics General Linguistics

1969 1976 1976 1976 1978 1980 1983

Sweden

University of Gothenburg

John Lyons (b. 1932) Olov Næs (1901–1984) Karl-Hampus Dahlstedt (1917–1996) Kalevi Wiik (b. 1932) Karl-Hampus Dahlstedt (1917–1996) Bertil Malmberg (1913–1994) Jørgen Rischel (b. 1934) Roy Harris (b. 1931) John Lyons (b. 1932) Fred Karlsson (b. 1946) Peter Matthews (b. 1934) Helge (Julius Jakhelln) Dyvik (b. 1947) Jens Allwood (b. 1947)

Linguistic Anthropology General Linguistics General Linguistics English Language and General Linguistics General Linguistics General Linguistics General Linguistics

General Linguistics

Finland Sweden

University of Joensuu Uppsala University

Jussi Niemi (b. 1950) A˚ke Viberg (b. 1945)

1983 1986 1991 2001

General Linguistics Linguistics

1825

1964 1964 1965

240 Linguistics as a University Subject: Early History, in Europe

introductory lectures on the subject (Hovdhaugen et al., 2000: 333). A few years later it was introduced more formally at Uppsala University through phonetics, in which courses started during the 1950s (Hovdhaugen, 2000: 333). In Gothenburg, there were seminars in the language departments that were organized by the professors of Scandinavian languages, Natanael Beckman (1868–1946) and Hjalmar Lindroth (1878–1947), where one discussed issues in general linguistics at least from the 1930s. In Lund linguistics was primarily introduced from Copenhagen, but there was also a French scholar, Pierre Naert (1916–1971), who taught Saussure to his students of Scandinavian languages between 1939 and 1962 (Hovdhaugen et al., 2000: 334). So by various means, linguistics became a university subject, starting during the 19th century but becoming more firmly established as general linguistics during the 20th century (see Table 1). See also: Historical and Comparative Linguistics in the 19th Century; Language Teaching: History; Modern Linguistics: 1800 to the Present Day; Scandinavia: History of Linguistics.

Bibliography Aarsleff H (1983). The study of language in England 1780–1860. London: Athlone Press.

Asher R E (2000). ‘R E Asher.’ In Brown K & Law V (eds.) Linguistics in Britain: personal histories. Oxford: Blackwell. 28–42. Auroux S (1983). ‘La premie`re socie´ te´ de linguistique – 1837?’ Historiographia Linguistica 10, 241–265. Auroux S (1984). ‘Linguistique et anthropologie en France (1600–1900).’ In Rupp-Eisenreich B (ed.) Histoires de l’anthropologie: XVI–XIX sie`cles. Paris: Klincksieck. 291–318. Brown K & Law V (eds.) (2002). Linguistics in Britain: personal histories. Oxford: Blackwell. Hovdhaugen E, Karlsson F, Henriksson C et al. (eds.) (2000). The history of linguistics in the Nordic countries. Helsinki: Societas Scientiarum Fennica. Killy W & Vierhaus R (eds.) (1998). Deutsche Biographische Enzyklopa¨die. Munich: K. G. Saur. Lyons J (2002). ‘John Lyons.’ In Brown K & Law V (eds.) Linguistics in Britain: personal histories. Oxford: Blackwell. 170–199. Matthews P (2002). ‘Peter Matthews.’ In Brown K & Law V (eds.) Linguistics in Britain: personal histories. Oxford: Blackwell. 200–212. Morpurgo-Davies A (1998). ‘Nineteenth-century linguistics.’ In Lepschy G (ed.) History of linguistics, vol. IV. London: Longman. Sanderson M (1975). The universities in the nineteenth century. London: Routledge & Kegan. Seuren P (1998). Western linguistics – an historical introduction. Oxford: Blackwell. Wackernagel J (1904). ‘Vergleichende Sprachwissenschaft.’ In Lexis W (ed.) Das Unterrichtswesen im deutschen Reich, I, die Universita¨ ten. Berlin: Asher. 202–207.

Linguistics: Approaches N Fabb, University of Strathclyde, Glasgow, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Linguistics is the study of language, and there are many different kinds of linguistics, some mutually compatible, some in competition. This diversity of approaches to linguistics is possible because language does not present itself to investigation as a coherent and well-defined field of enquiry that is clearly distinct from other areas of investigation. Instead, language is best imagined as a landscape through which it is possible to take various journeys, its horizons redefined by each approach. Most approaches to linguistics agree on a few basic facts about language. The first is the fact of linguistic form. While the status of linguistic form is in dispute (as I show below), it is clear that linguistic events such

as utterances or inscriptions must be understood as manifestations of linguistic types. Thus utterances must be understood as tokens of combinations of words, even though it may be impossible to isolate a ‘word’ in the actual stream of sound. The words belonging to classes (such as ‘Noun’) are made by selecting sounds from an inventory of sounds (phonemes of the language) and so on; in all cases, language must be understood as drawing on inventories of types and combining those types in regular ways: a word of the type ‘Article’ precedes a word of the type ‘Noun’ in English (within the Noun Phrase). These regularities, another idealization away from the crude data, involve rules or generalizations or constraints. No account of language can ignore the fact that language is ordered and regular, based on an inventory of types and rules of combination that together constitute linguistic form. The second fact accepted by all approaches to linguistics is that form relates to

240 Linguistics as a University Subject: Early History, in Europe

introductory lectures on the subject (Hovdhaugen et al., 2000: 333). A few years later it was introduced more formally at Uppsala University through phonetics, in which courses started during the 1950s (Hovdhaugen, 2000: 333). In Gothenburg, there were seminars in the language departments that were organized by the professors of Scandinavian languages, Natanael Beckman (1868–1946) and Hjalmar Lindroth (1878–1947), where one discussed issues in general linguistics at least from the 1930s. In Lund linguistics was primarily introduced from Copenhagen, but there was also a French scholar, Pierre Naert (1916–1971), who taught Saussure to his students of Scandinavian languages between 1939 and 1962 (Hovdhaugen et al., 2000: 334). So by various means, linguistics became a university subject, starting during the 19th century but becoming more firmly established as general linguistics during the 20th century (see Table 1). See also: Historical and Comparative Linguistics in the 19th Century; Language Teaching: History; Modern Linguistics: 1800 to the Present Day; Scandinavia: History of Linguistics.

Bibliography Aarsleff H (1983). The study of language in England 1780–1860. London: Athlone Press.

Asher R E (2000). ‘R E Asher.’ In Brown K & Law V (eds.) Linguistics in Britain: personal histories. Oxford: Blackwell. 28–42. Auroux S (1983). ‘La premie`re socie´te´ de linguistique – 1837?’ Historiographia Linguistica 10, 241–265. Auroux S (1984). ‘Linguistique et anthropologie en France (1600–1900).’ In Rupp-Eisenreich B (ed.) Histoires de l’anthropologie: XVI–XIX sie`cles. Paris: Klincksieck. 291–318. Brown K & Law V (eds.) (2002). Linguistics in Britain: personal histories. Oxford: Blackwell. Hovdhaugen E, Karlsson F, Henriksson C et al. (eds.) (2000). The history of linguistics in the Nordic countries. Helsinki: Societas Scientiarum Fennica. Killy W & Vierhaus R (eds.) (1998). Deutsche Biographische Enzyklopa¨die. Munich: K. G. Saur. Lyons J (2002). ‘John Lyons.’ In Brown K & Law V (eds.) Linguistics in Britain: personal histories. Oxford: Blackwell. 170–199. Matthews P (2002). ‘Peter Matthews.’ In Brown K & Law V (eds.) Linguistics in Britain: personal histories. Oxford: Blackwell. 200–212. Morpurgo-Davies A (1998). ‘Nineteenth-century linguistics.’ In Lepschy G (ed.) History of linguistics, vol. IV. London: Longman. Sanderson M (1975). The universities in the nineteenth century. London: Routledge & Kegan. Seuren P (1998). Western linguistics – an historical introduction. Oxford: Blackwell. Wackernagel J (1904). ‘Vergleichende Sprachwissenschaft.’ In Lexis W (ed.) Das Unterrichtswesen im deutschen Reich, I, die Universita¨ten. Berlin: Asher. 202–207.

Linguistics: Approaches N Fabb, University of Strathclyde, Glasgow, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Linguistics is the study of language, and there are many different kinds of linguistics, some mutually compatible, some in competition. This diversity of approaches to linguistics is possible because language does not present itself to investigation as a coherent and well-defined field of enquiry that is clearly distinct from other areas of investigation. Instead, language is best imagined as a landscape through which it is possible to take various journeys, its horizons redefined by each approach. Most approaches to linguistics agree on a few basic facts about language. The first is the fact of linguistic form. While the status of linguistic form is in dispute (as I show below), it is clear that linguistic events such

as utterances or inscriptions must be understood as manifestations of linguistic types. Thus utterances must be understood as tokens of combinations of words, even though it may be impossible to isolate a ‘word’ in the actual stream of sound. The words belonging to classes (such as ‘Noun’) are made by selecting sounds from an inventory of sounds (phonemes of the language) and so on; in all cases, language must be understood as drawing on inventories of types and combining those types in regular ways: a word of the type ‘Article’ precedes a word of the type ‘Noun’ in English (within the Noun Phrase). These regularities, another idealization away from the crude data, involve rules or generalizations or constraints. No account of language can ignore the fact that language is ordered and regular, based on an inventory of types and rules of combination that together constitute linguistic form. The second fact accepted by all approaches to linguistics is that form relates to

Linguistics: Approaches 241

meaning. A central function of language is to enable communication, and the organization of linguistic forms has some relation to the organization of meaning. This view – in some form – is shared by many approaches to linguistics: meaning is structured, and the structure of form has some relation to the structure of meaning.

The Status of Linguistic Form One of the issues that divide approaches to linguistics relates to the ‘autonomy’ of linguistic form, in a number of senses. One view, expressed from the margins of linguistics or from outside linguistic theory, is that linguists’ discoveries of linguistic form are actually determined by their experience of writing (e.g., the notion of the phoneme is a reimagining of the alphabetic letter), and hence that linguistic form is an artifact of a particular moment in the history of work on language. A second view is that form is entirely determined by function; in its crudest form this is the traditional grammar notion that a word is of the class Noun if it is ‘a naming word’; more sophisticated accounts might see the class of a word as fully determined by the function of the word relative to other words in the sentence. A third view sees form as autonomous; thus it is a fact about a word that it is a Noun, this having the same determinate relation to the word as its sound structure or core meaning. While it may have other characteristics as a consequence of being in this word class, including function relative to other words or distinctive phonological characteristics (e.g., stress patterns in English), these functions do not determine its status and are in effect implied by its word class rather than presupposed by it. Of these three views, the first is an antilinguistic approach to linguistics, the second a strongly functionalist approach, and the third, a strongly formalist approach. Within the formalist approach, one of the issues that arise is which forms are primitives and which are compound or derived forms; for example, some approaches treat Noun as a primitive class, while others see it as derived from features, so that a Noun is a composite of two primitive features, þN and -V (while Verb would be -N þV, Adjective would be -N -V., etc.). The most important early discussion of whether forms are primitives is Halle’s argument that there is no coherent notion of the phoneme as a primitive of sound structure; it is instead a composite entity built from the phonological features. Underlying the discussion of the autonomy of linguistic form is the question of whether any kinds of linguistic forms are specific to language or are also found in other domains. A key text is Chomsky’s Syntactic structures (1957). He begins his book

by discussing generic kinds of forms, as expressed by rules that rewrite a symbol as a string of symbols (phrase-structure rules); simple rules of this kind could, in principle, be found in domains (including cognitive domains) outside language. But he shows that this simple kind of rule is inadequate for understanding language and that a new kind of rule – a transformational rule – must also be used. A transformational rule takes a complex linguistic representation (a tree structure) and changes it into another tree structure, and is specific to language; transformational rules crucially do not operate on unstructured strings of symbols (e.g., no rule simply inverts the sequence of a string of symbols) but instead operate on more complex structural representations. Later developments in transformational grammar showed that transformations were subject to various constraints (such as the Complex NounPhrase Constraint, which prevented a rule simultaneously accessing something inside and outside a complex Noun Phrase); these constraints again were specific to linguistic form. While it has occasionally been argued that transformational rules are found outside language (e.g., in music), it is likely that this kind of rule, which transforms one complex representation into another, is specific to language. These questions are particularly interesting because the existence of some kinds of forms specific to language would support that language is processed in the mind by modes of cognitive organization specific to language. A basic idea in linguistics is that linguistic form exists to provide formal choices that have functional (semantic or communicative) consequences. For Saussure, a sign such as the word ‘tree’ is a pairing of choices from two distinct systems: (1) the sound system, where the choices produce the three-sound sequence that make up the spoken word and (2) the meaning system, where the meaning of ‘tree’ is chosen from a range of possible meanings. The pairing of sound and meaning is thus a matter of choosing from different inventories. Choice is a dominating principle in Saussurean linguistics and, later, in structuralism, including the nonlinguistic structuralism found in the work of Roland Barthes such as The fashion system. Systemic functional grammar is one of the major approaches to linguistics in which the notion of choice has been fundamental. Each ‘system’ in systemic functional grammar presents ranges of options to choose from when communicating, and the theory investigates the functions expressed by making these choices. This approach has enabled systemic functional grammar to pioneer work in certain areas of linguistic description, some of which (such as

242 Linguistics: Approaches

Halliday’s early work on thematic roles) have been absorbed into other linguistic theories. In particular, systemicists are interested in the choices among ‘cohesive devices’ in discourse, the devices – conceived of as choices from a range of options presented by the system – explicitly guiding the hearer or reader in understanding how sentences are related. Systemicists have developed an account of ‘register,’ the set of linguistic features associated with specific genres of verbal behavior (formal interview, informal letter, etc.), where the term ‘register’ itself is another way of saying ‘range of options from which to choose.’ In addition, systemic functional grammar has a deep interest in stylistic analysis, in which formal choices have functional significance in expressing complexities and nuances of meaning and fit well with ‘close-reading’ approaches in mainstream literary criticism. Another fundamental idea, shared to some extent by all approaches, is that linguistic form is the mediation between a thought and an utterance; that is, that linguistic form enables the expression of meaning in sound (and writing). Chomsky’s ‘minimalist enquiries’ begin by re-examining this basic idea. From his earliest work onward, Chomsky’s approach to linguistics has always been to ask what the simplest kind of language might be, and then to ask why actual human languages are not this simple. Thus, as described above, Syntactic structures establishes that sentences of a natural language cannot be generated just by rules (phrase-structure rules) of a certain low level of complexity, but require a more complex kind of rule, one that in turn requires a whole theory of linguistic representations which can be subject to transformation. ‘Why aren’t sound and meaning fully aligned?’ is a recently formulated minimalist question; the answer is that organizations of the sentence as meaning are not isomorphic with large-scale organizations of the sentence as sound. In Chomsky’s always-provocative terminology, the question is why language is ‘imperfect.’ Consider, for example, the Noun Phrase that expresses the thing eaten in ‘John ate the cheese’; here this unit of meaning is after the verb, but it is before the verb in ‘the cheese was eaten’ and the word ‘what’ (which substitutes for it, i.e., ‘the cheese’) is at the beginning of the sentence in ‘what did John eat?’ Perhaps the unit of meaning does not always stay in the same place because some other principle, possibly involving the informational structuring of the utterance, forces it to move. Chomsky sees this as an ‘imperfection,’ and seeks to explain why moving the unit is a necessary compromise among the various demands placed on linguistic form by the requirement that speech expresses meaning.

Rationalist and Empiricist Approaches and the Status of Data A major division in approaches to linguistics pits rationalists against empiricists, as their attitudes towards data exemplify. The key figure for rationalist approaches is Noam Chomsky. Most kinds of linguistics before Chomsky were empiricist approaches (as in the work of Bloomfield), and Chomsky defined his linguistics in opposition to those. Data always present a problem for linguistics, because it must always be idealized; two slightly different utterances of the same word must be understood by linguistic theory as two tokens of the same type. Without this idealization, there would be nothing to say about language. Approaches differ in the kinds of idealizations they adopt. Chomsky famously and substantially idealized so that the rules in Syntactic structures generate an (infinite) set of sentences corresponding to the set of sentences accepted as grammatical by an idealized user of the language, having separated out all contextual and behavioral factors. The rules describe what someone knows to be grammatical (his or her ‘competence’), not what that person actually says. Since this rationalist approach to linguistics aims to describe a person’s knowledge of language, not his or her linguistic behavior, there are some significant consequences for the status of data. First, data must be sifted into the relevant and the irrelevant (for this particular purpose); data irrelevant for theoretical syntax might of course be relevant in a different kind of linguistics. It is never clear in advance whether data will be relevant or irrelevant. For example, in Syntactic structures Chomsky demonstrates that transformational rules can explain the various configurations of modal, auxiliary, and main verbs in an English sentence; on the other hand, it’s also possible, since a finite and small number of such possible configurations exist, that they could actually just be learned as fixed sequences with no rule involved. There is no guarantee in advance that either approach will be correct. The second major consequence of a rationalist approach to data is that the frequency of any particular kind of data is ignored, both in the sense that in Chomskyan linguistics ‘variability of rules’ is generally ignored (see discussion below) and in the sense that statistically very rare data may nevertheless have a key theoretical role. Thus, in the early 1980s the rare ‘parasitic gap constructions’ were first discussed; an example is the grammatical sentence ‘Which books did you put on the shelves without reading?’ where the phrase ‘which books’ matches not one but

Linguistics: Approaches 243

two gaps in the sentence (after ‘put’ and after ‘reading’), one of which is called parasitic. Sentences like this are unlikely to turn up in a corpus, and there are no descriptions of them before this time; nevertheless, such sentences played a major role in helping us understand fundamental aspects of the workings of sentence structure. The third major consequence for data under a rationalist approach is that these data are invented in order to test the predictions of the theory (i.e., invented and then tested against speakers’ judgments), rather than gathered and used in constructing a theory. Rationalist approaches to linguistics must assume that the theory will guide the interpretation of the data. In contrast, empiricist approaches treat theories as constructs that emerge from data. These data may be gathered into a corpus. A fundamental concern is to develop a methodology for gathering, annotating, and understanding the data. Note that rationalist approaches to linguistics do not have this commitment to a methodology, and one of Chomsky’s aims in Syntactic structures was to wrest American linguistics from the heavily method-oriented approach associated with the structural linguistics of the 1930s and 1940s. One advantage of the empiricist approach is its inherent ability, more so than a rationalist approach, to guarantee a result in the sense of a description or annotation of linguistic data. An empiricist approach can be conceptualized as a plan of action, with steps to be followed using clearly specified methods in order to process a collection of data into an account. Because an empiricist approach is supported by agreed methodologies, it can be used to gather and organize large amounts of data, and it is sometimes favored in language work when large amounts of information must be gathered before that information (e.g., from ‘endangered languages’) becomes unavailable. This kind of empiricist approach also favors the social over the cognitive, because what is gathered is what people say or write, rather than what they know. It has been argued (for example, by Ken Hale) that a rationalist approach, despite its associated risks, is nevertheless best even for endangered languages, because we should be interested in these speakers’ knowledge of their language, not just in their utterances. While approaches to linguistics agree on the existence of linguistic form, they disagree on how inclusive such a theory of linguistic form should be. One area of disagreement is language statistics. As with any collection of data, linguistic data can be subjected to statistical analysis and this statistical analysis can be described in terms of linguistic rules.

It is a fact that some linguistic rules are ‘variable’ for an individual speaker or for a speech community (the term ‘variable rule’ comes from Labov); an example would be ‘t-glottaling’ in British English or the ‘r-drop’ in some varieties of American English. T-glottaling is the use of a phonetic glottal stop where there is a phonemic /t/ in a word like bottle and that can be understood as a consequence of applying a rule changing /t/ to the glottal. Data gathered for a particular individual will show that the rule is used at some times and not at others, and that these data could be described by attaching a percentage to that rule for that individual (for that data) to indicate how frequently the rule is actually applied in a particular context. The percentage might differ with the social context; for most speakers, t-glottaling is more likely in more informal contexts, and this information could be appended as a statistic to the rule. The issue dividing approaches to linguistics is whether a way of relating that percentage to the rule can be built into the theory and the representations it permits. The decision depends in part on whether the theory aspires to the holistic or the modular. Holistic theories attempt to incorporate as much explanation of language as possible within a single, connected theoretical model, e.g., Systemic Functional Grammar. Such theories tend to be functionalist and tend not to allow for significant autonomy of form; the holism here claims an explanatory link among various different aspects of language. Modular theories, on the other hand, are theories that divide language into different subfields – each with its own theoretical account – where there may be limited or no relation among the theories. Modular theories, of which generative grammars are a good example, place considerable emphasis on the autonomy of linguistic form. While generative linguistics on the whole is not interested in statistical facts, Optimality Theory (which has emerged from generative linguistics) is interested; it has had a consistent interest in statistical facts, and various ways have been suggested of incorporating statistical facts into the ranking of constraints. The distinction between rationalist and empiricist approaches is sometimes entangled in a rhetoric of ‘realism’ or ‘naturalism.’ The terms ‘natural’ or ‘real/ realistic’ may be used to valorize certain kinds of (usually empiricist) linguistic theory; hence there is talk of ‘natural phonology,’ ‘psychological realism,’ or ‘real language.’ The underlying assumption in each case is that how linguistics proceeds should be subject to some external constraint on theory-formation, but that is in conflict with the basic principle of rationalist approaches that no theory is guaranteed by some external constraint to be right or successful.

244 Linguistics: Approaches

Most commonly, external constraints are drawn from computer science or from psychology, but sometimes a demand is heard that linguistic theory should be constrained by a particular audience’s need to understand it. In each case, these demands are driven by practical anxieties, usually fundamentally financial: Can the theory be used to develop working software? Can it serve psychology? Will the general public understand it sufficiently to want to support it? Will the linguistics department survive the next round of university cost-cutting?

The Production of Form All approaches to linguistics agree there is linguistic form. As I have suggested above, these approaches differ on which forms are considered fundamental; for example, are phonemes a fundamental kind of form, or are they a construct based on phonetic features that themselves are the fundamental kinds of forms? They also differ in how linguistic form is generated. What kinds of rules build form? What are the constraints on the operation or output of those rules? Much of the debate in this area has been conducted within kinds of linguistics that refer to themselves as ‘generative’ and trace their intellectual ancestry back to Syntactic structures, a book that is a key discussion of the role of rules in language. In early generative linguistics, there were just rules. These were phrase-structure rules that built a representation called a tree structure, which terminated in a string of words that made up a grammatical sentence, and there were transformational rules that changed one tree structure into another. Within a few years, it was noted that transformational rules were subject to constraints, and these were explored fully by J. R. Ross, who in 1967 wrote one of the most influential Ph.D. dissertations in linguistics, ‘Constraints on Variables in Syntax’ (published as Infinite Syntax!, 1986). Ross (1967) showed that the possibility of changing one tree structure into another was prevented if the transformational rule needed to relate two positions, one of which was in an ‘island’ (a relationally defined subarea within a tree structure, such as a sentence inside a noun phrase). This meant that in addition to rules that generated form, there were also constraints on the rules. The balance between rules and constraints – both in terms of number of rules as opposed to constraints, and complexity of rules as opposed to complexity of constraints – has changed over the course of generative linguistics and is different in different approaches. For example, by the early 1980s, transformational rules were as simple as the rule ‘move something’ (technically formulated as ‘move alpha’),

while the constraints on this simple rule were quite extensive and complex. In addition to constraints on rules, another kind of constraint was introduced: a constraint (or ‘filter’) on the output of the rules. The best known filter in generative syntax of the 1970s was the ‘that-trace’ filter; this stated that a transformational or other rule could not have as its output a sentence in which the subordinating conjunction that was followed by a ‘trace’ (an empty subject). Thus the sentence ‘Who did you say that left?’ is ungrammatical because that is followed by an empty subject. Given the possibility of having rules, constraints on rules, and constraints on outputs, some theories have attempted to dispense with one of these elements. In the syntactic theory of the early 1980s called ‘government-binding theory,’ the terms government and binding did not describe rules but were best understood as expressing constraints on relationships among parts of the tree structure; the role of rules was minimized. In Optimality Theory, both in syntax and in phonology, the role of form-generating rules is minimized, and there are no form-changing rules (such as transformational rules); instead, the burden of explanation is carried by constraints on output. Optimality Theory, which has been dominant in approaches to sound structure, participates in a key debate about order. In Optimality Theory, constraints are ordered (the term ‘ranked’ is used) in the sense that an output may violate a constraint and yet be preferred to an output that violates another constraint, because one constraint is ordered above the other. However, this order is fundamentally different from that of the generative phonology that dominated phonology before Optimality Theory, and Optimality Theory now challenges it. In this generative phonology, rules are themselves ordered, such that the input of one rule is determined by the output of the preceding rule. Rule ordering is also the basis of the phonological cycle, where a sequence of rules is applied to a word, then morphological rules are applied (e.g., it is suffixed), and then the same sequence of phonological rules is applied again from the beginning.

The ‘Landscape’ of Language and its Division into Fields of Linguistic Inquiry As we have seen, approaches to linguistics can disagree on what aspects of language are open to theoretical description. Alternatively, these approaches may agree on what should be described but dispute the subfield most suited to describing it. Linguistic theory is conventionally divided into distinct but related subfields. For sound structure there is a distinction between

Linguistics: Approaches 245

phonetics (itself potentially distinguished into acoustic and articulatory phonetics) and phonology; phonetics deals with the mediated manifestations of sound, while phonology deals with the knowledge of (representations of) sound. Here a ‘border dispute’ involves the distinctness of phonetics and phonology and whether the same kinds of description – e.g., the same kinds of articulatory feature – can be used explanatorily in both phonetics and phonology. The distinct status of morphology is another area for dispute. In some theoretical frameworks, morphology and phonology are closely intertwined (e.g., lexical phonology and morphology), and thus questions arise about the similarity between morphological and phonological processes. Some morphological processes, such as reduplication (where part of a word is copied to build a larger word), raise difficult problems for phonology. However, morphology is also syntactically relevant, and it is possible that the internal structure of words – the domain of traditional morphology – can in some cases be opened up to syntactic rules. Reorganizing subkinds of linguistics relative to one another can create very productive ways for approaches to linguistics to develop and change. Another border under dispute is that between the syntax and the pragmatics. This can involve matters of information structure, as well as problematic areas of the syntax such as apposition and conjunction, where there may be pragmatic explanations for apparently syntactic processes. Does the organization of discourse require its own subfield (‘discourse analysis’) or can it be entirely incorporated under the field of pragmatics? Could a theory of discourse analysis better explain coreference than the syntax? In his recent work, Chomsky argues that some phenomena previously understood as syntactic are better understood as phonetic – explained as part of the ‘phonetic form’ of the sentence rather than its syntactic form. Here, for example, an interesting issue is whether the linear order of syntactic elements (such as words) can be entirely explained by some nonsyntactic principle such as the phonetics, on the basis that syntax is the study of hierarchy and phonetics the study of sequence. Approaches to linguistics can thus differ according to the linguistic material that experts think they can explain under their own subfield. But it is also possible for approaches to different kinds of data to share similar ways of theorizing, a fact that has interesting implications for the organizations of different kinds of linguistic cognition corresponding with these theories. The most significant interplay among domains is seen in generative phonology and generative syntax.

At times these have been very close – and at other times very different – in their approaches, raising the question as to whether there are good reasons for thinking that phonology and syntax are fundamentally alike or fundamentally different. A historical connection, now broken, was in the idea of the ‘cycle,’ where a linguistic object (such as a word including several suffixes or a sentence including several subordinate clauses) was built in stages; at each stage, the full set of rules was run through in sequence, starting again from the beginning at the next stage. For a period, it seemed that both words and sentences could be understood as constructed ‘cyclically’ in this way. While the basic notions of this cycle still survive in some forms and in some types of linguistics, few would now argue for the cycle as a point of similarity between the two theories. Another rich collection of interconnections involves the notion of the ‘feature,’ an idea that also jumped from linguistics to Claude Le´ vi-Strauss’s anthropology in the 1940s. Form understood as a set of features – each valued as þ or " (such that two features can build four derived forms) – has not only been central to generative phonology, but also borrowed as an idea for (1) syntactic categories (e.g., the category of noun is a derived form based on the features þN "V), (2) semantic categories (e.g., ‘agent’), and (3) in lexical semantics (where meanings are considered composites of underlying ‘meaning features’). More recently, the principles of Optimality Theory, which have been widely used in phonology, have been borrowed to explain syntactic phenomena. Thus, one characteristic of approaches to linguistics is a willingness to borrow ideas from other domains. When the same idea works in different linguistic domains, this promises to tell us something fundamental about the organization of linguistic form.

Bibliography Baker M (1988). Incorporation: a theory of grammatical function changing. Chicago: University of Chicago Press. Chomsky N (1957). Syntactic structures. Berlin: Mouton. Chomsky N (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky N (1995). The minimalist program. Cambridge, MA: MIT Press. Halle M (2002). From memory to speech and back: papers on phonetics and phonology 1954–2002. Berlin: Mouton. Halliday M A K (1985). An introduction to functional grammar. London: Edward Arnold. Jackendoff R (1977). X’-syntax: a study of phrase structure. Cambridge, MA: MIT Press. Kager R (1999). Optimality theory. Cambridge: Cambridge University Press.

246 Linguistics: Approaches Kenstowicz M (ed.) (2001). Ken Hale: a life in language. Cambridge, MA: MIT Press. Labov W (1972). Sociolinguistic patterns. Oxford: Blackwell.

Newmeyer F J (1980). Linguistic theory in America: the first quarter-century of transformational generative grammar. New York: Academic Press. Ross J R (1986). Infinite syntax! Norwood, NJ: Ablex.

Literacy Practices in Sociocultural Perspective J Collins, University at Albany, SUNY, Albany, NY, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction This article discusses some of the things meant or intended when researchers refer to ‘literacy practices.’ One meaning is that the researchers will treat literacy as an event in which the (arti)facts of inscription cannot be separated from persons, settings, and other communicative modalities. In addition to this descriptive aim, the term also usually signifies a theoretical ambition. Across multiple disciplines of social inquiry, reflecting the legacy of thinkers such as Sahlins, Williams, Foucault, Giddens, and Bourdieu, the term ‘practice’ has come to signify a range of theories and frameworks that grapple with the interplay of structure and construction, history and agency (Ortner, 1984). Although a literacy event may be as personal and fleeting as jotting down a list or glancing at an advertisement, literacy understood as the making or interpreting of inscriptions has a history of many millennia, and is closely associated with longterm and large-scale enterprises, such as cities and states, armies and schools, extensive markets and world religions. Relating the personal to the largescale, the fleeting to the enduring, subject to structure, is thus a challenge confronting the study of literacy practices. The term ‘New Literacy Studies’ (NLS) denotes a broad framework, drawing together various critical approaches within the field of literacy studies, consciously built upon the notion of literacy as a practice, rather than, say, a psychological skill or abstract social property. Within the NLS framework, literacy practices have been explicitly discussed as entailing a twofold research intention: (a) to build upon the notion of literacy as a communicative event, itself derived from the anthropological ethnographies of communication paradigm, but also (b) to push analysis beyond events to the ideological framing and institutional contextualizing that gives particular events their broader significance (Street, 1993). In this effort, NLS work has drawn particular inspiration from the ground-breaking research of Shirley Heath on

literacy events, and it has used the anthropological notion of ‘cultural model’ to explore relations between consciousness and institutions. It has also, however, been informed by European and English traditions of critical discourse analysis, which call for the study of language as a social practice, an approach to language/society informed by Marxian analyses of ideology and hegemony and Foucault’s arguments about discourse, knowledge, and power (Rogers, 2003). In what follows, I first discuss work in the latter ‘New Literacies’ (NLS) framework, then indicate amplifications and framings suggested by sociocultural perspectives drawn from anthropology and linguistic anthropology (LA), arguing that the differing approaches only partly overlap. The article concludes by addressing contemporary literacy problems in the US and the complementary strengths and weaknesses of the two approaches (NLS and LA) in thinking about those problems as well as in providing frameworks for more general understandings of the phenomena of literacy and practice. It goes without saying, an article of this scope will unavoidably be schematic, citing only directly relevant work, leaving much that is worthwhile unmentioned and treating as settled much that deserves further argument.

Literacy Practices in the New Literacy Studies As noted above, within the NLS approach, priority is given to studying literacy events, the situated doings that involve acts of reading, writing, or both. Highly influential early work was that of Heath (1983), who showed that in home, school, workplace, or church, what was a issue was not a binary contrast between literacy and orality, but rather the socially situated and culturally mediated events – whether reading a bedtime story or reading the mail, filling in a job application or composing a prayer for a church service. Participating in such events, people not only decoded or encoded text but were socialized into and enacted particular views of what reading or writing might be, who took what roles in such events,

20th-Century Linguistics: Overview of Trends 181

with one’s viewpoint. He claimed that philosophy is the theory of science, the subject of which is the study of human cognition. Thus, he supplemented his researches with psychological evidence and data now referred to as semantic. Twardowski divided the cognitive psychical phenomena into representations and statements; the former into images and conceptions. According to this distinction, representation is not an element of a statement, but a necessary condition of it (idiogenetic theory of statements). For example, the conception of a rectangle consists of an image of some plain geometric figure that the agent complements with another two characteristics: rectangular and equilateral. Apart from investigating purely scientific problems, Twardowski cultivated a rule of clear speaking according to his saying, ‘the one who speaks unclearly, thinks unclearly,’ in which he always called for proofs of every statement and avoided speculations. Later, his students formed the Warsaw-Lvov School of Philosophy, and, since they followed Twardowski’s ideas, it was a school of analytic philosophy. Twardowski died on February 12, 1938 in Lvov. See also: Kotarbinski, Tadeusz (1912–1998); Kuryowicz, Jerzy (1895–1978).

Bibliography Albertazzi L (1993). ‘Brentano, Twardowski and Polish scientific philosophy.’ In Conoglione F, Roberto P &

Wolenski J (eds.) Polish Scientific Philosophy: The Lvov–Warsaw School. Amsterdam: Rodopi. 11–40. Bobryk J (1989). ‘Twardowski’s theory of actions and products, and the computer metaphor in psychology.’ In Ruch Filozoficzny. no. 4 (1989). 390–395. Czezowski T (1948). ‘Kazimierz Twardowski as a teacher.’ In Studia Philosophica 1939–1946. vol. III. 13–17. Twardowski K (1891). Idee und Perzeption. Eine erkentmisstheotetische Unterschung aus Descartes. Vienna: W. Konegen. Twardowski K (1894). Zum lehre von Inhalt und Gegenstand der Vorstellung. Eine psychologische Unterschung. Vienna: Ho¨lder. Twardowski K (1898). Wyobraz˙enia i poje( cia. Lwo´w: n.p. Twardowski K (1901). Gło´wne poje( cia dydaktyki i logiki. Lwo´w: n.p. Twardowski K (1910). O filozofii s´redniowiecznej wykłado´w szes´c´. Lwo´w: Nakładem Altenberga. Twardowski K (1912). Mowy i rozprawy. Lwo´w: Towarzystwo Nauczycieli Szko´ł Wyz˙szych. Twardowski K (1913). ‘O psychologii, jej przedmiocie, zadaniach, metodzie, stosunku do innych nauk i o jej rozwoju.’ In Encyklopedia wychowawcza, vol. 9. Warszawa: Skład Gł. Ksie( garni Gubrynowicza. Twardowski K (1924). O istocie poje¸c´. Lwo´w: Wydawnictwo Polskiego Towarzystwa Filozoficznego. Twardowski K (1927). Rozprawy i artykuły filozoficzne. Lwo´w: Nakładem Komitetu Ucznio´w. Twardowski K (1965). Wybrane pisma filozoficzne. Warszawa: PWN. Twardowski K (1994). Etyka. Jadczak R (ed.). Torun´: A. Marszałek.

20th-Century Linguistics: Overview of Trends G Graffi, University of Verona, Verona, Italy ! 2006 Elsevier Ltd. All rights reserved.

Introduction Two theoretical trends can be considered as hallmarks of 20th-century linguistics: structural linguistics and generative grammar. They almost equally divide this epoch: structural linguistics (or, shortly, structuralism) flourished between the 1910s and the 1950s, generative grammar from the 1950s. Structuralism was not a unitary theory, but rather a galaxy of schools sharing some principles; furthermore, some important differences distinguish European structuralist schools from the American one. Generative grammar, instead, originated as a unitary theory, which subsequently divided into different schools and which stimulated several alternatives from scholars not accepting it.

Both structural linguistics and generative grammar also had an impact outside linguistics: between the 1950s and 1970s (especially in France), the former became the model for all humanities, hence a ‘structural’ anthropology, a ‘structural’ sociology, etc., were developed. Generative grammar, in its turn, was seen as one of the initial steps of the so-called cognitive revolution. Neither of these extensions was free of problems: in many cases, concepts of structural linguistics were applied to other fields with some illegitimate modifications, and the debate on what ‘cognitive’ really means has not yet come to a solution. All this, however, does not lessen the outstanding role of both structuralism and generative grammar within 20th-century linguistics and within 20th-century thought in general. This article will therefore focus almost exclusively on these two theoretical trends: even sociolinguistics will be dealt with rather as an alternative to generative grammar rather

182 20th-Century Linguistics: Overview of Trends

than in its applied aspects (language policy, etc.). Also other fields, such as language teaching, experimental phonetics, and so on, will not be presented in this overview: this does not mean that they have not reached important results during the 20th century.

20th-Century Linguistics vs. 19th-Century Linguistics: Continuities and Breakthroughs It is a widely held opinion that the 19th century has been ‘the century of comparative and historical linguistics’ and the 20th century that of ‘general’ or ‘theoretical’ linguistics. Such an opinion is certainly not ungrounded, but it needs some qualifications. Indeed, historical linguistics in the modern sense originated in the 19th century and experienced an astonishingly fast development: in the course of about 80 years, the whole structure of historicalcomparative grammar of Indo–European languages reached its final form. Later discoveries (e.g., of languages like Hittite or Mycenaean Greek) added some new data, but the overall architecture built by the Neogrammarians nevertheless remained valid, and it is still today the frame of reference for any historical linguist working in the domain of comparative grammar of Indo–European languages. However, historical-comparative grammar was not the only subject investigated by 19th-century linguists; as a matter of fact, many of them dealt with topics that one would certainly label, today, ‘general linguistics.’ This phrase may refer to somewhat different research perspectives, as, e.g., (1) speculation on language in general, hence also on language change (and investigation of the principles of historical linguistics plainly enters into this kind of research), and (2) all kind of linguistics that is not historical (‘synchronic,’ or ‘panchronic’ in Saussure’s terms). Both kinds of general linguistics were practiced during the 19th century. W. von Humboldt (1767–1835) was not an isolated exception: his speculations on the nature of language and his typological classification of languages were developed by several of his followers, such as Heymann Steinthal (1823–1899), Georg von der Gabelentz (1840–1893), and Franz Misteli (1841–1903). But also the first generations of comparative linguists had the study of language in general as their first goal. For example, Franz Bopp (1791–1867) reconstructed Proto-Indoeuropean verbal forms according to a scheme of verb phrase, which is heavily influenced from Port-Royal views. Even August Schleicher’s (1821–1868) views about language and language change belong to ‘general linguistics’ in the former of the senses alluded to earlier. The debate of the 1880s, about the ‘sound laws’

(Lautgesetze), possibly marks the highest point of this kind of general linguistics: shortly after conclusion of such debate, it gradually became less and less important. A ‘paradigm’ in the Kuhnian sense has developed: the majority of scholars consider only a given set of problems as ‘scientific,’ namely those of a historical kind. This paradigm is general labeled as the ‘Neogrammarian’ one: but it cannot be forgotten that many of the Neogrammarians also dealt with topics of general linguistics in both senses quoted earlier. The often labeled ‘Neogrammarian Bible,’ namely Hermann Paul’s (1846–1921) Prinzipien der Sprachgeschichte (I ed. 1880; V and last ed. 1920), deals with topics both of historical linguistics and of general linguistics. For example, Paul defined some oppositions that seem to foreshadow some Saussurean dichotomies (see next section) – that of ‘descriptive grammar’ vs. ‘historical grammar’ (which could be held to correspond to that between synchronic and diachronic linguistics), or that of ‘individual linguistic activity’ vs. ‘linguistic usage,’ which could be considered analogous to that of parole vs. langue. Similar distinctions were also introduced by Gabelentz in his Sprachwissenschaft (I ed. 1891; II ed. 1901), where three meanings of the term language (Sprache) are distinguished: (a) ‘discourse’ (Rede); (b) ‘a totality of expressive means for any thought’; and (c) ‘linguistic capacity’ (Sprachvermo¨ gen), i.e., ‘‘a faculty innate to all peoples of expressing thought by means of language.’’ Sense (a) could be made to correspond to Saussure’s parole, sense (b) to langue, and sense (c) to faculte´ de langage. Even if it is quite probable that Saussure knew both Paul’s and Gabelentz’s work, such correspondences are more seeming than real, as will be seen in the next section. The fact cannot be overlooked, however, that such matters typically belong to general linguistics. As can be seen from their life dates, neither Gabelentz nor Paul were much older than Ferdinand de Saussure (1857–1913): why, then, are they normally presented in histories of 19th-century linguistics, whereas Saussure is considered as the ‘father’ of 20th-century linguistics? This is mainly due to the fact that Saussure and his followers dealt with more or less the same stuff, but in a different perspective. Summarizing so far, one could say that topics in general linguistics show a continuity between the 19th and 20th centuries, but the way of looking at them shows a definite breakthrough.

Ferdinand de Saussure As is well known, Saussure’s Cours de linguistique ge´ ne´ rale (Saussure, 1922) was not directly written by him, but it was compiled by two former students,

20th-Century Linguistics: Overview of Trends 183

Charles Bally (1865–1947) and Albert Sechehaye (1870–1946), on the basis of the notes from class lectures given by Saussure in the academic years 1906–7, 1908–9, and 1910–11 at the University of Geneva. It is perhaps lesser known that neither Bally nor Sechehaye attended any of these lectures: they simply reworked and systematized the notes that others had passed to them. As a result, their reconstruction is often considered not quite faithful to the authentic Saussurean thought, especially after detailed studies of the handwritten notes by Godel (1957) and their edition by Engler (1967–74). Tullio De Mauro’s very detailed and insightful commentary on the Cours (published since 1972 together with Saussure’s original text) stresses many points of Saussure’s original thinking that were more or less modified by the editors. Today the exact knowledge of Saussure’s ideas cannot be gained without the support of De Mauro’s commentary and/or the attentive reading of Engler’s edition. Nevertheless, because only the Bally–Sechehaye edition was available until the 1960s, this text actually influenced the immediately subsequent linguists. Hence, reference will be made in what follows almost exclusively to the Bally–Sechehaye edition. Saussure’s linguistic views are standardly epitomized by his so-called dichotomies: (1) langue vs. parole; (2) synchrony vs. diachrony; (3) signifier vs. signified (signifiant ! signifie´ ); and (4) associative vs. syntagmatic relations. (1) opposes the social aspect of language, the code shared by a speaking community (langue), to the individual speech act (parole); (2) opposes the state of a language at a given moment of its history to its change during the time; (3) defines the ‘two sides of the linguistic sign,’ namely the ‘acoustic image’ and the ‘concept’; and (4) opposes the relation between elements in succession, such as teach þ ing in teaching, to that between elements alternative to each other: e.g., teaching as alternative to learning, or studying, etc. All these dichotomies were surely attested in 19th-century linguistics: that of the social vs. the individual aspect of language was already sketched among others by Paul and Gabelentz (see previous section; it must be added that Saussure also hinted at the notion of faculte´ de langage, language faculty, which shows analogies with Gabelentz’s Sprachvermo¨ gen); the synchrony/ diachrony opposition could have a foreshadowing in Paul’s distinction between ‘descriptive grammar’ and ‘historical grammar’; the idea that the linguistic sign is two sided, in the sense that the meaning is not external, but internal to it, could be traced back even to the Stoics; and also the existence of two kind of relations in language could already be found, e.g., in some of Paul’s pages. Two features, however, strongly differentiate Saussure from his predecessors: (1) a systemic

approach, and (2) a strong tendency to define the basic concepts of linguistics without anchoring them to other disciplines, such as sociology, psychology, etc. Saussure’s key notion is that of language (langue) as a ‘system of signs,’ each of which has no intrinsic value, but whose value is determined solely by their relationships with the other members of the system (‘‘dans la langue, il n’y a que des diffe´ rences’’; Saussure, 1922: 166; original emphasis). This sign system is the code shared by all members of a linguistic community, and its only root lies in this common sharing, because linguistic signs do not have any intrinsic value. Linguistics is a part of a more general science called semiology, namely ‘‘une science qui e´ tudie la vie des signes au sein de la vie sociale’’ (Saussure, 1922: 33; original emphasis). Langue is therefore a social notion because of its character of semiological code: parole is the use by the individual of this general code. Because linguistic signs have no foundation in external reality, but are purely differential entities, they can change across the time; this is the reason for the distinction between synchrony and diachrony. In Saussure’s view, however, system exists only in synchrony, at a given moment of time; diachrony does not concern systems, but only isolated elements. A linguistic change is therefore isolated and fortuitous; only when a given sign is changed, a new system is formed, because the relations between signs are different from earlier. Finally, associative and syntagmatic relations are synchronic, because they are essentially systematic. Saussure’s view of language paved the way to what was later called structural linguistics. Even if neither ‘structure’ nor ‘structural’ (but just syste`me) occur throughout Saussure’s text in a technical sense, the systemic approach to language and the definition of linguistic notions and categories on a purely linguistic basis (i.e., without reference to psychological categories, and so on) became the starting points of structural linguistics.

Saussurean Trends in Europe Geneva School

The editors of Saussure’s Cours, namely Bally and Sechehaye, were only weakly influenced by the systemic and autonomous approach to language developed in that book. This may seem paradoxical, but it has to be kept in mind that both Bally and Sechehaye had completed their linguistic formation before Saussure’s lectures in general linguistics. Among Saussurean notions, Bally especially deepened the langue/parole opposition. According to him, langue preexists parole from a static point of view. This relationship, however, is reversed from the

184 20th-Century Linguistics: Overview of Trends

genetic point of view, because parole preceded langue in the genesis of language (see, e.g., Bally, 1965). Sechehaye was concerned with problems of general linguistics and its relationships with psychology and sociology since his first book (Sechehaye, 1908); in subsequent years he analyzed some fundamental problems of syntax, such as the notion of sentence. These analyses are insightful, but essentially extraneous to the structuralist trend. A stronger structuralist approach characterizes the work of Bally’s pupil, Henri Frei (1899–1980); he revisited Saussure’s ideas on syntagmatic and paradigmatic relationships, on the one hand, while on the other hand he confronted other descriptions of syntactic structure, mainly those of Bloomfield and of the American structuralist school (see later discussion). Prague School

The so-called Prague School was formed by a group of linguists belonging to the ‘Prague linguistic circle,’ founded in 1926 by the Czech anglicist Vile´ m Mathesius (1882–1945). Members of this circle were, among others, the Russian scholars Serge Karcevskij (1884–1955), Roman Jakobson (1896–1982), and Nikolaj S. Trubeckoj (1890–1938). Prague School dealt with a variety of topics, from English syntax to literary criticism; however, it is especially known for its critical development of Saussurean notions expounded in the theses presented to the first International Congress of Linguists (‘Prague Theses,’ 1928) and for the contributions of some of its members to phonology (especially Trubeckoj and Jakobson). Among the most influential statements contained in Prague Theses, the following two can be quoted: (1) language is a functional system, whose goal is communication; and (2) the synchrony/diachrony dichotomy is not so neat as Saussure presented it: on the one hand, no linguistic state can be considered as totally independent from evolution and change; on the other, phonetic change is not blind and unsystematic as Saussure assumed, but it must be considered in the framework of the sound system that underpins it. The most important Prague work on phonology is surely Trubeckoj’s posthumous and unfinished Principles of phonology (Trubeckoj, 1939). Phonology was defined by Trubeckoj as ‘‘the science of sounds of langue,’’ whereas phonetics is ‘‘the science of sounds of parole.’’ The key notion of phonology is phoneme. Phoneme (the phonological unit) is opposed to sound (the phonetic one): the former is abstract, the latter is concrete and ‘realizes’ the phoneme. When two sounds occur in exactly the same positions and cannot be changed without a change in the meaning of the words, they are different

realizations of the same phoneme (‘rule 2’ of Trubeckoj, 1939): e.g., English [t], [p], and [k] realize three different phonemes, /t/, /p/, and /k/ because they distinguish, among others, the three words tin, pin, and kin. Two sounds may be different and nevertheless belong to the same phoneme: for example, English [t], [p], and [k] in the preceding examples are produced with an extra puff of air, but this puff of air does not occur, e.g., in spin. These two sounds do not distinguish any meaning: they are variants of the same phoneme. Phoneme inventories differ from language to language: sounds that realize the same phoneme in a language may be realizations of different phonemes in other languages. For example, aspirated stops are variants of the same phoneme in English, but they realize a phoneme of its own in Hindi. Since the late 1940s, Jakobson remarked that phonemes are not the ‘smallest distinctive units,’ but they are actually constituted by even smaller entities, the distinctive features. For example, /d/ differs from /n/ (cf. dine vs. nine) because of the feature ‘nasality’; and it differs from /t/ (cf. do vs. to) because of the feature ‘tensedness.’ During the 1950s, Jakobson, together with Morris Halle (b. 1923) further worked out his theory: any phoneme of any language is analyzed as containing or not containing a given feature from a universally fixed set of 12 (later 14) features, whose values are þ or #; (binary value, hence the label of binarism given to the theory). For example, English /t/ would have the following features: [-vocalic], [þconsonantal], [-compact], [-grave], [-nasal], [þtense], [-continuous] (for the meaning of these terms, see Jakobson and Halle, 1956). Both consonants and vowels are defined on the basis of the same features, and all languages have only this inventory of features at their disposal (but some languages exploit only some of them). Jakobson’s binarism was adopted (with some modifications) also by generative phonology (see discussion under Generative Phonology Section). Copenhagen School

The most well-known linguists of the ‘Copenhagen school’ are Viggo Brøndal (1887–1942) and Louis Hjelmslev (1899–1965). Both scholars vigorously maintained the structuralist point of view and their Saussurean heritage. Nevertheless, their approach to language in general and to syntax in particular shows many differences; whereas Brøndal considered language to be based on logic, Hjelmslev’s program was to give linguistics a logical basis, in the sense of the ‘logic of science,’ which was being developed in the 1930s by the neopositivistic philosophers.

20th-Century Linguistics: Overview of Trends 185

In his most important book, Hjelmslev (1943) aimed at constructing a deductive theory of language that he dubbed glossematics. Such a theory should be based on purely linguistic notions (‘immanent linguistics’) and should follow rigorous methodological standards (it has to be ‘‘self-consistent, exhaustive and as simple as possible’’). Hjelmslev assigned special importance to Saussure’s statement that ‘‘language (langue) is form, not substance.’’ He therefore distinguished form and substance both on the phonological and on the syntactico-semantic level (expression plane and content plane respectively, in his terms). On the expression plane, a continuous stretch of sound can be differently articulated, according to the different phonemic inventories: e.g., where English distinguishes three nasal phonemes (/m/, /n/ and /N/), Italian only distinguishes two (/m/ and /n/). On the content plane, the same continuum can be differently subdivided: e.g., English divides the color spectrum from green to brown in four sections (green, blue, gray, brown), whereas Welsh divides it in three (gwyrdd, glas, llwyd). Both planes are analyzable, according to Hjelmslev, into smallest units, which are limited in number, that he called figuræ: expression figuræ are phonemes, content figuræ are semantic units from which larger semantic units can be constructed (e.g., man would be formed by the content figuræ ‘human,’ ‘male,’ ‘adult’). Content figuræ and expression figuræ are not in one-to-one correspondence: this is the reason why two planes are postulated (otherwise, such a postulation would be superfluous and the theory would violate the simplicity requirement). Any structure that has an expression plane and a content plane is named by Hjelmselv a semiotic, whereas structures with one plane only are ‘symbolic systems.’ Each plane can in its turn be constituted by a semiotic, and so on. Structural Linguistics in France: Benveniste, Martinet

E´ mile Benveniste (1902–1976) combined his experience in the field of historical-comparative grammar of Indo–European languages with a particular skillfulness in the analysis of linguistic facts. He was surely well acquainted with the patterns of investigation worked out by European and American structuralism, but he resorted to them only to a limited extent. Somewhat paradoxically, this allowed him to sketch some analytical proposals that are sometimes superior to those of the structuralists. Among such proposals (see Benveniste, 1966), the most well known are his remarks on Saussure’s notion of arbitrariness of linguistic sign and those concerning the definition and classification of grammatical persons and of pronouns, which parallel the investigations

about performative utterances developed by Austin more or less during the same years (cf. Pragmatics section). Andre´ Martinet (1908–1999) was in the 1930s, a foreign member of Prague linguistic circle; he consistently developed that ‘functional view’ of language explicitly stressed by the Prague theses (see Prague School section). Natural languages, in Martinet’s view, have three features in common: (a) their communicative function, (b) their use of vocal utterances (i.e., natural language is essentially and primarily a vocal phenomenon, and only derivatively a written one), and (c) the double articulation, i.e., a first articulation into significant units (‘monemes,’ a term borrowed from Frei, but with a somewhat different sense, to replace ‘morpheme’), which are in their turn articulated into distinctive units (‘phonemes’). One of the most interesting instantiations of Martinet’s functionalism is his investigations of diachronic phonology (cf. Martinet, 1955); the ‘economy’ of sound changes is the effect of the balance of two opposed tendencies, the ‘minimal effort’ (which tends to lessen sound differences) and ‘communicative efficiency’ (which tends to multiply them).

Other European Scholars (Guillaume, Tesnie`re, London School) The scholars presented in this section, however, certainly influenced by Saussurean thought and hence ascribable to the structuralist trend, nevertheless remained somewhat aside from the debate that developed about the basic tenets of structural linguistics, especially between the two World Wars. This fact occurred also because the most significant works of some of them (e.g., Guillaume and Tesnie`re) were only posthumously published (see Guillaume, 1971– 1990; Tesnie`re, 1966). The writings of the French linguist Gustave Guillaume (1883–1960) are often difficult to read and interpret, especially because of the dark philosophical style that shows many influences. That by Henri Bergson (1859–1941) is especially significant in Guillaume’s analysis of concept of ‘time’ (which he does not clearly distinguish from that of ‘tense’). Among the several topics dealt with by Guillaume, one can quote his opposition between what is a formal expression in language (‘psychosemiotics’) and what is expressed by it (‘psychosystematics’); only what has a morphophonological representation of its own can be called semiotic. Guillaume also proposed to replace Saussure’s terminological pair langue/parole with langue/discours (speech). Speech necessarily presupposes language: their relationship

186 20th-Century Linguistics: Overview of Trends

can be expressed in terms of the pair ‘power’ (language) vs. ‘effect’ (speech). Lucien Tesnie`re’s (1893–1954) work is especially important for its contribution to syntax. The seminal notion of Tesnie`re’s syntax was that of valency. Tesnie`re compared the verb to ‘‘a kind of hooked atom’’ that can exert its power of attraction on a smaller or bigger number of ‘participant roles’ (actants). Besides participant roles, the sentence may also contain some ‘circumstantial roles’ (circonstants), which express the conditions of place, time, manner, etc., in which the process described by the verb takes place. Participant roles are obligatory; circumstantial roles are optional. The number of participant roles varies according to the verb class to which the verb belongs, so we have several verb classes according to their ‘valency sets.’ If Guillaume’s linguistic thought did not exert any special influence on subsequent scholars, Tesnie`re’s lies at the origin of several of the most important developments in syntax during the second half of 20th century (cf. Functionalist Schools section). Among the linguists of the ‘London school,’ Daniel Jones (1881–1967) and John R. Firth (1890–1960) especially have to be cited. Jones was a phonetician deeply involved in practical questions (such as, for example, the assessment of the principles of phonetic transcription), but he also faced theoretical questions, such as the definition of phoneme. In contrast with the more abstract view held by Trubeckoj (cf. Prague School section), who defined phoneme on an exclusively linguistic basis, Jones opted for a ‘physical’ definition of phoneme as ‘‘family of sounds related in character no member of which occurs in the same phonetic context of any other member.’’ Jones also rejected Trubeckoj’s sharp opposition between phonetics and phonology. Firth’s view of language is characterized by the key role it assigned to the notion of context. He defined ‘meaning’ as ‘function in context’: not only words and sentences, but even phonetic units have meaning. Firth’s contextual approach was especially fruitful in phonology. In his view, phonology cannot be limited to the segmentation and classification of sounds and phonemes (‘paradigmatic units’), but must take into account also prosodic, ‘syntagmatic’ units such as the syllable (hence the name of ‘prosodic phonology’ given to Firth’s theory). It is therefore necessary to study syllabic structure in terms of general sound classes, such as C(onsonant) and V(owel), and of their respective positions. Firth also maintained that grammar (i.e., syntax and morphology) and phonology are interdependent, anticipating in this way positions that will be later held in generative phonology (cf. Generative Phonology section).

Among Firth’s students, one may cite R. H. Robins (1921–2000), especially for his historical researches on classical and medieval linguistics, and M. A. K. Halliday (b. 1925), whose ‘Systemic Functional Grammar’ (see Functionalist Schools section), worked out since the 1960s in successively revised versions, had several important applications in many fields, as artificial intelligence, discourse analysis, or language education.

American Linguistics from 1920s through 1960s Sapir and His Heritage

American linguistics began to show peculiar features, different from those of European linguistics, from the beginning of 20th century. At that time, mainly because of the influence of the anthropologist Franz Boas (1859–1942), American linguists oriented a lot of their research to the study of Amerindian languages. Because such languages were devoid of written tradition, these scholars were automatically led to adopt a synchronic point of view. Furthermore, given the difficulty of applying the notions of Western grammar to such languages, attempts had to be made at describing them in purely formal ways: this led to the development of ‘distributional’ and ‘classificatory’ methods, i.e., based on the observation of pure occurrences of forms. Edward Sapir (1884–1939) was himself an anthropologist and a linguist at the same time and devoted much of his research to Amerindian languages. His theoretical ideas are expressed in his book Language (Sapir, 1921), and in several papers posthumously collected (Sapir, 1949). According to Sapir, language is an ‘overlaid function’ from a physiological point of view, because it is a psychological and symbolic phenomenon. Hence, a purely physiological view of speech sounds, as was typical of late 19th-century experimental phonetics, is untenable: e.g., English wh-sound as in when, where, etc., is physiologically identical with the sound produced blowing out a candle, but the two sounds are essentially different, because only the linguistic wh- is ‘placed’ in a system that is composed ‘‘of a definitely limited number of sounds.’’ In this way, Sapir arrived at a psychological conception of phonemes and variants. Sapir’s classification of grammatical concepts, on which his new approach to language typology is based, also deserves special mention. He classified them into two main groups: concepts that express ‘material content’ and ‘relational’ concepts. Each of both groups is further subdivided into two groups: so one obtains ‘basic concepts’ (1a), ‘derivational

20th-Century Linguistics: Overview of Trends 187

concepts’ (1b), ‘concrete relational concepts’ (2a), and ‘pure relational concepts’ (2b). Concepts (1a) and (1b) are mainly semantically (or ontologically) based; they are ‘objects,’ ‘actions,’ ‘qualities’ (1a) and their derivations (1b). Gender and number belong to the group (2a), whereas grammatical relations (subject, object, attribute, etc.) belong to pure relational concepts (2b). The possible combinations of the four groups of concepts brings about Sapir’s classification of languages into four ‘conceptual types.’ Type (A) languages only contain concepts (1a) and (2b); those of type (B), concepts (1a), (1b) and (2b); those of type (C), concepts (1a), (2a) and (2b); those of type (D), all four kinds of concepts. Among Sapir’s followers the important contribution of Morris Swadesh (1909–1967) to phonemics (a term that corresponds to the European ‘phonology’) is to be noted. The most well-known heritage of Sapir’s thought is, however, the so-called Sapir– Whorf hypothesis, so called after his name and that of his follower, the nonprofessional linguist Benjamin L. Whorf (1897–1941). According to the theory, our vision of the world is heavily conditioned by our language. This hypothesis is today rejected, especially by those linguists, as generative grammarians, who maintain that cross-linguistic differences are actually more apparent than real. On the other hand, generative grammarians have often reevaluated Sapir’s work against Bloomfield’s, because of its psychologistic (or mentalistic) approach strongly avoided by the latter. Bloomfield

Leonard Bloomfield’s (1887–1949) behavioristic approach (see especially chap. 2 of Bloomfield, 1933) essentially consists of describing language as a chain of stimuli and responses (S-r-s-R): a speech event takes place when a nonlinguistic stimulus (S; e.g., hunger) produces a linguistic response (r) in the speaker (give me something to eat!) and such a response in its turn induces a linguistic stimulus (s) in the hearer, which has as a consequence the hearer’s nonlinguistic response (R; e.g., providing food). Bloomfield’s major contribution to linguistics certainly does not lie in this crudely mechanistic view of language function, but instead in the working out of some analytical tools, particularly in the domains of morphology and syntax. The most influential of such tools is the so-called Immediate Constituent (IC) Analysis (see Bloomfield, 1933: chap. 13). In the classical example Poor John ran away, the immediate constituents are Poor John and ran away. The analysis goes on by partitioning poor John into poor and John, and ran away into ran and away. Furthermore, away is also analyzed into a- and way: the principle of

immediate constituent analysis applies to morphology exactly in the same way as to syntax. IC analysis was subsequently deepened and formalized, not only by ‘post-Bloomfieldian’ linguists, but also within generative grammar. Post-Bloomfieldian Structuralism

Linguists most directly influenced by Bloomfield’s thought and analytical techniques especially developed the operational and distributional features of his conception of language. One of the most typical examples of this methodological trend was the socalled prohibition of mixing levels: phonological analysis must precede grammatical analysis and must not assume any part of the latter. Among post-Bloomfieldian scholars, the following can be quoted, according to the different linguistic domains: concerning phonology, William F. Twaddell (1906–1982), Bernard Bloch (1907–1965), and George L. Trager (1906–1992); concerning morphology and syntax, besides Bloch and Trager, Eugene Nida (b. 1914); a very important intervention in the domain of IC-analysis is due to Rulon S. Wells (b. 1919); possibly, the leader of the group can be considered Charles F. Hockett (1916–2000), who dealt with all such different fields and also worked on theoretical problems, especially in polemics with generative grammar. The most original and influential among American structuralists is, however, Zellig S. Harris (1909–1992): in the1940s, he was engaged in the deepening and the formalization of Bloomfield’s analytical techniques (see especially Harris, 1951); in the early 1950s, he worked out the notion of transformation. In Harris’s framework, a transformation is seen as an equivalence relation between two different sentence-forms: e.g., Casals play the cello and The cello is played by Casals, or he met us and his meeting us are ‘transforms’ of each other. The notion of transformation (with important modifications) was to become a cornerstone of generative grammar, especially in its first phases (see later discussion). Tagmemics and Stratificational Grammar

Tagmemics is the name given to the linguistic theory worked out by Kenneth L. Pike (1912–2000) and his associates and students. It combines both Bloomfield’s and Sapir’s insights, but it trespasses the boundaries of American structuralism in many respects. The Bloomfieldian side of tagmemics lies in its analytical techniques, which resume, deepen, and modify Bloomfield’s. On the other hand, Pike’s approach to language is decidedly and explicitly Sapirean:

188 20th-Century Linguistics: Overview of Trends

language is seen as a cultural phenomenon, strictly tied to other cultural manifestations of human life. Stratificational grammar was developed by Sydney M. Lamb (b. 1929) beginning in the late 1950s. It combines a post-Bloomfieldian approach with some European perspectives, mainly Hjelmslev’s glossematics and Halliday’s Systemic Functional Grammar. The number of assumed strata varies from two to six, according to the different versions of the theory. Mostly, four strata are assumed: semotactics, lexotactics, morphotactics, and phonotactics. Stratificational grammar aims at giving an account of all kind of linguistic processes, i.e., concerning both competence and performance (see later discussion): it shows, therefore, a ‘cognitive’ approach that sharply differentiates it from classical post-Bloomfieldian theories and makes it closer to generative grammar, although it is very distant from this latter theory both in the assumed principles and on many technical aspects. The Beginnings of Typological Linguistics

Language typology was an important field in 19thcentury linguistics, but was rather overlooked during the first half of 20th century, because Sapir’s work on the topic remained an isolated exception. Things began to radically change in the 1960s, especially stimulated by the work of Joseph H. Greenberg (1915–2001). In Greenberg’s perspective (see Greenberg, 1966), a close link is assumed between typology on the one hand and universals on the other. Language universals are no longer exclusively conceived as features that every language must possess: to such universals, named by Greenberg ‘unrestricted’ universals, also implicational universals and statistical correlations have to be added. The most well known instances of implicational universals concern the linear ordering of elements. Greenberg assumed as the bases of his language classification three possible choices: (1) whether a language has prepositions or postpositions (‘prepositional’ vs. ‘postpositional’ languages). (2) The position of the verb (V) with respect to the subject (S) and to the object (O). Of the six theoretically possible positions, only three normally occur: VSO, SVO, and SOV. (3) The order of the adjective with respect to the noun it modifies: A ( ¼ AN) vs. N ( ¼ NA). Such choices are systematically correlated with each other in an implicational way: this implication can be exceptionless or only statistically significant. An instance of the first case is the statement that if a language shows VSO order, it is always prepositional (Greenberg’s Universal 3). On the other hand, Greenberg’s universal 4 is an example of ‘statistical correlation’: if a language has a normal SOV order, it is postpositional ‘‘with overwhelmingly

more than chance frequency.’’ Greenberg’s insights caused a tremendous development of typological studies, as will be seen in the Typological Linguistics section.

The Birth and Rise of Generative Grammar The Origins of Generative Grammar

Generative Grammar (GG) is the label for the linguistic theory developed by the American scholar Noam Chomsky (b. 1928) and his followers; a GG, in Chomsky’s own word, is ‘‘a system of rules that in some explicit and well-defined way assigns structural descriptions to sentences’’ (Chomsky, 1965: 8). Chomsky was a student of Harris (cf. previous section), but he early adopted a ‘mentalistic’ approach to the problems of language and knowledge, highly polemical against the behavioristic one, typical of Bloomfieldian and post-Bloomfieldian linguistics. The first systematic version of Chomsky’s theory appeared in print in a booklet called Syntactic Structures (Chomsky, 1957), which was partly an abstract of a much more voluminous work written in the years 1955–56 and published only 20 years later, with some modifications. The main features shown in this book with respect to the tradition of American structural linguistics were the following ones: (1) the goal of linguistic description is no more seen in the analysis of a given corpus, but in the accounting for the intuitions of the native speaker of a given language (wellformedness of sentences, synonymy, etc.). (2) A sharp distinction is traced between linguistic theory on the one hand and grammar on the other. (3) IC-analysis typical of American structuralism (see previous discussion) is formalized in a system of rules called Phrase-structure (PS) grammar. (4) PS-grammar is shown not able to adequately account for all sentences of any natural language. For example, it cannot account for the intuitive relation that any English speaker recognizes between two sentences such as Mary gave a book to John and John was given a book by Mary, or between the latter and Who was given a book by Mary? To account for such kind of relations, it is necessary to postulate a further level of rules, called transformations. This notion was borrowed from Harris, but it is rather differently conceived. Whereas, for Harris, it is a relation between sentences, for Chomsky it is a relation between structures. This means that the input of a transformation is a sentence in Harris’ framework, whereas in Chomsky’s one it is an abstract structure often rather remote from the actual sentence that it underlies. The importance given to the notion of transformation in the early phase of GG had the effect that Chomsky’s

20th-Century Linguistics: Overview of Trends 189

theory was initially known as transformational grammar rather than as generative grammar (actually, the use of the latter label was rather unsystematic at that time). The Standard Theory

In the decade 1955–1965, the model of grammar described in the previous section was modified by Chomsky himself and by some of his early associates, such as Charles J. Fillmore (b. 1929), Jerrold J. Katz (1932–2002), Edward S. Klima (b. 1931), Robert B. Lees (1922–1996), and Paul M. Postal (b. 1936). The result of such changes was the so-called (by Chomsky himself) standard theory, presented in Chomsky (1965). The overall structure of the standard model is the following one: PS-rules and lexical insertion rules generate the deep structure both of simple and of complex sentences. The application of transformational rules to deep structure produces surface structures. PS-rules, lexical rules, and transformations form the syntactic component of grammar; deep structures are interpreted by the semantic component, giving the semantic representation of sentences; and surface structures are interpreted by the phonological component, giving the phonetic representation. In Chomsky (1965), also the ‘mentalistic’ interpretation of linguistic theory, explicitly defined as ‘part of theoretical psychology,’ was maintained and argued for in detail. Chomsky opposed competence, defined as ‘‘the speaker–hearer’s knowledge of his language,’’ to performance, which is defined as ‘‘the actual use of language in concrete situations.’’ The linguist has to discover ‘‘the underlying system of rules’’ (i.e., the competence) ‘‘from the data of performance’’ (Chomsky, 1965: 4). A grammar that correctly describes the competence of a native speaker of a given language is said to be descriptively adequate. A linguistic theory is said to be explanatorily adequate if it ‘‘succeeds in selecting a descriptively adequate grammar on the basis of primary linguistic data’’ (Chomsky, 1965: 25). The task of linguistic theory, then, becomes that of accounting for the properties of the LAD (Language Acquisition Device), i.e., the device that allows the child to construct a grammar from among a set of possible alternatives. Generative Phonology

Generative phonology was discussed in several essays since the late 1950s and found its systematic presentation in Chomsky and Halle (1968). The starting point of generative phonology is that phonology is ‘not-autonomous’ from syntax: some phonological processes depend on morphological and syntactic

structure. For example, the falling stress contour of blackboard is opposed to the rising one of black board because the former is a compound, hence belongs to the syntactic category N, whereas the latter is a Noun Phrase. Therefore the rules of assignment of stress contour must refer to syntactic surface structure (cf. Chomsky-Halle, 1968: chap. 2.1.). This is the reason why the phonological component is said to ‘interpret’ the syntactic component (see previous section). This strict interrelation assumed between the phonological and the syntactic level is quite contrary to the prohibition of mixing levels typical of post-Bloomfieldian structuralism (cf. section on this topic; Pike had already criticized this principle). Generative phonology considered the autonomous approach as a basic flaw of structuralistic phonology, both European and American, labeled ‘autonomous phonemics’: the notion itself of phoneme as conceived in such frameworks was rejected. Generative phonologists, on the one hand, took advantage from some difficulties in assigning which variants to which phonemes that had already been remarked upon within structuralistic phonology; on the other hand, they maintained that the assumption of an autonomous phonemic level often produces a loss of significant generalizations (the classical case was that of voicing of Russian obstruents, brought forward by Halle). Hence, generative phonology does not assume a phonemic level, but only a phonological representation and a phonetic representation. The former representation is derived from syntactic surface structure by means of readjustment rules; the latter is derived from the phonological representation by means of phonological rules, which apply in a given order. Both phonological and phonetic representations are strings of word and morpheme boundaries and of feature matrices. In such matrices, columns are segments, and rows indicate the value of features. Features of generative phonology only partly overlap with Jakobson’s ones (see section on Prague School): their number is higher (about two dozen vs. 12 or 14), and they are mainly defined on an articulatory rather than on an acoustic basis. Features are ‘by definition’ binary at the level of phonological representation, whereas they are not necessarily binary at the phonetic one. An essential part of generative phonology is the so-called theory of markedness (developing, but also essentially modifying, insights of Prague phonology): features, segments and rules are not on the same plane, but some of them are more natural in the sense that they are more frequent, are acquired by the child earlier than others, etc. This greater or lesser naturalness is accounted for in terms of unmarkedness vs. markedness of the concerned entities and rules.

190 20th-Century Linguistics: Overview of Trends

Since the 1970s, alternative approaches to the strictly segmental or linear model of Chomsky– Halle (1968) have been developed. For example, feature values and segments were no more seen as necessarily in one-to-one correspondence, but it was assumed that in some cases a single feature can extend over more than one segment, and, vice versa, a single segment can subsequently take two opposite values of the same feature (autosegmental phonology). It was also assumed that the domain of application of phonological rules is not only determined by the syntactic surface structure and readjustment rules, but also that the phonological representation has a hierarchic structure of its own, not necessarily coinciding with the syntactic one (prosodic phonology).

The Impact of Generative Grammar Generative grammar (or, more exactly, generative syntax) aroused great interest among linguists shortly after the publication of Chomsky (1957). This interest became still greater in the subsequent decade, especially after the appearance of Chomsky (1965) and also reached logicians and philosophers of language. Generative tenets were not accepted by everybody: quite the contrary, many of them were sharply criticized. However, the large majority of linguists felt obliged to take a position on them. The following tenets were especially the focus of discussion: (1) The mentalistic view of linguistics (cf. The Standard Theory), which was later called cognitive. (2) The assumption that linguistic theory has to deal with ‘an ideal speaker–hearer,’ within a ‘homogeneous linguistic community’: i.e., the social and communicative aspects of language do not influence its structure. (3) The notion of Universal Grammar (UG), resuscitated by Chomsky (1965) with explicit reference to the tradition of grammaire ge´ ne´ rale starting with Port-Royal. From the early 1970s, UG essentially came to mean what he had earlier dubbed the ‘language acquisition device’ (LAD; cf. The Standard Theory): it was assumed to be universal since it would be shared by all human beings. (4) The postulation of two different levels of representation (deep and surface structure). It is therefore possible to investigate the development of linguistic trends grown from the last 1960s according to the position they took with respect to the previously listed generative tenets. (1) Chomsky’s cognitive view of linguistics was actually opposite the main structuralist trends, both in Europe and America, which conceived linguistics as an autonomous field. This new view was rejected, or at least dismissed as irrelevant, by some strictly formal approaches, such as Relational Grammar (see later

discussion). It was, however, shared by the majority of trends during the last decades of 20th century, but often in a rather different way from Chomsky’s. Indeed, although Chomsky simply assumed that to do linguistics is to do ‘theoretical psychology,’ many scholars maintained that linguistic explanations have to be traced back to more general psychological or cognitive factors, or, at least, they must be supported by independent psychological evidence. (2) Chomsky’s low evaluation of social and communicative aspects of language contrasted with many earlier linguistic trends, even of the structuralist kind: e.g., Prague school defined language as a ‘means of communication.’ The view of language as a social phenomenon had been maintained at least since Meillet and it was strongly reaffirmed by scholars such as Uriel Weinreich (1926–1967) as an explicit rejection of Chomsky’s views. Other opposition came from the pragmatic approaches to linguistic analysis that were developing within the philosophical tradition. More or less explicitly, all such trends opposed a ‘social– communicative’ view of language to the ‘cognitive’ one. (3) A revival of interest in the problem of linguistic universals had been already shown by researches such as Greenberg’s; Chomsky’s notion of UG clearly developed such interest in an unprecedented way. However, Chomsky’s version of UG was not accepted by anybody: the different approaches to language universals were strictly linked with the different views of linguistics as a cognitive science and of relationships between language on the one side and social and communicative phenomena on the other. (4) Also the question of levels of representation was often linked to the problems of linguistic universals: several scholars equated ‘deep structure’ with UG, and ‘surface structure’ with cross-linguistic variation. These interpretations were misled, because both ‘deep’ and ‘surface’ structure had a specific technical value within a theoretical framework (see, e.g., Chomsky, 1975: 82). Nevertheless, they exerted a not negligible impact even on trends that were very distant from the Chomskyan one. Many of the debates between the different generative schools concentrated on the question if a distinction between whether a ‘deep’ and ‘surface’ structure is really necessary and on the nature of the ‘deep’ level. In the following sections, trends stemming from generative grammar are distinguished from trends alternative to it. Such a distinction only refers to historical roots: the former trends were worked out by linguists originally (i.e., at the epoch of the standard theory) belonging to the generative group, the latter by scholars outside it. Nevertheless, several trends of the former group eventually became wholly alternative to the generative model.

20th-Century Linguistics: Overview of Trends 191

Trends Stemming from Generative Grammar Generative Semantics and Its Heritage

Generative Semantics (GS) was worked out between the 1960s and 1970s by scholars such as George Lakoff (b. 1941), James D. McCawley (1938–1999), Paul M. Postal (b. 1936) and John R. Ross (b. 1938). It was sharply opposed to the Extended Standard Theory (EST) by Chomsky and some of his followers. Both approaches shared a realistic view of linguistics and a multilevel approach to syntax, but their way of implementing such ideas was totally different. In their first works, generative semanticists rejected some basic assumptions of the standard theory: according to them, (a) deep structure was a useless concept, and (b) linguistic description must be semantically based. This semantic basis was sought in the reduction of linguistic categories to logical and/or psychological categories: semantic representation should be made to coincide with natural logic. In later works, it was assumed that semantic representation also includes typical semantic and pragmatic categories, such as focus or presupposition. From the early 1970s, the leading ideas that had characterized the followers of GS were gradually abandoned, and each generative semanticist followed his own way. Lakoff first tried to work out a ‘fuzzy’ grammar, according to which grammatical categories are not discrete, but form a continuum from the noun at one end to the verb at the other. McCawley moved toward an empirical and somewhat skeptical approach to syntax: contrary to EST, McCawley kept on rejecting any theory of language acquisition that did not take into account general cognitive properties. From this point of view, Cognitive Grammar by Ronald W. Langacker (b. 1942) and his associates could be considered as a legitimate heir to Generative Semantics. From the middle 1970s, two linguists formerly belonging to the GS group, David M. Perlmutter (b. 1938) and Paul M. Postal, developed a theory called Relational Grammar (RG). RG completely abandoned the notion of transformation as an operation on hierarchically and linearly ordered phrase markers. It also explicitly rejected any aim at being ‘psychologically real.’ RG takes grammatical relations as primitives and represents clause structure as an unordered set of constituents that bear grammatical relations to each other. Grammatical relations may change from one level (‘stratum,’ in RG terminology) to another. Strata are not connected by means of transformations, but of Relational Networks, which show which different grammatical relations the constituents bear at different levels.

Fillmore’s Case Grammar was often associated to GS, but it is essentially independent from it, even if both approaches wholly replaced the standard notion of deep structure. In Fillmore’s view, the ‘basic structure’ of the sentence consists of the verb and an array of case relationships (see Fillmore, 1968). By ‘case,’ Fillmore does not mean a morphological category, but an ‘underlying syntactic–semantic relationship.’ The elements of the basic sentence structure are unordered. ‘One-Level’ Approaches to Syntax

Generative Semantics pushed the distance between ‘deep’ and ‘surface’ structure to its extreme, by identifying deep structure with semantic representation. RG preserved a multilevel approach to syntax. From the middle 1970s, other linguistic trends originated that took the opposite path, giving up the distinction between deep and surface structure and assuming a single level of syntactic representation. The first systematic proposals in this direction are due to Michael K. Brame (b. 1944). The most successful of such ‘one-level’ approaches were, however, LFG (Lexical– Functional Grammar) and GPSG (Generalized Phrase Structure Grammar). LFG was initiated by Joan Bresnan (b. 1945), a former Chomsky graduate student, and GPSG by a British scholar, Gerald Gazdar (b. 1945), who was later joined in his research program by other British and American linguists. On the one hand, GPSG and LFG share several assumptions: e.g., both avoid transformations and resort to other techniques to solve problems that standard theory dealt with in transformational terms. On the other hand, they originated from and developed with rather different goals and concerns. LFG’s original goal was the search for a ‘realistic’ grammar. GPSG was worked out mainly on the basis of formal concerns and had no special interest in building a ‘psychologically real’ grammar. From EST to the ‘Minimalist Program’

The syntactic theory worked out by Chomsky and his closest associates in the period from the late sixties until now had as its primary goal that of implementing the notion of Universal Grammar: the development of an adequate model of UG was seen as the proper goal of the cognitive view of language. This theory was called, during the 1970s, Extended Standard Theory (EST); in the 1980s, Principles and Parameters Theory (P&P) or ‘Government-Binding Theory’ (GB-Theory); from the early 1990s, the Minimalist Program (MP). Three works of Chomsky’s could be considered the landmarks of each of these three phases: Chomsky (1973) for EST; Chomsky (1981) for P&P; and Chomsky (1995) for MP.

192 20th-Century Linguistics: Overview of Trends

EST’s main concern was the definition of restrictions on the functioning and on the format of syntactic rules. The first, decisive, step in this direction was the system of conditions on transformations of Chomsky (1973). More or less in the same period, Joseph E. Emonds (b. 1940) and Ray S. Jackendoff (b. 1945) formulated some important constraints on the format of transformational rules (Emonds) and phrase structure rules (Jackendoff). The great abstractness of all such conditions was assumed to be the proof that they could not possibly have been taught by adults or inductively discovered by the child. They were assumed to belong to Universal Grammar, namely the ‘innate biological system’ that is ‘‘invariant about humans’’ (Chomsky, 1975: 29). The innateness hypothesis, of course, contrasts with the actual cross-linguistic diversity. The Principles and Parameters approach was the first real effort made within the Chomskyan program to provide a systematic account of cross-linguistic differences. The universal features of language were dubbed principles, and the dimensions along which languages can vary, parameters. For example, the fact that a sentence in any language must have a subject would be a principle: but in some languages (e.g., Italian, Spanish, etc., as opposed to English, French, etc.) the subject may be ‘null,’ i.e., not phonetically realized. This option is called the ‘null-subject-parameter’: it has ‘positive value’ in Italian or Spanish, ‘negative value’ in English and French. Although principles are innate, the values of parameters are to be fixed on the basis of experience. ‘Principles and Parameters’ approach stimulated an amount of research much larger than anything previously done within any other framework connected with generative grammar. In particular, the notion of parameter stimulated cross-linguistic investigation of several languages. Chomsky, however, was more interested in the depth than in the breadth of explanation (in a sense, more to explanatory adequacy than to descriptive adequacy) and since the early 1990s developed the ‘Minimalist Program.’ The leading criterion of MP can be considered that of economy, namely resorting to the least possible number of entities and of levels of representation. Therefore, MP disposed of the levels of ‘deep’ and ‘surface structure’ and assumed Phonetic Form (PF) and Logical Form (LF) as the only levels of representation. Nevertheless, MP cannot be equated with ‘one-level’ approaches discussed in the preceding section. In fact, also this last version of Chomskyan generative syntax essentially assumes a very abstract relation between the phonetic and the semantic side of language: PF and LF are related by the computational system, i.e., a transformational apparatus. One of the main goals of MP is just to

show why transformations exist: prima facie, they would seem to be antieconomical. The answer is that they exist to replace uninterpretable features, which are also antieconomical: by so doing, both imperfections erase each other. Natural language is therefore a perfectly economical system, and from this point of view, Chomsky maintains, it is very rare among other biological systems.

Trends Alternative to Generative Grammar Functionalist Schools

The common feature of functionalism is the assumption that language structure is conditioned by its function as a means of communication. This approach was already taken by some structuralist scholars, such as Martinet, and especially the founder of Prague school, V. Mathesius, who distinguished between the formal (i.e., the grammatical) and the actual (i.e., the communicative) partition of the sentence. Mathesius’s insights were taken over by Prague linguists of the subsequent generation, such as Frantisˇ ek Danesˇ (b. 1919) and Jan Firbas (1921– 2000), who coined the term Functional Sentence Perspective to mean Mathesius’s actual partition. From the 1960s, the most significant functionalist schools developed as an explicit alternative to the formal paradigm of generative grammar: Functional Generative Description (FGD), mainly worked out by Petr Sgall (b. 1926) and his associates, which represents a further stage of Prague School linguistics; Simon Dik’s (1940–1995) Functional Grammar (FG) and Halliday’s Systemic Functional Grammar (SFG). FGD proponents did not reject generative grammar as a whole, but maintained that it was too partial as an approach to language; on the other hand, they considered exclusively pragmatic approaches to be partial as well. Despite their differences, all these functionalist schools share an important common core, the main points of which are the adoption of (a) Functional Sentence Perspective, and (b) a kind of Tesnie`re’s valency grammar, in its original form or mediated through Fillmore’s Case Grammar. Hence, their fundamental problem was to work out a device to explain the relationship between the system of Tesnie`re’s roles (or Fillmore’s ‘deep cases’) and the grammatical and communicative organization of the sentence. Typological Linguistics

Typological linguistics since the early 1970s, mainly developed as an attempt to explain word order

20th-Century Linguistics: Overview of Trends 193

correlations stated by Greenberg (see the section titled ‘The Beginnings of Typological Linguistics’), and it gradually replaced purely syntactic explanations with semantically and pragmatically based ones. It is therefore independent from generative grammar both in its origins and in its achievements. However, some insights stemming from generative grammar influenced typological studies, especially in the 1970s. For example, Winfred P. Lehmann (b. 1916) started from a syntactic model analogous to that of Fillmore’s Case Grammar. He assumed an unordered ‘underlying structure,’ to be converted into a linearly ordered one by a rule with phrase-structure format: therefrom VO-languages vs. OV-languages would result (cf. Lehmann, 1973). The most significant development of Greenberg’s proposals about word order universals is due to John A. Hawkins (b. 1947), who showed that cases that appear as exceptions in Greenberg’s treatment are actually not exceptions, if Greenberg’s universals are reformulated in a ‘complex’ form. An example of such reformulation would be the following: ‘‘if a language is SOV, then, if it has AN order, it has also GN order’’ (cf. Hawkins, 1983: 64). Two other key notions developed in the framework of typological linguistics, especially by Edward S. Keenan (b. 1937) and Bernard Comrie (b. 1947), are continuum and prototype: categories are no more defined, as in generative grammar, in terms of possessing or not possessing a given property, but as clusters of properties. If all such properties occur, the concerned category is ‘prototypical’; the deviations from the prototype are distributed along a ‘continuum.’ Sociolinguistics

The label ‘sociolinguistics’ was firstly used in 1952 by the American Haver C. Currie (1908–1993), but it became widespread from about the late 1960s. In the last decades, this label ended in indicating a variety of researches, both of theoretical and applied kind, from ‘ethnography of speaking’ to ‘language policy.’ Between the 1960s and the 1970s, however, a sociolinguistic trend presented itself as an alternative approach to generative grammar. The leader of this trend undoubtedly was William Labov (b. 1927), to whom the notion of variable rule is due. Variable rules have the format of PS-rules of generative grammar (more exactly, of contextsensitive PS-rules): their application or nonapplication, however, is not categorical, but it is conditioned by some probability factors, both of linguistic and extralinguistic (i.e., social, stylistic, regional) kind. By resorting to the device of variable rules, Labov was able to account for the different realizations of the same grammatical phenomenon across different

social groups (a paradigmatic case was that of the contraction vs. deletion of the copula be in white vs. black American English speakers; see Labov, 1969). The status of variable rules became a topic of intensive discussion. Do they belong to competence or to performance? Labov initially assumed that they are part of competence, but he eventually (in the 1970s) rejected the usefulness of such distinction. On the other hand, a radical revision of Chomsky’s notion of competence had been proposed in 1968 by Dell Hymes (b. 1927), who replaced it by that of communicative competence, which indicates the speaker’s ability to use language according to the different social and contextual situations. It can be remarked that, since the EST period, also Chomsky referred to a ‘pragmatic competence’ interacting with the grammatical one. Hence, the problem is whether grammatical competence is independent or not from ‘pragmatic’ or ‘communicative’ competence: Chomsky’s answer was affirmative, whereas that of sociolinguistics and pragmaticists, negative. Pragmatics

From the 1960s, pragmatics presented itself as an alternative to the Chomskyan view of language as a cognitive capacity fully independent from its use. Indeed, the roots of pragmatics lay earlier than generative grammar: the term had been created in 1938 by the philosopher Charles W. Morris (1901–1979) and the research field had as its initiators, between the 1950s and the 1960s, two British philosophers of language, John L. Austin (1911–1960) and H. Paul Grice (1913–1988) (neither of whom, however, used the word pragmatics). Austin maintained that speech is action (cf. Austin, 1962). The primary evidence for this is given by the utterances called performative by Austin, such as I promise you to come, by means of which I am not only saying something, but also doing it. Performative utterances are a kind of illocutionary act: examples of illocutionary acts are question, order, etc. According to Austin, a speech act, besides the illocutionary act, consists of the locutionary act (the uttering of given words and phrases) and the perlocutionary act (the intended effect of the speech act on the hearer). This classification of speech acts was partly revised by John R. Searle (b. 1932). The original motivation of Grice’s ‘logic of the conversation’ (which dates back to essays from the 1950s, eventually collected in Grice, 1989) was to show that there is no real divergence between the meaning of symbols such as !, 8, 9, etc., of formal logic and their counterparts not, all, some, etc., in natural language: the apparent differences of meaning are due to certain principles governing conversation, the conversational maxims. If I utter a sentence such

194 20th-Century Linguistics: Overview of Trends

as Some students passed the examination, I am undoubtedly saying the truth, even if every student did in fact pass the examination, but the hearer normally interprets it as meaning that only some students, and not all of them, passed the examination. This is due to the fact that I have violated the ‘maxim of quantity’: ‘‘make your contribution as informative as is required’’ (cf. Grice, 1989: 26). From my violation, the hearer has drawn the conversational implicature that only some students, and not all, passed the examination. The analysis of speech acts and of logic of conversation are still today at the center of interest of pragmaticists. The interest in conversation also made pragmatics include a good deal of text linguistics, which originally started as a project to extend formal techniques of generative kind to units larger than sentences. In recent decades, text linguistics seems to have been replaced by the more empirical and informal conversational analysis, initiated by sociologists such as Harvey Sacks (1935–1975) or Harold Garfinkel (b. 1929), but later adopted by pragmaticsoriented linguists. See also: Phoneme; Autosegmental Phonology; Benveniste, Emile (1902–1976); Bloomfield, Leonard (1887– 1949); Case Grammar; Chomsky, Noam (b. 1928); Cognitive Grammar; Cognitive Science: Overview; Distinctive Features; Firth, John Rupert (1890–1960); Functionalist Theories of Language; Generative Grammar; Generative Phonology; Generative Semantics; Halle, Morris (b. 1923); Halliday, Michael A. K. (b. 1925); Harris, Zellig S. (1909– 1992); Hockett, Charles Francis (1916–2000); Jakobson, Roman (1896–1982); Labov, William (b. 1927); Lexical Functional Grammar; Martinet, Andre´ (1908–1999); Minimalism; Nida, Eugene Albert (b. 1914); Phonemics, Taxonomic; Prague School; Principles and Parameters Framework of Generative Grammar; Saussure, Ferdinand (-Mongin) de (1857–1913); Saussure: Theory of the Sign; Structuralism; Structuralist Phonology: Prague School; Systemic Theory; Tagmemics; Tesniere, Lucien Valerius (1893–1954); Trager, George L. (1906–1992); Transformational Grammar: Evolution; Trubetskoy, Nikolai Sergeievich, Prince (1890–1938); Valency Grammar.

Bibliography Austin J L (1962). How to do things with words. Oxford: Clarendon Press. Bally C (1965 [1932]). Linguistique ge´ ne´ rale et linguistique franc¸ aise (4th edn.). Bern: Francke. Benveniste E´ (1966). Proble`mes de linguistique ge´ ne´ rale, 1. Paris: Gallimard. Bloomfield L (1933). Language. New York: Holt & Co. Chomsky N (1957). Syntactic structures. The Hague: Mouton.

Chomsky N (1965). Aspects of the theory of syntax. Cambridge, MA: The MIT Press. Chomsky N (1973). ‘Conditions on transformations.’ In Anderson S R & Kiparsky P (eds.) A festschrift for Morris Halle. New York: Holt, Rinehart & Winston. 232–286. Chomsky N (1975). Reflections on language. New York: Random House. Chomsky N (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky N (1995). The minimalist program. Cambridge, MA: The MIT Press. Chomsky N & Halle M (1968). The sound pattern of English. New York: Harper & Row. Engler R (1967–74). Critical edition of F. De Saussure, Cours de linguistique ge´ ne´ rale. Wiesbaden: Harrassowitz. Fillmore C J (1968). ‘The case for case.’ In Bach E & Harms R T (eds.) Universals in linguistic theory. New York: Holt, Rinehart & Winston. 1–88. Godel R (1957). Les sources manuscrites du Cours de linguistique ge´ ne´ rale de F. de Saussure. Gene`ve: Droz. Greenberg J H (1966 [1963]). ‘Some universals of grammar with particular reference to the order of meaningful elements.’ In Greenberg J H (ed.) Universals of language, 2nd edn. Cambridge, MA: The MIT Press. 73–113. Grice P (1989). Studies in the way of words. Cambridge, MA: Harvard University Press. Guillaume G (1971–1990). Lec¸ ons de linguistique. Paris: Klincksieck; Que´ bec: Presses de l’Universite´ Laval [vols 1–4]. Que´ bec: Presses de l’Universite´ Laval; Lille: Presses Universitaires [vols 5–10]. Harris Z S (1951). Methods in structural linguistics. Chicago: The University of Chicago Press. Hawkins J A (1983). Word order universals. New York/San Francisco/London: Academic Press. Hjelmslev L (1943). Omkring sprogteoriens grundlæggelse. København: Munskgaard (English translation by F J Whitfield: Prolegomena to a theory of language (2nd edn.). Madison: University of Wisconsin Press, 1961). Jakobson R & Halle M (1956). Fundamentals of language. The Hague: Mouton. Labov W (1969). ‘Contraction, deletion and inherent variability of the English copula.’ Language 45, 715–762. Lehmann W P (1973). ‘A structural principle of language and its implications.’ Language 49, 47–66. Martinet A (1955). E´ conomie des changements phone´ tiques. Bern: Francke. Sapir E (1921). Language. New York: Harcourt, Brace & World. Sapir E (1949). Selected writings of Edward Sapir in language, culture and personality. Mandelbaum D G (ed.) Berkeley & Los Angeles: University of California Press. Saussure Ferdinand de (1922 [1916]). Cours de linguistique ge´ ne´ rale (2nd edn.). Paris: Payot. (English translation by Roy Harris. London: Duckworth, 1983). Sechehaye A (1908). Programme et me´ thodes de la linguistique the´ orique. Paris-Leipzig-Gene`ve: ChampionHarrassowitz-Eggimann. Tesnie`re L (1966 [1959]). E´ le´ ments de syntaxe structurale (2nd edn.). Paris: Klincksieck.

Two-Dimensional Semantics 195 Trubeckoj [Trubetzkoy] N S (1939). Grundzu¨ ge der Phonologie. Prague: Travaux du Cercle Linguistique de

Prague, 7. (English translation by C. A. M. Baltaxe. University of California Press, 1969).

Two-Dimensional Semantics C Spencer, Howard University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.

When we ask whether a sentence is true or false, we are always asking with respect to a particular world. We are typically concerned about a sentence’s truth value in the actual world, but we sometimes consider its truth value in other possible worlds as well. Thus the evaluation of any sentence is world-dependent in the sense that whether it is true (in a world) depends on the facts about that world. Context-sensitive sentences are also world-dependent in another quite different sense – what they mean depends on facts about the context, or world, in which they are used. For instance, it’s cold here uttered in Pakistan means that it is cold in Pakistan, and when uttered in New Jersey means that it is cold in New Jersey. Twodimensional semantics uses a formal apparatus from two-dimensional modal logic to characterize these two kinds of world-dependence. The two-dimensional framework has been applied to a variety of problems in semantics (indexicals and demonstratives and their interaction with modal operators), pragmatics (presupposition), and philosophy (accounts of the a priori/a posteriori distinction and the psychological/ functional roles of thought). All of these applications depend on various assumptions, which are not implicit in two-dimensional modal logic, and many of which are controversial. Modal logic allows that expressions may have different extensions in different possible worlds. For instance, it allows that the objects that satisfy a predicate in one world may differ from those that satisfy it in another. In one-dimensional modal logic, the rule that determines each expression’s extension in every world, called its intension, is represented as a function from possible worlds into extensions. The intension of a predicate F, for instance, is a function that takes a possible world onto the set of individuals that satisfy F in that world, and the intension of a singular term t is a function taking possible worlds to single individuals. Two-dimensional modal logic allows that a single expression may be associated with different one-dimensional intensions in different contexts, or worlds, of use. So it associates a twodimensional intension with each expression, which is

a function from possible worlds to one-dimensional intensions, or equivalently a function from ordered pairs of possible worlds into extensions (see Segerberg, 1973; Aqvist, 1973; van Fraassen, 1977 for expositions of a two-dimensional modal semantics for formal languages). Since a two-dimensional intension takes pairs of possible worlds onto extensions, it has the resources to represent the two different kinds of worlddependence mentioned above. One of the worlds supplies the contextual elements needed to interpret context-sensitive expressions, and the other world supplies the context of evaluation. I will call the entity that plays the former role the world of occurrence, and the entity that plays the latter role the world of evaluation, although no terminology is standard. Two-dimensional intensions can be represented in a matrix such as Figure 1, which gives a twodimensional intension for a single expression, s. In the leftmost column of Figure 1, w1, w2, and w3 represent possible worlds considered as worlds of occurrence. In the top row, these same three worlds are considered as worlds of evaluation. Suppose that s is the sentence I am in San Francisco. In w1, Ann is the speaker of this sentence and Ann is in San Francisco. She is also in San Francisco in w3, but not in w2. In w2, Beth is the speaker, but she is in London in all three worlds. In w3, Carl is the speaker, and Carl is in San Francisco in all three worlds. The cells of this matrix are filled in with truth values, since this is the appropriate extension for sentences. The row corresponding to w1 tells us the truth value, in w1, w2, and w3, of the sentence s, considered as occurring in w1. In w1, Ann utters this sentence, so it is true just in the case where Ann is in San Francisco. Accordingly, this occurrence is true in w1 and w3, but false in w2, as the matrix indicates. Similarly, the row

Figure 1 Two-dimensional matrix.

246 Linguistics: Approaches Kenstowicz M (ed.) (2001). Ken Hale: a life in language. Cambridge, MA: MIT Press. Labov W (1972). Sociolinguistic patterns. Oxford: Blackwell.

Newmeyer F J (1980). Linguistic theory in America: the first quarter-century of transformational generative grammar. New York: Academic Press. Ross J R (1986). Infinite syntax! Norwood, NJ: Ablex.

Literacy Practices in Sociocultural Perspective J Collins, University at Albany, SUNY, Albany, NY, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction This article discusses some of the things meant or intended when researchers refer to ‘literacy practices.’ One meaning is that the researchers will treat literacy as an event in which the (arti)facts of inscription cannot be separated from persons, settings, and other communicative modalities. In addition to this descriptive aim, the term also usually signifies a theoretical ambition. Across multiple disciplines of social inquiry, reflecting the legacy of thinkers such as Sahlins, Williams, Foucault, Giddens, and Bourdieu, the term ‘practice’ has come to signify a range of theories and frameworks that grapple with the interplay of structure and construction, history and agency (Ortner, 1984). Although a literacy event may be as personal and fleeting as jotting down a list or glancing at an advertisement, literacy understood as the making or interpreting of inscriptions has a history of many millennia, and is closely associated with longterm and large-scale enterprises, such as cities and states, armies and schools, extensive markets and world religions. Relating the personal to the largescale, the fleeting to the enduring, subject to structure, is thus a challenge confronting the study of literacy practices. The term ‘New Literacy Studies’ (NLS) denotes a broad framework, drawing together various critical approaches within the field of literacy studies, consciously built upon the notion of literacy as a practice, rather than, say, a psychological skill or abstract social property. Within the NLS framework, literacy practices have been explicitly discussed as entailing a twofold research intention: (a) to build upon the notion of literacy as a communicative event, itself derived from the anthropological ethnographies of communication paradigm, but also (b) to push analysis beyond events to the ideological framing and institutional contextualizing that gives particular events their broader significance (Street, 1993). In this effort, NLS work has drawn particular inspiration from the ground-breaking research of Shirley Heath on

literacy events, and it has used the anthropological notion of ‘cultural model’ to explore relations between consciousness and institutions. It has also, however, been informed by European and English traditions of critical discourse analysis, which call for the study of language as a social practice, an approach to language/society informed by Marxian analyses of ideology and hegemony and Foucault’s arguments about discourse, knowledge, and power (Rogers, 2003). In what follows, I first discuss work in the latter ‘New Literacies’ (NLS) framework, then indicate amplifications and framings suggested by sociocultural perspectives drawn from anthropology and linguistic anthropology (LA), arguing that the differing approaches only partly overlap. The article concludes by addressing contemporary literacy problems in the US and the complementary strengths and weaknesses of the two approaches (NLS and LA) in thinking about those problems as well as in providing frameworks for more general understandings of the phenomena of literacy and practice. It goes without saying, an article of this scope will unavoidably be schematic, citing only directly relevant work, leaving much that is worthwhile unmentioned and treating as settled much that deserves further argument.

Literacy Practices in the New Literacy Studies As noted above, within the NLS approach, priority is given to studying literacy events, the situated doings that involve acts of reading, writing, or both. Highly influential early work was that of Heath (1983), who showed that in home, school, workplace, or church, what was a issue was not a binary contrast between literacy and orality, but rather the socially situated and culturally mediated events – whether reading a bedtime story or reading the mail, filling in a job application or composing a prayer for a church service. Participating in such events, people not only decoded or encoded text but were socialized into and enacted particular views of what reading or writing might be, who took what roles in such events,

Literacy Practices in Sociocultural Perspective 247

and how they were part of larger-purpose endeavors: bringing up children, going to school, getting and keeping a job, expressing faith. In calling attention to the fact that acts of inscription and interpretation were part of a more general communicative economy, Heath’s studies were part of a more general inquiry, also undertaken by historians and literary scholars, into the relations between orality and literacy (see Gee, 1996, for one review). With the breaching of boundaries between written text and spoken event, the question of context is sharply raised, or in a more specific analytic idiom, the issue of indexicality is posed. Shuman (1986), in an in-depth study of working-class girls’ ‘‘fight stories,’’ emphasized that the right or authority to provide an account (that is, to be or not be talking behind someone’s back, itself grounds for an altercation) depended on particularities of relationship and situation, not on the communicative modality of speaking or writing per se. Drawing upon work by Garfinkel and Sacks, she noted that ‘‘all utterances are indexical . . . [that is,] all utterances have multiple possible meanings and any particular meaning is an understanding of context (Shuman, 1986: 119, emphasis added).’’ Arguing that the principle of indexicality extended to inscribed utterances, she argued that ‘‘The recognition of indexicality admits the possibility of multiple interpretations of text (Shuman, 1986: 120).’’ The theoretical recognition of indexicality, and the practical possibility of multiple interpretations of text, means that there can in principle be no strict line between text, context, and interpretation. The point is not simply that multiple interpretations are possible, which is true but uninteresting in itself. Rather, it is that which interpretations gain authority is a matter of social dynamics, involving actors and institution-based understandings, as well as inscriptions. The second aspect of the NLS understanding of literacy practices is that practices necessarily involve a cognitive-ideological dimension – of cultural models and implicit theories regarding what counts as reading or writing. One of the more systematic treatments of literacy models, Bialostok (2002), analyzes interviews with a set of middle-class parents, arguing for a ‘‘white middle class model of reading,’’ in which the reading of prose (fiction or non-fiction) in books is given near-exclusionary status. In Bialostok’s analysis, the statements that interviewees make, such as ‘‘Reading is part of me’’ or ‘‘I can easily gorge down a novel a day,’’ are expressions of metaphors which indirectly index a cultural model in which reading is a proxy for morality (Bialostok, 2002: 351). In this model, reading is recurrently talked about as a moral act of worthy consumption, and a series of interesting

metaphors are attached to this textual consumption. Book reading is sustenance, it fills readers up; indeed, they can ‘devour’ or even ‘binge’ on books. Reading books is a fitting part of the good person: ‘‘Books fit in with my life,’’ says one respondent; ‘‘Reading is something that’s just part of me,’’ says another (Bialostok, 2002: 356). The person who doesn’t read (books) is often characterized as ‘‘missing a lot of things’’ (Bialostok, 2002: 357). Book reading produces economic well-being: ‘‘Books have enriched our children’s lives’’ says one respondent. But its lack points to poverty: ‘‘It’s just so sad when a child doesn’t have any books or isn’t read to. They have such impoverished lives,’’ says another interviewee (Bialostok, 2002: 359). As Bialostok argues, this is not just a matter of variation in culturally specific conception of reading, coherent and articulated through interlocking images, and anchored in the institution of the middle class family. It is also that this is the model or theory of reading promoted by the child-rearing professions of the contemporary US and, in particular, the school. It is a model that recognizes and gives legitimacy to certain class-associated literacy practices, while ignoring or derogating acts rooted in other systems of value (see also Heath (1983)). An important point about positing what we might call an ideological level of models and implicit theories that are part of literacy practices is that, as with language ideologies more broadly, such conceptions and theories mediate between real-time microinteractional events, and institutional orders, that is, perduring social classifications, organizational forms, and activity types. In his ethnography of literacy on the Pacific atoll of Nukulaelae (Tuvalu), Besnier (1995) shows that letter writing gains its full significance within a configuration linguistic, cultural, and social features that include Nukulaelae lexical and grammatical categories of affect, norms for faceto-face conversation, norms for gender-based emotional expression, and extensive labor out-migration. Most pertinent to our current theme, Besnier shows that the strong evangelical religion practiced on the island provides weekly enactments, through sermons, of gendered personhood and authority. In Besnier’s account, Nukulaelae sermons feature a blending of literate and oral resources: sermons are written beforehand, but preaching styles draw on other ritual genres as well as biblical passages. For many Nukulaelae, sermons are key events in which religious authority and a public search for truth is performed, displayed, and claimed. The sermon is part of a more general fundamentalist epistemology: ‘‘ . . . Scriptures are the ultimate arbiter of truth. If they abide by the authority of God and the Bible, humans

248 Literacy Practices in Sociocultural Perspective

can gain access to the unambiguous truth that is otherwise beyond their reach (Besnier, 1995: 141).’’ This pursuit of truth is done by individuals evoking and commenting upon the sacred scripture in written sermons: ‘‘[the composition and performance of which] centralize an individualistic sense of personhood and the search for the truth (Besnier, 1995: 138).’’ But the sermon is part of – that is, it reflects, expresses, and reproduces – an institutional hierarchy of gender and knowledge: though women as well as men may preach, only men can write a sermon. That conceptions of reading or writing include assumptions about categories of person appropriate to a given activity is, of course, not surprising. The exclusion of women from Koranic or Talmudic scholarship is well known; enlightenment intellectuals in Europe and Colonial North America assumed that Blacks, whether free or enslaved, were incapable of writing; and in pre-Revolutionary America, it was normative that the virtuous woman should read, but not ‘‘take up the pen,’’ especially for public expression (Warner, 1990). Such ideological framing of literacy events, combining cultural classification (e.g., male/ female, white/black) and institutional site (e.g., the school, the pulpit, the press), shows that concepts of literacy practice must extend beyond the observable event.

Literacy Practices in Sociocultural Perspective The New Literacy Studies has been largely an undertaking of educators, anthropologists, and sociolinguists who have successfully critiqued earlier claims about literacy as a context-independent ‘‘technology of the intellect’’ and called into question many school-based and official definitions of literacy (Street, 1984; Gee, 1996). As a research program it has, however, been less successful in areas germane to a sociocultural perspective on literacy practices. These areas include the following: (a) registering the consequences of the artifactualization of language wrought by inscription and the technical-institutional distribution of such text-artifacts, (b) analyzing the ideological range that accompanies literacy endeavors in comparative, global perspective, and (c) attending to institutional and social structural influences on literacy practices outside contemporary Western states and school systems. I address each of these themes below. Artifactualized Language

One undeniable aspect of inscription is that it turns language into a thing, it renders some element of language (words, syllables, phonemes) into an

artifact, into marks made in some medium (clay, papyrus, paper, acetate). These artifacts can, in turn, become input for procedures of accumulation and distribution of varying scale. Despite the untenability of Jack Goody’s general conception of literacy, one value of his extensive research is his emphasis on procedures of accumulation and distribution of textartifacts as essential processes in the evolution of social complexity. In what is otherwise a critical assessment of Goody’s 1986 monograph, The logic of writing and the organization of society, Collins and Blot (2003: 20) have written: A general virtue of LWOS is that it provides constant reminders of the semiotic dimensions of social complexity. An endless keeping of lists seems to accompany social undertakings of any significant scale: censuses of population undertaken by states and lists of priest-officiants maintained by temples are prime examples. Bookkeeping emerged early as a primary function: if not the primary function of inscription in Mesopotamia, and it figured in subsequent temple, palace, and merchant economies in the Ancient Near East.

As if in complement to Goody’s The logic of writing and the organization of society, Olson (1994) developed an extended inquiry into philosophies and practices of reading in the intellectual traditions of Western Europe. He argues that a written textual representation is important because as an artifactualizing of language, it suggests a model of language. This model inclines users to become (a) aware of the dimensions of language represented (say, letters or words) and (b) troubled by what is not represented, that is, by what is left unsaid. In Olson’s account, an historical confluence of ideological, institutional, and technological developments in Western Europe results in a general model of readingfaithful-to-the-text, which arose out of medieval and early modern religious debates, was given substantive diffusion by the development of universities, printing and print-capitalism, and entered into debates in 16th- and 17th-century science about how to best read the Book of nature. As articulated by Francis Bacon and other early modern philosopher-scientists, the model of reading became part of a doctrine of truth based on precise records, in ‘‘unadorned style,’’ in experimental and other natural scientific inquiry. This concern for realistic, replicable representations of the natural world, in turn, gave impetus to developments in various representational media which flourished in the 17th–19th centuries: portrait painting, map-making, and botanical drawing. These representation media themselves became print artifacts, mechanically reproduced, extensively distributed, and mediating far-flung engagements in a world being transformed by capitalism and colonialism.

Literacy Practices in Sociocultural Perspective 249

What is often called ‘the print revolution’ was a central element in the production and distribution of artifactualized language. Historians of print (Eisenstein, 1968) have described the cultures of print emerging along with the new technology developed by Gutenberg in Germany in the 1450s, and they have described the very rapid spread of both production sites and book artifacts. As noted in Crosby’s (1997) suggestive analysis of the role of measurement and representation in European science and technology: ‘‘By 1478 they were printing in London, Cracow, Budapest, Palermo, Valencia, and a number of cities in between. By the next century millions of books had been printed (Crosby, 1997: 231).’’ The widespread diffusion of print artifacts, and especially books and newspapers, gave rise to new ways of viewing language, space, and time, encouraging, in turn, new ways of imagining both collectivities and individual nature. For Eisenstein, what she called ‘‘the uneven rise of reading publics’’ led to new forms of individual development. She argues that with the spread of books, lives became more procedure-bound as individuals and groups began ‘‘going by the book’’ (Eisenstein, 1968: 39). Such ‘‘going by the book’’ was a textual ethos and practice characterizing a ‘‘‘middle class’ secular puritan’’ ethos found in Protestant Europe and North America, in which domestic handbooks, marriage guides, and etiquette rule books were paramount in the conduct and judgment of lives. Eisenstein suggests that the new literate individualization also contributed to the envisioning of and identifying with new collectivities, in particular, the nation-state. This line of argument has been influentially developed by Anderson (1991), who argues that wide dissemination of newspapers and novels provided textual materials for imagining new spatial and temporal orders, for which the language-codifying nation-state was both instrument and outcome. A more historically and culturally nuanced account of such processes is provided in Warner’s (1990) analysis of letters, literacy, and political categories in late colonial and early national U.S. history. In his account, foundational concepts such as ‘citizen’ are linked to social divisions – such as those between men and women, whites and blacks – and to particular textual practices, such as collecting personal libraries, practicing journalism, and selling an ‘American literature.’ We now live of course in an era not of ascendant nationalism but of globalization, wrought not by printing press but by computing machines and their communicative infrastructure. Space considerations do not permit much discussion, but there has been considerable interest, debate, and writing about this

most recent development in the production and distribution of text-artifacts. We need not need enter the debates of whether the digital technologies are revolutionary newcomers, or merely new communicative modalities interacting with older modes, whether they herald an era of dystopian or utopian postliteracy (Warschauer, 1999). But we should note that with digital as with alphabetic literacy, artifactualized language provides ways of mediating communication, channeled through existing and emerging institutional forms, typically with official inducements for appropriate use and sanctions of inappropriate use, and typically also accompanied by an underground of subversive, wayward practices. Sociocultural Practices of Language Use Associated with Inscription and Artifact Circulation

In the preceding section I focused upon the artifactualizing of language brought about by inscription, some of the major developments in the production and distribution of such artifactualized language, and some of the historical, institutional, and psychological developments associated with these communicative economies. These latter include the rise of states and general religions, the development of science, nationalism, and the rule- (or book-) bound self. It is ethnocentric and crudely techno-deterministic to argue, as some have, that the alphabet, qua script, causes modern science. I agree with Brandt (2001), however, that the many students of literacy practices have gone too far in denying a technical dimension to literacy, giving short shrift to technical–institutional interactions and political economic dynamics in the history of literacy. This concern registered, it is also important to stress that there are always cultural variations in the way literate resources are taken up and put to use. Indeed, to its credit, the NLS framework has always stressed such variation. Given its concern to critique official, dominant understandings, it has not, however, brought out the playful, subversive uses of literacy; such transgressive uses often highlight the cultural shaping of literacy practice. Burke’s (1988) ‘The uses of literacy in Early Modern Italy’ gives a picture of literacy practices in a flourishing mercantile city state. Not surprisingly, we learn that in the 16th-century city state of Florence there was a robust notarial culture in which merchants kept extensive accounts and engaged in voluminous correspondence as they tracked expenditures and profits over far-flung trade networks. Also not surprisingly, we learn of secular, sacred, and patriarchal efforts at regulation: the Florentine state pioneered the use of written passes in order to regulate

250 Literacy Practices in Sociocultural Perspective

subjects’ travel; the Church issued tickets that were collected at communion, part of a record book registering attendance and nonattendance; and literate Florentine males proposed what Burke calls a ‘‘rule of female illiteracy (Burke, 1988: 38).’’ This latter prohibition on who should read and write seems to have been caused by a familiar patriarchal plight: fear of wayward female sexuality, more particularly, that literate wives and daughters might get up to no good by trafficking in love letters. The state, for its part, feared unauthorized reading of its correspondence and pioneered the use of ciphers for encrypting correspondence. The Counter Reformation Church was troubled by both literacy and illiteracy. Literates were problematic because they might read heretical works, such as those being busily produced and disseminated by Protestant printers to the north. Illiterates, on the other hand, were worrisome because they were fond of and susceptible to superstition and magic. And the illiterate majority of early modern Florence had access to written magic: spells, incantations, and medical cures that used alphabetic sequences (such as abracadabra), printed and otherwise inscribed on cards, amulets, and other objects, in order to evoke and control the supernatural. There is a small but interesting body of research showing that in colonial and postcolonial settings, hybrid forms of spiritual practice involve nonWestern peoples appropriating the idea of supernatural powers of sacred text. This is the case among the Mende of Sierra Leone, who make spells and talismans with fragments of Classical Arabic (Koranic) script (Bledsoe and Robey, 1993); it is also the case with the Gapun villagers of Papua New Guinea, who manipulate Biblical text in search of the material benefits promised by Cargo religion (Kulick and Stroud, 1993). Basso’s (1990) description of an indigenous Apache writing system reports the case of Silas John, a mission-raised Apache who in early adulthood experienced visions in which God apparently imparted knowledge of 62 prayers and a system for writing them down. The writing system was novel in form, drawing upon both English orthography and Apache sand-painting designs. After his visions, Silas John became the prophet of a neotraditionalist religious movement, gathering a select band of twelve disciples, to whom he taught his system of writing. He and his followers then used the writing system to record prayers and other instructions for use in rituals. In short, Silas John drew upon the idea of the power of divine writing, creating a written code, knowledge of which was necessary to successful performance of ceremonies. A traditional practice of rites was continued, but only through the agency of new literates, readers who controlled access to

the divine through their manipulation of esoteric script. A case strikingly similar to this one, and roughly contemporaneous, is that of the Aladura movement in colonial Nigeria during the 1920s. It also involved a prophet, Oshitelu, who also received prayers in a vision, along with a writing system for recording them. Oshitelu’s writing system was distinct from English orthography, in which this prophet had also been schooled, and thus was accessible only to the specially trained (Probst, 1993). In a recent study of practices of magical writing in Highland Ecuador, Wogan (2004) describes how the villagers of Salasaca attribute life-threatening power to acts of writing and erasure. Wogan argues that many practices of magical writing are a mirrorimage of the Ecuadorian state’s concern to record all indigenous peoples in its Civil Registry. In a campaign advertising the need to register new births, the slogan Si el nin˜o no esta´ inscrito, es como si no existiera (‘‘If a child is not registered, it’s as if he doesn’t exist’’) is proclaimed on posters and television (Wogan, 2004: 52). Conversely, Salasacans attribute life-threatening power to placement of a name in the book of San Gonzalo, a local saint. Hence the mirror-image: if a child’s name is not registered in one book, the Civil Registry, it is ‘‘as if’’ he or she does not exist; if a name of a person is placed in another book, that of San Gonzalo, that person is in mortal danger, until they have their name removed (which erasure eliminates the supernatural threat). Ideologies of Language and Literacy

As should be apparent from the foregoing discussion of religious and magical literacy practices, the existence of text-artifacts and their use in specific local settings is often overlain with a strong evaluative dimension, that is, an ideological layering. The view that a letter, a word, or a name or a passage in a book gives access to supernatural beings and powers is a belief, an idea about language, and an element in a framework for relating practices with language artifacts to concerns at once worldly and otherworldly. The idea that some given language or form of language has special powers is widespread, and the so-called world religions are well-known for their linguistic orthodoxies. I will illustrate two cases. For Muslims, Classical Arabic is the only language in which the holy Koran may appropriately appear; for Roman Catholics, until recent decades, Latin was the only correct language for the liturgical service. Aspects of this religious belief in the divine nature of specific languages clearly recurs in the nationalist equation of ‘‘a (n artifactualized and printdisseminated) language and a people,’’ an equation of essentialized language and political-cultural

Literacy Practices in Sociocultural Perspective 251

community, which has informed political struggles and education policy throughout the world (Anderson, 1991; Bauman and Briggs, 2003). Sometimes the focus of ideological reasoning is not language per se but the particular script in which the divine is both evoked and represented. This is apparent in the Apache and Nigerian cases described by Basso and Probst. Such cases in turn reflect a more general state of affairs: that orthographies (systems of inscription) are never neutral phenomena. They are instead often the object of sharp controversy over the best (i.e., the most authentic or scientific) way to represent a given language. The varied conflicts and their constituencies, sites, and orthography-ideas are discussed in research on language revitalization movements in Europe, South Asia, and the Americas (see, for example, the papers in Fenigson, 2003). In a more abstract fashion, thinking about literacy has incorporated specific assumptions about orthographic development and social evolution. Goody and Olson, for example, put forth claims about an ‘‘alpabetic mind’’ as part of arguments about the supposed superiority of alphabetic over other writing systems. Such claims were quickly refuted by anthropologists and comparative psychologists (see Street, 1984, for a general review), but they reflect an old legacy in Enlightenment thinking about literacy and society. Rousseau, in his writing on the social contract, posited an equivalence between kinds of inscription and stages of society. The scheme went like this (see discussion in Collins and Blot, 2003: 166–167): . Savagery (non-state societies) ¼ Picture writing . Barbarism (non-Western Asiatic states) ¼ Hieroglyphic writing . Civil Man (Western states and empires) ¼ Alphabetic writing This evolutionary hierarchy recapitulates other nowdiscredited hierarchies of culture, thought, and language, but with a focus on orthography rather than grammatical form. In yet other cases, it is neither a given language or script that is central to the ideologizing of literacy practices, rather it is the encounter between a superposed world language/religion and local beliefs about both language and the supernatural, resulting in novel construals of uses of artifact-language. This is the situation reported by Kulick and Stroud (1993), in which Gapun villagers apply pre-existing theories of hidden meaning in language to their readings the Christian Bible, as they seek passages which will unlock the secrets of Cargo wealth. Such an encounter is also the object of Pulis’ (1999) study of Rastafarian citing-up, a reading practice which emphasizes

playing with the sound of scripture in order to tap meanings hidden by the literal or denotational meanings of the text. In some cases, the holy language can be used for other supernatural purposes, which are organized in accordance with local practices. In Bledsoe and Robey’s discussion of Mende, ritual specialists substitute ‘‘Muslim magic’’ for indigenous magic and sorcery: ‘‘A moriman uses his command over Arabic writing, which is widely regarded as the literal word of God, to obtain God’s assistance. He evokes a verse’s power by writing it on paper, rolled into a tight wad and tied with string or inserted in a small amulet pouch (Bledsoe and Robey, 1993: 118).’’ Note in this case the condensation of talismanic power and artifactualized language: the verse, written in ‘‘the literal word[s] of God,’’ is compacted into a tight wad and inserted into an amulet pouch. A theme explicitly discussed in Wogan’s work on magical writing is that occult uses of literacy are often in counterpoint to the official organization of literate practices in the service of state functions. For the Ecuadorian state, the practice of registering people and land is part and parcel of being a modern society. Salasacan magic writing is, of course, viewed as a premodern superstition, albeit one that shares a belief in the life-defining efficacy of writing in books. The general point here – that literacy practices are intertwined conceptions of modernity – has been widely discussed. An early and highly controversial claim by Goody was that (Western) literate traditions were causal forces in the creation of modern, democratic forms of government; this was echoed in Olson’s early general statements about literacy and modern science. Their claims were effectively criticized, but as an ideological assumption, as separate from what we might call historical understanding, the equation of literacy with modernity in the West is widespread and pervasive (see Street, 1984; Collins and Blot, 2003: Chaps. 2 and 4, for critical discussions). The pertinent question, of course, is what is meant by ‘literacy’ or ‘modernity.’ Some recent linguistic anthropological work provides a nuanced approach to the issues. In Voices of modernity: language ideologies and the politics of inequality, Bauman and Briggs (2003) discuss the way in which major philosophical works, developing a conception of modern versus traditional society, depended on a highly selective viewing of language. A case in point is provided by John Locke’s essays on the foundations of knowledge and forms of government. In both kinds of essay, Locke treats language as a purely literal-referential communicative device, stripped of its interactional or contextual dimensions. (His view of language is thus like overly generalized conceptions of literacy, common enough in the 20th century, which equate an

252 Literacy Practices in Sociocultural Perspective

undifferentiated literacy with a singular modernity.) As Bauman and Briggs show, there are many intellectual traditions, ranging from Classical Empiricism through Antiquarianism, Folklore, and Boasian Anthropology, which have differently stipulated what is essential to ‘language,’ while giving priority to specific entextualizations of language, as part of their projects of establishing boundaries between the modern and the pre-modern. Reporting from a very different part of the world, and treating a more modest historical scope, Schieffelin (2000) also grapples with the issue of literacy and modernity. In her article she analyzes how the Kaluli of highlands Papua New Guinea have responded to Christian missionary activity and the variety of literacy practices attendant upon such activity. Analyzing missionary grammars (structural linguistic descriptions) and literacy primers, she shows how descriptive and prescriptive documents systematically skew the form of Kaluli language like importing Anglophone notions of literacy and encouraging reading and writing activities of a church-appropriate type. Kaluli literates, for their part, bring a local slant to interaction and interpretation. They read in groups, rather than in isolation, and they interpret written language as speech, in accordance with Kaluli beliefs about what is primary: talk rather than text. These local appropriations notwithstanding, Kaluli of the 1980s, after some 50 years of missionary activity, have adopted a modern ideology of authoritative truth: it resides in the written text. The blending of imported categories of description and practice with indigenous ones is part of a local modernity that Schieffelin deftly characterizes. But the Kaluli are nonetheless drawn into what she calls ‘‘colonial and missionary intrusions . . . premised on the principle of asymmetry: domination, control, and conversion . . .’’ (Schieffelin, 2000: 321). One general lesson from this work is that in thinking about literacy ideologies and long-term developments, such as the emergence of modernity, it is wise to seek out variation in and dialogic exchanges between the so-called modern and non-modern. It is also important to be aware of the frequent connection between literacy practices and forms of power and authority, to which we now give our attention. Institutions, Language, and Social Structure

Investigating ideologies of literacy and of language more generally can help in understanding literacy practices as doubly-articulated phenomena, that is, as enacted in events while also informed by ideological overlays. Efforts to understand the relation between empirical events and abstract entities such as models must, however, also attend to literacy

institutions and their authority with respect to language, language use, and social differentiation. Religion and education are two well-known examples of such institutions. As Goody (1986) and numerous others have noted, the spread of ancient civilizations, with their city states, extensive taxation, and conscript armies, typically also featured priestly castes and temple complexes, which were distinguished by their knowledge and control of an esoteric literacy dividing elites from masses. World religions are also famously ‘‘religions of the Book’’ and struggles to control the book – its script, language, or interpretation – are struggles through which knowledge becomes power and power knowledge. Even in relatively modern times, the elite concern to regulate mass forms of literacy is evident. One of the first known national literacy campaigns, in 17th century Sweden, also featured national literacy tests. A country with high literacy rates and surprising gender equality in those rates, Sweden’s literacy campaign featured household instruction and religious oversight: parish supervision and compulsory testing to ensure not just reading but orthodox, Protestant reading (Gee, 1996: 32–34). Besnier’s close analysis of Nukulaelae church services brings out the way in which religious institutions are enacted, how categories of person, and knowledge, and text are combined in communicative events: e.g., the sermon. In such events, differences between the kinds of person are presumed, displayed, and thus reproduced: for example, between the Christian and un-Christian; and between men who write and women who read. A limiting case of sorts is suggested by dissenting, hybrid religious movements. The examples of Silas John and Oshitelu show that a prophet, his writing, and chosen readers, can mobilize charismatic authority, in claiming and performing divine access through scriptal practices, without substantive institutionalization of offices, personnel, or resources. An important point revealed in all these cases, as in those discussed by Schieffelin (2000) and Wogan (2004), is that something significant occurs with the locating of authority, truth and correctness in text. Epistemologies and ontologies shift as well as what we might call social technologies of language. Artifactualized language is subject to different dynamics of accumulation and distribution than nonartifactualized language, with different potentials for ideological articulation and institutional consolidation. Ever since Socrates’s voice was transmuted into Plato’s writings, the school as well has been a paradigmatic literacy institution. Since the time of the Greek and Roman empires, the scola has been the primary site for drawing boundaries between the idealized uniformity and consistency of civilized

Literacy Practices in Sociocultural Perspective 253

language and the hybridizing variety of barbarian tongues. It is a familiar story that nationalism feeds upon and promulgates the standardizing of language, a political impulse toward centralization and control given great efficacy by the institutions and agents of the nation-state and the colonial regime, accompanied, to be sure, by local interpretations and reworkings (see Gee, 1996; and Street, 1984 and 1993, for discussion of general mechanisms and cases from national, colonial, and postcolonial settings). There is considerable historical and ethnographic work showing that schooled literacy is a particular institutional regulation of appropriate practices of reading and writing in correct forms of language, a mode of contemporary power which is also inextricably connected to dynamics of subjectivity and identity formation (see Collins and Blot, 2003, Chaps. 4 and 5, for discussion and extensive bibliography). That schooled literacy in the United States, for example, reflects and helps reproduce patterns of race and class inequality is both a familiar claim and the proverbial tough nut to crack. The undeniable persistence of racialized class inequality in literacy attainment has fueled an ongoing sense of a literacy crisis in late 20th and early 21st century America, yet neither side in the acrimonious reading wars of the last decade have much compelling evidence that their pedagogical proposals to improve school literacy achievement will do that, let alone lessen general economic inequalities. The current plight is well characterized in Rogers’ (2003) ethnographic and critical discursive analysis of the panacea of family literacy, which focuses on one African-American family’s multigenerational struggles with schools, Special Education referrals, and the damaged identities that result from literacy failure. Rogers, an ardent believer in the value of literacy and education, nonetheless writes: . . . research in New Literacy Studies has demonstrated that parents of working-class and minority children do value education and school and are involved with their children’s education. I want to suggest that it is precisely because of this belief and involvement that workingclass and minority families’ efforts are thwarted within the institution (Rogers, 2003: 144, emphasis added).

One way of thinking about valuing education and encountering failure in school and the workplace, a way of moving beyond the current political opportunism of reading reform, is suggested by Brandt’s (2001) historical analysis of literacy traditions and practices in 20th-century America. Drawing on Bourdieu’s writings about social fields and her own historical material, Brandt argues for the concept of bold sponsors of literacy, providing more dynamic accounts of social institutions – of family, church,

school, and the workplace – and their intertwined role in promoting and regulating literacy practices.

Taking Stock: New Literacy Studies, Linguistic Anthropology and Indexical Analysis In the preceding I have discussed the New Literacy Studies’ formulation of literacy practices and some of the empirical work developing this conception. Interwoven with and as complement to that discussion, I have drawn on research on the history and anthropology of literacy, seeing this especially as part of a general sociocultural perspective on literacy practices. I have emphasized the artifactualization of language, uses of artifactualized language, especially in the domain of religion and the occult, the ideological articulations of such uses, and the significance of literacy institutions and their authority in order to discuss analytic and substantive issues relevant for understanding literacy practices understood as sociocultural phenomena. These are issues which are mentioned in various NLS works; indeed it is a truism of NLS work that literacy is ideological, but on my reading these issues are not given sufficient theoretical attention or empirical priority in their inter-relation. One way of approaching this argument is to note, again, that most work on literacy practices posits a two-way relation: between events and models. Within linguistic anthropology, the subfield of anthropology that takes language analysis as central to a more general project of sociocultural inquiry, much work on language function and ideology has either been explicitly cast as practice (e.g., Hanks, 1996) or has developed via theory and analysis of indexicals (e.g., Ochs, 1996; Silverstein, 1985). On my reading, this work shares a fundamental assumption: that overcoming the dichotomy of the linguistic/social requires analysis of a three-way relation: between language form (system), language use (acts, events), and language evaluation (an ideological or reflexive aspect of social process and communicative conduct). There is now a small but growing body of work which uses analysis of indexicality to investigate the sociocultural processes in which literacy practices are situated and to which they contribute. I have discussed this briefly above, with regard to Bialostok’s work on models of reading and Shuman’s discussion of talk–text interactions among working-class adolescents. Two general points may be made regarding this work and indexicality more generally. First, analysis of indexicals entails a commitment to systematic investigation of language–context relations, with healthy concern for interpretive complexities. Second, it requires as well an awareness that many

254 Literacy Practices in Sociocultural Perspective

language–context indexical relations are indirectly related to – they indirectly index – other sociocultural constructs and processes, which are themselves essential to the larger analysis. Briefly put, I think that indexical analysis offers a promising entre´ e to the interplay between microanalytic language use (events) and macroanalytic sociocultural constructs (including models). Let me turn to some examples. Silverstein (1996) is an influential discussion of the indexical underpinnings of language standardization in the contemporary USA, linking such standardization to general semiotic-economic processes, such as commodification, and ideologizing strategies, such as naturalization. More recent explorations of indexical dimensions of literacy and learning include attention to long-term interactional development of student identities; the indexing of unspoken models of appropriate literacy; the interplay of heterogeneous voices, pedagogical innovation, and globalized discourses of learning; and the assembly of knowledge and identity in dispersed, digitally mediated work settings (see studies in Wortham and Rymes, 2002). Blommaert (2005) provides a series of illustrative cases and analyses of literacy practices within orders of indexicality – authoritative frames for reading and writing events – which organize postcolonial and diasporic encounters between Africans and Europeans. Collins and Slembrouck (2004) develop a related line of inquiry via a case study of the reading practices encountered in situations of dense, migration-based multilingualism.

emerged along with the New Literacy Studies, which were themselves an effort to consolidate a critique against earlier models of literacy, in the academy and officialdom more generally, which privileged Western literate traditions and minds over other traditions and minds. In their concern to uncover and demonstrate the logics underlying mundane practices by the economically, politically, and culturally disfavored, NLS practitioners share with anthropologists a number of philosophical, analytical, and substantive concerns, and they share also, I would suggest, an ethical commitment to anthropology as cultural critique. In the urgency of their normative stance, they may focus overly much on legacy of the school in shaping literacy practices (an indictment in which I would place myself in earlier textual manifestations), and in widening this focus, sociocultural perspectives are invaluable. But in their normative stance, NLS practitioners also remind academic anthropology that knowledge matters – in struggles as well as deliberations. Let this article be a contribution to and encouragement of continuing exchanges, on the common ground of literacy practices, between critical anthropology and new literacy studies.

Conclusion

Anderson B (1991). Imagined communities (2nd edn.). London: Verso. Basso K (1990). ‘A Western Apache writing system.’ In Western Apache language and culture. Tucson: University of Arizona Press. 25–52. Bauman R & Briggs C (2003). Voices of modernity. Cambridge: Cambridge University Press. Besnier N (1995). Literacy, emotion, and authority. Cambridge: Cambridge University Press. Bialostok S (2002). ‘Metaphors for literacy: A cultural model of white, middle-class parents.’ Linguistics and Education 13, 347–371. Bledsoe C & Robey K (1993). ‘Arabic literacy and secrecy among the Mende of Sierra Leone.’ In Street B (ed.) Crosscultural approaches to literacy. Cambridge: Cambridge University Press. 110–134. Blommaert J (2005). Discourse. Cambridge: Cambridge University Press. Brandt D (2001). Literacy in American lives. Cambridge: Cambridge University Press. Burke P (1988). ‘The uses of literacy in early modern Italy.’ In Burke P & Porter R (eds.) The social history of language. Cambridge: Cambridge University Press. 21–42. Collins J & Blot R (2003). Literacy and literacies. Cambridge: Cambridge University Press.

I have concluded with a brief discussion of work using indexical analysis to investigate literacy practices because it shares with research in the NLS framework a concern to grasp the simultaneous, ongoing interplay between the microphenomena of everyday life and the macrophenomena of institutional orders, unequal resources, and accumulated power and authority. But the work on indexicality holds more closely to what I take to be an important insight from linguistic anthropology: that understanding communicative practices, in our case, literacy practices seen as total social facts, requires attention to language form, use, and evaluation. Such a three-way analytic can lead from situated communicative doings to longer-term structuring processes, while remaining attentive to cultural dynamics. I think the foregoing discussion of religion, magic and literacy practices illustrates some of the issues involved. It must be borne in mind, however, that concern with literacy practices is not just an effort to better understand literacy, though it is that. The concept

See also: Influence of Literacy on Language Development; Politics, Ideology and Discourse; Reading and Multiliteracy; Religion and Literacy; Writing and Cognition.

Bibliography

Literary Pragmatics 255 Collins J & Slembrouck S (2004). ‘Reading shop windows: Multilingual literacy practices and indexical orders in globalized neighborhoods.’ Working Papers on Language, Power and Identity no. 21. http://bank.bug.be/ lp1. Crosby A (1997). The measure of reality. Cambridge: Cambridge University Press. Eisenstein E (1968). ‘Some conjectures about the impact of printing on western society and thought.’ Journal of Modern History 40, 1–57. Fenigsen J (ed.) (2003). Misrecognition, linguistic awareness, and linguistic ideologies. Special issue of Pragmatics 13.4. Gee J (1996). Social linguistics and literacy (2nd edn.). London: Taylor & Maxwell. Goody J (1986). The logic of writing and the organization of society. Cambridge: Cambridge University Press. Hanks W (1996). Language and communicative practice. Boulder: Westview. Heath S (1983). Ways with words. Cambridge: Cambridge University Press. Kulick D & Stroud C (1993). ‘Conceptions and uses of literacy in a Papua New Guinean village.’ In Street B (ed.) Crosscultural approaches to literacy. Cambridge: Cambridge University Press. 30–61. Ochs E (1996). ‘Linguistic resources for socializing humanity.’ In Gumperz J & Levinson S (eds.) Rethinking linguistic relativity. Cambridge: Cambridge University Press. 407–437. Olson D (1994). The world on paper. Cambridge: Cambridge University Press. Ortner S (1984). ‘Theory in anthropology since the sixties.’ Comparative Studies in Society and History 26, 126–166.

Probst P (1993). ‘The letter and the spirit.’ In Street B (ed.) Crosscultural approaches to literacy. Cambridge: Cambridge University Press. 198–220. Pulis J (1999). ‘Citing[sighting]-up.’ In Pulis J (ed.) Religion, diaspora, and cultural identity. Newark, NJ: Gordon and Breach. 357–401. Rogers R (2003). A critical discourse analysis of family literacy practices. Hillsdale, NJ: Lawrence Erlbaum Associates. Schieffelin B (2000). ‘Introducing Kaluli literacy.’ In Kroskrity P (ed.) Regimes of language. Santa Fe, NM: School of American Research Press. 293–328. Shuman A (1986). Storytelling rights. Cambridge: Cambridge University Press. Silverstein M (1985). ‘Language and the culture of gender.’ In Mertz E & Parmentier R (eds.) Semiotic mediation. Orlando: Academic Press. 219–259. Silverstein M (1996). ‘Monoglot ‘‘standard’’ in America.’ In Brenneis D & Macaulay R (eds.) The matrix of language. Boulder: Westview. 284–306. Street B (1984). Literacy in theory and practice. Cambridge: Cambridge University Press. Street B (ed.) (1993). Crosscultural approaches to literacy. Cambridge: Cambridge University Press. Warner M (1990). The letters of the republic. Cambridge, MA: Harvard University Press. Warschauer M (1999). Electronic literacies. Hillsdale, NJ: Lawrence Erlbaum Associates. Wogan P (2004). Magical writing in Salasaca. Boulder: Westview. Wortham S & Rymes B (eds.) (2002). Linguistic anthropology of education. Westport, CT: Praeger.

Literary Pragmatics J L Mey, University of Southern Denmark, Odense, Denmark ! 2006 Elsevier Ltd. All rights reserved.

The efforts to relate the study of literary texts to the science of language have mainly taken two directions. In an earlier, quantitative approach, the emphasis was on how to characterize a text, based on the occurrence of certain elements (substantives, adjectives, and so on). The methods employed were mainly statistic. Later approaches focused on the qualitative aspect: how to account for certain characteristics (grammatical constructions, literary tropes, etc.) thought to be specific for types of text or, more generally, how to characterize a stretch of text in relation to what used to be called ‘extralinguistic’ purposes: persuading, arguing, questioning, etc. – matters that used to be taken care of in the discipline called rhetoric (see Rhetoric, Classical).

When talking about literary pragmatics, it is important to realize that pragmatics was not developed from inside linguistics but, rather, originated in related fields, such as philosophy, sociology, and the theory of interaction. The linguists were initially alerted to pragmatics through the aporias that arose within their strictly limited field of view (see Pragmatics: Overview). Similarly, literary studies did not ‘discover’ pragmatics by themselves; the approaches to the study of texts as related to the human users were developed simultaneously by philosophers, literary theoreticians, and pragmaticists. The purpose of these endeavors was to make sense of the fact that language does not always obey the strictures of the grammarians; indeed, the linguistic structure of a text has, in many cases, very little to do with the effects of a text, with what it ‘does,’ or with how a text is produced and consumed by the users. Applying the definition of pragmatics as ‘‘the study of the use of language in human communication, as

Literary Pragmatics 255 Collins J & Slembrouck S (2004). ‘Reading shop windows: Multilingual literacy practices and indexical orders in globalized neighborhoods.’ Working Papers on Language, Power and Identity no. 21. http://bank.bug.be/ lp1. Crosby A (1997). The measure of reality. Cambridge: Cambridge University Press. Eisenstein E (1968). ‘Some conjectures about the impact of printing on western society and thought.’ Journal of Modern History 40, 1–57. Fenigsen J (ed.) (2003). Misrecognition, linguistic awareness, and linguistic ideologies. Special issue of Pragmatics 13.4. Gee J (1996). Social linguistics and literacy (2nd edn.). London: Taylor & Maxwell. Goody J (1986). The logic of writing and the organization of society. Cambridge: Cambridge University Press. Hanks W (1996). Language and communicative practice. Boulder: Westview. Heath S (1983). Ways with words. Cambridge: Cambridge University Press. Kulick D & Stroud C (1993). ‘Conceptions and uses of literacy in a Papua New Guinean village.’ In Street B (ed.) Crosscultural approaches to literacy. Cambridge: Cambridge University Press. 30–61. Ochs E (1996). ‘Linguistic resources for socializing humanity.’ In Gumperz J & Levinson S (eds.) Rethinking linguistic relativity. Cambridge: Cambridge University Press. 407–437. Olson D (1994). The world on paper. Cambridge: Cambridge University Press. Ortner S (1984). ‘Theory in anthropology since the sixties.’ Comparative Studies in Society and History 26, 126–166.

Probst P (1993). ‘The letter and the spirit.’ In Street B (ed.) Crosscultural approaches to literacy. Cambridge: Cambridge University Press. 198–220. Pulis J (1999). ‘Citing[sighting]-up.’ In Pulis J (ed.) Religion, diaspora, and cultural identity. Newark, NJ: Gordon and Breach. 357–401. Rogers R (2003). A critical discourse analysis of family literacy practices. Hillsdale, NJ: Lawrence Erlbaum Associates. Schieffelin B (2000). ‘Introducing Kaluli literacy.’ In Kroskrity P (ed.) Regimes of language. Santa Fe, NM: School of American Research Press. 293–328. Shuman A (1986). Storytelling rights. Cambridge: Cambridge University Press. Silverstein M (1985). ‘Language and the culture of gender.’ In Mertz E & Parmentier R (eds.) Semiotic mediation. Orlando: Academic Press. 219–259. Silverstein M (1996). ‘Monoglot ‘‘standard’’ in America.’ In Brenneis D & Macaulay R (eds.) The matrix of language. Boulder: Westview. 284–306. Street B (1984). Literacy in theory and practice. Cambridge: Cambridge University Press. Street B (ed.) (1993). Crosscultural approaches to literacy. Cambridge: Cambridge University Press. Warner M (1990). The letters of the republic. Cambridge, MA: Harvard University Press. Warschauer M (1999). Electronic literacies. Hillsdale, NJ: Lawrence Erlbaum Associates. Wogan P (2004). Magical writing in Salasaca. Boulder: Westview. Wortham S & Rymes B (eds.) (2002). Linguistic anthropology of education. Westport, CT: Praeger.

Literary Pragmatics J L Mey, University of Southern Denmark, Odense, Denmark ! 2006 Elsevier Ltd. All rights reserved.

The efforts to relate the study of literary texts to the science of language have mainly taken two directions. In an earlier, quantitative approach, the emphasis was on how to characterize a text, based on the occurrence of certain elements (substantives, adjectives, and so on). The methods employed were mainly statistic. Later approaches focused on the qualitative aspect: how to account for certain characteristics (grammatical constructions, literary tropes, etc.) thought to be specific for types of text or, more generally, how to characterize a stretch of text in relation to what used to be called ‘extralinguistic’ purposes: persuading, arguing, questioning, etc. – matters that used to be taken care of in the discipline called rhetoric (see Rhetoric, Classical).

When talking about literary pragmatics, it is important to realize that pragmatics was not developed from inside linguistics but, rather, originated in related fields, such as philosophy, sociology, and the theory of interaction. The linguists were initially alerted to pragmatics through the aporias that arose within their strictly limited field of view (see Pragmatics: Overview). Similarly, literary studies did not ‘discover’ pragmatics by themselves; the approaches to the study of texts as related to the human users were developed simultaneously by philosophers, literary theoreticians, and pragmaticists. The purpose of these endeavors was to make sense of the fact that language does not always obey the strictures of the grammarians; indeed, the linguistic structure of a text has, in many cases, very little to do with the effects of a text, with what it ‘does,’ or with how a text is produced and consumed by the users. Applying the definition of pragmatics as ‘‘the study of the use of language in human communication, as

256 Literary Pragmatics

determined by the conditions of society’’ (Mey, 2001: 6) to the case of literary communication, we may say that literary pragmatics is concerned with the user’s role in the societal production and consumption of texts.

The Linguistics of Texts Early approaches to literary pragmatics (what used to be called ‘text linguistics’) took their point of departure in certain concepts that had been developed in linguistics mainly in order to cope with the needs of grammatical description. A text was thought of as a hierarchically structured complex of sentences, just as the sentence itself was considered a hierarchically structured unit of ‘phrases’ (noun phrases, verb phrases, etc.). A ‘text grammar’ was proposed in parallel to the sentential grammar developed by Chomsky and his school (see, e.g., van Dijk, 1972). Later efforts were spurred on by the achievements of philosophers and pragmaticists such as Austin and Searle, who developed a theory of ‘speech acts’ – that is, utterances that did something in addition to being merely ‘uttered’: the ‘illocutionary’ vs. the simple ‘locutionary’ aspect (see Speech Acts). The idea that utterances had a ‘performative’ and not just a ‘constative’ value (originally due to Austin, 1962) gave rise to a classification of speech acts into categories such as assertions, questions, orders, and apologies. Following on this, some suggested that we could look at a text as a gigantic ‘macro-speech act’ that had under it all sorts of individual acts, each expressed in some hierarchically ordered, linguistically recognizable form. This approach mostly failed, on several counts. First, it was not easy to explain what exactly the relationships were between the different acts making up the text. There was also the problem of the hierarchical constraints, which seemed unnatural, given the essentially linear, and often unpredictable, nature of textual ‘speech acting.’ However, the idea that sentences may be conjoined in deeper ways than just being strung together on the surface (‘concatenated’ in Chomskyan parlance) had taken hold. Similarly, the notion that language ‘performed,’ did something, led to the valuable discovery that certain sentences were ‘doable’ or ‘speakable,’ and others were not (Banfield, 1982). Perhaps the most important insight into the nature of human language use as ‘doing things with words’ (Austin, 1962) was due, again, not to a linguist but to a philosopher, H. Paul Grice, who took the ideas developed by both Austin and Searle (1969) several steps further. In order to explain the regularity of human conversations and their mostly successful outcomes, Grice (1989: 26–31) postulated the principle

of cooperation, stipulating that every person’s contribution to a conversation should be commensurate to its purpose (including the aims and motives of the participants) (see Cooperative Principle and Grice, Herbert Paul (1913–1988)). Grice further suggested four conversational maxims regulating our cooperative handling of conversational information: making it sufficient (the maxim of ‘quantity’), true (‘quality’), relevant (‘relation’), and orderly (‘manner’). Any infringement of these maxims should be interpreted (assuming the general idea of conversation as cooperation) as implying an additional meaning: Breaking (‘flouting’) a maxim imparts a message that although not explicitly mentioned, nevertheless is understood. Grice’s famous example is that of a professor writing a recommendation for a student, consisting of a statement as to the student’s attendance in class and his correct English spelling – clearly irrelevant matters in the context – and thus implying that there must be a reason for the professor’s unwillingness to cooperate, viz. that the student does not deserve a ‘real’ recommendation (Grice, 1989: 30) (see Maxims and Flouting). Grice’s notion of ‘implicature,’ as it is called, is of the utmost importance with regard to explaining how authors and readers go about ‘cocreating’ the literary work, as discussed later (see Implicature). It has often been remarked that the essence of art, visual or literary, is in the way we omit things, rather than saying them outright or representing them pictorially. When a queen utters that she is not amused, we understand perfectly well what she is, and we hide to escape her wrath. When God asks Cain, ‘Where is Abel thy brother?’ Cain realizes (since God of course knows exactly where Abel is and who killed him) that this implies the question, ‘Why did you do this to your brother?’ However, Cain refuses to cooperate in this conversation and tries to deny the implicature that he should have been ‘his brother’s keeper.’ But in the next sentence, having ‘answered’ God’s question with the pseudo-cooperative utterance ‘I know not,’ he shows that he indeed has gotten the implied message: ‘Am I my brother’s keeper?’ (Gen. 4:9). In the following sections, I briefly indicate the particular instances in which pragmatics meets with various linguistic and literary approaches, as they are dealt with in the remainder of this article. The problems of text production and consumption are discussed first. Next, a brief discussion is provided of a concept that has gained increasing attention in the discussions during the past few decades: interactivity, understood as the active (specifically Gricean) collaboration between reader and author. Such a collaboration involves more than plain interaction, however: the relationship between the active partners is

Literary Pragmatics 257

‘dialectic,’ which means that neither the author nor the reader can claim exclusive possession of, or authority over, the text. Following this, I characterize this relationship as a ‘cocreative’ one: a text is not the exclusive work of the individual author but always presupposes the active collaboration of an audience, the readers. How this cocreation is orchestrated technically is the subject of the following sections, in which first some classical linguistic techniques of ‘signposting’ are discussed, and then the important pragmatic concept of ‘voice’ is introduced as that which brings the text to life. In particular, the ‘point of view’ embodied in the voice of a character helps us find our way through the textual maze; this ‘vocalization’ relies not only on signposting by means of linguistic devices but also on the techniques of cooperation and implicature (including flouting) that were mentioned previously. When the proper conditions for use of these techniques are not met, we may experience a phenomenon that I have termed ‘voice clashing’: voices speaking out of order, either by an author’s fiat or (more often) by authorial negligence. The final section wraps up the discussion by stressing the human engagement that is the necessary condition for successful text work.

Production and Consumption The following is a common model of production and consumption in our society: a producer delivers a product to the market, and the consumer pays the market price, acquires the product, and starts consuming it. After the transaction is concluded, producer and consumer part ways, never to meet again unless in special cases (foreseen by the laws regulating trade in our society), such as inferior product quality or unsatisfactory handling of the financial aspects of the purchase. This relationship is purely linear and unidirectional; the deal, once consummated, cannot be reversed (barring special cases such as return or repossession of the product). One may be tempted to apply this simple model to the production and consumption of literary works: The author is the producer of some literary text, whereas the reader is a consumer who happens to be ‘in the market’ for a particular literary product. Once the book is bought, the reader is free to do whatever he or she wants to do with it: take it home and place it on the shelf in the living room, possibly read it, or maybe even throw it in the trash – or at somebody, literally or metaphorically. In reality, things happen not quite like that. Buying a book is not like acquiring a piece of kitchenware or furniture. One does not just bring a book back from the bookstore: one takes home an author, inviting

him or her into the privacy of one’s quarters. The author, on the other hand, does not just make a living, producing reams of printed paper (granted, there are those that do), but has a message for the reader as a person. This is, eventually, why books are bought and sold: not because they are indispensable for our material existence, but because they represent a personal communication from an author to a potential readership – a communication that, in order to be successful, will have to follow certain rules. Authors and Readers

The process of writing has been likened to a technique of seduction: a writer takes the readers by their hands, separates them from the drudgery of everyday life, and introduces them to a new world, of which the writer is the creator and main ‘authority’ (Mey, 1994: 162, 2000: 109). The readers will have to accept this seductive move and follow the author into the labyrinth of the latter’s choice in order to participate properly in the literary exercise. The readers take the narrative relay out of the hands of the author: ‘The author is dead, long live the reader,’ to vary Barthes (1977). Marie-Laure Ryan (2001) envisioned this reader participation along a twofold dimension: that of interactivity (in which the reader manipulates the text) and that of immersion (where the reader seamlessly identifies him- or herself with the text). In immersion mode, the reader is not just a spectator on the virtual scene: The ‘role of the reader’ is that of an ‘‘active participant in the process of creating the fictional space’’ (Mey, 1994: 155). As discussed later, the immersed reader is a ‘voice’ in the text; he or she is not only ‘present at the creation’ of the text but also to some extent its ‘creator’ (Barthes, 1977). In literary texts in particular, the success of the story depends not only on the author but also to a high degree on the reader. In the process of creating the text, the reader is created anew, reborn in the text’s image. This interactivity does not just happen on the level of the text: it involves a deeper layer, that of the self. What is created is not only the fictional space but also the reader in it, ‘lector in fabula.’ ‘This book changed my life’ is therefore not just a trite expression we employ to register an exceptional reading experience; such changes happen whenever we consume texts (including nonliterary genres, such as scientific and commercial prose, legal texts, etc., and ‘texts’ in a wider understanding of the term – theatrical and movie productions, the visual arts, and so on; Mey, 1994: 155). Updating our view on texts, we may even include here the virtual realities of the computerized world and its texts, as discussed by authors such as Gorayska and Mey (1996), Ryan (2001), and others.

258 Literary Pragmatics Text Dialectics

A dialectic situation of interaction occurs when the interacting parties influence each other in such a way that the outcome of what one party does is determining for the other’s ability to operate. In language use, the pragmatics of interaction determines the game, whose very name is dialectics. When speaking or writing, we are always engaged in some communication (informing our partner about some event, apologizing for inflicted injuries or insults, promising services, telling a story, etc.). In this activity, we crucially depend on the other’s presence and cooperation not only for the legitimacy of our speech acts but also for their very viability. Conversely, our interactors depend crucially on who we think they are and on whether they think of us as good partners in interaction. (After all, one can only tell a story properly to listeners whose interests one shares or imagines one does). The way we see ourselves and our partners, and how they see themselves and us, is essential to this dialectic process. The following section details how such representations come about and are managed. Cocreativity

Pragmatically speaking, a text is the result of what Bakhtin (1994: 107) called ‘‘the meeting of two subjects.’’ The life of the text ‘‘always develops on the boundary between two consciousnesses, two subjects’’ (Bakhtin, 1994: 106; italics in original) the two consciousnesses being the author’s and the reader’s. The author is by definition conscious of his or her role in creating the world of letters, the ‘fictional space,’ mentioned previously. However, the reader’s consciousness is just as essential in cocreating this fictional universe. For Bakhtin, the reader is the (co-)creator of the text: It is in the dialogue between author and reader that the text, as a dialectic creation, emerges (see Bakhtin, Mikhail Mikhailovich (1895–1975)). How do author and reader, these two ‘consciousnesses,’ navigate the fictional space? For a reader, it is not enough to identify with the author passively; the reader must consciously adopt the cocreator role, as it is assigned by the textual dialectics. Conversely, the author must consciously alert the reader to the signposts and other ‘indexes’ placed in the fictional space to enable the navigation process. In some older novels, mainly those written in the 18th and 19th centuries, the author often appears on the scene in person, apostrophizing the reader and telling him or her what to do, what to feel, what not to object to, which ‘disbeliefs to willfully suspend,’ and so on. The 19th-century British writer Anthony Trollope was a master of this

‘persuasion-cum-connivance,’ as when he told the readership that he was unable to expatiate more on certain characters of his story: the publisher, a Mr Longman, would not allow him a fourth volume, so he had to finish the third and last of the Barchester novels at page 477 – and, well, since we are already at page 396 . . . (‘‘Oh, that Mr Longman would allow me a fourth!’’ Trollope, 1857/1994: 306). The curious and eagerly co-creative reader hurries to the last page of the novel to find that its number is indeed ‘477,’ just as the author had predicted. We cannot exclude the possibility that the reader may feel a bit taken in: the cocreative is morphing into the gullible. Cases such as these are the exception, and readers will normally do no more than smile at discovering their complicity in what is commonly understood as an authorial prank. In other cases, the cocreativity that is needed to make the cocreative enterprise succeed, although less obvious, is (perhaps for that reason) considerably more effective. Notorious instances of successful ‘reader deception’ are found in the Argentine writer Julio Corta´ zar’s work, as in the novella ‘Historia con migalas’ (‘A story of spiders’; 1985). Here, the author consciously leads the reader down a ‘garden path’ of narration, along which the two female protagonists by default are assumed to be a male–female couple. Only in the story’s very last sentence do they literally remove their morphological protection, along with their seductive veils (see Mey, 1992; the trick is pulled off successfully only in the Spanish original). In the Corta´ zar story, reader seduction (involving the cocreation of a manipulated consciousness) is achieved without the reader’s awareness – a typical requirement of certain literary genres, such as the joke or, as in this case, the garden path story. In other (more normal?) cases, readers are guided through the fictional labyrinth by certain indications as to where the narrative ‘thread’ is leading them, which readerly pitfalls they have to avoid, where to proceed with caution or alternatively with boldness, and so on. Signposting

In the Corta´ zar story, the reader is led astray by the (mis)use of linguistic means. It is as if the author had changed or removed the usual road signs – a technique one could call ‘deceptive signposting.’ The question is what are these signposting techniques, and how are they normally used to bring coherence to a text when no garden paths or jokes are involved? First, we have the time-honored concepts of ‘deixis,’ such as when articles and (personal) pronouns are employed to ‘point to’ a particular character. In particular, in anaphora, the relative pronoun is used to ‘point back’ (or, in cataphora, ‘forward’) to a

Literary Pragmatics 259

character or object that is already mentioned, or is going to be mentioned, in the story. The common concept covering these phenomena is called ‘(phoric) reference’ (see Deixis and Anaphora: Pragmatic Approaches). In addition, we have means to indicate the time and place relations that are of importance in order to establish and promote the flow of the story. Time adverbs such as ‘today’ and place adverbs such as ‘abroad’ tell us when, and where, the story is taking place. Also, we have sentence adverbs that give a particular flavor to a larger stretch of discourse, sometimes even an entire paragraph (e.g., ‘regularly,’ ‘unfortunately,’ and ‘clearly,’ especially when placed sentence-initially) (see Parentheticals). As was the case in the Corta´ zar story, much of the correct understanding of a story is imparted through the use of gender-marked items, such as ‘she’ vs. ‘he,’ or by exploiting the difference between a male and a female form of, for example, an adjective. The latter technique is not always applicable in English; in the Corta´ zar case, the ‘de´ nouement’ comes when the unsuspecting reader finally is confronted with an unequivocal, female adjectival form: desnudas ‘naked’ (a Spanish fem. plur. since it refers to women; in the English translation, this point gets lost and the garden path leads nowhere). In addition to these linguistic techniques, pragmatics offers the reader a great help. There is Gricean implicature, mentioned previously; furthermore, the author has at his or her disposal various ways of representing speech or thought, either by directly quoting a character’s utterance (literally putting words in his or her mouth) or by indirectly reproducing what the character is thinking to him- or herself in ‘free indirect discourse,’ as in the following quote from Jane Austen (1810/1947: 191): ‘‘And now – what had she done or what had she omitted to do, to merit such a change?’’ Here, ‘she’ (Catherine, the novel’s heroine) is musing about her sudden change in fate (owing to the fact that, unbeknownst to her, the father of her lover has discovered that she is no rich prospect after all); however, we are never told explicitly who this ‘she’ is: being competent, cocreative readers, we will know. Characters are given ‘voices’ that we clearly and distinctly perceive as the characters’ and theirs alone. This notion (including the phenomenon of vocalization) is discussed next.

The Voices of the Text Vocalization

‘Vocalization’ is a powerful way of creating and maintaining the fictional space with the willing help

and indispensable assistance from the readership, and of ‘orchestrating’ the dialectics of cocreativity between author and reader. Taken by itself, the term may be translated as ‘giving a voice,’ ‘making vocal’ (or ‘heard,’ depending on the perspective). In the context of literary pragmatics, vocalization means ‘giving a voice to a character in the story’ – in other words, making the character speak. We are more or less familiar with the phenomenon from the simple fact of narrative dialogue. Whenever a conversation is included in the story, we hear the voices of the characters discussing current events or other matters of interest, such as how many kinds of love there are (compare Kitty and Anna’s conversation in Tolstoy’s Anna Karenina; 1889/1962: 155), or the advantages of married life as opposed to the single gentleman officer’s existence, as enthusiastically described by General Serpuchovskoy to Vronsky – how he got his hands freed when marriage lifted the ‘fardeau’ of everyday worries onto his shoulders (Tolstoy, 1889/1962: 350). In situations such as these, the attribution of voices is done in a straightforward manner, more or less as it happens in a play: the lines are put into the mouths of the characters, given voice through the unique assignment of a familiar role name, and are often preceded by what is called a ‘parenthetical,’ such as ‘he said,’ ‘she laughed,’ and ‘he cried’ (see Parentheticals). Voice and Focus

Vocalization is an intricate process, inasmuch as it not only gives voice to a character in the strict sense of speaking one’s part, but also affords information about the character’s perspective or point of view. What the voice indicates is not just the character as such (by naming the person) but also the viewpoint from which the character sees the other characters, and indeed the world. In this wider sense, voices range over the entire fictional space they create: ‘‘Utterances belong to their speakers (or writers) only in the least interesting, purely physiological sense; but as successful communication, they always belong to (at least) two people, the speaker and his or her listener’’ (Morson and Emerson, 1990: 129). Vocalization always implies ‘focalization,’ a focusing on the characters’ placement in the literary universe (Mey, 2000: 148). In Bal’s (1985: 100) words, focalization is ‘‘the relation between the elements presented and the vision through which they are presented.’’ This vision and these relations are not open to direct inspection by the reader’s naked eye in as much as they are necessarily mediated through the voice of the author; consequently, they may have trouble being focalized properly.

260 Literary Pragmatics The Pragmatics of Voice

In the absence of obvious signposts, such as names and parentheticals attached to the ‘physiological utterance’ (especially when we are dealing with an unspoken thought or an ‘unspeakable sentence’), we may be unsure whose voice we are hearing. This is where pragmatics comes to the rescue. In order to be speakable, a sentence, Banfield (1982) noted, must have a ‘speaking subject’’ – not just a sentential subject, but one authoring the utterance that is placed in a context in which certain utterances are speakable by certain persons. Successful vocalization at the author end is matched by the reader’s successful revocalizing: the reader cocreates the part of the fictional universe in which the utterance is spoken and attributes the voices univocally to the focalizing characters, including the speaking subject. When Voices Clash

Voices may sound in harmony, or they may clash. A voice that is not in accordance with what we, as readers, know about the speaking character will jar, not sound right; we do not feel it is the voice of the character (but perhaps the voice of the intrusive narrator trying to disguise him- or herself as a character or even, as in the case of Trollope, as the author). Other clashes are often referred to as ‘poetic license,’ such as when animals are attributed vocalizations that are not in keeping with their animal status. In Anna Karenina, we encounter quoted thoughts ascribed to the bird-dog Laska, who is irritated at Levin and his brother because they keep chatting while the birds fly by, one after the other, without the hunters so much as bothering to point their guns at them: ‘‘‘Look how they have time to make conversation – she thought – And the birds are coming . . . . In fact, here comes one. They’re going to bungle it . . .’’ Laska thought’ (Tolstoy 1889/1962: 185). In other cases, the reader is confused, such as when voices speak ‘out of order,’ having access to material that is strictly not accessible to the characters, given their background, or even false (Mey, 2000: Chap. 6). Such ‘clashes’ may even be caused intentionally, for example, in order to obtain a comic effect by letting characters adopt modes of speech that are not commensurate with the speech proper to the events or characters (such as when a director purposefully introduces modern colloquialisms and slang into a Shakespearian play).

Conclusion: The Role of Pragmatics The user has been the guideline in our reflections on the ways readers and authors participate in the

common endeavor of creating a literary text. The dialogue we engage in as authors and readers is a dialogue of users; the ‘dialectics of dialogue’ has been invoked to explain the users’ cocreative roles, as authors and readers, in establishing the textual object (e.g., a story). However, dialogue does not happen in a vacuum; it is a dialogue of social forces perceived not only in their static coexistence, but also as a dialogue of different times, epochs and days, a dialogue that is forever dying, living, being born: Coexistence and becoming are fused into an indissoluble, concrete multi-speeched [italics added] unity. (Bakhtin, 1992: 365)

The voices of the text are anchored in the plurality of discourse, in a ‘multispeeched’ mode; this multivocality represents the dialectic relations between different societal forces (see Discourse, Foucauldian Approach). If it is true that texts only come into existence as human texts through an actual engagement by a human user (as already stated by Roman Ingarden in 1931), then a pragmatic view of text, particularly literary text, is anchored in this user engagement. Conversely, the user is engaged only insofar as he or she is able to follow, and recreate, the text supplied by the author. Among the voices of the text, the reader, too, has one; this vocalization is subject to the same societal conditions that surround the author. The textual dialogue thus presupposes a wider context than that provided by the actual text. As we have seen, pragmatics offers a view on this wider, social context and explains how it interacts with author, texts, and readers. See also: Bakhtin, Mikhail Mikhailovich (1895–1975); Coop-

erative Principle; Deixis and Anaphora: Pragmatic Approaches; Discourse, Foucauldian Approach; Grice, Herbert Paul (1913–1988); Implicature; Maxims and Flouting; Parentheticals; Pragmatics: Overview; Rhetoric, Classical; Speech Acts.

Bibliography Austen J (1947). Northanger Abbey. London: Wingate. (Original work published 1810). Austin J L (1962). How to do things with words. Oxford: Oxford University Press. Bakhtin M M (1992). ‘Discourse in the novel.’ In Holquist M (ed.) The dialogic imagination: four essays by M. M. Bakhtin. Austin: University of Texas Press. 259–422. Bakhtin M M (1994). Speech genres and other late essays. McGee V (trans.); Emerson C & Holquist M (eds.). Austin: University of Texas Press. Bal M (1985). Narratology: introduction to the theory of narrative. Toronto: University of Toronto Press.

Literary Theory and Stylistics 261 Banfield A (1982). Unspeakable sentences. Boston: Routledge. Barthes R (1977). Music Image Text. London: Fontana. Corta´ zar J (1985). ‘Historia con migalas.’ In Queremos tanto a Glenda y otros relatos. Madrid: Ediciones Alfaguara. 29–44. Gorayska B & Mey J L (1996a). ‘Of minds and men.’ In Gorayska & Mey (eds.). 1–24. Gorayska B & Mey J L (eds.) (1996b). Cognitive technology: in search of a humane interface. Advances in Psychology, vol. 113. Amsterdam: North Holland-Elsevier. Grice H P (1989). Studies in the way of words. Cambridge, MA: Harvard University Press. Ingarden R (1973). The literary work of art. Grabowicz G G (trans.). Evanston, IL: Northwestern University Press. (Original work published 1931). Mey J L (1992). ‘Pragmatic gardens and their magic.’ Poetics 20(2), 233–245. Mey J L (1994). ‘Edifying Archie or: how to fool the reader. In Parret H (ed.) Pretending to communicate. Berlin: de Gruyter. 154–172.

Mey J L (2000). When voices clash: a study in literary pragmatics. Berlin: Mouton de Gruyter. Mey J L (2001). Pragmatics: an introduction (2nd edn.). Boston: Blackwell. Morson G S & Emerson C (1990). Mikhail Bakhtin: The creation of a prosaics. Stanford, CA: Stanford University Press. Ryan M-L (2001). Narrative as virtual reality: immersion and interactivity in literature and electronic media. Baltimore: Johns Hopkins University Press. Searle J R (1969). Speech acts: an essay in the philosophy of language. Cambridge, UK: Cambridge University Press. Tolstoy L N (1962). Anna Karenina (2 vols). Moscow: Izdatel’stvo Pravda. (Original work published 1889). Trollope A (1994). Barchester Towers. Harmondsworth: Penguin. (Original work published 1857). Van Dijk T (1972). Some aspects of text grammars. The Hague, The Netherlands: Mouton.

Literary Theory and Stylistics K Green, Sheffield Hallam University, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Stylistics and literary theory would appear to coincide neatly in a concern for the language of the given object, literature. In one model, stylistics can be seen as a hyponym to the superordinate theory of literature, that aspect of theory that explicitly deals with locating and describing the particular language of literature. Yet stylistics is often marginalized as a literary-critical approach and has not enjoyed an easy relationship with many of the dominant trends in literary theory. Indeed, the term stylistics has suggested to some critics a rather marginal activity having to do with the niceties of an author’s style; but it clearly is much more than that. In a broad historical context, stylistics has its roots in the elocutio of Aristotelian rhetorical studies. Crucially surviving from the rhetorical tradition, elocutio dealt with the appropriateness of the expression and the relevance of its stylistic choices. Thus, it is a focus on the language of the expression in a particular way to do with how the text means, rather than what it means. Here are the roots not only of stylistics, but also of formalist and structuralist analyses of the 20th century. Literary theory, in contrast, is a broad area of investigation covering linguistic, social, cultural, political, economic, hermeneutic, empirical, and

psychological issues surrounding the production and consumption of literary texts. Before the 20th century, the notion of style was a secondary product of the analysis of grammar and rhetoric. In the 20th century, it developed from elocutio to encompass almost anything to do with an explicit focus on the ‘language’ of the (normally literary) text; that is, its particular and manifest phonological, lexical, and grammatical features. Stylistics arose partly because of the need in literary criticism to work with a set of agreed-upon and defined terms for the analysis and description of a particular kind of language, the language of literature. Such a language would not be wholly derived from the study of rhetoric (though it would take from it where it felt it was necessary) but would be built upon modern linguistic analysis. Stylistics in the main has concentrated on literary texts (despite the offshoot and somewhat tautologous sounding ‘literary stylistics’), although the concept of style extends beyond the literary. With its focus on the language of the text, largely (but not exclusively) treated in a synchronic manner, stylistics has obvious affinities with a certain kind of literary criticism, particularly what came to be known as ‘practical’ criticism. This critical approach, growing out of the work of I. A. Richards in the 1920s, focused on the mechanics of the text, its immanent language features, and gave the pedagogy of English literature an enormous boost by freeing it from the

Literary Theory and Stylistics 261 Banfield A (1982). Unspeakable sentences. Boston: Routledge. Barthes R (1977). Music Image Text. London: Fontana. Corta´zar J (1985). ‘Historia con migalas.’ In Queremos tanto a Glenda y otros relatos. Madrid: Ediciones Alfaguara. 29–44. Gorayska B & Mey J L (1996a). ‘Of minds and men.’ In Gorayska & Mey (eds.). 1–24. Gorayska B & Mey J L (eds.) (1996b). Cognitive technology: in search of a humane interface. Advances in Psychology, vol. 113. Amsterdam: North Holland-Elsevier. Grice H P (1989). Studies in the way of words. Cambridge, MA: Harvard University Press. Ingarden R (1973). The literary work of art. Grabowicz G G (trans.). Evanston, IL: Northwestern University Press. (Original work published 1931). Mey J L (1992). ‘Pragmatic gardens and their magic.’ Poetics 20(2), 233–245. Mey J L (1994). ‘Edifying Archie or: how to fool the reader. In Parret H (ed.) Pretending to communicate. Berlin: de Gruyter. 154–172.

Mey J L (2000). When voices clash: a study in literary pragmatics. Berlin: Mouton de Gruyter. Mey J L (2001). Pragmatics: an introduction (2nd edn.). Boston: Blackwell. Morson G S & Emerson C (1990). Mikhail Bakhtin: The creation of a prosaics. Stanford, CA: Stanford University Press. Ryan M-L (2001). Narrative as virtual reality: immersion and interactivity in literature and electronic media. Baltimore: Johns Hopkins University Press. Searle J R (1969). Speech acts: an essay in the philosophy of language. Cambridge, UK: Cambridge University Press. Tolstoy L N (1962). Anna Karenina (2 vols). Moscow: Izdatel’stvo Pravda. (Original work published 1889). Trollope A (1994). Barchester Towers. Harmondsworth: Penguin. (Original work published 1857). Van Dijk T (1972). Some aspects of text grammars. The Hague, The Netherlands: Mouton.

Literary Theory and Stylistics K Green, Sheffield Hallam University, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Stylistics and literary theory would appear to coincide neatly in a concern for the language of the given object, literature. In one model, stylistics can be seen as a hyponym to the superordinate theory of literature, that aspect of theory that explicitly deals with locating and describing the particular language of literature. Yet stylistics is often marginalized as a literary-critical approach and has not enjoyed an easy relationship with many of the dominant trends in literary theory. Indeed, the term stylistics has suggested to some critics a rather marginal activity having to do with the niceties of an author’s style; but it clearly is much more than that. In a broad historical context, stylistics has its roots in the elocutio of Aristotelian rhetorical studies. Crucially surviving from the rhetorical tradition, elocutio dealt with the appropriateness of the expression and the relevance of its stylistic choices. Thus, it is a focus on the language of the expression in a particular way to do with how the text means, rather than what it means. Here are the roots not only of stylistics, but also of formalist and structuralist analyses of the 20th century. Literary theory, in contrast, is a broad area of investigation covering linguistic, social, cultural, political, economic, hermeneutic, empirical, and

psychological issues surrounding the production and consumption of literary texts. Before the 20th century, the notion of style was a secondary product of the analysis of grammar and rhetoric. In the 20th century, it developed from elocutio to encompass almost anything to do with an explicit focus on the ‘language’ of the (normally literary) text; that is, its particular and manifest phonological, lexical, and grammatical features. Stylistics arose partly because of the need in literary criticism to work with a set of agreed-upon and defined terms for the analysis and description of a particular kind of language, the language of literature. Such a language would not be wholly derived from the study of rhetoric (though it would take from it where it felt it was necessary) but would be built upon modern linguistic analysis. Stylistics in the main has concentrated on literary texts (despite the offshoot and somewhat tautologous sounding ‘literary stylistics’), although the concept of style extends beyond the literary. With its focus on the language of the text, largely (but not exclusively) treated in a synchronic manner, stylistics has obvious affinities with a certain kind of literary criticism, particularly what came to be known as ‘practical’ criticism. This critical approach, growing out of the work of I. A. Richards in the 1920s, focused on the mechanics of the text, its immanent language features, and gave the pedagogy of English literature an enormous boost by freeing it from the

262 Literary Theory and Stylistics

contextual constraints of history and biography and treating the text on its own terms. It was thus an important precursor of Anglo-American New Criticism. The text’s ‘terms’ were linguistic, both in detail and in use as metaphor. Stylistics is not to be confused with the New Criticism of the 1950s and 1960s, but it does share some methodological axioms, and the two approaches overlap in the 1960s. Modern stylistic analysis really began with the work of Charles Bally (1909) and Leo Spitzer (1948) in the first half of the 20th century. Bally’s stylistique saw literary texts as examples of particular language use, and his work prefigures later conceptions of register and mode (particularly in the work of M. A. K. Halliday and David Crystal). Bally was a pupil of Ferdinand de Saussure. Spitzer, however, saw ‘style’ as not so much a particular form of linguistic ‘character’ but as a manifestation of social or historical ‘expression.’ Spitzer’s work is essentially the cultural interpretation of style, and he suggests another important line of stylistic investigation (be it literary or nonliterary). His work highlights an ongoing tension in stylistics between formal description of linguistic features and broader cultural, psychological, or historical or interpretative aspects of the text. Put simply, the goal of stylistics is to describe the linguistic features of the object of analysis in such a way as to demonstrate their ‘‘functional significance for the interpretation of the text’’ (Wales, 2001: 438). Its relationship to literary theory is a complex one. In the 1960s, linguistics became the focus both of a certain movement in literary theory and of critical approaches that grew out of the close-reading methods of I. A. Richards and others. The way in which linguistics was used by literary theory and stylistics, however, was radically different. Stylistics in its anglicized form has tended to eschew the philosophical complexities and self-reflexive obsessions of literary theory and to use the practical methods of linguistics wherever it sees them as being relevant to a particular reading. Linguistic theory and practice is very much the handmaiden of stylistics, which eclectically draws on what are deemed appropriate terminology and discourse. The dominant linguistic theory throughout the 1960s and 1970s was undoubtedly that of Noam Chomksy and his followers, yet strikingly, stylistics almost ignored his work during this time. The Chomskyan influence is seen particularly acutely in the work of Ohmann (1964) and Levin (1962). Here, style is seen as a transformational option selected from a core form. Despite the central notion of ‘transformations,’ which could easily be related to notions of style (and is presented convincingly by Ohmann and Levin), Chomsky’s work was seen as too abstract for the text-driven concerns of the stylistician.

Stylistics found a more sympathetic grammar in the work of M. A. K. Halliday, whose systemic grammar allowed stylisticians to link grammatical description with literary effect in a way that was more transparent than in the Chomskyan paradigm. Halliday’s functional approach (Halliday called it systemicfunctional) again uses the notion of language choices, but sees them determined by the uses or functions which they are seen to serve. Language is thus socially constituted and meaning cannot be formulated in isolation (in opposition to the formalist approach). Literary stylisticians approved of this view because it allowed the notion of literary ‘style’ to be part of a larger, sociolinguistic network. In the 1970s and 1980s, stylistics drew on a number of different methodologies and approaches, most notably the Hallidayan functional model, feminism, pragmatics, and discourse analysis, these approaches being dominant and extremely influential. Indeed the most obvious and striking development in stylistics in the last 20 years has been its recognition of the notion of ‘discourse’ – although this was not especially new and had a precursor in Crystal and Davy (1969). There are some parallels in literary theory in that following the radical movements of structuralism and deconstruction in the 1960s and 1970s, a more historicist approach began to be seen. This was not simply an opposition to the formalisms of, for instance, deconstructive methodologies, for the New Historicism often incorporated deconstructive practice (particularly in the work of Stephen Greenblatt). Rather, it was a widening of the linguistic context and a breaking down of barriers between literary and nonliterary writing. Perhaps the most pervasive influence on stylistic analysis in recent years, then, has been that of discourse analysis, and there are again parallels with developments in literary theory. Discourse analysis can signal many things, from the most text-centered description of language fragments to broad, Foucaultdian speculations about power and language. Literary theory has drawn more on the work of Foucault than almost any other theorist of ‘discourse,’ and yet Anglo-American stylisticians remain suspicious of his writing and treat it with a degree of scepticism. What unites the Foucaultdian approach with those such as the Hallidayan, however, is a belief in the contextual nature of language and a focus on language ‘beyond the sentence.’ Further, there is a belief, not always made manifest, that the (discourse) stylistician should be looking at ‘real’ language; that is ‘naturally occurring’ discourse. Of course, literature provides us with a ready-made supply of ‘real’ discourse, although many discourse analysts reject this corpus, insisting that it is ‘deviant’ language.

Literary Theory and Stylistics 263

However, the approach has been fruitful in the field of stylistics. Ronald Carter’s influential collection of stylistics essays Language and Literature (1982) was followed by the volume Language, Discourse and Literature: An Introductory Reader in Discourse Stylistics (1989). The shift in focus is clearly registered in essay topics: the first volume is dominated by grammatical considerations, despite the inclusion of Deirdre Burton’s influential ‘Through glass darkly: through dark glasses’ (a politico-stylistic analysis of Sylvia Plath’s The Bell Jar). The latter volume draws heavily on developments in pragmatics and discourse analysis, including applications of Politeness Theory, Speech Act Theory, and Gricean pragmatics. If we take discourse analysis in its broadest sense, then it is possible to also view stylistics, ‘‘as simply the variety of discourse analysis dealing with literary discourse’’ (Leech, 1983: 151). This may not lead anywhere, and rather sets stylistics adrift from other movements in linguistics. Stylisticians have tended to draw eclectically from developments in discourse analysis (as they had done from other developments in linguistics) and the related field of pragmatics. Speech Act Theory, Politeness Theory, and Critical Discourse Analysis are now the staples of stylistic approaches. Speech Act Theory has also been adopted by literary theorists with some success, though Derrida’s interpretation is quite different from Searle’s. Fairclough’s Critical Discourse Analysis (1989) has been extremely influential in encouraging a more politically orientated approach to stylistics. Used primarily in the analysis of media texts, it has proven fruitful in some aspects of literary analysis, reflecting the challenge (or threat) posed by the spread of media and cultural studies over traditional ‘English’ courses. Critical Discourse Analysis in a literary context is exemplified in the work of David Birch (1989), Sara Mills (1997) and Michael Toolan (1996), among others. It has its roots in Hallidayan systemic-functional linguistics (although Hodge and Kress 1993, successfully employ a Chomskyan model for analysis) and is particularly concerned with the linguistic encoding of ideology in texts. Its main object is most naturally the language of the media, of institutions, and of government; but again, stylisticians are able to examine the ideologies evident in representations of these elements as well as taking the broader view of literature as an encoder of ideology. The integrationalist approach of Michael Toolan is influenced by the work of Roy Harris (1984 and passim) and rejects many of the fundamental tenets of Western linguistics while promoting a holistic view of language function. Fundamentally exciting and iconoclastic, its effect

has yet to be fully registered in literary discourse analysis. Literary theory in the second half of the 20th century was initially driven not by Chomsky or Halliday or Richards or F. R. Leavis, or even Spitzer or W. K. Wimsatt, but by the Swiss linguist Ferdinand de Saussure, whose Cours de Linguistique Generale (1917) exercised a profound influence on continental and Anglo-American literary criticism (and indeed criticism of other arts). Translations and reprints of the work increased dramatically through the 1950s, 1960s, and 1970s, as the Cours became the sine qua non of literary criticism. But the way in which Saussure was adopted by literary critics and cultural theorists was vastly different from the way that, say, Halliday, was adopted by the stylisticians. Saussure offered no grammatical method and only the broadest semantic theory, and so was not adopted ‘practically’ in the same way. What he offered literary critics and theorists was a way of conceiving meaning relations, and he did so by positing a number of crucial, and by now extremely familiar, ideas: (i) (ii) (iii) (iv)

The distinction between la langue and la parole The essential arbitrariness of the linguistic sign The arbitrary and conventional nature of the sign The distinction between the signifier and the signified (v) Language is binary and contrastive.

These five observations together revolutionized literary and cultural criticism and were thoroughly used and abused throughout the middle and latter part of the 20th century. Aberrant readings and deliberate misreadings of Saussure pushed literary theory into an extreme self-reflexive period in the 1970s. At first, Saussure’s work was felt most keenly in literary structuralism, where content was bracketed off and structure became central. The literary text was seen to function like a sentence – a massive grammatical construction. Individual words (or signs) were mere fillers for essential structural functions and a sign’s meaning was a result of its relation to other signs (in particular one in which it stood in binary relation to) and based on an arbitrary relation between the ‘acoustic sound-image’ (the signifier) and the concept associated with it (the signified). Thus, at a stroke issues of a text’s ‘meaning’ were reduced to that of a text’s structure. This fact coupled with misreadings of Saussure and extreme versions of the theory of the arbitrariness of the sign (contra Saussure), took criticism away from the delicate consideration of a text’s aesthetic value and toward a consideration of literary practice as sign. This is only one half of the story, however, for another version of structuralism took the notion

264 Literary Theory and Stylistics

of binary and contrastive relations and utilized it to produce close and immensely detailed analyses of the meaning relations in individual texts (Riffaterre, 1966). Indeed, Riffaterre’s work, particularly the essay on Baudelaire’s Les Chats, became a model for a certain kind of structural-linguistic analysis. In this important work, many of the present-day issues and difficulties to do with the location and description of style are evident. Riffaterre draws on the Russian formalist school in his notion of literature as some form of stylistic defamiliarization, but he attacks the structuralism of Roman Jakobson and Claude Levi-Strauss for its undiscriminating location of features (which may or may not have stylistic significance). Riffaterre further is concerned with the effects of the language upon its audience. In the early 1970s, debates about the relationship between linguistic analysis and ‘traditional’ literary criticism raged, particularly in the United Kingdom. The most famous of these was the Fowler-Bateson debate (1971) (see Simpson, 2004 for a reprint of this argument). Roger Fowler took it as axiomatic that language analysis was at the heart of literary criticism, while F. W. Bateson, a literary critic, suggested that literature possessed an ‘ineradicable subjective core’ which was simply not amenable to linguistic analysis. Essentially, this was a claim for the unique status of literary language and a swipe at so-called ‘objective’ methods of analysis. But for a decade in the mid-1970s to the mid-1980s, the linguistic critics held sway, and the two approaches to linguistic theory and practice outlined above came together in a peculiarly anglicized form. The work of Fowler drew on both the continental Saussurean tradition and the more pragmatic (not in the linguistic sense) and eclectic stylistic Anglo-American tradition. In works such as Linguistics and the Novel (Fowler, 1977) and Linguistic Criticism (Fowler, 1986), one will find the work of Halliday, Chomsky, and Saussure linked with the practical consideration and analysis of a range of largely canonical literary texts. Traditional areas of investigation were given a new look, with some new terminology. In Fowler’s work, we find analyses of ‘mind style,’ of paradigmatic and syntagmatic relations, and of free indirect discourse – standard areas for later stylistic analysis. Fowler’s work drew on structuralist methodologies, if somewhat eclectically, but typically avoided Barthesian excesses. In theory, stylistics offers common ground for those perennially separate groups of literature and language specialists; but the two domains have developed from such different origins that the relationship is never clear. In 1960, Roman Jakobson famously stated that ‘‘a linguist deaf to the poetic function of language

and a literary scholar indifferent to linguistic problems and unconversant with linguistic methods are equally flagrant anachronisms’’ (1960: 377). One could extend David Lodge’s statement that ‘‘the novelist’s medium is language: whatever he does, qua novelist, he does in and through language’’ (Lodge, 1966: ix) to literature as a whole. Literature is fundamentally language, and the linguist is interested in language. This language can be conceived of as fundamentally different from ‘ordinary’ language, or as different only to the extent that it contains a greater concentration of features already present in other forms of language. But whether the difference is seen as one of degree or one of kind, the object of analysis is still language. What could be simpler? Even with the arrival, or rather discovery, of Saussure by literary theorists, the relation between linguistics and literature was not made clearer. This is partly because linguists in the 20th century, following Humboldt and more recently Chomsky, tended to view language as an abstract system, whereas the principal domain of the literary critic was in the textual features of the text as artifact. Stylistics thus continues to see the text as the object of study, but instead of seeing it as part of a system, sees it more simply as a site where language is organized, and proceeds to utilize the methods of descriptive linguistics for a more focused and less subjective analysis of those textual features. For the stylistician, then, the text is both data (as a linguist) and artifact (as literary critic). This assumption of the literary artifact (and acceptance of it) has led to the stylistician’s domain object being largely simply the literary canon. Literary stylistics does not engage much in debates about the canon (on the whole) – it accepts it as its object of study and analysis. This is because it is not a coherent theory of literature as such, but a collection of possible methodologies. As noted, stylistics often borrows from whatever methodology is prevalent in mainstream linguistics. Traffic the other way – stylistics influencing mainstream linguistics – is almost unheard of. The influence of Saussure brought literary criticism and linguistics both to treat the text synchronically (that is, without regard to time). Stylistics is still a predominantly synchronic approach which has found no parallel to the historicist developments in literary theory that have dominated since the late 1970s. An important development in the broad field of linguistics in the late 20th and early 21st centuries is that of ‘cognitive linguistics.’ Cognitive Linguistics (as part of a broader ‘cognitive science’) is most notably associated with the work of Giles Fauconnier, George Lakoff, Ronald Langacker, Mark Turner, and Eve Sweetser, among others. Much is made of

Literary Theory and Stylistics 265

the supposed antiobjectivist stance of the cognitivists, where language, thought, and conceptualization are seen to be embodied. Embodied experience is expressed through metaphors (hence the cognitivists’ obsession with this trope), and knowledge is constituted through and by what are known as conceptual metaphors. Cognitive stylistics naturally takes its lead from cognitive psychology and linguistics, having a number of key assumptions. The first is to do with the ‘embodied mind,’ an attempt to heal the Cartesian split between mind and body. Mental activity is ‘embodied,’ in that physical and cultural functioning is interwoven and finds outlets in our use of metaphors and other tropes. Humans are said to conceptualize abstract concepts in concrete terms, where modes of reasoning are grounded in our physical being. Metaphors such as ‘success is up’ are said to demonstrate this. However, it is too easy to take this as a given, or simple cognitive disposition. Eve Sweetser has shown that such movements from the concrete to the abstract develop over a period of time. The movement, for instance, from the following two meanings of ‘see’: (i) I see the cat (ii) I see what you mean. shows a cognitive development in terms of historical semantic change. In its most enlightening form, particularly in the work of Sweetser, it can throw light on the development of semantic change and thus has a diachronic as well as synchronic aspect. There are some within the discipline who wish to replace the term ‘stylistics’ with the term ‘cognitive poetics,’ but, in my view, this is a dangerous move. The strength of the eclectic approach offered by stylistics is also its weakness: its restless search for new applications and new theories to galvanize the movement can also be seen as the capricious appropriation of others’ work. The cognitive poetics ‘revolution’ shows the discipline of stylistics at its best and at its worst – sifting through the mass of theories and applications for the most appropriate for stylistic analysis and producing genuine interdisciplinary analyses or misappropriating theories and rushing too quickly to apply them to texts. Catherine Emmott’s (1997) work has made important contributions to our understanding of certain procedures of text-mapping, drawing on recent cognitive and discourse theories. Emmott avoids the pitfalls of what might be called the ‘new’ affective stylistics by resisting the temptation to speculate further upon the aesthetic considerations of the text. Cognitive poetics (sometimes given the term ‘cognitive stylistics’) has an uneasy relationship with

literary theory, and indeed with traditional stylistics. We might, for instance, define ‘literariness’ in terms of ‘‘cognitive events triggered in the mind by linguistic stimuli’’ (Pilkington, 2000: 189). But how are we to locate and define those ‘cognitive events’ except by the description and analysis of the ‘linguistic stimuli’? It is not clear how this is different from saying that the text produces a certain effect because of its relevant linguistic features. Traditional literary concerns reappear with a new metalanguage. The focus, as in other cognitive approaches, is, then, on how the text works (or how readers work the text) rather than what the texts means. The work of Stockwell (2002) and Semino and Culpeper (2002) outlines the major concerns of the ‘new’ stylistics and is usefully read alongside the more traditional stylistic work of Paul Simpson (2004). Jonathan Culler’s (1975) notion of ‘literary competence’ is the main precursor here (in a literary-critical context), but there are also links to the earlier work of Stanley Fish (1970) and Wolfgang Iser (1978). The ‘human mind’ is the focus of critical attention; but the human mind is a troubling, elusive thing whose functions and very being have to be inferred from human actions and processes. The principal difficulty is that as the analyses of individual texts proceed, the focus on the workings of the ‘human mind’ is diminished. Typically, as readings become more engaged with the manifest features of the literary text, claims about the ‘cognitive’ aspect of the analysis are more difficult to sustain. An alternative to the cognitivists’ approach is presented in the work of David Miall (e.g., Miall and Kuiken, 1994 and passim) and others. Miall’s work draws on the work of Samuel Taylor Coleridge and grows out of the theories of both Coleridge and the 20th century formalist Victor Shklovsky. Other influences include the Dutch discourse analyst Teun van Dijk. Miall’s claim is that the literary text is a universe ‘‘whose laws are distinctive’’ (1994: 2), echoing a more traditional view of the literary artifact. Although he on occasions assumes a crude notion of what constitutes ‘ordinary language,’ he makes some important observations regarding the nature of the literary text. Most cognitive approaches describe a ‘resource-limited’ system in which cognitive structures economize comprehension by deleting irrelevant propositions, inferring relevant propositions and building macropropositions. But Miall suggests that literary texts reverse the economizing effects of schemata and other supposed cognitive features. Thus, Miall finds cognitive approaches inadequate and unable to account for the richness of literary language. At a stroke, he rejects a dominant approach and brings to the fore a traditional view of the literary artifact.

266 Literary Theory and Stylistics

Closely allied to the burgeoning cognitive stylistics movement is the development of empirical stylistic analyses from the late 1980s. Empirical stylistics also draws on and is influenced by the growing field of corpus-based linguistic studies. It faces many of the same difficulties faced by the affective stylistics of Stanley Fish developed in the 1970s and the receptionaesthetik of Hans Robert Jauss of the same time. Do we account for general dispositions of an ideal reader, or do we try to account for the readings of individuals? In all cases, what is the role of stylistic analysis? In such cases, the common assumption is that nothing is ‘there’ in the text until a reader construes it – a seemingly trivial point which nevertheless has potentially devastating consequences for a discipline based on the close analysis of manifest linguistic features. Studies such as Steen’s (1994) into readers’ processing of metaphor rely on conventional interpretation of informants’ responses and the sifting and evaluation of data. The corpus of literary texts is already there, as it were, as given. After the tremendous developments and upheavals in literary theory from the 1960s to the beginning of the1980s, criticism settled to a period of prolonged historicism, whether politicized or not. The dominant strains have been in historicized analysis, postcolonial and feminist (and postfeminist) work, including psychoanalytical approaches. To be sure, criticism could not sustain its relentless self-questioning forever, and there has been a certain amount of retreat from critical neuroses and a reassessment of earlier critical and literary issues – even such hoary old-timers as authorial intention. Certainly, literary theory turned away from the kinds of linguistic concerns that had been fundamental to its development in the 1960s through to the 1980s and focused more on extending and challenging the canon and on forging closer relationships with historical methodologies. In some ways, language-study (in its broadest sense) has rarely been further from the domain of ‘mainstream’ literary criticism; and theoretical inputs and influences from mainstream linguistics (or out the mainstream) are rare. In stylistics, many of its traditional concerns are still being explored, despite the proclamations of the cognitive stylisticians. In journals, one will still find work on free indirect discourse, on ‘mind style,’ on metaphor and metonymy, on speech acts and pragmatics, on feminist stylistics, and on transitivity. This is partly due to the fact that a certain kind of stylistics (particularly evident in British universities) is essentially a pedagogical tool, and many of these topics and approaches are still the core of stylistics and ‘literary linguistics’ courses both in Europe and in the United States.

Stylistics in Europe has tended to engage with the interface of linguistic and literary theory more readily than the dominant British model (which is largely conservative in its eclecticism). European work – especially from The Netherlands, Germany, and Spain – has embraced movements both in linguistics and literary theory to produce challenging and often exciting readings of texts. European stylistics has tended to incorporate issues in philosophy, both ‘continental’ and ‘traditional,’ in its approaches and analyses much more readily than its more conservative counterpart. Here we are likely to find Derrida, Wittgenstein, Russell, Said, Gadamer, Kripke, and Spivak rubbing shoulders with Halliday, Chomsky, and Leech and Leavis, Greenblatt and Harris. Stylistics and literary theory are closely related in this eclectic tradition. This is partly due to a tendency in Europe to embrace European philosophy more readily than in Great Britain and the United States and a desire to integrate the empirical and the philosophical. Thus, the ‘great four’ of critical theory (and related fields) who so dominated and influenced cultural analysis in the 1960s, 1970s, and 1980s – Jacques Derrida, Roland Barthes, Jacques Lacan, and Michel Foucault – were assimilated into European cultural theory readily. Coupled with a tradition of stylistic analysis (as in the work of Teun Van Dijk), this philosophical and cultural assimilation gave rise (and to a certain extent still does) fertile crossdisciplinary literary analysis. See also: Approaches to Translation: Relevance Theory;

Cognitive Linguistics; Critical Discourse Analysis; Relevance Theory; Stylistics; Stylistics, Cognitive.

Bibliography Bally C (1909). Traite de stylistique Francaise (2 vols). Heidelberg: Carl Winters. Birch D (1989). Language, literature and critical practice: ways of analysing text. London: Routledge. Burton D (1982). ‘Through glass darkly: through dark glasses.’ In Carter R (ed.). 195–216. Carter R (ed.) (1982). Language and literature: an introductory reader in stylistics. London: Allen and Unwin. Carter R & Simpson P (eds.) (1989). Language, discourse and literature: An introductory reader in discourse stylistics. London: Allen and Unwin. Culler J (1975). Structuralist poetics. London: Routledge and Kegan Paul. Crystal D & Davy D (1969). Investigating English style. London: Longman. Emmott C (1997). Narrative comprehension. Oxford: Oxford University Press. Fairclough N (1989). Language and power. London: Longman.

Literature and the Language of Literature 267 Fauconnier G (1994). Mental spaces: aspects of meaning construction in natural language. Cambridge: Cambridge University Press. Fish S (1970). ‘Literature in the reader: affective stylistics.’ New Literary History 2, 123–162. Foucault M (1986). The Foucault reader. Rabinov P (ed.). Harmondsworth: Penguin. Fowler R (1977). Linguistics and the novel. London: Methuen. Fowler R (1986). Linguistic criticism. Oxford: Oxford University Press. Green K & Lebihan J (1996). Critical theory and practice: a coursebook. London: Routledge. Greenblatt S (1980). Renaissance self-fashioning. Chicago: University of Chicago Press. Halliday M A K (1971). ‘Linguistic function and literary style: an inquiry into the language of William Golding’s The Inheritors.’ In Chatman S (ed.) Literary style: a symposium. New York: Oxford University Press. 330–365. Harris R (1984). The language myth. London: Duckworth. Hodge R & Kress G (1993). Language as ideology (2nd edn.). London: Routledge. Iser W (1978). The act of reading: a theory of aesthetic response. Baltimore: Johns Hopkins University Press. Jakobson R (1960). ‘Closing statement: linguistics and poetics.’ In Sebeok T (ed.) Style and language. Cambridge, Mass: MIT Press. 350–377. Jauss H R. Aesthetic experience and literary hermeneutics. Minneapolis: Minnesota University Press. Leech G (1983). Principles of pragmatics. London: Longman. Levin S (1962). Linguistic structures in poetry. The Hague: Mouton. Lodge D (1966). Language of fiction. London: Routledge and Kegan Paul. Miall D & Kuiken D (1994). ‘Beyond text theory: understanding literary response.’ http://ualberta.ca/!dmiall/ reading/BEYOND_t.htm.

Mills S (1997). Discourse. London: Routledge. Ohmann R (1964). ‘Generative grammars and the concept of literary style.’ Word 20, 423–439. Pilkington A (2000). Poetic effects. Amsterdam: John Benjamins. Richards I A (1929). Practical criticism. London: Kegan Paul. Riffaterre M (1966). ‘Describing poetic structures: two approaches to Baudelaire’s Les Chats.’ In Babb H (ed.) (1972). Essays in stylistic analysis. New York: Harcourt, Brace Jovanovich. 362–394. Saussure F de (1974). [1917] Course in general linguistics. London: Fontana. Semino E (1997). Language and world creation in poems and other texts. London: Longman. Semino E & Culpeper J (eds.) (2002). Cognitive stylistics: language and cognition in text analysis. Amsterdam: John Benjamins. Shklovsky V (1917). ‘Art as technique.’ In Lemon L & Reis M (eds.) (1965). Russian formalist criticism. Lincoln: University of Nebraska Press. 5–24. Simpson P (2004). Stylistics: A resource book for students. London: Routledge. Spitzer L (1948). Linguistics and literary history: essays in stylistics. Princeton: Princeton University Press. Steen G (1994). Understanding metaphor in literature: an empirical approach. London: Longman. Stockwell P (2002). Cognitive poetics: an introduction. London: Routledge. Sweetser E (1997). From etymology to pragmatics: metaphorical and cultural aspects of semantic structure. Cambridge: Cambridge University Press. Toolan M (1996). Total speech: an integrational linguistic approach to language. London: Duke University Press. Wales K (2001). A dictionary of stylistics (2nd edn.). London: Longman.

Literature and the Language of Literature B Burton and R Carter, University of Nottingham, Nottingham, UK ! 2006 Elsevier Ltd. All rights reserved.

Producing Definitions Definitions of literature and of literary language are socially and historically variable. The history of both terms has been constituted by differently positioned readers and writers framing different answers to questions concerning an appropriate definition. In this respect, definitions of literature and of literary language should not be considered as ‘ontological,’ establishing an essential, timeless property of what

literature or literary language is, but as ‘functional’ – establishing specific and variable circumstances within which texts are designated as literary, and the ends to which these texts are and can be used (cf. Fowler, 1997; Frow, 2002). It should also be remembered that the circumstances surrounding the production and reception of definitions – who makes a definition, at what time, in what place, and for whom – play a significant part in determining the currency of those definitions. As part of a linguistic encyclopedia, this article also produces definitions from within a particular context, one that tends to have institutional authority insofar as it is sanctioned by universities and school examining boards.

Literature and the Language of Literature 267 Fauconnier G (1994). Mental spaces: aspects of meaning construction in natural language. Cambridge: Cambridge University Press. Fish S (1970). ‘Literature in the reader: affective stylistics.’ New Literary History 2, 123–162. Foucault M (1986). The Foucault reader. Rabinov P (ed.). Harmondsworth: Penguin. Fowler R (1977). Linguistics and the novel. London: Methuen. Fowler R (1986). Linguistic criticism. Oxford: Oxford University Press. Green K & Lebihan J (1996). Critical theory and practice: a coursebook. London: Routledge. Greenblatt S (1980). Renaissance self-fashioning. Chicago: University of Chicago Press. Halliday M A K (1971). ‘Linguistic function and literary style: an inquiry into the language of William Golding’s The Inheritors.’ In Chatman S (ed.) Literary style: a symposium. New York: Oxford University Press. 330–365. Harris R (1984). The language myth. London: Duckworth. Hodge R & Kress G (1993). Language as ideology (2nd edn.). London: Routledge. Iser W (1978). The act of reading: a theory of aesthetic response. Baltimore: Johns Hopkins University Press. Jakobson R (1960). ‘Closing statement: linguistics and poetics.’ In Sebeok T (ed.) Style and language. Cambridge, Mass: MIT Press. 350–377. Jauss H R. Aesthetic experience and literary hermeneutics. Minneapolis: Minnesota University Press. Leech G (1983). Principles of pragmatics. London: Longman. Levin S (1962). Linguistic structures in poetry. The Hague: Mouton. Lodge D (1966). Language of fiction. London: Routledge and Kegan Paul. Miall D & Kuiken D (1994). ‘Beyond text theory: understanding literary response.’ http://ualberta.ca/!dmiall/ reading/BEYOND_t.htm.

Mills S (1997). Discourse. London: Routledge. Ohmann R (1964). ‘Generative grammars and the concept of literary style.’ Word 20, 423–439. Pilkington A (2000). Poetic effects. Amsterdam: John Benjamins. Richards I A (1929). Practical criticism. London: Kegan Paul. Riffaterre M (1966). ‘Describing poetic structures: two approaches to Baudelaire’s Les Chats.’ In Babb H (ed.) (1972). Essays in stylistic analysis. New York: Harcourt, Brace Jovanovich. 362–394. Saussure F de (1974). [1917] Course in general linguistics. London: Fontana. Semino E (1997). Language and world creation in poems and other texts. London: Longman. Semino E & Culpeper J (eds.) (2002). Cognitive stylistics: language and cognition in text analysis. Amsterdam: John Benjamins. Shklovsky V (1917). ‘Art as technique.’ In Lemon L & Reis M (eds.) (1965). Russian formalist criticism. Lincoln: University of Nebraska Press. 5–24. Simpson P (2004). Stylistics: A resource book for students. London: Routledge. Spitzer L (1948). Linguistics and literary history: essays in stylistics. Princeton: Princeton University Press. Steen G (1994). Understanding metaphor in literature: an empirical approach. London: Longman. Stockwell P (2002). Cognitive poetics: an introduction. London: Routledge. Sweetser E (1997). From etymology to pragmatics: metaphorical and cultural aspects of semantic structure. Cambridge: Cambridge University Press. Toolan M (1996). Total speech: an integrational linguistic approach to language. London: Duke University Press. Wales K (2001). A dictionary of stylistics (2nd edn.). London: Longman.

Literature and the Language of Literature B Burton and R Carter, University of Nottingham, Nottingham, UK ! 2006 Elsevier Ltd. All rights reserved.

Producing Definitions Definitions of literature and of literary language are socially and historically variable. The history of both terms has been constituted by differently positioned readers and writers framing different answers to questions concerning an appropriate definition. In this respect, definitions of literature and of literary language should not be considered as ‘ontological,’ establishing an essential, timeless property of what

literature or literary language is, but as ‘functional’ – establishing specific and variable circumstances within which texts are designated as literary, and the ends to which these texts are and can be used (cf. Fowler, 1997; Frow, 2002). It should also be remembered that the circumstances surrounding the production and reception of definitions – who makes a definition, at what time, in what place, and for whom – play a significant part in determining the currency of those definitions. As part of a linguistic encyclopedia, this article also produces definitions from within a particular context, one that tends to have institutional authority insofar as it is sanctioned by universities and school examining boards.

268 Literature and the Language of Literature

Literature: A Brief History of Definitions In one sense, the definitions of literature and literary appear straightforward; the phrase ‘English literature’ occurs everyday and is used often with apparent felicity. However, what kinds of texts are designated by the term ‘literature’ is the subject of ongoing debate. To understand what is at stake in these debates, it is useful to look at the history of the word. The following definitions are presented according to the order in which they appeared historically. However, this order should not be taken to suggest the wholesale replacement of one sense by another, as though the term ‘literature’ could be emptied out and replenished with meaning. Rather, many senses have remained and do remain in use, alongside other senses, complementing and conflicting with each other. The following summary derives from Williams (1983). Literature came into English in the late 14th century in the sense of acquaintance with ‘letters’ or books. To ‘have’ literature was to be schooled in polite or human learning. For example, Henry Bradshaw, writing in 1513, spoke with contempt for ‘‘The comyn people’’ who ‘‘without lytterature and good informacyon Ben lyke to Brute beestes.’’ Writing in 1605, Francis Bacon used the term to praise the learning of King James I: ‘‘There hath not beene . . . any King . . . so learned in all literature and erudition, diuine and humane.’’ The state of being well read was here associated with the normal adjective from this period – literate. From Bacon’s use of ‘literature,’ however, it seems that the condition of being well read was at times close to the objective noun – the books in which one was well read. The evaluative overtones of ‘literate’ (and of its antonym, ‘illiterate’) persist to this day and are regularly implicated in definitions of literature. There remains a general agreement that reading literary texts plays an important part in literacy development and can be especially valuable in extending imaginative and intellectual development, aesthetic appreciation, and an understanding of how experiences of people in the past and present can be represented (cf. Showalter, 2002; Abbs, 2003). The sense of literature as a kind of knowledge seems to have persisted until the late 19th century. However, the blending of senses suggested by Bacon – both knowledge of books and books themselves – indicates that a general change in meaning took place in the mid-18th century. At that time, literature emerged in a newly objective sense, designating the activity or profession of a man of letters, as well as the whole body of books and writing, the realm of letters. In his Life of Cowley (1779), Johnson referred to ‘‘an author whose pregnancy of imagination and

elegance of language have deservedly set him high in the ranks of literature.’’ Johnson’s reference to ‘‘ranks of literature’’ pointed less to a hierarchy of types of writing than to levels of polite learning: ‘‘elegance of language.’’ At the same time, ‘literary’ no longer remained simply synonymous with ‘literate,’ but extended its meaning to describe the practice and profession of writing; the concepts ‘literary merit’ and ‘literary labor’ emerged during this period. As ‘literature’ came to designate a whole body of writing, so the sense of a corpus of works produced by a particular nation or in a particular period began to develop in the late 18th and early 19th centuries. The idea of a Nationallitteratur developed in Germany from the 1770s, whereas references to a national literature also emerged in collections of stories published during this period in France (Les Sie`cles de literature franc¸ aise, 1772) and Italy (Storia della letteratura italiana, 1772). The concept of English literature seems to have followed these European uses. The identity and constitution of a national literature are fraught with sociocultural and political struggles. These struggles focus upon the concept of a canon, certain texts that are considered central to a tradition of literature within a culture. The existence of a canon leads to the institutionalization of texts. Certain texts are ‘set’ for study by examination boards, syllabus designers, and teachers teaching particular courses; in turn, these books are categorized by publishing houses as canonical or ‘classic’ texts, and the whole process even serves to define what is considered to be literature (cf. Hodge, 1990). There is a certain degree of truth, therefore, in the statement of Roland Barthes that ‘‘literature is what gets taught.’’ Indeed, the notion of a canon of great literary texts is most problematized when national curricula are being developed, and definitions of what constitutes a national heritage are inevitably foregrounded. For example, the dominance of the canon by male writers and by a version of English writing confined to native British writers has been questioned for its partiality and may in turn be connected with the powerful positions held by white Anglo-Saxon men in educational establishments, on examination boards, and in the control of national curricula. Such circularity cannot help but influence how interpretations of the constitution of a national literary heritage are shaped (cf. Eagleton, 1996; Docherty, 2002). It is also necessary to recognize that canons are not immutable. The recent admission to the canon of English literature of the writer John Clare and the corresponding relative relegation of such 18th-century writers as Gray and Collins indicate how tastes change and evaluations shift as part of a process of canon formation, with

Literature and the Language of Literature 269

which definitions of what literature is and what it is for are inextricably bound. During the 19th century, the sense of a national literature developed alongside a second sense, which comprised a nexus of meanings – literature as art, as aesthetic object, and as the product of great imagination or creative genius. This nexus of meanings represents a narrowing of meaning from the earlier sense of literature – writing books and books themselves – toward specifically imaginative writing. Before this specialization of meaning, the domain of imaginative writing had principally been denoted by poetry. However, by the middle of the 17th century, this term ‘poetry’ was gradually confined to metrical composition. It is probable that this application of the term to verse, together with the increasing importance of prose forms such as the novel, made literature the most available general word. However, whereas poetry had designated skills of writing and speaking in the context of high imagination, literature, in its 19th-century sense, referred almost exclusively to writing. The post-18th-century senses of ‘literature’ and such associated terms as ‘literary’ still have effective currency in academic and popular uses. Nevertheless, ‘literature’ is subject to constant change; it is not universally the same everywhere and is, as a category of text, eminently negotiable. Definitions of literary language are similarly protean.

Literary Language Definitions of literary language necessarily entail theories of literature, regardless of whether these theories are explicitly announced or recognized as such. Two main camps can be identified, and these are grouped, rather loosely and at the risk of oversimplification, into what have variously been dubbed ‘intrinsic,’ ‘formalist,’ or ‘inherency’ models, on the one hand, and ‘extrinsic,’ ‘functionalist,’ or ‘sociocultural’ models, on the other (Carter, 1997, 2004). Inherency definitions are presented first not only because they are historically antecedent but also because their influence has been pervasive in the export of Russian formalism into American New Criticism and in its subsequent import into practical criticism in Britain. Foregrounding and Defamiliarization

Formalist definitions, especially those of the Russian formalists, were predicated on a division between poetic and practical language. Russian formalists sought to isolate by rigorous scientific means the specifically literary forms and properties of texts. Because there is no exclusively literary content, they argued, poetics should concern itself with the how,

rather than the what. Thus, early formalists, such as Shklovsky, Tynyanov, Eichenbaum, and Jakobson, gave special attention to the linguistic constituents of the literary medium – language – and drew on the new science of linguistics for their theoretical and descriptive apparatus. Their main theoretical position was that literary language is deviant language. It is a theory that has had considerable influence. According to deviation theory, literariness inheres in the degrees to which language use departs or deviates from expected patterns of language and thus defamiliarizes the reader. Literary language use is therefore different because it makes strange, disturbs, and upsets a routinized ‘normal’ view of things and thus generates new or renewed perceptions. For example, the phrase ‘a grief ago’ would be poetic by virtue of its departure from semantic selection restrictions that state that only temporal nouns, such as ‘week’ or ‘month,’ can occur in such a sequence. As a result, however, grief comes to be perceived as a temporal process. Deviation theory represents a definition of literary language that contains some interesting insights, but that on close inspection is theoretically underpowered. For example, consider the following four observations: 1. Norms are difficult to define, which in turn makes deviation difficult to measure. What is the norm? Do we not mean norms? Is the norm the standard language, the internally constituted norms created within a text, the norms of a genre, a particular writer’s style, or the norms created by a school of writers within a period? If it is the norms of the standard language, then what level of language is involved? Grammar, phonology, discourse, or semantics? Because the greatest advances in 20th-century linguistics have been in grammar and phonology, formalist poetics has tended to discuss literariness, in a rather limited way, in terms of grammatical and phonological deviations. 2. What is defamiliarizing in 1912 may not be that in 1922, when readers have acquired a different set of expectations. Despite the formalists’ claims for the ahistorical separateness of art and literature from other kinds of discourse, the concept of defamiliarization is predicated upon socially and historically specific reading conventions. 3. To equate deviation with literariness is to suggest that literary language cannot result from adherence to norms, which in certain literary periods was a prerequisite for the creation of patterns in literary language. 4. The distinction between literary and ‘normal’ or ‘poetic’ and ‘ordinary’ language can be

270 Literature and the Language of Literature

deconstructed by pointing out that deviation regularly occurs in such discourses as advertising, jokes, and newspaper headlines that are not institutionally connected with ‘literature.’ However, the idea of literary language as language that can result in renewal or in new ways of seeing the familiar cannot be as easily discounted as some of the above observations might suggest. Still, deviation theory needs greater theoretical and linguistic precision for the definition to hold, and it needs to be considered and tested alongside complementary definitions. Self-Referentiality

Another influential inherency definition is particularly associated with Roman Jakobson. Originally connected with the Russian formalists, Jakobson subsequently moved to the United States and in a famous paper (Jakobson, 1960) articulated a theory of poetic language that stressed the ‘self-referentiality’ of poetic language. In his account, literariness results when language draws attention to its own status as a sign and when consequently there is a focus on the message for its own sake. Jakobson’s notion has been clearly explained by Easthope (1983): The poetic function gets into the syntagmatic axis something which would normally stay outside in the paradigmatic axis: It does so by operating a choice in favor of something that repeats what is already in the syntagmatic axis, thus reinforcing it.

Thus, in the examples, I hate horrible Harry or I like Ike, the verbs hate and like are selected in favor of ‘loath’ or ‘support’ because they establish a reinforcing phonoaesthetic patterning. The examples cited (the latter was Jakobson’s own and is a slogan in favor of the former American President Dwight Eisenhower, whose nickname was Ike) demonstrate that poeticality can inhere in such everyday language as political advertising slogans. Jakobson’s definition is, like definitions of deviation theory, founded on an assumed distinction between ‘poetic’ and ‘ordinary’ language. According to Jakobson, in nonliterary discourse, the word is a mere vehicle for that to which it refers. In literary discourse, the word or phrase is brought into a much more active and reinforcing relationship, serving, as it were, to echo, mime, or somehow represent what is signified, as well as to refer to it. This emphasis on the representational nature of literary discourse is valuable. However, it should be pointed out that Jakobson’s criteria work rather better for poetry than for prose and that he supplied no clear criteria for determining the ‘clines’ or degrees of poeticality

or ‘literariness’ in his examples. Like other formalist analysts, Jakobson also placed too great a stress upon the production of effects, neglecting in the process the recognition and reception of such effects. The reader or receiver (or listener) of the message and his or her sociocultural position tended to get left out of the account. Speech Act Theories

Accounts of literary language that attempt more boldly to underscore the role of the reader interacting in a sociocultural context with the sender of a verbal message are generally termed ‘speech act theories of literature.’ Speech acts are uses of language that, either directly or indirectly, commit the user or recipient to a particular ‘action.’ In applications of these theories to literary text study, one of the main proponents has been Richard Ohmann. Ohmann’s basic proposition was that the kinds of conditions that normally attach to speech acts, such as insulting, questioning, and promising, do not obtain in a literary context. Instead we have quasi- or mimetic speech acts. As Ohmann (1971) put it: Since the quasispeech acts of literature are not carrying the world’s business – describing, urging, contracting, etc. – the reader may well attend to them in a nonpragmatic way and thus allow them to realize their emotional potential.

The literary speech act is typically represented as a different kind of speech act. As Fish (1980) and HillisMiller (2001) have demonstrated, definitions of literary speech acts presuppose a distinction between two kinds of discourse: (1) ‘nonliterary,’ ‘nonfictional,’ or ‘serious’ discourse that in various ways hooks up with the real world, and (2) ‘literary,’ ‘fictional,’ or ‘nonserious’ discourse that operates with diminished responsibility to that world. The first discourse type is typically privileged and normalized by speech-act theorists and is presented as basic and prior, whereas the other is presented as derivative and dependent on ‘normal’ language use (cf. Fish, 1980). Hence, according to Ohmann’s formulation, literary speech acts involve (on the part of the reader) a suspension of the normal pragmatic functions that words may have in order for the reader to regard them as somehow representing or displaying the actions they would normally perform. In this sense, a literary speech act brings a world into being for its readers or listeners, but beyond that does nothing. Ohmann’s theory, like inherency models, suffers from an essentialist opposition between literary and nonliterary language that is not really borne out by careful consideration. Pratt (1977), for example, has demonstrated that nonfictional, nonpragmatic,

Literature and the Language of Literature 271

mimetic, or disinterested playful speech acts routinely occur outside what is called literature. Hypothesizing, telling white lies, pretending, playing the devil’s advocate, imagining, fantasizing, relating jokes or anecdotes, and even using illustrations to underscore a point in scholarly argument, are then, by Ohmann’s definition, literary. Ohmann’s theory also does not explain either the ‘literary’ status of certain travel writings or Orwell’s essays on the Spanish Civil War (and Orwell would have been extremely perturbed if people read those essays as merely pretended speech acts). Neither does it explain why detective novels, science fiction, and popular romances that are fictional are not literary nor why the prose works of Milton or Donne, which are nonfictional, are literary. There are limitations to the theory of speech acts. However, what work on speech-act theory has done is to pose the question of whether the literary speech act is different or whether all language can be used for literary purposes. It is a tradition that has led to the insertion of quotations marks around the word ‘literary.’

Literature as Discourse The definitions of literary language that have been considered so far tend to be intrinsic rather than extrinsic definitions; there has been a focus on ‘the language itself,’ rather than on the source of the definition or on the institutional or ideological context in which the definition is made. In a sense, to describe literature as discourse is to open yet another can of wormy definitions – what is meant by ‘discourse’? In a recent article, Steen (2004) explained that discourse analysis is a complex field of research in the social sciences that requires interdisciplinary and multidisciplinary cooperation among linguistics, psychology, sociology, anthropology, and poetics, to name a few. As a result, within ‘‘the linguistic view of discourse,’’ Steen (2004: 161) explained, discourse analysts ‘‘can examine discourse as language with special attention to the encoding of a message, or as a reflection of, or stimulus for, individual cognition and action, or as a reflection of or contribution to social interaction, or as the expression of a symbol in a culture.’’ In an influential study of literature as discourse, Fowler (1981) analyzed literature as a particular form of social interaction. According to Fowler, a literary text is not simply a formal structure with such properties as grammaticality, cohesion, and rhetorical patterning such as parallelism, chiasmus, metaphor, and so on but is also the medium of a situated interaction with a source and a recipient:

To treat literature as discourse is to see the text as mediating relationships between language-users: not only relationships of speech, but also of consciousness, ideology, role and class. The text ceases to be an object and becomes an action or process (Fowler, 1997: 77).

To examine literature as a social action or process, Fowler’s analysis drew upon the functional analyses associated with such linguists as Michael Halliday. In this way, the linguistic structures of a literary text are explained according to the communicative purposes they serve: ‘‘Once we start looking at literature as a part of social process then texts are opened to the same kinds of causal and functional interpretations as are found in the sociology of language generally’’ (Fowler, 1997: 78). From this analysis, literature emerges as a sociolinguistic fact – a set of values and functions assented to (even if not consciously recognized as such) by members of a community: ‘‘The values are neither universal, though they are subject to a small range of types of historical explanation, nor stable, though they change slowly’’ (Fowler, 1997: 78). Any attempt to define literature as discourse is liable to underplay both the role and the relative position of the reader vis-a`-vis the text. However, recent research has revealed that strategies of engagement with a literary text may be described in two ways: as a cognitive and emotional process in the minds of readers, and as a socially and culturally mediated expertise with language (cf. Charlton et al., 2004; Kuiken et al., 2004). Reference made to discourse conventions in the following section is an attempt partly to reorient definitions of literature with respect to the operation of readers or, more specifically, reading communities that are located in particular social, political, and historical environments. Discourse Conventions

Analyses of literature as discourse must acknowledge that the production and reception of texts take place in relation to socially negotiated conventions. Among students and critics of literature in contemporary Western culture, certain conventions form a basis of taken-for-granted ‘common sense’ about literature. In an educational context, for example, specific conventions underpin reading literature in what is an institutionally acceptable or ‘competent’ way (cf. Culler, 1975). However, it must be stressed that these competencies are ‘conventional.’ Whereas this article attempts to define literariness with respect to discourse conventions, it need not follow that literary discourse is always produced with such functions in mind. Furthermore, whether or not a reader chooses to read a text in accordance with these conventions, as a literary text as it were, is one crucial determinant

272 Literature and the Language of Literature

of its literariness. Gerard Steen (1994), for example, has demonstrated that by presenting a newspaper article first in its original context (as a newspaper article), and then in a ‘perverse’ context (as part of a novel), readers are inclined to note metaphors that are prototypically journalistic and metaphors that are distinctively literary, respectively. The discourse conventions of a given interpretive community inform the definitions of literature that are current within that community. Stockwell (2001), for example, has noted that fiction emerges as the prototypical form of literature in much of modern literary theory. So central is the concept of the fictional that many studies of literary theory emphasize imagination and alternativity without recalling that much literature is religious, autobiographical, or political or describes real journeys, satirizes real people, or recounts real events. The institutionally and discursively regulated forms of use to which texts are put are historically specific. However, as Tony Bennett (1990) noted, this is not the same as distinguishing between historically different forms of writing. Texts from any period can be retrospectively literarized by abstracting them from the different institutional and discursive forms regulating their initial use. In taking up these criteria, it must be emphasized that the production and reception of literary texts are historically specific, socially situated acts. Rather than designate particular linguistic features as intrinsically ‘literary,’ these criteria describe how features typically function as a part of contemporary literary discourse conventions.

The Literariness of Discourse Literature may usefully be considered not only as discourse but also as a discourse that shares characteristics with other discourses. Put another way, the conventions for the production and reception of literary discourse may be considered with respect to a typology of discourses; the more a discourse conforms to these conventions, the more literary it is considered to be. Accordingly, Carter (1997) proposed a ‘cline’ of literariness, along which a typology of discourses may be situated as more or less ‘literary.’ He suggested the following criteria. Medium Dependence

The notion of medium dependence means that, the more literary a text is, the less it will be dependent for its reading on another medium or media (see Carter, 1997 for a range of examples). This criterion is part of a tradition of attributing to literary texts the qualities of mimesis and fictionality. According to this tradition, literary texts generate a world of internal

reference and rely only upon their own capacity to project. However, this is not to suggest that literary texts cannot be determined by external or social or biographical influences. It should be borne in mind that the notion of medium independence is a particular preserve of critics and students of literature. An emphasis upon the relative medium independence of literary discourse should not detract from the ways in which readers mobilize literary texts for social and cultural purposes. Janice Radway (1989), for example, has examined criteria for ‘serious fiction’ that emerge among members of the Book-of-the-Month Club. Radway emphasized that, for what she called ‘the General Reader,’ the value of ‘serious fiction’ is ‘‘a function of its capacity to be used as a map which is, despite the status of its representation, a tool for enabling its reader to move about effectively in the world to which it refers.’’ For the editors of the Book-of-theMonth Club, books in the category of ‘serious fiction’ function for Club members ‘‘in a way very similar to many self-help manuals, advice books, and reference volumes.’’ Displaced Interaction

What is conventionally regarded as a ‘literary’ text is likely to be one in which the context-bound interaction between author and reader is more deeply embedded or displaced. However, although a more literary discourse is likely to be characterized by medium independence and displaced interaction, literary texts continue to participate in a medium of interpersonal exchange. Whereas literary texts may not be dependent on any particular medium and interaction between author and reader is embedded or displaced, readers can and do mobilize the communicative function of texts for social and cultural goals. Robert Hodge (1990), for example, has examined how literary texts, marked by the mode of unreality, are able to express alternative or oppositional meanings that otherwise might be suppressed from the public arena. Polysemy

A literary text conventionally will be read for polyvalent meanings and may be produced with such functions in mind (cf. Steen, 1994). One characteristic of the polysemic text is that its lexical items do not stop automatically at their first interpretation. Denotations are always potentially available for transformation into connotation; contents are never received for their own sake, but rather as a sign vehicle for something else. Polysemy is a regular feature of many discourse genres, including advertising. Several recent studies of the language of advertising, including Cook (2001), have identified such

Literature and the Language of Literature 273

characteristics as polysemy, which point to the literariness of this discourse genre. Polysemy is a significant communicative function that is promoted and valued in particular contexts and that operates according to specific reading strategies. However, to claim that polyvalent meanings are promoted does not necessarily mean that any or all interpretations will be accorded equal value or will meet with equal acceptance within a given interpretive community. Moreover, in certain instances, promoting polyvalent meanings may carry a serious social significance. Reregistration

The notion of reregistration means that no single word or stylistic feature will be barred from admission to a literary context. This is not to say that certain stylistic or lexical features are not regarded as more conventionally ‘literary’ than others. However, reregistration recognizes that the full, unrestricted resources of a language are open to exploitation for literary ends. As studies of novelistic discourse by the Russian theorist Mikhail Bakhtin (1984) suggested, not simply single words but also the characteristics of other discursive genres may be represented in a literary context. According to Bakhtin, the effects of reregistration, of appropriating the characteristics of another discourse as an object for representation, are often parodic or travestying. In certain instances, the parodic effects of reregistration have serious social consequences, as was illustrated in the case of the fatwa declared by members of Muslim communities against Salman Rushdie following the publication of his Satanic verses. In such instances, the parodic function of literary discourse may be interpreted as a threat to the monologic authority of hegemonic discourses (cf. Mufti, 1994).

Once more, these effects contribute to the semantic density of literary discourse, as patterns of clause and tense extend over a suprasentential level.

Literature and the Language of Literature: Some Conclusions Literary language is not special or different, in that any formal feature termed ‘literary’ can be found in other discourses. Yet, literary language is different from other language uses in that it functions differently. Some of the differences can be demarcated with reference to such criteria as medium dependence, reregistration, semantic density produced by the interaction of linguistic levels, displaced interaction, polysemy, and discourse patterning. What is prototypically literary will be a text that meets most of the above criteria. In contrast, a nonliterary text will meet none or few of these criteria; that is, it will be monosemic, medium dependent, project a direct interaction, contain no reregistration, and so on. The worst excesses of paradox and the essentialist dichotomies of an absolute division into literary/ nonliterary or fictional/nonfictional can be avoided by positing a cline of literariness along which discourses can be arranged. The sociolinguistic and sociocultural context of the discourse is important. This article has attempted to indicate how certain of the functions of literary discourse may be mobilized by readers or even reading communities for social and cultural purposes. Literature and literary language may best be seen as socially and historically variable terms, with differently positioned readers and writers framing different answers to questions concerning an appropriate definition.

Interaction of Levels: Semantic Density

A text that is perceived as resulting from the additive interaction of several superimposed codes and levels is recognized as more literary than a text in which there are fewer levels at work or in which they are present but do not interact as densely. A more literary text is typically perceived to exhibit a high degree of semantic density that results from an interactive patterning at the levels of syntax, lexis, phonology, and discourse. Discourse Patterning

Criteria for literariness discussed so far have focused mostly on effects at the sentence level. At the suprasentential level of discourse, effects can be located that can help differentiate further degrees of literariness.

See also: Foregrounding; Style; Stylistics.

Bibliography Abbs P (2003). Against the flow: the arts, postmodern culture and education. London: Routledge Falmer. Bakhtin M M (1984). The dialogic imagination: four essays. Holquist M (ed.). Emerson C & Holquist M (trans.). Austin: University of Texas Press. Bennett T (1990). Outside literature. London: Routledge. Bissell E B (ed.) (2002). The question of literature. Manchester: Manchester University Press. Carter R (1997). Investigating English discourse. London: Routledge. Carter R (2004). Language and creativity: the art of common talk. London and New York: Routledge.

274 Literature and the Language of Literature Charlton M, Pette C & Burbaum C (2004). ‘Reading strategies in everyday life: different ways of reading a novel which make a difference.’ Poetics Today 25(2), 241–263. Cook G (2001). The discourse of advertising (2nd edn.). London: Routledge. Culler J (1975). Structuralist poetics: structuralism, linguistics and the study of literature. London: Routledge and Kegan Paul. Docherty T (2002). ‘The question concerning literature.’ In Bissell (ed.). 126–141. Eagleton T (1996). Literary theory: an introduction (2nd edn.). Minneapolis, MN: University of Minnesota Press. Easthope A (1983). Poetry as discourse. Routledge: London. Fish S (1980). ‘How to do things with Austin and Searle: speech-act theory and literary criticism.’ In Fish S (ed.) Is there a text in this class? The authority of interpretive communities. Cambridge, MA: Harvard University Press. 197–245. Fowler R (1981). Literature as social discourse: the practice of linguistic criticism. London: Batsford. Fowler R (1997). ‘Literature as discourse.’ In Newton K M (ed.) Twentieth-century literary theory: a reader. New York: St Martin’s. 77–82. Frow J (2002). ‘Literature as regime (meditations on an emergence).’ In Bissell (ed.). 142–155. Hillis-Miller J (2001). Speech acts in literature. Stanford, CA: Stanford University Press. Hodge R (1990). Literature as discourse: textual strategies in English and history. Oxford: Blackwell Polity Press.

Jakobson R (1960). ‘Linguistics and poetics.’ In Sebeok T (ed.) Style in language. Cambridge, MA: MIT Press. 350–377. Kuiken D, Miall D S & Sikora S (2004). ‘Forms of selfimplication in literary reading.’ Poetics Today 25(2), 171–203. Mufti A (1994). ‘Reading the Rushdie affair: ‘Islam,’ cultural politics, form.’ In Burt R (ed.) The administration of aesthetics: censorship, political criticism, and the public sphere. Minneapolis, MN: University of Minnesota Press. 307–339. Ohmann R (1971). ‘Speech, action and style.’ In Chatman S (ed.) Literary style: a symposium. Oxford: Oxford University Press. Pratt M L (1977). Toward a speech act theory of literary discourse. Bloomington, IN: Indiana University Press. Radway J (1989). ‘The Book-of-the-Month Club and the general reader: on the uses of ‘serious’ fiction.’ In Desan P, Ferguson P P & Griswold W (eds.) Literature and social practice. Chicago and London: University of Chicago Press. 154–176. Showalter E (2002). Teaching literature. Oxford: Blackwell. Steen G (1994). Understanding metaphor in literature. London: Longman. Steen G (2004). ‘Perspectives on discourse: the state of the art.’ Language and Literature 13(2), 161–179. Stockwell P (2001). Cognitive poetics. London: Routledge. Tambling J (1988). What is literary language? Milton Keynes: Open University Press. Williams R (1983). Keywords (2nd edn.). London: Fontana.

Literature: Empirical Studies J Hakemulder, Utrecht University, Utrecht, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

Introduction The empirical study of literature (ESL) concerns all aspects of literary communication. Using methods of the social sciences, researchers test hypotheses concerning the attitudes, and behavior of participants in processes ranging from the production of what are considered literary texts, their mediation, and their reception. One of the central aims of the field is to examine how literary communication differs from (or resembles) other forms of communication. It is an increasingly interdisciplinary domain, with literary scholars, sociologists, psycholinguists, cognitive and social psychologists, and anthropologists combining their insights and methodologies. In the past ESL has

first seen scholars moving from traditional disciplines of the humanities (e.g., stylistics, narratology, literary theory) into cognitive psychology; conversely, psychologists became curious how results obtained in controlled experiments with artificial texts relate to responses to full-blown literary texts. Interdisciplinarity ideally enriches both sides of the equation, with, for instance, literary studies contributing refined taxonomies of narrative structures, and psycholinguistics contributing methodological rigor and available insights in narrative processing in joint efforts to establish characteristics of literary processing. In the early history of ESL we do not always see such fruitful interchange. Some of the work did not meet standards in either the humanities or the social sciences. However, a lot has changed. The wide range of research traditions, as well as the many different cultural and language barriers may have hindered interdisciplinarity (Steen, 2003). However, cooperation is enhanced by the common

274 Literature and the Language of Literature Charlton M, Pette C & Burbaum C (2004). ‘Reading strategies in everyday life: different ways of reading a novel which make a difference.’ Poetics Today 25(2), 241–263. Cook G (2001). The discourse of advertising (2nd edn.). London: Routledge. Culler J (1975). Structuralist poetics: structuralism, linguistics and the study of literature. London: Routledge and Kegan Paul. Docherty T (2002). ‘The question concerning literature.’ In Bissell (ed.). 126–141. Eagleton T (1996). Literary theory: an introduction (2nd edn.). Minneapolis, MN: University of Minnesota Press. Easthope A (1983). Poetry as discourse. Routledge: London. Fish S (1980). ‘How to do things with Austin and Searle: speech-act theory and literary criticism.’ In Fish S (ed.) Is there a text in this class? The authority of interpretive communities. Cambridge, MA: Harvard University Press. 197–245. Fowler R (1981). Literature as social discourse: the practice of linguistic criticism. London: Batsford. Fowler R (1997). ‘Literature as discourse.’ In Newton K M (ed.) Twentieth-century literary theory: a reader. New York: St Martin’s. 77–82. Frow J (2002). ‘Literature as regime (meditations on an emergence).’ In Bissell (ed.). 142–155. Hillis-Miller J (2001). Speech acts in literature. Stanford, CA: Stanford University Press. Hodge R (1990). Literature as discourse: textual strategies in English and history. Oxford: Blackwell Polity Press.

Jakobson R (1960). ‘Linguistics and poetics.’ In Sebeok T (ed.) Style in language. Cambridge, MA: MIT Press. 350–377. Kuiken D, Miall D S & Sikora S (2004). ‘Forms of selfimplication in literary reading.’ Poetics Today 25(2), 171–203. Mufti A (1994). ‘Reading the Rushdie affair: ‘Islam,’ cultural politics, form.’ In Burt R (ed.) The administration of aesthetics: censorship, political criticism, and the public sphere. Minneapolis, MN: University of Minnesota Press. 307–339. Ohmann R (1971). ‘Speech, action and style.’ In Chatman S (ed.) Literary style: a symposium. Oxford: Oxford University Press. Pratt M L (1977). Toward a speech act theory of literary discourse. Bloomington, IN: Indiana University Press. Radway J (1989). ‘The Book-of-the-Month Club and the general reader: on the uses of ‘serious’ fiction.’ In Desan P, Ferguson P P & Griswold W (eds.) Literature and social practice. Chicago and London: University of Chicago Press. 154–176. Showalter E (2002). Teaching literature. Oxford: Blackwell. Steen G (1994). Understanding metaphor in literature. London: Longman. Steen G (2004). ‘Perspectives on discourse: the state of the art.’ Language and Literature 13(2), 161–179. Stockwell P (2001). Cognitive poetics. London: Routledge. Tambling J (1988). What is literary language? Milton Keynes: Open University Press. Williams R (1983). Keywords (2nd edn.). London: Fontana.

Literature: Empirical Studies J Hakemulder, Utrecht University, Utrecht, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

Introduction The empirical study of literature (ESL) concerns all aspects of literary communication. Using methods of the social sciences, researchers test hypotheses concerning the attitudes, and behavior of participants in processes ranging from the production of what are considered literary texts, their mediation, and their reception. One of the central aims of the field is to examine how literary communication differs from (or resembles) other forms of communication. It is an increasingly interdisciplinary domain, with literary scholars, sociologists, psycholinguists, cognitive and social psychologists, and anthropologists combining their insights and methodologies. In the past ESL has

first seen scholars moving from traditional disciplines of the humanities (e.g., stylistics, narratology, literary theory) into cognitive psychology; conversely, psychologists became curious how results obtained in controlled experiments with artificial texts relate to responses to full-blown literary texts. Interdisciplinarity ideally enriches both sides of the equation, with, for instance, literary studies contributing refined taxonomies of narrative structures, and psycholinguistics contributing methodological rigor and available insights in narrative processing in joint efforts to establish characteristics of literary processing. In the early history of ESL we do not always see such fruitful interchange. Some of the work did not meet standards in either the humanities or the social sciences. However, a lot has changed. The wide range of research traditions, as well as the many different cultural and language barriers may have hindered interdisciplinarity (Steen, 2003). However, cooperation is enhanced by the common

Literature: Empirical Studies 275

ground that researchers share. The use of empirical methods, for one, stimulates the exchange of findings. Second, most researchers agree on the epistemological foundations of their work. One basic idea of ESL may be traced back to Mukarˇovsky´’s (1974) distinction between ‘artefact’ (the signs on the paper) and the ‘aesthetic object.’ For an aesthetic object to be realized, someone must read the artefact, interpret it, and evaluate it. Consequently, to understand literature means to understand readers’ responses. And this requires empirical methods of investigation. Another assumption shared by most researchers in ESL is that it is possible to discover regularities in what people do with literature. Many theories in literary studies generate hypotheses about such regularities. Researchers in ESL agree that these claims need to be tested empirically.

Origins Besides the growing awareness among literary scholars of the inherently empirical nature of some of the problems in literary studies, there are a number of other developments that have led to ESL (Andringa, 1998). One is the rise of reception aesthetics in Germany. We can distinguish two basic ideas in this movement. Both are relevant to ESL. First, in Wolfgang Iser’s hermeneutic approach, literary texts are interpreted from the perspective of the reader. It is important to note that ‘the reader’ here is a mere construct in the mind of the interpreter, and does not necessarily have anything to do with the empirical reader. A second idea in reception aesthetics concerns research in the history of literature, which, as HansRobert Jauss proposed, should focus more on readers’ responses. Developments in literary history are best understood, he claimed, by investigating audiences’ ‘horizon of expectation’ (their aesthetic and other norms) and the degree to which a given text meets (or does not meet) these expectations. It is a small step from there to doing actual empirical research. This step was not taken by reception aesthetics, however. A call for moving from readersas-construct to empirical readers was made by the German psychologist Groeben (1977), who pointed out that many of the concepts in literary studies could easily be operationalized. In addition, he argued, this would make literary studies much more relevant to society at large. Only a few years later, Schmidt (1982) provided a theoretical basis for ESL, describing the literary system in contrast to other systems in society. He pointed out the differences in terms of communication conventions: the ‘aesthetic (or nonreferential) convention’ in contrast to the ‘fact convention’ used in other discourses; the ‘polyvalence

convention,’ the unwritten agreement between authors and readers that the text may well have more than one interpretation, as opposed to the ‘monovalence convention’ used in other systems. Another ESL strand comes from France, where mainly sociologically oriented researchers investigated the influence of extraliterary factors such as economy, politics, geography, and religion. Of central importance to the sociology of literature in France and elsewhere is the work of Bourdieu (1984). His concept of the literary field describes how literary evaluation is determined by sociological factors rather than properties of the aesthetic objects. All agents (artists, critics, art consumers) strive for symbolic value (social approval, status) to obtain, in the end, economic power. Lifestyle (which includes reading behavior) is conceived here as a mean for social groups to distinguish themselves from others and to ascertain group membership. Yet another origin of ESL can be found in psycholinguistic studies, mostly conducted in North America. A number of researchers focused on the processing of narratives, some of them looking back at the work of Bartlett. Of importance to this field of research is schema theory, schemas being memory structures abstracted from idiosyncratic experiences. A large body of research concentrates on how readers build ‘situation models,’ that is, cognitive representations of the events, actions, characters, themes, and authorial intentions (van Dijk and Kintsch, 1983). Another root of ESL can be found in America’s reader response movement. As a result of democratization in education (and elsewhere in society), educationalists shifted their attention from a top-down approach of teaching how to arrive at correct interpretations of literary texts to an interest in what literature meant to students personally. Important in this movement is the work by Rosenblatt (1938), who focused on the influence of literary socialization on reading. Some of the research in this field is of a psychoanalytic nature and looks at how readers search for identity themes in the literary texts they read (Holland, 1975).

Mapping the Field The origin of the research methods applied in ESL may lead us to distinguish two general directions: the psychology and the sociology of literature. However, such labels do not adequately represent recent developments in ESL, nor the research objects themselves. As to the latter, the various agents and processes in literary communication are affected by both psychological and sociological factors. Reception research focuses on readers’ aesthetic evaluation, their reading

276 Literature: Empirical Studies

motivation, and the effect reading has on them. Although this field is dominated by psychological approaches, the processes involved are clearly also influenced by sociological factors (e.g., the institutional context in which readers read, and readers’ literary socialization). In the sociology of literature researchers seek to explain differences in reading behaviors of social groups, and how institutions such as literary criticism are governed by social factors (e.g., a desire for status or social approval). However, exclusively relying on sociological models to explain, for example, aesthetic preferences ignores too much research showing the effect of text properties. Hence the call of some researchers for an integrated approach, which does seem to be reflected in recent developments in ESL (Andringa, 1998). With overlapping interests, different methodologies and disciplines, and differences in the precise definition of the research object, the field has become too complex for a clear-cut distinction between the psychology and sociology of literature; therefore the present overview attempts an alternative categorization. First, research pertaining to production and distribution will be introduced. This work is mainly sociological, focusing, for instance, on economic conditions for literary production and on the influence publishers have on literary communication. Here we will also discuss studies exploring the market for literature: why do some people buy literary books, while others do not? In the next section, research pertaining to reception or processing will be presented, including a description of cognitive studies of narrative processing in general, as well as of research specifically pertaining to literary narratives and other literary genres. The final section on alternative categorization gives an impression of post-processing research – studies of reader behavior after reading the text. Here we find research pertaining to canonization, reception in literary criticism, and the effects of reading literature on readers’ behavior, attitudes, norms, and values. Production and Mediation

Strong evidence shows that socioeconomic factors influence developments in literature. Janssen (2001) reviews the research concerning the social-economic conditions for authors to do their work. Central to many of these studies is the term ‘gatekeepers’: agents in the literary field who are involved in selection (e.g., of which texts are actually published). Examples of research subjects are: the effect of censorship on the creative process; the effect mediators have on literary production, for instance, when looking at translations of Third World literature into modern Western languages; and factors influencing authors’

careers. These are important efforts that could help to understand processes studied by literary historians (e.g., canonization). One example is presented in Peterson (1985). He shows how factors like copyright, technology, industry structure, and market affected the rise and decline of one particular genre, the short story in the United States. Psychological studies of production processes seem somewhat underrepresented. Much work, it seems, still needs to be done to examine processes of literary creativity (Simonton, 1984). Research often involves self reports of writers. Other research compares creative and less creative persons to determine the factors that covary with creativity, both in personality (e.g., willingness to delay a decision, tolerance for uncertainty), and in biology. Under this subheading we could also place quantitative (rather than psychoanalytic) studies concerning the relation between, on the one hand, authors’ personalities and events in their lives, and, on the other hand, the stylistic aspects of their texts (Janssen, 2001). Distribution research concerns the factors that determine what books people buy or borrow from the library. Most studies focus on social-economic factors in reading behavior, the lion’s share being inspired by Bourdieu’s (1984) theoretical framework. In his view, aesthetic preferences have a social function. Like other lifestyle components, they are ways to affirm group membership. In understanding the dynamics of the literary field, researchers look at status: the more status agents have accumulated, the more their evaluation will be considered legitimate and the more influence they will have on canonization. Also, in striving to obtain status, agents deny their interest in economic capital. Only this will result in symbolic capital, which will, in turn, lead to economic capital. These ideas have had a strong impact on what is called the institutional approach in ESL, which looks at the role of publishers, literary criticism, and literary education on aspects of literary communication (Griswold, Jansen, and van Rees, 1999). Recent studies present a more complex picture of reading audiences than Bourdieu’s distinction theory predicts. Peterson’s (1992) account of cultural stratification reveals there is no empirical evidence for a clear distinction between elite and nonelite culture consumption patterns. Instead, we see is an ‘omnivore’ audience that is involved in a wide range of cultural activities (including the elite arts), while a mainly low-educational ‘univore’ audience appreciates only a few nonelitist cultural activities. Other research shows that reading behavior can indeed be predicted by readers’ social networks, but also by their cultural competence, personality, behavioral beliefs about the rewards of reading literature,

Literature: Empirical Studies 277

norms related to literary reading, and the promotion of reading by parents, schools, etc. (Griswold, Jansen, and van Rees, 1999). Historical Readers An important subcategory of production and distribution research concerns the behavior of historical readers. These studies allow us to put current changes in reading audiences into perspective. Moreover, the history of reading may contribute to the contexualization of literary history. Literary history seems incomplete without knowledge of who the readers of the literary texts were, for whom the texts were written, what their expectations were, and how aesthetic norms changed through time and how such change affected reception. Research in this field is sometimes based on systematic content analysis of newspaper or magazine reviews. Some studies are based on sales records. One has to be careful interpreting these often incomplete data. Also, the data do not show whether the books were actually read. Taking these drawbacks into account, some interesting studies have been carried out about production and distribution in the past. For example, Kloek and Mijnhardt (1993) show that the so-called reading revolution in the 18th century is probably nothing more than a myth. Reception

Under the heading of reception we will briefly review research on the processing of narratives in general and literary texts in particular, focusing on studies pertaining to processes that occur during and directly after reading. An impressive body of cognitive research on discourse processing contributes to our understanding of how narratives are processed (e.g., Emmott, 1997). Of course, not all narratives belong to the group of texts referred to as literature, and not all literature is written in the form of narratives. Nevertheless, research results have been successfully applied in literary studies (for instance in cognitive stylistics, Semino and Culpeper, 2002). Central in many studies is schema theory, which proposes that schemata are essential to narrative comprehension processes and recall. These insights have proven to be useful in understanding, for instance, the role of genres in literary processing. Other relevant examples of research issues are: the processing and accessibility in memory of information about characters; how readers keep track of speakers, of who said what, and who knows what; the causal inferences that readers make during reading; the conditions under which predictive inferences are made; inferences concerning the ‘aboutness’ or the theme of the text; the gender of the author;

his or her intentions (e.g., Magliano, Baggett, and Graesser, 1996; Louwerse and van Peer, 2002). Cognitive psychology dominates this line of research. However, a growing number of studies investigates the role of affect (e.g., the effect of empathy on narrative comprehension; factors generating participatory responses (e.g., Gerrig, 1993). Important for ESL is the relation of findings in narrative processing research to the processing of literary texts. Many of the studies mentioned above focus on factors that facilitate comprehension, causal sequence reconstruction, understanding character goals, etc. van den Broek, Rohleder, and Narva´ez (1996), for example, proposed a model for ‘‘successful reading of literary texts’’ that suggests that much depends on the construction of a coherent mental representation. This emphasis may lead researchers to ignore what is possibly typical for processing literature, namely those aspects that obstruct readers’ understanding (see Foregrounding). Neither, of course, do the narrative models take response to poetry into account. Nevertheless, as Van den Broek et al., suggested, the research does seem a useful starting point to examine deviations of literary narratives from more accommodating ones, thus discovering where literary processing differs from processing of other discourses. One complaint concerning the studies referred to above is that they often look at seemingly insignificant ‘miniature’ hypotheses. Consequently, the results seem to be irrelevant to the bigger issues that would grant ESL more relevance within the humanities (Ibsch, 1996). Zwaan (1993) is arguably an exemplary study for ESL in the sense that he informs debates central to literary studies. His findings point out that readers process literary texts in a different way than other text genres. Thinking that they are presented with a literary text, readers read more slowly and remember more of the surface structure. This complies with what some literary theorists predict (e.g., Shklovsky, Jakobson). Studies in narrative processing may not always seem of direct relevance to a better understanding of literary communication, but it should be noted that this is often not even the primary interest of the researchers, let alone that they claim to have solved the riddles of literary studies. Cognitive stylistics, however, explicitly aims at examining (literary) style and the effects it has on readers (e.g., Semino and Culpeper, 2002). While only a decade ago it may still have been feasible to summarize developments in this field, it is now not even possible to enumerate all the subjects of interest. Here we will therefore only look at a few examples. One is the field of metaphor research, especially those studies that pay attention

278 Literature: Empirical Studies

to the specific role of metaphors in literature (e.g., Steen, 1994; Hoorn, 1997). A second field pertains to the effect of foregrounding, a term used to describe where literary style deviates from daily language (e.g., see, for more empirical studies, Foregrounding). A third group of studies examines the effect of sound on reader response. Two approaches can be distinguished here: first, those studies that focus directly on readers’ responses to sounds, either relative to context or not; second, studies based on computer-assisted content analysis (see, for a discussion, Miall, 2001). A fourth group of studies, the lion’s share of cognitive stylistics, can be labeled as psychonarratology, a term coined by Bortolussi and Dixon (2003). Methods of cognitive psychology are applied here to test empirical assumptions of narratology, the textoriented study of narratives. Examples of research subjects are: the effect of narrative structure on feelings of surprise, suspense, and curiosity (Brewer and Lichtenstein, 1982; Vorderer, 1996); the effect of narrative perspective (Van Peer and Chatman, 2001); the processing of information about story characters; and how findings of social cognition research apply to perception of characters (Bortolussi and Dixon, 2003). Differences in Reader Response A central question in reception research is (or rather, should be) ‘How do differences in response to literature come about?’ Are there general patterns or regularities in what moves readers, for instance, to feel sympathy for fictional characters? This is mainly the domain of psychology. Miall and Kuiken (1995) produced a Literary Response Questionnaire (LRQ) that was shown to be a useful instrument to differentiate between readers’ attitudes toward literature. Studies under this subheading pertain to developmental differences, gender differences, and differences between expert and novice readers. In this last group, some studies reveal similarity in responses to textual features; others find that aesthetic norms and interpretation strategies depend on readers’ literary competence. To make research findings in this domain more comparable, future research should more carefully describe ‘reading experience’ or ‘literary competence’ on the one hand, and text features on the other hand. Some research focuses on cross-cultural differences. Generally these studies show mixed results: some do find differences between readers from different cultures, but the majority point to similarities. An important research tradition that may help understand differences in aesthetic evaluation can be found in experimental aesthetics (Berlyne, 1974). Berlyne’s influential theory about the processing of

aesthetic information conceptualizes art objects as psychological stimuli, some more complex, novel, or ambitious than others. The theory proposes that the more complex the stimulus, the more capacity is required from the perceiver (or reader) to process the stimulus. Exploring the relation between stimulus properties and aesthetic pleasure, the theory suggests an inversed U-curve (and in itself an instantiation of the Wundt-curve): appreciation will increase with increasing complexity, but only up to a certain point; after that optimum, appreciation will decline. Research in this field is often experimental. It sometimes involves physiological measures (of arousal), the interpretation of which is not always unambiguous. However, the results have been replicated under many different circumstances, using a wide variety of measures and stimuli. Post-processing

Post-processing research mainly pertains to literary criticism, but also to the effects that reading literature has on the reader. The first category is dominated by the institutional approach. Some researchers look at the social dynamics in literary criticism to examine how consensus is reached over the value of literary texts. Findings suggest that that the relative status of critics affects their power over this evaluation process (Griswold, Jansen, and van Rees, 1999). The second category mainly consists of psychological and educational studies. Researchers concentrated on the effects on norms, values, attitudes, and empathic ability (Hakemulder, 2000). Most of these studies do show effects, but few allow claims about the specific effects of literary communication. Some studies focus on literary education and its effect on literary socialization, on reading behavior at a later age, and on attitudinal effects of different educational approaches to literary education.

Current Trends and Debates One of the important trends in the relatively short history of ESL is a shift from theoretical concerns (e.g., reflection on the epistemological foundations of the new discipline) to a predominantly researchdriven approach, a development toward normal science in the Kuhnian sense (Steen, 2003). ESL has met with strong criticism from traditional hermeneutics, mainly focused on the superficiality of some of the research questions and the lack of external validity of laboratory experiments, but also fed by an indifference among scholars toward empirical validation and toward (naive) readers’ interpretations of complex literary texts (e.g., Sternberg, 2003).

Literature: Empirical Studies 279

Some of these objections were also raised within ESL itself, and some important efforts have been made to remedy these problems. For example, more and more studies work with literary texts rather than experimenter generated ones, which has resulted in an increase in external validity. Careful manipulation of literary texts can sometimes compensate for the resulting loss of control. It may be advisable to conduct an evaluation of this method, after some time has passed, to see whether it is not systematically the manipulated version that has less effect (e.g., on aesthetic pleasure) than the originals. One problem with much of the available research in ESL is that the population under investigation is rather limited (often undergraduate psychology students at some American university). However, some look at other age groups, and others take a cross-cultural perspective on response to literature. This will eventually result in greater generalizability of the findings. ESL has seen a shift from radical standpoints (e.g., Schmidt’s radical constructivism) to more integrated models (e.g., Bortolussi and Dixon, 2003). One single approach clearly cannot explain all the phenomena in literary communication, and therefore researchers need to take many factors into account to generate more comprehensive models. Accounting for institutions, or more generally the context in which people read, is essential when explaining literary evaluation. However, psychological research focusing on the effects of text properties leaves no room for onesided absolutism. Another development in ESL is a growing attention to the role of emotions in the reception of literary texts (Radway, 1984; Nell, 1988; Frijda and Schram, 1994). This seems to go hand in hand with the use of qualitative research methods. To assess spontaneous imagery and emotional responses, some researchers prefer free response, self report, and think aloud procedures rather than measures such as response time and closed questionnaires (e.g., Andringa, 2004). A trend we can detect in ESL is that more and more researchers do not look at literary communication in isolation anymore, but put it in a wider context of media communication. Konijn and Hoorn (2005), for instance, have developed a model for perceiving and experiencing fictional characters for both readers and spectators of movies. This is also reflected in the work of such researchers as Vorderer (1996) and Schreier (Schreier, Knobloch, and Wieler, 1998). ESL has seen boosts in its development every time researchers of different disciplines are brought together in joint projects, be it for conferences of the International Society for Empirical Studies of Literature and Media (Internationale Gesellschaft fu¨ r

Empirische Literaturwissenschaft, IGEL), or of the Poetics and Linguistics Association (PALA), book publications, or thematic issues of journals such as Poetics and SPIEL. Hopefully the future will see more of these fusions of literary studies and the social sciences. See also: Discourse Processing; Emotion: Stylistic Approaches; Foregrounding; Humor in Language; Ingarden, Roman (1893–1970); Literary Pragmatics; Literary Theory and Stylistics; Literature and the Language of Literature; Metaphor: Psychological Aspects; Metaphor: Stylistic Approaches; Narrative: Cognitive Approaches; Reading Processes in Adults; Reading Processes in Children; Schema Theory: Stylistic Applications; Shklovsky, Viktor Borisovich (1893–1984); Stylistics; Stylistics, Cognitive; Thematics.

Bibliography Andringa E (1998). ‘The empirical study of literature: its development and future.’ In Janssen S & Van Dijk N (eds.) The empirical study of literature and the media; current approaches and perspectives. Rotterdam: Waalwijk van Doorn. 12–23. Andringa E (2004). ‘The interface between fiction and life: patterns of identification in reading autobiographies.’ Poetics Today 25(2), 205–240. Berlyne D E (1974). Studies in the new experimental aesthetics. New York: Wiley. Bortolussi M & Dixon P (2003). Psychonarratology: foundations for the empirical study of literary response. Cambridge: Cambridge University Press. Bourdieu P (1984). Distinction: a social critique of the judgment of taste. London: Routledge. Brewer W F & Lichtenstein E H (1982). ‘Stories are to entertain: a structural-affect theory of stories.’ Journal of Pragmatics 6, 473–486. Emmott C (1997). Narrative comprehension: a discourse perspective. Oxford: Clarendon Press. Frijda N & Schram D (1994). ‘Emotions and cultural products.’ Special issue of Poetics 23. Gerrig R J (1993). Experiencing narrative worlds. New Haven: Yale University Press. Griswold W, Janssen S & van Rees K (1999). ‘Conditions of cultural production and reception.’ Special issue of Poetics 26. Groeben N (1977). Rezeptionsforsschung als empirische Literaturwissenschaft. Kronberg: Gunter Narr. Hakemulder J (2000). The moral laboratory: experiments examining the effects of reading literature on social perception and moral self-concept. Amsterdam: Benjamins. Holland N (1975). 5 Readers reading. New Haven: Yale University Press. Hoorn J (1997). ‘Electronic evidence for the anomaly theory of metaphor processing: a brief introduction.’ In To¨ to¨ sy de Zepetnek S & Sywensky I (eds.) The systematic and

280 Literature: Empirical Studies empirical approach to literature and culture as theory and application. Edmonton: University of Alberta. 67–74. Ibsch E (1996). ‘The strained relationship between the empiricist’s notion of validity and the hermeneutician’s notion of relevance.’ In Kreutz R & MacNealy M S (eds.). 23–33. Janssen S (2001). ‘The empirical study of careers in literature and the arts.’ In Schram D & Steen G (eds.) The psychology and sociology of literature. Amsterdam: Benjamins. 323–358. Kloek J J & Mijnhardt W W (1993). ‘The ability to select: the growth of the reading public and the problem of literary socialization in the eighteenth and nineteenth centuries.’ In Rigney A & Fokkema D W (eds.) Cultural participation: trends since the Middle Ages. Amsterdam: Benjamins. 51–62. Konijn E & Hoorn J (2005). ‘Some like it bad: testing a model for perceiving and experiencing fictional characters.’ Media Psychology 7(2), 107–144. Kreuz R & MacNealy M S (eds.) Empirical approaches to literature and aesthetics. Norwood, NJ: Ablex. Louwerse M M & van Peer W (eds.) (2002). Thematics: interdisciplinary studies. Amsterdam: Benjamins. Magliano J P, Baggett W B & Graesser A C (1996). ‘A taxonomy of inference categories that may be generated during the comprehension of literary texts.’ In Kreuz R & MacNealy M S (eds.). 201–220. Miall D S (2001). ‘Sound of contrast: an empirical approach to phonemic iconicity.’ Poetics 29, 55–70. Miall D S & Kuiken D (1995). ‘Aspects of literary response: a new questionnaire.’ Research in the teaching of English 29, 389–407. Mukarˇ ovsky´ J (1974). Kapitel aus der A¨sthetik. Frankfurt: Suhrkamp. Nell V (1988). Lost in a book: the psychology of reading for pleasure. New Haven: Yale University Press. Peterson R A (1985). ‘Six constraints on the production of literary works.’ Poetics 14, 45–67.

Peterson R A (1992). ‘Understanding audience segmentation: from elite to omnivore and univore.’ Poetics 21, 243–258. Radway J A (1984). Reading the romance: women, patriarchy, and popular literature. Chapel Hill: University of North Carolina Press. Rosenblatt L M (1938). Literature as exploration. New York: Appleton. Schmidt S J (1982). Foundations for the empirical study of literature. Hamburg: Buske. Schreier M, Knobloch S & Wieler P (eds.) (1998). ‘Media, literature, and socialization.’ Special issue of SPIEL 17. Semino E & Culpeper J (2002). Cognitive stylistics: language and cognition in text analysis. Amsterdam: Benjamins. Simonton D K (1984). Genius, creativity, and leadership: historiometric inquiries. Cambridge, MA: Harvard University Press. Steen G (1994). Understanding metaphor in literature: an empirical approach. London: Longman. Steen G (2003). ‘A historical view of empirical poetics: trends and possibilities.’ Empirical Studies of the Arts 21(1), 51–67. Sternberg M (2003). ‘Universals of narrative and their cognitivist fortunes.’ Poetics Today 24(2), 297–395. van den Broek P, Rohleder L & Narva´ ez D (1996). ‘Causal inferences in the comprehension of literary texts.’ In Kreutz R & MacNealy M S (eds.). 179–200. van Dijk T A & Kintsch W (1983). Strategies of discourse comprehension. New York: Academic Press. van Peer W & Chatman S (eds.) (2001). New perspectives on narrative perspective. Albany: State University of New York Press. Vorderer P (1996). Suspense: conceptualizations, theoretical analysis, and empirical explorations. Mahwah, NJ: Erlbaum. Zwaan R A (1993). Aspects of literary comprehension. Amsterdam: Benjamins.

Lithuania: Language Situation M Ramoniene˙, Vilnius University, Vilnius, Lithuania ! 2006 Elsevier Ltd. All rights reserved.

The Republic of Lithuania is located on the eastern coast of the Baltic Sea. Lithuania borders Latvia, Belarus, Poland, and the Kaliningrad district of the Russian Federation. The country covers 65 300 km2. Lithuania has a population of 3.48 million, of whom 66.9% reside in urban areas. According to the 2001 census, Lithuania consists of 115 ethnicities: Lithuanians make up 83.5%, Poles 6.7%, Russians 6.3%, and the others make up the remaining 3.5% of the population.

Lithuanian is the only state language used in all spheres of life. Its status as a state language is legitimized by the constitution of the Republic of Lithuania. Its public use is regulated by the Law on the State Language, adopted in 1995. Lithuanian is the mother tongue of the majority of the Lithuanian population. It is used as a second language by 356 000 non-Lithuanians, most of whom reside in the southeastern Lithuania and a few cities. Lithuanian is one of the two Baltic languages of the Indo-European family (the other being Latvian). It is one of the oldest living Indo-European languages, and has retained many archaic linguistic features also characteristic of Latin and Sanskrit. The Lithuanian

280 Literature: Empirical Studies empirical approach to literature and culture as theory and application. Edmonton: University of Alberta. 67–74. Ibsch E (1996). ‘The strained relationship between the empiricist’s notion of validity and the hermeneutician’s notion of relevance.’ In Kreutz R & MacNealy M S (eds.). 23–33. Janssen S (2001). ‘The empirical study of careers in literature and the arts.’ In Schram D & Steen G (eds.) The psychology and sociology of literature. Amsterdam: Benjamins. 323–358. Kloek J J & Mijnhardt W W (1993). ‘The ability to select: the growth of the reading public and the problem of literary socialization in the eighteenth and nineteenth centuries.’ In Rigney A & Fokkema D W (eds.) Cultural participation: trends since the Middle Ages. Amsterdam: Benjamins. 51–62. Konijn E & Hoorn J (2005). ‘Some like it bad: testing a model for perceiving and experiencing fictional characters.’ Media Psychology 7(2), 107–144. Kreuz R & MacNealy M S (eds.) Empirical approaches to literature and aesthetics. Norwood, NJ: Ablex. Louwerse M M & van Peer W (eds.) (2002). Thematics: interdisciplinary studies. Amsterdam: Benjamins. Magliano J P, Baggett W B & Graesser A C (1996). ‘A taxonomy of inference categories that may be generated during the comprehension of literary texts.’ In Kreuz R & MacNealy M S (eds.). 201–220. Miall D S (2001). ‘Sound of contrast: an empirical approach to phonemic iconicity.’ Poetics 29, 55–70. Miall D S & Kuiken D (1995). ‘Aspects of literary response: a new questionnaire.’ Research in the teaching of English 29, 389–407. Mukarˇovsky´ J (1974). Kapitel aus der A¨sthetik. Frankfurt: Suhrkamp. Nell V (1988). Lost in a book: the psychology of reading for pleasure. New Haven: Yale University Press. Peterson R A (1985). ‘Six constraints on the production of literary works.’ Poetics 14, 45–67.

Peterson R A (1992). ‘Understanding audience segmentation: from elite to omnivore and univore.’ Poetics 21, 243–258. Radway J A (1984). Reading the romance: women, patriarchy, and popular literature. Chapel Hill: University of North Carolina Press. Rosenblatt L M (1938). Literature as exploration. New York: Appleton. Schmidt S J (1982). Foundations for the empirical study of literature. Hamburg: Buske. Schreier M, Knobloch S & Wieler P (eds.) (1998). ‘Media, literature, and socialization.’ Special issue of SPIEL 17. Semino E & Culpeper J (2002). Cognitive stylistics: language and cognition in text analysis. Amsterdam: Benjamins. Simonton D K (1984). Genius, creativity, and leadership: historiometric inquiries. Cambridge, MA: Harvard University Press. Steen G (1994). Understanding metaphor in literature: an empirical approach. London: Longman. Steen G (2003). ‘A historical view of empirical poetics: trends and possibilities.’ Empirical Studies of the Arts 21(1), 51–67. Sternberg M (2003). ‘Universals of narrative and their cognitivist fortunes.’ Poetics Today 24(2), 297–395. van den Broek P, Rohleder L & Narva´ez D (1996). ‘Causal inferences in the comprehension of literary texts.’ In Kreutz R & MacNealy M S (eds.). 179–200. van Dijk T A & Kintsch W (1983). Strategies of discourse comprehension. New York: Academic Press. van Peer W & Chatman S (eds.) (2001). New perspectives on narrative perspective. Albany: State University of New York Press. Vorderer P (1996). Suspense: conceptualizations, theoretical analysis, and empirical explorations. Mahwah, NJ: Erlbaum. Zwaan R A (1993). Aspects of literary comprehension. Amsterdam: Benjamins.

Lithuania: Language Situation M Ramoniene˙, Vilnius University, Vilnius, Lithuania ! 2006 Elsevier Ltd. All rights reserved.

The Republic of Lithuania is located on the eastern coast of the Baltic Sea. Lithuania borders Latvia, Belarus, Poland, and the Kaliningrad district of the Russian Federation. The country covers 65 300 km2. Lithuania has a population of 3.48 million, of whom 66.9% reside in urban areas. According to the 2001 census, Lithuania consists of 115 ethnicities: Lithuanians make up 83.5%, Poles 6.7%, Russians 6.3%, and the others make up the remaining 3.5% of the population.

Lithuanian is the only state language used in all spheres of life. Its status as a state language is legitimized by the constitution of the Republic of Lithuania. Its public use is regulated by the Law on the State Language, adopted in 1995. Lithuanian is the mother tongue of the majority of the Lithuanian population. It is used as a second language by 356 000 non-Lithuanians, most of whom reside in the southeastern Lithuania and a few cities. Lithuanian is one of the two Baltic languages of the Indo-European family (the other being Latvian). It is one of the oldest living Indo-European languages, and has retained many archaic linguistic features also characteristic of Latin and Sanskrit. The Lithuanian

Lithuanian 281

alphabet has the Latin alphabet as its basis. It consists of 32 letters, some of which have certain diacritical marks to indicate special sounds. Lithuanian is an inflectional language. The noun has seven cases, and the verb has a particularly complex system of inflections. The stress is free. Standard Lithuanian is about 100 years old, which makes it a relatively young language. It is used in all spheres of public life, education, and the media. In informal and semiformal contexts, regional and urban dialects are used. Two major dialects are distinˇ emaicˇ i guished: Auksˇ taicˇ i (the Highlanders’) and Z (the Lowlanders’). Standard Lithuanian was formed on the basis of the Western Highlanders’ dialect. Russian is widely known in Lithuania: 60% of the Lithuanian population use it as a second language. In Soviet times, before the restoration of independence in 1990, asymmetrical bilingualism was dominant: most Lithuanians could speak Russian, whereas most Russian speakers could not speak Lithuanian. Russian is the mother tongue of 89.2% of Russians. Many Belarusians, Ukrainians, Jews, Poles, and others also consider it as their mother tongue. At present, many Russian speakers are learning Lithuanian and becoming bilingual. Polish is used as the mother tongue by 80% of Poles, most of whom reside in the capital, Vilnius, and its region. It is a local spoken variety different from standard Polish. A spoken variety of the Belarusian vernacular is used in rural southeastern Lithuania. Ukrainian, Tatar, Yiddish (Western Yiddish), Latvian, German (Standard German), and Romani are used as home languages. Lithuania’s major ethnic minorities have state-run schools with their ethnic language as the language

of instruction and Lithuanian taught as a subject. Tertiary-level state education is offered in Lithuanian only. Ethnic minorities have TV and radio programs, newspapers, and magazines. The most common foreign languages known by people in Lithuania are English (17%), Polish (9%), German (8%), and French (2%). See also: Balto-Slavic Languages; Bilingualism; Lithuanian; Minorities and Language.

Bibliography Ambrazas V (ed.) (1997). Lithuanian grammar. Vilnius: Baltos Lankos. Dini P U (1991). L’anello baltico: profilo delle nazioni baltiche Lituania, Lettonia, Estonia. Genova: Marietti. Dini P U (2000). Balt kalbos: lyginamoji istorija. Vilnius: Mokslo ir Enciklopedij Institutas. Gyventojai pagal isˇsilavinim gimt j kalb ir kalb moke˙jim (2002). Vilnius: Statistikos Departamentas. Hogan-Brun G & Ramoniene˙ M (2003). ‘Emerging Language and Education Policies in Lithuania.’ Language Policy 2, 27–45. Hogan-Brun G & Ramoniene˙ M (2004). ‘Changing levels of bilingualism across the Baltic.’ International Journal of Bilingual Education and Bilingualism 7(1), 62–77. Kaubrys S (2002). National minorities in Lithuania. Vilnius: Vaga. Vaitiekus S (1992). Tautine˙s mazˇumos Lietuvos respublikoje. Vilnius: Valstybinis Nacionalini Tyrim Centras. Zinkevicˇ ius Z (1993). Ryt Lietuva praeityje ir dabar. Vilnius: Mokslo ir Enciklopedij Leidykla. Zinkevicˇ ius Z (1998). The history of the Lithuanian language. Plioplys R (trans.). Vilnius: Mokslo ir Enciklopedij Leidybos Institutas.

Lithuanian S Young, University of Maryland Baltimore County, Baltimore, MD, USA

southwest Auksˇ taitic region, bordering former East Prussia.

! 2006 Elsevier Ltd. All rights reserved.

The Written Language Lithuanian (Lietu`viu˛ kalba`) is the native language of some 2.9 million speakers in the Republic of Lithuania. Together with Latvian, it forms East Baltic, the sole remaining branch of the Baltic family of IndoEuropean languages. There are two major dialects of Lithuanian, the more conservative and territorially greater Auksˇ taitic (auksˇtaicˇiu˛ tarme˙) and the more innovating Zˇ emaitic (zˇemaicˇiu˛ tarme˙; Samogitian), spoken in the northwest quarter of Lithuania. The standard language is based on the speech of the

The Writing Tradition in East Prussia

Lithuanian is attested in written form from the 16th century, in three varieties of the Auksˇ taitic dialect; the earlier texts are chiefly translated and original religious literature. Book publication in Lithuanian began earliest in German East Prussia (which had a substantial Lithuanian population) in connection with the spread of the Reformation. The first work published in Lithuanian is a 1547 translation of a Lutheran

Lithuanian 281

alphabet has the Latin alphabet as its basis. It consists of 32 letters, some of which have certain diacritical marks to indicate special sounds. Lithuanian is an inflectional language. The noun has seven cases, and the verb has a particularly complex system of inflections. The stress is free. Standard Lithuanian is about 100 years old, which makes it a relatively young language. It is used in all spheres of public life, education, and the media. In informal and semiformal contexts, regional and urban dialects are used. Two major dialects are distinˇ emaicˇi guished: Auksˇtaicˇi (the Highlanders’) and Z (the Lowlanders’). Standard Lithuanian was formed on the basis of the Western Highlanders’ dialect. Russian is widely known in Lithuania: 60% of the Lithuanian population use it as a second language. In Soviet times, before the restoration of independence in 1990, asymmetrical bilingualism was dominant: most Lithuanians could speak Russian, whereas most Russian speakers could not speak Lithuanian. Russian is the mother tongue of 89.2% of Russians. Many Belarusians, Ukrainians, Jews, Poles, and others also consider it as their mother tongue. At present, many Russian speakers are learning Lithuanian and becoming bilingual. Polish is used as the mother tongue by 80% of Poles, most of whom reside in the capital, Vilnius, and its region. It is a local spoken variety different from standard Polish. A spoken variety of the Belarusian vernacular is used in rural southeastern Lithuania. Ukrainian, Tatar, Yiddish (Western Yiddish), Latvian, German (Standard German), and Romani are used as home languages. Lithuania’s major ethnic minorities have state-run schools with their ethnic language as the language

of instruction and Lithuanian taught as a subject. Tertiary-level state education is offered in Lithuanian only. Ethnic minorities have TV and radio programs, newspapers, and magazines. The most common foreign languages known by people in Lithuania are English (17%), Polish (9%), German (8%), and French (2%). See also: Balto-Slavic Languages; Bilingualism; Lithuanian; Minorities and Language.

Bibliography Ambrazas V (ed.) (1997). Lithuanian grammar. Vilnius: Baltos Lankos. Dini P U (1991). L’anello baltico: profilo delle nazioni baltiche Lituania, Lettonia, Estonia. Genova: Marietti. Dini P U (2000). Balt kalbos: lyginamoji istorija. Vilnius: Mokslo ir Enciklopedij Institutas. Gyventojai pagal isˇsilavinim gimt j kalb ir kalb moke˙jim (2002). Vilnius: Statistikos Departamentas. Hogan-Brun G & Ramoniene˙ M (2003). ‘Emerging Language and Education Policies in Lithuania.’ Language Policy 2, 27–45. Hogan-Brun G & Ramoniene˙ M (2004). ‘Changing levels of bilingualism across the Baltic.’ International Journal of Bilingual Education and Bilingualism 7(1), 62–77. Kaubrys S (2002). National minorities in Lithuania. Vilnius: Vaga. Vaitiekus S (1992). Tautine˙s mazˇumos Lietuvos respublikoje. Vilnius: Valstybinis Nacionalini Tyrim Centras. Zinkevicˇius Z (1993). Ryt Lietuva praeityje ir dabar. Vilnius: Mokslo ir Enciklopedij Leidykla. Zinkevicˇius Z (1998). The history of the Lithuanian language. Plioplys R (trans.). Vilnius: Mokslo ir Enciklopedij Leidybos Institutas.

Lithuanian S Young, University of Maryland Baltimore County, Baltimore, MD, USA

southwest Auksˇtaitic region, bordering former East Prussia.

! 2006 Elsevier Ltd. All rights reserved.

The Written Language Lithuanian (Lietu`viu˛ kalba`) is the native language of some 2.9 million speakers in the Republic of Lithuania. Together with Latvian, it forms East Baltic, the sole remaining branch of the Baltic family of IndoEuropean languages. There are two major dialects of Lithuanian, the more conservative and territorially greater Auksˇtaitic (auksˇtaicˇiu˛ tarme˙) and the more innovating Zˇemaitic (zˇemaicˇiu˛ tarme˙; Samogitian), spoken in the northwest quarter of Lithuania. The standard language is based on the speech of the

The Writing Tradition in East Prussia

Lithuanian is attested in written form from the 16th century, in three varieties of the Auksˇtaitic dialect; the earlier texts are chiefly translated and original religious literature. Book publication in Lithuanian began earliest in German East Prussia (which had a substantial Lithuanian population) in connection with the spread of the Reformation. The first work published in Lithuanian is a 1547 translation of a Lutheran

282 Lithuanian

catechism by Martynas Mazˇ vydas (Martinus Masvidius). The foreword begins with a personal appeal to the reader, Bralei seseris imkiet mani ir skaitikiet ‘Brothers, sisters, take me and read me’. The language reflects Mazˇ vydas’s native south Zˇ emaitic dialect, with Auksˇ taitic elements. Subsequent Lithuanian publications in East Prussia are written in an increasingly normalized variety of the local west Auksˇ taitic dialect, codified in Daniel Klein’s 1653 Grammatika Litvanica, the first grammar of Lithuanian. The Writing Tradition in the Grand Duchy

In the Catholic Grand Duchy of Lithuania, two writing traditions took root, one based on the East Auksˇ taitic dialect of the capital, Vilnius, and the other representing the Central Auksˇ taitic dialect of the Kedainiai area. The latter served as the medium for the earliest Lithuanian publications in the Grand Duchy, i.e., Mykalojus Dauksˇ a’s 1595 translation of Jacobus Ledisma’s popular Catholic catechism and his lengthy 1599 translation of Jakub Wujek’s collection of sermons, the Postilla Catholicka. Although these were translations, the language of these works is relatively natural and had considerable influence on the later cultivation of Lithuanian. Dauksˇ a’s works are also the first accented texts in Lithuanian, and as such are of particular importance for the study of the historical prosody of the language. The National Standard Language

The increasing polonization of the Grand Duchy’s nobility and educated classes led in the 18th century to a decline in the Central and East Auksˇ taitic writing traditions (the latter eventually disappeared). The present-day standard language has its roots in the late 19th century, and is based on the dialect of the southern West Auksˇ taitic region, in which the speakers are traditionally called suvalkiecˇ iai. Several factors stand out in the establishment of this dialect as the national standard: the prior literary tradition of the virtually identical Auksˇ taitic dialect of neighboring East Prussia; the authority of the 19th-century Lithuanian grammars of A. Schleicher and F. Kurschat, which described the same Prussian Lithuanian speech; and the normative influence of late 19th- to early 20th-century newspapers such as Ausˇ ra (The Dawn) and Varpas (The Bell), which had many writers and editors (in particular Jonas Jablonskis) who came from the southwest Auksˇ taitic dialect area.

Phonology Prosodic Features

Standard Lithuanian has free stress, which may alternate between a stem and ending within a grammatical

paradigm, as in dukra` (nominative), du`kra˛ (accusative)‘daughter’; sakau˜ ‘I say’, and sa˜ ko ‘he says’. There are four such stress patterns for nouns and two for verbs. Stressed long vowels and diphthongs (including sequences of vowel plus tautosyllabic resonant) distinguish two phonemic contour tones, traditionally referred to as acute ( B) and circumflex ( D), as in sˇ a´ uk! ‘shoot.IMP’ vs. sˇ au˜ k! ‘shout.IMP’. Short stressed vowels are marked with a grave accent (A). The tones are conventionally indicated in dictionaries and linguistic works; otherwise, they are not represented. According to the norms of the standard language, acute tone (tvirtapra˜ de˙ prı´egaide˙ ) is realized with a falling tonal contour, whereas circumflex tone (tvirtaga˜ le˙ prı´egaide˙ ) is level or rising. The tonal opposition is clearest on diphthongs; in the urban colloquial language, the distinction is becoming neutralized on long vowels. The phonetic realization of the two tones differs dialectally; in particular, the acute tone of northwest Zˇ emaitic speech incorporates a glottal stop (lauzˇ tı`ne˙ prı´egaide˙ ‘broken tone’). The Vowel System

Vowel length is distinctive in Lithuanian. The rather open short vowel phonemes /i [I], u [o], e [E], a [a]/ (orthographically i, u, e, a) are inherited from protoBaltic and also result from an early Lithuanian shortening of final long vowels under acute tone (compare ta` ‘this. NOM SG FIM’ with Latvian ta¯˜ , having the Latvian reflex of acute). In addition, a short /O/ (spelled o) is found in words of foreign origin. The long vowel phonemes /i:, u:, e:, o:, æ:, A:/ (orthographically y/i, u¯ /u˛ , e˙ , o, e, a˛ ) also have two sources. Inherited length is represented by the spellings y, u¯ , e˙ , o [< *a¯ ], as in gy´ vas (*gı¯-) ‘alive’, bu´¯ ti (*bu¯ -) ‘to be’, se˙´ ti (*se¯ -) ‘to sow’, and bro´ lis (Latvian bra˜ lis) ‘brother’, whereas i, u˛ , e, and a˛ develop from sequences of vowel plus tautosyllabic n, when not before a stop (where they are preserved). Original V þ n sequences were first replaced by long nasalized vowels, marked in the earlier texts by a hook under the corresponding vowel graph. These vowels were eventually denasalized, although the orthography still reflects the earlier practice, as in ı˜˛ [i:] (< *in) ‘to, into’, siu˜˛ sti ["sju˜¯ stjI] (< *siun˜ t-) ‘to send’, te˜˛ sti ["tjæ˜:stjI] (< *ten˜ s-) ‘to continue’, and zˇ a˛ sı`s [zˇ A:"sjIs] (< *zˇ an˜ s-) ‘goose’. Both long and short a are fronted to /æ:/ and /E/, respectively, after a palatalized consonant or j, as in gı`lia˛ ‘deep.ACC SG FEM’ ¼ gı`le ‘acorn.ACC j j SG FEM’, both ["g Il æ ˜ :]; and gilia`s ‘deep.ACC PL FEM’ ¼ gile`s ‘acorns.ACC PL FEM’, both [gjI"ljEs]. Short e and a are automatically lengthened under stress in most nonfinal syllables to [æ:] and [A:], with concomitant circumflex tone. This phonetic vowel length is not indicated orthographically, i.e., ledas

Lithuanian 283

["læ˜:das] ‘ice’ and vakaras ["va˜ :karas] ‘evening’ (compare Latvian ledus and vakars, having short e and a). Also included in the inventory of Lithuanian vowels are the diphthongs ie and uo, which arose from East Baltic *e.¯ (< *ei) and *o¯. (< *o¯ ). These diphthongs, which function as long vowels, begin with a high vowel and end with a lower, more central vowel ( , ), as in diena` (*dein-) ‘day’ and du´ ona (*do¯ n-) ‘bread’, phonetically [d "na] and ["d na]. The Consonant System

Lithuanian alone among R the Baltic languages preserves distinct reflexes ( and z) of the Indo-European palatovelars; in Latvian and Old Prussian as well as in Slavic, these have merged with s and z:, as in sˇ uo˜ ‘dog’ and zˇ eme˙ ‘earth’ (Latvian suns and zeme). A characteristic feature of the Lithuanian consonant system is the phonemic opposition of palatalized and nonpalatalized consonants before back vowels (palatalization is automatic before front vowels). These palatalized consonants are the product of Baltic consonant þ sequences, which are still preserved in the case of labial stops in word-initial position, as in pja´ uti ["pjjæ´utjI] ‘to cut’ and bjauru`s [bjjEu"ros] ‘ugly’. Earlier sequences of dental stop þ have developed into the affricates [ ] and [ ], orthographically cˇ (i) and dzˇ (i), as in cˇ ia` (*t a) ‘here’ and me˜ dzˇ ias (*med as) ‘woods’. The distinctive palatalization of the remaining consonants is marked orthographically by a following i, as in siu¯´ ti ["sju¯´ :tjI] ‘to sew’ and lia´ udis ["lja´ udjIs] ‘people’.

Morphosyntactic Features The Noun

Lithuanian has preserved the Baltic stem classes and their declensional endings rather well, giving the noun a relatively archaic appearance. Six case forms are distinguished in the singular and plural (a dual is attested for certain case forms in dialects and older texts): (1) nominative (for example, in the singular, the o-stem na˜ mas ‘house’, a¯ -stem ranka` ‘hand’, i-stem akı`s ‘eye’, and u-stem su¯ nu`s ‘son’), (2) genitive (na˜ mo, ran˜ kos, akie˜ s, and su¯ nau˜ s), (3) dative (na˜ mui, ran˜ kai, a˜ kiai, and su¯´ nui), (4) accusative (na˜ ma, ran˜ ka˛ , a˜ ki˛, and su¯´ nu˛ ), (5) instrumental ( (namu`, ranka`, akimı`, and su¯ numı`), and (6) locative (name`, rankoje`, akyje`, and su¯ nuje`). In addition, a special vocative form is used in the singular, as in Jo˜ nai! (nominative Jo˜ nas ‘John’) and Biru`te! (nominative Biru`te˙ ). The adnominal genitive is typically preposed, as in te˙ vo na˜ mas ‘of-father house’ (‘father’s house’) and lietu`viu˛ kalba` ‘of-Lithuanians language’ (‘Lithuanian

language’). The genitive also occurs in partitive expressions, both positive, as in misˇ ke` yra` vilku˜˛ ‘inthe-woods there-are wolves (genitive)’, and negative, as in ne˙ ra` zˇ uvie˜ s ‘there-is-no fish (genitive)’. The locative case is used without a preposition, as in Vı`lniuje ‘in Vilnius’; historically this represents an inessive, the remnant of a more complex system of local cases formed with postpositions. These cases included an adessive, illative, and allative, some of which (particularly the illative) are still found dialectally. Nouns are marked for gender (masculine and feminine) through distinctive desinences and adjectival concord, as in ge˜ ras te´˙ vas ‘good father’ and gera` mo´ tina ‘good mother’. In an innovation shared with Latvian, proto-Baltic neuter gender was lost; neuters typically became masculine, as in sˇ ie˜ nas ‘hay’ (masc.) and Old Church Slavonic seˇ no (neut.). The category of definiteness is marked within adjectives by the historical affixation of a pronominal - - element to the indefinite form, as in indefinite nau˜ jas (masc.), nauja` (fem.) ‘new’ vs. definite nauja`sis (i.e., naujas þ jis) (masc.), naujo´ ji (fem.). The Verb

The Lithuanian verb marks present, past, and future tense forms. The present tense has three conjugation patterns, illustrated by lı`pti ‘to climb’ (first conjugation, stem in a), myle˙´ ti ‘to love’ (second conjugation, stem in i), and skaity´ ti ‘to read’ (third conjugation, stem in o): 1SG (asˇ ) lipu`, my´ liu, skaitau˜ ; 1PL (mes) lı`pame, my´ lime, skaı˜tome; 2SG (tu) lipı`, my´ li, skaitaı˜; 2PL (ju¯ s) lı`pate, my´ lite, skaı˜tote; 3SG/PL (jis/jie) lı`pa, my´ li, skaı˜to. The past tense has two patterns, illustrated by lı`pti ‘to climb’ (stem in o) and skaity´ ti ‘to read’ (stem in e˙ ): 1SG (asˇ ) lipau˜ skaicˇ iau˜ ; 1PL (mes) lı`pome, skaı˜te˙ me; 2SG (tu) lipaı˜, skaiteı˜; 2PL (ju¯ s) lı`pote, skaı˜te˙ te; and 3SG/PL (jis/jie) lı`po, skaı˜te˙ . A frequentative past is formed by adding the suffix -dav- (plus o-stem endings) to the infinitive stem, as in jis skaity´ davo ‘he used to read, would read’ (skaity´ ti ‘to read’). The future is formed by adding -s- and the present-tense person endings to the infinitive stem, lı`pti ‘to climb’: 1SG (asˇ ) lı`psiu, 1PL (mes) lı`psime, 2SG (tu) lı`psi, 2PL (ju¯ s) lı`psite, and 3SG/PL (jis, jie) lı`ps. As the various examples demonstrate, number is not marked in the third person, a characteristic feature of Baltic. Lithuanian shows a fondness for participles and gerunds, in both colloquial and written styles. Among the more typical participles (which decline like adjectives) are the present active (rasˇ a˜˛ s, stem ra˜ sˇ ant-, infinitive rasˇ y´ ti ‘to write’), past active ([pa]ra˜ sˇ e¸ s, stem ([pa]ra˜ sˇ ius-), present passive (ra˜ sˇ omas), and past passive (parasˇ y´ tas). The past active

284 Lithuanian

participle is used, together with a finite form of the verb ‘to be,’ to form a system of perfect tenses, as in asˇ esu` (pa)ra˜ sˇ es (masc.)/(pa)ra˜ sˇ iusi (fem.) ‘I have written’ and asˇ buvau˜ (pa)ra˜ sˇ s (masc.)/(pa)ra˜ sˇ iusi (fem.) ‘I had written’. Passive constructions are formed with the verb ‘to be’ and the corresponding passive participle, as in knyga` bu`vo ra˜ sˇ oma ‘the book was being written’ and knyga` bu`vo parasˇ y´ ta ‘the book was/had been written’. The language retains a reflex of an earlier dative absolute in gerundive constructions such as sa´ ulei te˜ kant ‘as the sun is/was rising’ (‘to-the-sun rising’).

Lexicon Lithuanian has long felt the lexical influence of neighboring Slavic languages. Early East Slavic borrowings into Lithuanian include tur˜ gus ‘market’ and krı`ksˇ tas ‘baptism’ (Old Russian t]rg] ‘market’ and krmst] ‘cross’). Among East Slavic borrowings from the time of the Grand Duchy are knyga` ‘book’ and bly˜ nas ‘pancake’ (Russian knı´ga and blin). A significant number of Polish borrowings began to appear in Lithuanian in the 17th and 18th centuries, among them arbata` ‘tea’ and cu`krus ‘sugar’ (Polish herbata and cukier). Since the late 19th century, language reformers have succeeded in replacing many earlier borrowings with native words known from dialects

and Old Lithuanian texts; for example, the native laı˜kas ‘time’ was normalized in place of cˇ e˙˜ sas (Russian cˇ as), and pasa´ ule˙ ‘world’, in place of svı´etas (Russian svet). During this time, a number of neologisms took root, such as akiniaı˜ ‘eyeglasses’ (akı`s ‘eye’) and mokykla` ‘school’ (mok- ‘teach, learn’ þ -ykl- ‘place where’). See also: Balto-Slavic Languages; Latvian; Lithuania: Language Situation; Lithuanian Lexicography.

Bibliography Ambrazas V (ed.) (1997). Lithuanian grammar. Vilnius: Baltos Lankos. Dini P (2000). Baltu˛ kalbos: lyginamoji istorija. Vilnius: Mokslo ir Enciklopediju˛ Leidybos Institutas. Kabelka J (1982). Baltu˛ filologijos (ivadas. Vilnius: Mokslas. Mathiassen T (1996). A short grammar of Lithuanian. Columbus, OH: Slavica. Palionis J (1979). Lietuviu˛ literatu¯ rine˙ s kalbos istorija. Vilnius: Mokslas. Senn A (1966). Handbuch der litauischen Sprache (vol. 1): Grammatik. Heidelberg: C. Winter. Zinkevicˇ ius Z (1966). Lietuviu˛ dialektologija. Vilnius: Mintis. Zinkevicˇ ius Z (1984–1995). Lietuviu˛ kalbos istorija (6 vols.). Vilnius: Mokslas.

Lithuanian Lexicography A Veisbergs, Riga, Latvia ! 2006 Elsevier Ltd. All rights reserved.

The first trilingual Polish–Latin–Lithuanian dictionary was Konstantinas Sirvydas’ Dictionarium trium linguarum (1620), and republished several times (most recently by Pirmasis, 1979). This was followed by several bidirectional Lithuanian–German dictionaries – Haack (1739) aimed at theology students and Ruigys (1747) with an input of vernacular language. The latter was in 1800 further improved by Milkus’ Littauisch– Deutsches und Deutsch–Littauisches Woerterbuch. The living Lithuanian vernacular language was covered more fully in a trilingual explanatory dictionary by Jusˇ ka (1897). Further progress was made with a card index initiated by Kazimieras Bu¯ ga, starting in 1902. The task was continued from 1930 by an editorial board under Juozas Balcˇ ikonis and continued to the present day (changing editorial policies several times) – 5 million

entries, including also spoken language, dialects, and various styles. The work based on citations resulted in a 20-volume dictionary of the Lithuanian language (Lietuvi kalbos zˇ odynas, 1968–2002) covering the years 1547–2001. Supplements are being compiled and an electronic version is being prepared. The Modern Lithuanian Dictionary (Dabartines, 2002) also has an online version. A new Dictionary of the standard Lithuanian language is being compiled – for the general public – descriptive and normative. There is a modern and good dictionary of synonyms (Sinonim , 2002) and a big and new dictionary of phraseology (Frazeologijos, 2001) as well as as a dictionary of collocations (in electronic form only) based on a corpus of the contemporary Lithuanian language (100 million running words). Several corpora exist; one is generally accessible. Though no dictionary has been based on them, there is a 1-million-word morphologically annotated corpus and three parallel corpora of the same size for

284 Lithuanian

participle is used, together with a finite form of the verb ‘to be,’ to form a system of perfect tenses, as in asˇ esu` (pa)ra˜sˇes (masc.)/(pa)ra˜sˇiusi (fem.) ‘I have written’ and asˇ buvau˜ (pa)ra˜sˇ s (masc.)/(pa)ra˜sˇiusi (fem.) ‘I had written’. Passive constructions are formed with the verb ‘to be’ and the corresponding passive participle, as in knyga` bu`vo ra˜sˇoma ‘the book was being written’ and knyga` bu`vo parasˇy´ta ‘the book was/had been written’. The language retains a reflex of an earlier dative absolute in gerundive constructions such as sa´ulei te˜kant ‘as the sun is/was rising’ (‘to-the-sun rising’).

Lexicon Lithuanian has long felt the lexical influence of neighboring Slavic languages. Early East Slavic borrowings into Lithuanian include tur˜gus ‘market’ and krı`ksˇtas ‘baptism’ (Old Russian t]rg] ‘market’ and krmst] ‘cross’). Among East Slavic borrowings from the time of the Grand Duchy are knyga` ‘book’ and bly˜nas ‘pancake’ (Russian knı´ga and blin). A significant number of Polish borrowings began to appear in Lithuanian in the 17th and 18th centuries, among them arbata` ‘tea’ and cu`krus ‘sugar’ (Polish herbata and cukier). Since the late 19th century, language reformers have succeeded in replacing many earlier borrowings with native words known from dialects

and Old Lithuanian texts; for example, the native laı˜kas ‘time’ was normalized in place of cˇe˙˜sas (Russian cˇas), and pasa´ule˙ ‘world’, in place of svı´etas (Russian svet). During this time, a number of neologisms took root, such as akiniaı˜ ‘eyeglasses’ (akı`s ‘eye’) and mokykla` ‘school’ (mok- ‘teach, learn’ þ -ykl- ‘place where’). See also: Balto-Slavic Languages; Latvian; Lithuania: Language Situation; Lithuanian Lexicography.

Bibliography Ambrazas V (ed.) (1997). Lithuanian grammar. Vilnius: Baltos Lankos. Dini P (2000). Baltu˛ kalbos: lyginamoji istorija. Vilnius: Mokslo ir Enciklopediju˛ Leidybos Institutas. Kabelka J (1982). Baltu˛ filologijos (ivadas. Vilnius: Mokslas. Mathiassen T (1996). A short grammar of Lithuanian. Columbus, OH: Slavica. Palionis J (1979). Lietuviu˛ literatu¯rine˙s kalbos istorija. Vilnius: Mokslas. Senn A (1966). Handbuch der litauischen Sprache (vol. 1): Grammatik. Heidelberg: C. Winter. Zinkevicˇius Z (1966). Lietuviu˛ dialektologija. Vilnius: Mintis. Zinkevicˇius Z (1984–1995). Lietuviu˛ kalbos istorija (6 vols.). Vilnius: Mokslas.

Lithuanian Lexicography A Veisbergs, Riga, Latvia ! 2006 Elsevier Ltd. All rights reserved.

The first trilingual Polish–Latin–Lithuanian dictionary was Konstantinas Sirvydas’ Dictionarium trium linguarum (1620), and republished several times (most recently by Pirmasis, 1979). This was followed by several bidirectional Lithuanian–German dictionaries – Haack (1739) aimed at theology students and Ruigys (1747) with an input of vernacular language. The latter was in 1800 further improved by Milkus’ Littauisch– Deutsches und Deutsch–Littauisches Woerterbuch. The living Lithuanian vernacular language was covered more fully in a trilingual explanatory dictionary by Jusˇka (1897). Further progress was made with a card index initiated by Kazimieras Bu¯ga, starting in 1902. The task was continued from 1930 by an editorial board under Juozas Balcˇikonis and continued to the present day (changing editorial policies several times) – 5 million

entries, including also spoken language, dialects, and various styles. The work based on citations resulted in a 20-volume dictionary of the Lithuanian language (Lietuvi kalbos zˇodynas, 1968–2002) covering the years 1547–2001. Supplements are being compiled and an electronic version is being prepared. The Modern Lithuanian Dictionary (Dabartines, 2002) also has an online version. A new Dictionary of the standard Lithuanian language is being compiled – for the general public – descriptive and normative. There is a modern and good dictionary of synonyms (Sinonim , 2002) and a big and new dictionary of phraseology (Frazeologijos, 2001) as well as as a dictionary of collocations (in electronic form only) based on a corpus of the contemporary Lithuanian language (100 million running words). Several corpora exist; one is generally accessible. Though no dictionary has been based on them, there is a 1-million-word morphologically annotated corpus and three parallel corpora of the same size for

Liu Xie (ca. 500) 285

three different pairs of languages: Czech, English, German, and Lithuanian. A typical Lithuanian dictionary is first and foremost a bilingual dictionary. The contemporary bilingual dictionary scene offers a great number of dictionaries and a great number of languages covered: apart from the traditional ones, also Japanese, Italian, Spanish, Latin (Kuzavinis, 1996), and Polish (Vaitkevicˇ iu¯ te˙ , 1979). English–Lithuanian dictionaries started as e´ migre´ products. The most important are a large Lithuanian– English dictionary (Piesarskas, 2002) and a large twovolume English-Lithuanian dictionary (Piesarskas, 2004). The German connection is best covered by Krizˇ inauskas (2001, 2004), French by Jusˇ kiene (1997) and Karsavina (1992). There is a multitude of terminological dictionaries and glossaries (e.g., computing, textiles, agriculture, roads, hunting, sports, engineering, law etc.). Twenty-one terminological dictionaries were scanned and supplied with a query system, thus forming an online terminological database. These are mostly multilingual (usually Lithuanian, Russian, or English) and have unfortunately only a small circulation. Parliament has an online EUROVOC dictionary, which, as stipulated by law, collects a record of all technical terminology. Among other sources one could also mention dictionaries of verbal associations, a dictionary of nicknames, proper names, first names, place names, hydronyms, mistakes, reverse dictionary, several dialects, accentuation, spelling and punctuation, frequency dictionary (based on a 1-million-word corpus, annotated morphologically), several contemporary dictionaries of foreign words, but no modern dictionary of etymology and no slang dictionary (a dictionary of curse words is under preparation). Summing up, it can be said that the Lithuanian dictionary scene is vibrant and unexpectedly well developed for a small nation of 3.3 million people.

See also: Bilingualism; German; Lexicography: Overview;

Lithuanian; Lithuania: Language Situation; Phraseology.

Bibliography (2002). Dabartine˙ s lietuvi kalbos zˇ odynas. (2001). Frazeologijos zˇ odynas. Haack (Hakas) (1730). Vocabolarium Lithvanico-Germanicum und Germanico-Lithvanicum. Jusˇ ka A (1897–1904). Litovskij slovarj A. Jusˇ kevicˇ a s perevodom slov na russkij I polskij jaziki. Jusˇ kiene˙ A, Katiliene˙ M & Kaziu¯ niene˙ M (1997). Prancu¯ z lietuvi kalb zˇ odynas. Karsavina I & Kairiuksˇ tyte˙ S (1992). Lietuvi -prancu¯ z kalb zˇ odynas. Krizˇ inauskas J A (2001). Vokiecˇ i -lietuvi , lietuvi vokiecˇ i kalb zˇ odynas. Krizˇ inauskas J A (2004). Lietuvi -vokiecˇ i kalb zˇ odynas. Kuzavinis K (1996). Lotyn -lietuvi kalb zˇ odynas. Lietuviu, kalbos zˇ odynas (1941–2002). Lyberis A (2002). Sinonim zˇ odynas. Milkus (1800). Littauisch-Deutsches und DeutschLittauisches Woerterbuch. Piesarskas B (2004). Dvitomis angl -lietuvi kalb zˇ odynas. Piesarskas B & Svecevicˇ ius B (2002). Naujasis lietuvi angl kalb zˇ odynas. (1979) Pirmasis lietuvi kalbos zˇ odynas. Ruigys (1747). Littauisch-Deutsches und DeutschLittauisches Lexicon. Vaitkevicˇ iu¯ te˙ V (2001). Lenk -lietuvi kalb zˇ odynas.

Relevant Websites http://www.autoinfa.lt/webdic/ – Modern Lithuanian Dictionary (Dabartines, 2002) online version. http://www.donelaitis.vdu.ltu. http://www.terminynas.lt – online terminological DB. http://www3.lrs.lt/DPaieska.html – Online EUROVOC dictionary.

Liu Xie (ca. 500) Y Gu, Chinese Academy of Social Sciences, Beijing, China ! 2006 Elsevier Ltd. All rights reserved.

Liu Xie (ca. 465–522 A.D.) was a native of what is now Lu¨ County, Shandong Province. Tradition says that he was a descendant of the Lord of Qi, Liu Fei,

who was a son of Liu Bang, the first emperor of the Former Han dynasty (206 B.C.–25 A.D.). It is however certain that his father Liu Shang was a ranking officer in the military, but he died when Liu Xie was still young. He was reared in poverty by his mother. It was said that he was so poor that he never got married. His bachelor life was perhaps also due to his 10 years of close association with a monk, who was not only

Liu Xie (ca. 500) 285

three different pairs of languages: Czech, English, German, and Lithuanian. A typical Lithuanian dictionary is first and foremost a bilingual dictionary. The contemporary bilingual dictionary scene offers a great number of dictionaries and a great number of languages covered: apart from the traditional ones, also Japanese, Italian, Spanish, Latin (Kuzavinis, 1996), and Polish (Vaitkevicˇiu¯te˙, 1979). English–Lithuanian dictionaries started as e´migre´ products. The most important are a large Lithuanian– English dictionary (Piesarskas, 2002) and a large twovolume English-Lithuanian dictionary (Piesarskas, 2004). The German connection is best covered by Krizˇinauskas (2001, 2004), French by Jusˇkiene (1997) and Karsavina (1992). There is a multitude of terminological dictionaries and glossaries (e.g., computing, textiles, agriculture, roads, hunting, sports, engineering, law etc.). Twenty-one terminological dictionaries were scanned and supplied with a query system, thus forming an online terminological database. These are mostly multilingual (usually Lithuanian, Russian, or English) and have unfortunately only a small circulation. Parliament has an online EUROVOC dictionary, which, as stipulated by law, collects a record of all technical terminology. Among other sources one could also mention dictionaries of verbal associations, a dictionary of nicknames, proper names, first names, place names, hydronyms, mistakes, reverse dictionary, several dialects, accentuation, spelling and punctuation, frequency dictionary (based on a 1-million-word corpus, annotated morphologically), several contemporary dictionaries of foreign words, but no modern dictionary of etymology and no slang dictionary (a dictionary of curse words is under preparation). Summing up, it can be said that the Lithuanian dictionary scene is vibrant and unexpectedly well developed for a small nation of 3.3 million people.

See also: Bilingualism; German; Lexicography: Overview;

Lithuanian; Lithuania: Language Situation; Phraseology.

Bibliography (2002). Dabartine˙s lietuvi kalbos zˇodynas. (2001). Frazeologijos zˇodynas. Haack (Hakas) (1730). Vocabolarium Lithvanico-Germanicum und Germanico-Lithvanicum. Jusˇka A (1897–1904). Litovskij slovarj A. Jusˇkevicˇa s perevodom slov na russkij I polskij jaziki. Jusˇkiene˙ A, Katiliene˙ M & Kaziu¯niene˙ M (1997). Prancu¯z lietuvi kalb zˇodynas. Karsavina I & Kairiuksˇtyte˙ S (1992). Lietuvi -prancu¯z kalb zˇodynas. Krizˇinauskas J A (2001). Vokiecˇi -lietuvi , lietuvi vokiecˇi kalb zˇodynas. Krizˇinauskas J A (2004). Lietuvi -vokiecˇi kalb zˇodynas. Kuzavinis K (1996). Lotyn -lietuvi kalb zˇodynas. Lietuviu, kalbos zˇodynas (1941–2002). Lyberis A (2002). Sinonim zˇodynas. Milkus (1800). Littauisch-Deutsches und DeutschLittauisches Woerterbuch. Piesarskas B (2004). Dvitomis angl -lietuvi kalb zˇodynas. Piesarskas B & Svecevicˇius B (2002). Naujasis lietuvi angl kalb zˇodynas. (1979) Pirmasis lietuvi kalbos zˇodynas. Ruigys (1747). Littauisch-Deutsches und DeutschLittauisches Lexicon. Vaitkevicˇiu¯te˙ V (2001). Lenk -lietuvi kalb zˇodynas.

Relevant Websites http://www.autoinfa.lt/webdic/ – Modern Lithuanian Dictionary (Dabartines, 2002) online version. http://www.donelaitis.vdu.ltu. http://www.terminynas.lt – online terminological DB. http://www3.lrs.lt/DPaieska.html – Online EUROVOC dictionary.

Liu Xie (ca. 500) Y Gu, Chinese Academy of Social Sciences, Beijing, China ! 2006 Elsevier Ltd. All rights reserved.

Liu Xie (ca. 465–522 A.D.) was a native of what is now Lu¨ County, Shandong Province. Tradition says that he was a descendant of the Lord of Qi, Liu Fei,

who was a son of Liu Bang, the first emperor of the Former Han dynasty (206 B.C.–25 A.D.). It is however certain that his father Liu Shang was a ranking officer in the military, but he died when Liu Xie was still young. He was reared in poverty by his mother. It was said that he was so poor that he never got married. His bachelor life was perhaps also due to his 10 years of close association with a monk, who was not only

286 Liu Xie (ca. 500)

versed in Buddhism, but also in Confucian classics. Liu Xie was said to have taken part in editing Buddhist sutras in Dinglin Monastery. He is best known for his monumental work Wenxin diaolong (‘The literary mind and the carving of dragons’). Liu Xie completed his monumental work Wenxin diaolong (‘The literary mind and the carving of dragons’) around 501, when he was little known. Tradition reports that he carried the manuscript, waiting for Shen Yue’s (441–513 A.D., the most influential literary critic of the time) cart to pass by. Eventually he succeeded. With Shen Yue’s recommendation he soon established his reputation. Liu Xie was said to be a prolific writer, but only a few of his works survive. In Chinese academic circles, Wenxin diaolong is a work not without controversy. The dispute is not over its academic quality, which is unanimously recognized as first class by any standards, but over its disciplinary classification. Literary critics claim that it is a milestone in literary criticism, whereas rhetoricians argue that it is primarily the first systematic treatment of rhetoric in Chinese history. The controversy is quite understandable, and actually to be expected on the grounds that rhetoric and literary criticism are intrinsically related to each other. It is argued by some that the embryonic conception of Chinese rhetoric was detectable even in oracle bone inscriptions. But it had remained fragmented for about a thousand years before Liu Xie’s monumental work. From the point of view of rhetoric, he made a

ground-breaking study of discourse genres, style, and many figures of speech. He also analyzed what would be labeled as issues of invention and disposition in Western rhetoric. There is a rich literature of textual studies of the work. Present-day editions include Fan Wenlan’s Wenxi diaolong zhu, Yang Mingzhao’s Wenxin diaolong jiaozhu, and Zhou Zhenfu’s Wenxin diaolong zhushi. Vincent Yu-Chung Shih’s 1983 The literary mind and the carving of dragons is a bilingual English-Chinese version with annotations. See also: China: Language Situation.

Bibliography Chen G & Junheng W (1998). Zhongguo Xiucixue Tongshi. (‘A general history of Chinese rhetoric.’) Changchun: Jilin Education Press. Fan W (1958). Wenxin diaolong zhu. (‘The literary mind and the carving of dragons.’) [Includes Fan’s commentary.] Beijing: People’s Literature Press. Shih V Y (1983). The literary mind and the carving of dragons. Hong Kong: The Chinese University Press. Yang M (1958). Wenxin diaolong jiaozhu. (‘The literary mind and the carving of dragons.’) [Includes Yang’s commentary.] Shanghai: Shanghai Literary Classics Publishing House. Zhou Z (1981). Wenxin diaolong zhushi. (‘The literary mind and the carving of dragons.’) [Includes Zhou’s annotations.] Beijing: People’s Literature Press.

Loanword Phonology E Broselow, Stony Brook University, Stony Brook, NY, USA ! 2006 Elsevier Ltd. All rights reserved.

The term ‘loanword’ is used for words (and sometimes phrases) borrowed from one language into another. The pronunciation of borrowed words is affected by many factors, including the nature and degree of contact between the lending and the borrowing communities (Haugen, 1950; Weinreich, 1953), the prestige of the languages in contact, the time course of borrowing, and the medium of transmission. Loanword pronunciation patterns may range from preservation of all or most features of the original (most often when borrowers are fluent in the foreign language or are members of a group of experts using specialized vocabulary) to fairly drastic

alterations (when the new words become fully nativized or integrated into the borrowing language). ‘Ear borrowings’ tend to reflect pronunciation more faithfully than orthography (e.g., the orthographic but unpronounced ‘t’ of ‘Christmas’ is not realized in Japanese [kurisumasu]), whereas ‘eye borrowings’ tend to reflect orthography (as in Hawaiian [hi:meni] ‘hymn’). Although the pronunciation of borrowed words may be complicated by many extralinguistic factors, systematic patterns often emerge that provide potentially crucial insight into the grammar of the borrowing language as well as into deeper questions of the relationship between perception and production, the role of frequency of different structure types in the data available to language learners and users, and the extent to which the formation of a grammar is shaped by universal and possibly innate principles.

286 Liu Xie (ca. 500)

versed in Buddhism, but also in Confucian classics. Liu Xie was said to have taken part in editing Buddhist sutras in Dinglin Monastery. He is best known for his monumental work Wenxin diaolong (‘The literary mind and the carving of dragons’). Liu Xie completed his monumental work Wenxin diaolong (‘The literary mind and the carving of dragons’) around 501, when he was little known. Tradition reports that he carried the manuscript, waiting for Shen Yue’s (441–513 A.D., the most influential literary critic of the time) cart to pass by. Eventually he succeeded. With Shen Yue’s recommendation he soon established his reputation. Liu Xie was said to be a prolific writer, but only a few of his works survive. In Chinese academic circles, Wenxin diaolong is a work not without controversy. The dispute is not over its academic quality, which is unanimously recognized as first class by any standards, but over its disciplinary classification. Literary critics claim that it is a milestone in literary criticism, whereas rhetoricians argue that it is primarily the first systematic treatment of rhetoric in Chinese history. The controversy is quite understandable, and actually to be expected on the grounds that rhetoric and literary criticism are intrinsically related to each other. It is argued by some that the embryonic conception of Chinese rhetoric was detectable even in oracle bone inscriptions. But it had remained fragmented for about a thousand years before Liu Xie’s monumental work. From the point of view of rhetoric, he made a

ground-breaking study of discourse genres, style, and many figures of speech. He also analyzed what would be labeled as issues of invention and disposition in Western rhetoric. There is a rich literature of textual studies of the work. Present-day editions include Fan Wenlan’s Wenxi diaolong zhu, Yang Mingzhao’s Wenxin diaolong jiaozhu, and Zhou Zhenfu’s Wenxin diaolong zhushi. Vincent Yu-Chung Shih’s 1983 The literary mind and the carving of dragons is a bilingual English-Chinese version with annotations. See also: China: Language Situation.

Bibliography Chen G & Junheng W (1998). Zhongguo Xiucixue Tongshi. (‘A general history of Chinese rhetoric.’) Changchun: Jilin Education Press. Fan W (1958). Wenxin diaolong zhu. (‘The literary mind and the carving of dragons.’) [Includes Fan’s commentary.] Beijing: People’s Literature Press. Shih V Y (1983). The literary mind and the carving of dragons. Hong Kong: The Chinese University Press. Yang M (1958). Wenxin diaolong jiaozhu. (‘The literary mind and the carving of dragons.’) [Includes Yang’s commentary.] Shanghai: Shanghai Literary Classics Publishing House. Zhou Z (1981). Wenxin diaolong zhushi. (‘The literary mind and the carving of dragons.’) [Includes Zhou’s annotations.] Beijing: People’s Literature Press.

Loanword Phonology E Broselow, Stony Brook University, Stony Brook, NY, USA ! 2006 Elsevier Ltd. All rights reserved.

The term ‘loanword’ is used for words (and sometimes phrases) borrowed from one language into another. The pronunciation of borrowed words is affected by many factors, including the nature and degree of contact between the lending and the borrowing communities (Haugen, 1950; Weinreich, 1953), the prestige of the languages in contact, the time course of borrowing, and the medium of transmission. Loanword pronunciation patterns may range from preservation of all or most features of the original (most often when borrowers are fluent in the foreign language or are members of a group of experts using specialized vocabulary) to fairly drastic

alterations (when the new words become fully nativized or integrated into the borrowing language). ‘Ear borrowings’ tend to reflect pronunciation more faithfully than orthography (e.g., the orthographic but unpronounced ‘t’ of ‘Christmas’ is not realized in Japanese [kurisumasu]), whereas ‘eye borrowings’ tend to reflect orthography (as in Hawaiian [hi:meni] ‘hymn’). Although the pronunciation of borrowed words may be complicated by many extralinguistic factors, systematic patterns often emerge that provide potentially crucial insight into the grammar of the borrowing language as well as into deeper questions of the relationship between perception and production, the role of frequency of different structure types in the data available to language learners and users, and the extent to which the formation of a grammar is shaped by universal and possibly innate principles.

Loanword Phonology 287

Loan Adaptation One of the best studied aspects of loanword phonology is the process of altering borrowed words to satisfy the borrowers’ native language constraints on sounds, sound combinations, and prosodic organization of words. One obvious problem is the occurrence in foreign words of sounds that are lacking in the borrowing language. This situation frequently gives rise to the process of phoneme substitution, the replacement of foreign sounds by native language phonemes (the set of contrasting sounds that may be used to encode contrasts in meaning). Thus, borrowings from English, which employs more than 30 phonemes, into Hawaiian, with only 13 contrasting sounds, may exhibit extensive alteration; for example, both ‘Katherine’ and ‘gasoline’ emerge as [kakalina], with [r], [g], [y], [s], [æ], and [e] all undergoing substitution (Elbert and Pukui, 1986). A more subtle problem arises when the foreign sound does occur in the borrowing language, but only in restricted contexts. For example, in Ganda, [l] never appears following a front vowel [i] or [e], whereas [r] occurs only after front vowels. A Ganda speaker can therefore pronounce both [r] and [l], but only in specific contexts, leading to the adaptation of English ‘railway’ as Ganda [lerwe] (Halle and Clements, 1993). A second motivation for the alteration of borrowed words stems from the borrowing language’s phonotactics, or restrictions on combinations of sounds. The bisyllabic word ‘Christmas’ is realized with five syllables in Japanese ([kurisumasu]) and in Samoan ([kilisimasi]), reflecting the requirements of the borrowing languages that all or most consonants be followed by a vowel. Similarly, loss of the [r] in the Navajo pronunciation [keSmiS] can be traced to the impossibility of consonant sequences at the beginning of a Navajo word or syllable (www.santas.net/ howmerrychristmasissaid.htm). Restrictions on the shape and organization of words are additional factors motivating the alteration of loanwords. Many languages adhere to a minimal word size and will augment borrowed words that fall short. For example, ‘bomb’ is adapted in Selayarese as [bo oN] (Broselow, 1999); the change of the final [m] to [N] is dictated by the impossibility of word-final [m], but the addition of a vowel and glottal stop (resulting in [bo oN] rather than *[boN]) is motivated by the Selayarese requirement that all words must contain at least two syllables. In addition, most languages use the word as the basis for characteristic patterns of stress or pitch, and loans are often adapted to conform to these patterns. For example, whereas the most prominent stress falls on the word-initial syllable in English ‘Christmas,’ the major stress of

Hawaiian [kalikima´ ka] falls on the prefinal syllable, in accord with Hawaiian principles of stress placement. In borrowing from stress-based languages into languages in which words are characterized by lexically assigned tonal or pitch patterns, the original stress pattern may be reinterpreted as a pitch-based pattern. Thus, in Cantonese a high tone is generally assigned to the syllable carrying main stress in English, falling on the initial syllable of [fasan] ‘fashion’ and on the final syllable of [sytka] ‘cigar’ (Silverman, 1992). Linguists have traditionally looked to loanword adaptation as a probe into the particular grammars of borrowing languages as well as into the general structure of phonological grammars. In particular, the alterations of borrowed words may provide evidence for phonological rules or constraints that cannot be motivated by native language alternations because the relevant structural types simply do not occur in the native language vocabulary. For example, in analyzing Nupe (Nupe-Nupe-Tako), in which labialized consonants occur only before rounded vowels or before [a], Hyman (1970) argued that the [a] following a labialized consonant is derived from underlying [O], which triggers labialization on the preceding consonant before undergoing a rule of absolute neutralization. Since [O] never appears on the surface in Nupe, the support for a rule transforming [O] to [a] comes from adaptation of Yoruba words such as [kObO] ‘penny’ as Nupe [kw abwa] (with HL tone on both forms). More recently, as much phonological research has moved toward models of grammar consisting not of rules but of surface-oriented constraints, the lack of particular structures in both native and borrowed vocabulary can be seen as having the same source, a set of universal markedness constraints (Jacobs and Gussenhoven, 2000; Yip, 1993).

Problems in Loanword Phonology Choice of Strategies for Loanword Adaptation

The motivation for changing the structure of borrowed words can often be found in the restrictions of the borrowing language, which define targets for adaptation. However, precisely because the native language often lacks forms that require repair, the native language grammar may not provide guidance in different ways of reaching that target (e.g., in choice of substitute for a foreign sound). What, then, determines borrowers’ choice of repair strategy? One possibility is that the means used to adapt loanwords is largely a function of the native language grammar (Jacobs and Gussenhoven, 2000; Paradis and LaCharite´ , 1997). However, it is often difficult

288 Loanword Phonology

to explain how the data of the borrowers’ native language could have led them to converge on a particular repair strategy when their native language lacks structures that require repair (Broselow, 1999; Peperkamp and Dupoux 2003). In fact, the repairs used in loanwords are sometimes different from those used for similar native vocabulary. For example, in Malayalam single voiceless stops may not occur intervocalically. In native vocabulary, an intervocalic stop becomes voiced, while in Malayalee English an intervocalic stop is realized as a voiceless geminate, as in [beekkar] ‘baker’ (Mohanan and Mohanan, 2003). If the strategies for loanword adaptation are not entirely a function of the borrowing language grammar, perhaps they reflect the emergence of universal preferences for particular types of structures. For example, both Shinohara (2004) and Kenstowicz and Sohn (2001) found evidence that the preferred position for the accented syllable in words borrowed into Japanese and into North Kyungsang Korean is on a heavy prefinal syllable, a familiar pattern in stress systems of the world. Similarly, particular types of repairs might be universally preferred; for example, Fleischhacker (2001) found that in many languages that prohibit consonant sequences at the beginning of a word, a vowel is inserted in loanwords before a sequence of [s]-stop but between the two consonants otherwise (compare Egyptian Arabic [iski:] ‘ski’ and [bilastik] ‘plastic’ (Broselow, 1992) and Hindi [iskul] ‘school’ and [pIliz] ‘please’ (Singh, 1985) ). Fleischhacker argued that loan adaptation is guided by a universal ‘perceptual mapping’ that dictates what sorts of repairs result in the greatest perceptual similarity to the original (see also Kang, 2003; Kenstowicz, 2001, 2003). However, many cases of loanword adaptation are not obviously explainable as reflexes of universal cross-linguistic preferences. For example, it is generally assumed that in phoneme substitution, the ‘closest’ native language phoneme is substituted for the foreign phoneme. However, languages may choose different segments as ‘closest’: the [s] in ‘Christmas’ is replaced by [k] in Hawaiian [kalikimaki] and by [h] in Maori [kirihimete], although both languages include [k] and [h] (but not [s]) in their inventories. Languages may also choose different strategies for bringing foreign words into line with native language allophonic restrictions. In Japanese [s] and [S] are in complementary distribution, with [S] occurring only before [i]. The impossible sequence [Se] is transformed in (nativized) borrowed words by changing the consonant, as in [sepa:do] ‘shepherd’ (Ito and Mester, 1995a, 1995b). In Nupe, however, which has similar distribution of [s] and [S], an illegal

sequence is repaired by changing the vowel to [i], as in the rendering of Hausa [Su:gaba] (with LLH tone) ‘leader’ as Nupe [Si:gaba] (with LLM tone; Hyman, 1970). Similarly, although the insertion of vowels in the various realizations of ‘Christmas’ – Japanese ([kurisumasu]), Hawaiian [kalikimaki], Samoan ([kilisimasi]), and Maori [kirihimete] – is clearly motivated by the need to bring borrowed words into conformity with native language syllable structure requirements, it is not clear why the quality of the inserted vowels should differ from language to language. Nor is it clear why these languages should choose vowel insertion, whereas Navajo, for example, deletes a consonant in [keSmiS]. Different borrowing languages may also choose different strategies for upholding native language word stress requirements. In Hawaiian, preferred stress patterns are implemented in borrowings by simply reassigning the main stress to the appropriate syllable of the loanword (as in [kalikima´ ki]), whereas speakers of Huave maintain the native language requirement that stress normally falls on the word-final syllable by deleting poststress material, so that Spanish [kardu´ men] ‘flock’ is adapted as [kardu´ m] (Davidson and Noyer, 1996). Thus, borrowers of different language backgrounds may choose different repair strategies for similar foreign structures, even when both native languages appear to provide similar options and when the grammar of the native language appears to provide no guidance in choosing among options, raising the issue of the learnability of loan adaptation pattens (Broselow, 1999; Peperkamp, 2004). The principles determining the choice of repair constitute an important research problem for linguists, with proposals ranging from the claim that all loan adaptation is a function of misperception (Peperkamp and Dupoux, 2003) to proposals making reference to various factors: perception, production, and the nature of the data available to borrowers. Preservation of Foreign Features

A related question in loanword adaptation concerns why some features of borrowed words are changed to fit the native language constraints while others are maintained intact. Not all loanwords are fully adapted, and indeed over time the borrowing language may come to incorporate foreign structures, as in the English acceptance of [v] as an independent phoneme rather than the intervocalic reflex of [f] under the influence of French. In many languages, some sets of words retain foreign elements, resulting in what Ito and Mester (1995a, 1995b) called lexical strata: groups of words obeying distinct (but overlapping) sets of constraints. For example, in the core

Loanword Phonology 289

vocabulary of Japanese both [si] and [ti] sequences are impossible, and these sequences in the most nativized loanwords undergo palatalization: [Sifu:do] ‘seafood’ and [tSi:m] ‘team.’ In a more peripheral stratum, however, only the constraint against [si] is upheld, resulting in pronunciations such as [Sitibanku] ‘Citibank.’ Furthermore, nonnativized vocabulary may contain both [si] and [ti] sequences. There are no words, however, like hypothetical *[sitSibanku], where [si] is tolerated but [ti] is not. Thus, although both structures are equally foreign to core Japanese vocabulary, one is more readily accepted in borrowed words. The principles determining which foreign structures are more subject to preservation are yet to be understood. One family of proposals correlates the likelihood of preservation of foreign structures to the extent to which their individual articulatory components are part of the borrowing language system. Thus, Maddieson (1986) discussed evidence that a foreign phoneme is more likely to be incorporated into the borrowing language system if it fills some gap in the native language phoneme inventory; for example, [p] is more likely to be borrowed if the language already has a voiced labial stop [b] and if it has voiceless/voiced oppositions at other places of articulation ([t/d] and [k/g]). Similarly, Ussishkin and Wedel (2003) argued that medial consonant sequences are more likely to be tolerated in loanwords if the borrowing language allows word-final consonants (and therefore has consonant sequences across word boundaries). Thus, although neither Puluwat (Puluwatese) nor Tongan have consonant sequences within words, Puluwat does allow wordfinal consonants (and therefore consonant sequences across word boundaries), and in borrowed words, word-internal consonant sequences are much more likely to be preserved in Puluwat ([kapsajis] ‘capsize’) than in Tongan ([kakatisi] ‘cactus’). Other factors that may be implicated include simple frequency of foreign structures and the acoustic similarities between the native and foreign structures; borrowers may identify certain foreign phonemes, for example, as variants of native language phonemes. Although the patterns of loanword adaptation raise numerous questions, it is clear that they provide an important testing ground for hypotheses concerning how language is learned and how language is used. See also: Experimental Phonology; Phonology: Overview;

Second Language Acquisition: Phonology.

Bibliography Broselow E (1992). ‘Language transfer and universals in second language epenthesis.’ In Gass S & Selinker L

(eds.) Language transfer and language learning. Amsterdam: Benjamins. 71–86. Broselow E (1999). ‘Stress, epenthesis, and segment transformation in Selayarese loans.’ In Chang S, Liaw L & Ruppenhofer J (eds.) Proceedings of the 25th annual conference of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society. 211–225. Broselow E (2005). ‘Loan contact phonology: richness of the stimulus, poverty of the base.’ In Proceedings of NELS 34. Amherst: GLSA. Davidson L & Noyer R (1996). ‘Loan phonology in Huave: nativization and the ranking of faithfulness constraints.’ In Agbayani B & Tang S W (eds.) Proceedings of WCCFL 15. Stanford, CA: CSLI. 65–79. Elbert S H & Pukui M K (1986). Hawaiian dictionary: Hawaiian–English, English–Hawaiian. Honolulu: University of Hawaii Press. Fleishhacker H (2001). ‘Cluster-dependent vowel insertion asymmetries.’ UCLA Working Papers in Linguistics: Papers in Phonology 5, 71–116. Halle M & Clements G N (1993). Problem book in phonology: A workbook for introductory courses in linguistics and in modern phonology. Cambridge, MA: MIT Press. Haugen E (1950). ‘The analysis of linguistic borrowing.’ Language 26, 210–231. Hyman L (1970). ‘The role of borrowing in the justification of phonological grammars.’ Studies in African Linguistics 1, 1–48. Ito J & Mester A (1995a). ‘Japanese phonology.’ In Goldsmith J (ed.) The handbook of phonological theory. Cambridge, MA: Blackwell. 817–838. Ito J & Mester A (1995b). ‘The core–periphery structure of the lexicon and constraints on reranking.’ In Beckman J, Walsh Dickey L & Urbanczyk S (eds.) UMOP 18: papers in optimality theory. Amherst: GLSA. 181–210. Jacobs H & Gussenhoven C (2000). ‘Loan phonology: perception, salience, the lexicon and OT.’ In Dekkers J, van der Leeuw J & van de Weijer J (eds.) Optimality theory: phonology, syntax, and acquisition. Oxford: Oxford University Press. 193–210. Kang Y J (2003). ‘Perceptual similarity in loanword adaptation: English postvocalic word-final stops in Korean.’ Phonology 20, 219–274. Kenstowicz M (2001). ‘The role of perception in loanword phonology.’ Linguistique Africaine 20. Kenstowicz M (2003). ‘Salience and similarity in loanword adaptation: a case study from Fijian.’ Rutgers Optimality Archive 609–0803. http://roa.rutgers.edu/index.php3. Kenstowicz M & Sohn H S (2001). ‘Accentual adaptation in North Kyungsang Korean.’ In Kenstowicz M (ed.) Ken Hale: a life in language. Cambridge: MIT Press. 239–270. Maddieson I (1986). ‘Borrowed sounds.’ In Fishman J, Tabouret-Keller A, Clyne M, Krishnamurti B & Abdulaziz M (eds.) The Fergusonian impact: vol. I: From phonology to society. Amsterdam: Mouton. 1–15. Mohanan K P & Mohanan T (2003). ‘Towards a theory of constraints in OT: emergence of the not-so-unmarked in Malayalee English.’ Rutgers Optimality Archive 601–0503. http://roa.rutgers.edu/index.php3.

290 Loanword Phonology Paradis C & LaCharite´ D (1997). ‘Preservation and minimality in loanword adaptation.’ Journal of Linguistics 33, 379–430. Peperkamp S (2004). ‘A psycholinguistic theory of loanword adaptation.’ In Proceedings of the 30th annual meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society. Peperkamp S & Dupoux E (2003). ‘Reinterpreting loanword adaptations: the role of perception.’ ICPHS 15, 367–370. Ross K (1996). Floating phonotactics: variability in infixation and reduplication of Tagalog loanwords. Unpublished M.A. thesis, UCLA. Shinohara S (2004). ‘Emergence of Universal Grammar in foreign word adaptations.’ In Kager R, Pater J & Zonneveld W (eds.) Constraints in phonological acquisition. Cambridge, UK: Cambridge University Press. 292–320. Silverman D (1992). ‘Multiple scansions in loanword phonology: evidence from Cantonese.’ Phonology 9, 289–328.

Singh R (1985). ‘Prosodic adaptation in interphonology.’ Lingua 67, 269–282. Ussishkin A & Wedel A (2003). ‘Gestural motor programs and the nature of phonotactic restrictions: evidence from loanword phonology.’ In Tsujimura M & Garding G (eds.) WCCFL 22 Proceedings. Somerville, MA: Cascadilla Press. 505–518. Weinreich U (1953). Languages in contact. The Hague, The Netherlands: Mouton. Yip M (1993). ‘Cantonese loanword phonology and Optimality Theory.’ Journal of East Asian Linguistics 2, 261–291.

Relevant Website www.geocities.com/The Tropics/shores/6794/o-onloanwords. html.

Localization G Budin, University of Vienna, Vienna, Austria ! 2006 Elsevier Ltd. All rights reserved.

Defining and Distinguishing Key Terms Localization refers to the processes of preparing and adapting products linguistically and culturally for their sale and use in a specific locale, i.e., in a language community in a country or region of the world. A ‘locale’ is a virtual rather than a physical location, where a group of people share certain cultural and linguistic conventions in a consistent way so that the localization industry is able to identify the locale and distinguish it from other, maybe neighboring, locales (e.g., British English vs. American English). Traditionally, the concept of localization is distinguished from the concept of internationalization, i.e., the process of designing products in a generic way so that they can be localized easily and quickly. The process of internationalization is thus a prerequisite to localization. A more comprehensive concept is that of globalization. Besides its generic socioeconomic and political meaning, the term refers to the overall strategic planning level of ‘going global’ with one’s own products. On the operational level, it refers to the process of internationalization and subsequent localization and translation work. Due to the worldwide process of economic globalization, the demand for localization, internationalization, and translation has

been growing continuously over the past decades. Translation is a core process in the localization process because it focuses on crossing borderlines between languages and cultures and because it involves the comparison of cultural specificities of different locales, language communities, countries, etc., as well as the subsequent choice of communicative means in a target language or culture to adequately express the content of text available in a source language or culture.

What Is Localized and How Is It Done? Although localization has been successfully practiced for a long time in almost all spheres of trade and industry, the term ‘localization’ has been used more recently mainly in the contexts of the computer industry and of Internet- and Web-based trade. The localization of software products and of websites reflects the growing economic importance of localization in world trade. For the purpose of software localization, the program code has to be separated as nonlocalizable components from the user interface in general and from individual cultural elements to be localized in particular. While localization engineering includes the efficient use of localization tools, localization management focuses on the translation and cultural adaptation of user manuals, online help, and other types of software documentation. Website

290 Loanword Phonology Paradis C & LaCharite´ D (1997). ‘Preservation and minimality in loanword adaptation.’ Journal of Linguistics 33, 379–430. Peperkamp S (2004). ‘A psycholinguistic theory of loanword adaptation.’ In Proceedings of the 30th annual meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society. Peperkamp S & Dupoux E (2003). ‘Reinterpreting loanword adaptations: the role of perception.’ ICPHS 15, 367–370. Ross K (1996). Floating phonotactics: variability in infixation and reduplication of Tagalog loanwords. Unpublished M.A. thesis, UCLA. Shinohara S (2004). ‘Emergence of Universal Grammar in foreign word adaptations.’ In Kager R, Pater J & Zonneveld W (eds.) Constraints in phonological acquisition. Cambridge, UK: Cambridge University Press. 292–320. Silverman D (1992). ‘Multiple scansions in loanword phonology: evidence from Cantonese.’ Phonology 9, 289–328.

Singh R (1985). ‘Prosodic adaptation in interphonology.’ Lingua 67, 269–282. Ussishkin A & Wedel A (2003). ‘Gestural motor programs and the nature of phonotactic restrictions: evidence from loanword phonology.’ In Tsujimura M & Garding G (eds.) WCCFL 22 Proceedings. Somerville, MA: Cascadilla Press. 505–518. Weinreich U (1953). Languages in contact. The Hague, The Netherlands: Mouton. Yip M (1993). ‘Cantonese loanword phonology and Optimality Theory.’ Journal of East Asian Linguistics 2, 261–291.

Relevant Website www.geocities.com/The Tropics/shores/6794/o-onloanwords. html.

Localization G Budin, University of Vienna, Vienna, Austria ! 2006 Elsevier Ltd. All rights reserved.

Defining and Distinguishing Key Terms Localization refers to the processes of preparing and adapting products linguistically and culturally for their sale and use in a specific locale, i.e., in a language community in a country or region of the world. A ‘locale’ is a virtual rather than a physical location, where a group of people share certain cultural and linguistic conventions in a consistent way so that the localization industry is able to identify the locale and distinguish it from other, maybe neighboring, locales (e.g., British English vs. American English). Traditionally, the concept of localization is distinguished from the concept of internationalization, i.e., the process of designing products in a generic way so that they can be localized easily and quickly. The process of internationalization is thus a prerequisite to localization. A more comprehensive concept is that of globalization. Besides its generic socioeconomic and political meaning, the term refers to the overall strategic planning level of ‘going global’ with one’s own products. On the operational level, it refers to the process of internationalization and subsequent localization and translation work. Due to the worldwide process of economic globalization, the demand for localization, internationalization, and translation has

been growing continuously over the past decades. Translation is a core process in the localization process because it focuses on crossing borderlines between languages and cultures and because it involves the comparison of cultural specificities of different locales, language communities, countries, etc., as well as the subsequent choice of communicative means in a target language or culture to adequately express the content of text available in a source language or culture.

What Is Localized and How Is It Done? Although localization has been successfully practiced for a long time in almost all spheres of trade and industry, the term ‘localization’ has been used more recently mainly in the contexts of the computer industry and of Internet- and Web-based trade. The localization of software products and of websites reflects the growing economic importance of localization in world trade. For the purpose of software localization, the program code has to be separated as nonlocalizable components from the user interface in general and from individual cultural elements to be localized in particular. While localization engineering includes the efficient use of localization tools, localization management focuses on the translation and cultural adaptation of user manuals, online help, and other types of software documentation. Website

Locke, John (1632–1704) 291

localization has become a critical success factor for companies that use the Internet as a marketing instrument and that invest in e-commerce portals. The simultaneous launch of products in culturally and linguistically different markets (locales) requires the implementation of globalization strategies in general and of internationalization processes in particular.

Who Is Active in Localization? The complex tasks described above are increasingly carried out by localization professionals who are specifically educated and trained for such activities. The localization industry has developed professional profiles and specific roles in project teams, with best practices, success stories, quality models, and industry standards, and with professional organizations, such as the Localization Industry Standards Association (LISA), that organize symposia and workshops and that publish newsletters and standards documents. Localization has also become a topic of research activities in language engineering, translation studies, computer science, and other relevant disciplines. Courses and education programs are increasingly offered by academic institutions and by professional organizations (e.g., the Localization Research Centre at Trinity College in Dublin, Ireland). Localization service providers include small local businesses and large companies that act as global players in this competitive industry. For further reading, please consult the bibliography, which contains references to textbooks such as Bernd Esselink’s guide to software localization, as well as the list of Web references.

Bibliography Anobile M (ed.) (2001). Localization industry primer. Geneva: Localization Industry Standards Association. Cadieux P & Esselink B (2002). ‘Feeling GILTy: defining the terms globalization, internationalization, localization and translation.’ Language International 14(3), 22–25. Cronin M (2003). Translation and globalization. London and New York: Routledge. Esselink B (2000). A practical guide to localization. Amsterdam/Philadelphia: John Benjamins Publishing Company. Savourel Y (2001). XML internationalization and localization. Indianapolis: SAMS Publishing. Somers H (2003). Computers and translation: a translator’s guide. Amsterdam/Philadelphia: John Benjamins Publishing Company. Sprung R C (ed.) (2000). Translating into success: cuttingedge strategies for going multilingual in a global age. Amsterdam/Philadelphia: John Benjamins Publishing Company.

Relevant Websites http://www.gala-global.org – GALA: The Globalization and Localization Association. http://www.lisa.org – Localization Industry Standards Association. http://www.localisation.ie – Localisation Research Centre, Ireland; includes the International Journal of Localisation. http://www.localizationworld.com – annual conference on localization. http://www.multilingual.com – Multilingual Computing & Technology (journal). http://www.xliff.orgXML – Localization Interchange File Format.

Locke, John (1632–1704) V Raby, Universite´ de Reims-Champagne-Ardenne, Reims, France ! 2006 Elsevier Ltd. All rights reserved.

John Locke was born on August 28, 1632 in Wrington, Somerset, to a Puritan family. He studied at Westminster School and in Oxford, where he was appointed as a college tutor in 1661. During these years, his interest for medicine led him to the study of natural philosophy. Beginning in 1667, he served as secretary, physician, and political advisor to Lord Ashley, future Earl of Shaftesbury. He was elected a Fellow of the Royal Society. Locke came to France in 1673, after Shaftesbury’s resignation as lord chancellor. There he worked on the

study of philosophy, and met and read Cartesians and Gassendists. He returned to England in 1679 and became involved in politics, until his departure for Holland as a political refugee in 1683. He came back to England after the reign of James II in 1689 and spent his later years preparing his publications, the Essay, along with works about political theory, monetary problems, education, and theology. He died on October 28, 1704 in Oates, Essex. His contribution to linguistics is mainly located in Book III of An essay concerning human understanding (1690), titled ‘Of words.’ The general purpose of the Essay was to build an empirical theory of knowledge, assuming that there were no ‘innate ideas,’ and that all types of ‘ideas’ originated from

Locke, John (1632–1704) 291

localization has become a critical success factor for companies that use the Internet as a marketing instrument and that invest in e-commerce portals. The simultaneous launch of products in culturally and linguistically different markets (locales) requires the implementation of globalization strategies in general and of internationalization processes in particular.

Who Is Active in Localization? The complex tasks described above are increasingly carried out by localization professionals who are specifically educated and trained for such activities. The localization industry has developed professional profiles and specific roles in project teams, with best practices, success stories, quality models, and industry standards, and with professional organizations, such as the Localization Industry Standards Association (LISA), that organize symposia and workshops and that publish newsletters and standards documents. Localization has also become a topic of research activities in language engineering, translation studies, computer science, and other relevant disciplines. Courses and education programs are increasingly offered by academic institutions and by professional organizations (e.g., the Localization Research Centre at Trinity College in Dublin, Ireland). Localization service providers include small local businesses and large companies that act as global players in this competitive industry. For further reading, please consult the bibliography, which contains references to textbooks such as Bernd Esselink’s guide to software localization, as well as the list of Web references.

Bibliography Anobile M (ed.) (2001). Localization industry primer. Geneva: Localization Industry Standards Association. Cadieux P & Esselink B (2002). ‘Feeling GILTy: defining the terms globalization, internationalization, localization and translation.’ Language International 14(3), 22–25. Cronin M (2003). Translation and globalization. London and New York: Routledge. Esselink B (2000). A practical guide to localization. Amsterdam/Philadelphia: John Benjamins Publishing Company. Savourel Y (2001). XML internationalization and localization. Indianapolis: SAMS Publishing. Somers H (2003). Computers and translation: a translator’s guide. Amsterdam/Philadelphia: John Benjamins Publishing Company. Sprung R C (ed.) (2000). Translating into success: cuttingedge strategies for going multilingual in a global age. Amsterdam/Philadelphia: John Benjamins Publishing Company.

Relevant Websites http://www.gala-global.org – GALA: The Globalization and Localization Association. http://www.lisa.org – Localization Industry Standards Association. http://www.localisation.ie – Localisation Research Centre, Ireland; includes the International Journal of Localisation. http://www.localizationworld.com – annual conference on localization. http://www.multilingual.com – Multilingual Computing & Technology (journal). http://www.xliff.orgXML – Localization Interchange File Format.

Locke, John (1632–1704) V Raby, Universite´ de Reims-Champagne-Ardenne, Reims, France ! 2006 Elsevier Ltd. All rights reserved.

John Locke was born on August 28, 1632 in Wrington, Somerset, to a Puritan family. He studied at Westminster School and in Oxford, where he was appointed as a college tutor in 1661. During these years, his interest for medicine led him to the study of natural philosophy. Beginning in 1667, he served as secretary, physician, and political advisor to Lord Ashley, future Earl of Shaftesbury. He was elected a Fellow of the Royal Society. Locke came to France in 1673, after Shaftesbury’s resignation as lord chancellor. There he worked on the

study of philosophy, and met and read Cartesians and Gassendists. He returned to England in 1679 and became involved in politics, until his departure for Holland as a political refugee in 1683. He came back to England after the reign of James II in 1689 and spent his later years preparing his publications, the Essay, along with works about political theory, monetary problems, education, and theology. He died on October 28, 1704 in Oates, Essex. His contribution to linguistics is mainly located in Book III of An essay concerning human understanding (1690), titled ‘Of words.’ The general purpose of the Essay was to build an empirical theory of knowledge, assuming that there were no ‘innate ideas,’ and that all types of ‘ideas’ originated from

292 Locke, John (1632–1704)

sensation and reflection. This lead to the distinction between names for simple ideas, and names for mixed modes and substances. The significance of simple ideas could be secured by acts of ostension toward single perception, but the significance of the latter only relies on their definition. Locke insisted that, even if this could not be improved upon, we must always remember that the meaning of words resulted from our own ‘voluntary imposition.’ Words were signs of our internal conceptions and stood as ‘marks’ for ideas. This hypothesis was not new, but Locke added that this conception might differ from one individual to the other, because the meaning of words depended entirely on personal knowledge. This was particularly significant when it came to general terms. General terms did not refer to anything but were necessary to represent abstract, general ideas. They resulted from the activity of understanding, which ‘sorts’ particular things together according to their similarities and ‘ranks’ them under general names. Consequently, the question was to know how could one be sure to transmit, by means of conventional linguistic signs, the ‘determinate ideas’ required for knowledge and reasoning. This analysis led Locke to diagnose ‘imperfections’ in the use of language, and give remedies that every careful speaker should use to avoid most of these defaults. The idea that language may condition thoughts was further defined by Condillac. It has also been expanded upon in the field of language acquisition and the study of relationships between language and culture. Book III of the Essay deals mostly with the semantics of common names, but Locke also introduced, in Book IV, 21, a definition of se´ meioˆ tike` as the ‘doctrine of signs’ and as the third kind of science, after physica and practica. See also: Condillac, Etienne Bonnot de (1714–1780).

Bibliography Aarsleff H (1982). From Locke to Saussure: essays on the study of language and intellectual history. Minneapolis: University of Minnesota Press.

Ashworth E J (1984). ‘Locke on language.’ Canadian Journal of Philosophy 14, 45–73. Balibar E (1998). John Locke. Identite´ et diffe´ rence: l’invention de la conscience. Paris: Seuil. Buzzetti D (1982). ‘Locke e la discussione sugli universali.’ In Buzzetti D & Ferriani M (eds.) La grammatica del pensiero. Logica, linguaggio e conscenza nell’eta` dell’illumismo. Bologna: Il Mulino. Formigari L (1985). ‘Le ‘‘Way of ideas’’ et le langage moral.’ Histoire Episte´ mologie Langage VII–2, 15–33. Formigari L (1993). Signs, science and politics: philosophy of language in Europe 1700–1830. Amsterdam/ Philadelphia: John Benjamins. Hassler G (2003). ‘Scepticism and semantic theory from Locke to Du Marsais.’ In Paganini G (ed.) The return of scepticism from Hobbes and Descartes to Bayle. Dordrecht: Kluwer Academic Publishers. 343–361. Kretzmann N (1968). ‘The main thesis of Locke’s semantic theory.’ Philosophical Review 77, 175–196. Land S K (1986). The philosophy of language in Britain. New York: AMS Press. Latraverse F (1998). ‘Locke et le retournement se´ mantique.’ Se´ miotique 14, 19–30. Locke J (1689). Letter on toleration. London: A. Churchill. Locke J (1690a). An essay concerning human understanding. London. In Nidditch P H (ed.) (1975). Oxford; French tr. by Coste P (1700). Essai philosophique concernant l’entendement humain. Amsterdam: H. Schelte: repr. (1972) Paris: E. Naert. Locke J (1690b). Two treatises of government. London: A. Churchill. Locke J (1693). Some thoughts concerning education. London: A Churchill. Locke J (1975). The Clarendon edition of the works of John Locke. Nidditch P H, Yolton J W et al. (eds.) (30 vols). Oxford: Oxford University Press. Ott W R (2004). Locke’s philosophy of language. Cambridge: Cambridge University Press. Schreyer R (1992). ‘The definition of definition in Locke’s Essay concerning human understanding.’ Lexicographica 8, 26–51. Yolton J W (1965). ‘Locke and the seventeenth-century logic of ideas.’ Journal of the History of Ideas 16, 431–452.

Lodwick, Francis (1619–1694) 293

Lodwick, Francis (1619–1694) C Neis, University of Potsdam, Potsdam, Germany ! 2006 Elsevier Ltd. All rights reserved.

Francis Lodwick (1619–1694) was baptized on August 8, 1619 in London, where he was also buried on January 5, 1694. Lodwick [Lodowyck] attained important achievements in the domains of universal languages and phonetics. His father was a refugee Flemish merchant, while his mother was a descendant of French Huguenots. During his lifetime Lodwick was engaged in trade between England and the continent and was a highly influential member of the Anglo-Dutch community in London. Most details about Lodwick’s life are reported in the diary of Robert Hooke (1635–1703). In 1681, Lodwick was elected a member of the Royal Society. His writings attest to his interests in different fields, e.g., economy, politics, religion, and linguistics. His upbringing at a Protestant refugee community might have aroused his interest in linguistic issues, especially his desire to invent a universal language suitable for international communication. Lodwick was probably the first man to publish a book on a universal written character and language, A common writing, which appeared in 1647. Lodwick’s universal language project was influenced by Francis Bacon (1561–1626) and John Wilkins (1614–1672), who advocated the creation of a universal written ‘character’ that should have rendered communication possible between speakers of different languages. Lodwick, in his brief A common writing presented a universal language that contains a collection of words. These words are represented by non-verbal characters similar to arithmetic symbols that are useable in every language. Lodwick also suggested a dictionary containing primordial words and their derivations that can be gathered by the use of standardized diacritics. A second and more fruitful attempt at the creation of a universal character is his The ground-work, or foundation laid, (or so intended) for the framing of a new perfect language: and an universall or common writing. And presented to the consideration of the

learned, by a well-willer to learning. In this writing, published in 1652, Lodwick gave a more sophisticated solution to the problem of universal language as he tried to lay the foundation for a perfect new language suitable for the representation of the real order of nature. He suggested a scientific nomenclature with reference to the qualities of things. The main part of the text, however, is devoted to the project of a grammar that would reduce the multiplicity of rules in natural languages. One of his most striking solutions is the introduction of a universal word-order. Lodwick’s third suggestion for a universal language was published in 1686: An essay towards an universal alphabet. Although Lodwick’s attempts to create a universal language were never realized, his writings had a deep impact on Wilkins; in his Essay towards a real character, and a philosophical language (1668) he acknowledged his debt to Lodwick. Most of Lodwick’s other writings, such as his earliest shorthand system for Dutch on phonetic principles, remained in manuscript form. Several manuscripts were published in the Royal Society’s Philosophical Transactions, e.g., his scheme for a phonetic alphabet useable in every language. Lodwick may be characterized as an original thinker and one of the leading figures of linguistic discussions in the 17th century. See also: Bacon, Francis, Lord Verulam (1561–1626); Pho-

netics: Overview; Universal Language Schemes in the 17th Century; Wilkins, John (1614–1672).

Bibliography Borst A (1995). Der Turmbau von Babel. Geschichte der Meinungen u¨ ber Ursprung und Vielfalt der Sprachen und Vo¨ lker (4 Bde.). Mu¨ nchen: Deutscher Taschenbuch Verlag. Eco U (1993). La ricerca della lingua perfetta. Roma/Bari: Laterza. Funke O (1959). ‘On the sources of John Wilkins’ philosophical language (1668).’ English Studies 40, 208–214. Salmon V (1972). The works of Francis Lodwick. A study of his writings in the intellectual context of the seventeenth century. London: Longman.

294 Logic and Language: Philosophical Aspects

Logic and Language: Philosophical Aspects G Callaghan, Wilfrid Laurier University, Waterloo, Ontario, Canada G Lavers, University of Western Ontario, London, Ontario, Canada ! 2006 Elsevier Ltd. All rights reserved.

Introduction Theories of meaning and methods of linguistic analysis are key items in the agenda of contemporary analytic philosophy. Philosophical interest in language gained substantial impetus from developments in logic that took place in the latter half of the nineteenth century. It was at this time that the early modern conception of logic as an informal ‘art of thinking’ gave way to the contemporary conception of logic as a formally rigorous, symbolic discipline involving, inter alia, a mathematically precise approach to deductive inference. The systems of symbolic logic that emerged in the later stages of the nineteenth century were fruitfully applied in the logical regimentation of mathematical theories – analysis and arithmetic in particular – and logical analysis became the cornerstone of a general philosophical methodology for a number of influential figures in the first half of the twentieth century. Though the operative conception of logical analysis did not in every case treat language (or ‘natural language’) as the proper object of investigation, close connections between logic and language were stressed by virtually every proponent of the methodology. Our aim in this entry is to discuss these connections as they appear in the work of some of the eminent precursors and purveyors of the analytic tradition.

The Mathematicization of Logic: Leibniz and Boole Early modern philosophers were typically antipathetic to the formal approach to logic embodied in Aristotle’s doctrine of the syllogism and its Scholastic variants. Among the major figures of the early modern period, Gottfried Wilhelm Leibniz (1646–1716) is distinguished both for his respect for the Aristotelian tradition in logic and for his general emphasis on the importance of formal methods. Leibniz applauded Aristotle for being ‘‘the first to write actually mathematically outside of mathematics’’ (1696: 465). However, it was Leibniz’s own works, rather than those of Aristotle or of contemporary Aristotelians, that in the period did most to advance the conception of logic as a kind of generalized mathematics.

Leibniz’s logical work consists of a number of manuscripts, unpublished in his lifetime, in which he undertakes the construction of a logical calculus. In virtually all of these works, Leibniz represents judgments and logical laws in a quasi-arithmetical or algebraic notation and he assimilates processes of inference to known methods of calculation with numbers (e.g., by substitution of equals). Leibniz’s motivation for this approach stemmed from his early project of a lingua characteristica universalis – a symbolic language geared to the logically perspicuous representation of content in all fields of human knowledge. According to Leibniz, the content of any judgment consists in the composition of the concepts arrayed in the judgment as subject and predicate. A judgment is true when the predicate concept is ‘contained in,’ or partially constitutive of, the subject concept. For example, the truth of the judgment that all men are rational consists in the fact that the concept rational is contained in the concept man, as per the traditional definition of ‘man’ as ‘rational animal.’ (For obvious reasons, Leibniz’s conception of truth posed difficulties when it came to accounting for contingently true judgments, and the task of providing an account of contingency compatible with his conception of truth was one to which Leibniz devoted considerable philosophical attention.) All complex concepts can be parsed as conjunctions of concepts of lower orders of complexity down to the level of simple concepts that cannot be further analyzed. Leibniz’s various schemes for a universal characteristic were predicated on the idea that containment relations among concepts could be made arithmetically tractable given an appropriate assignment of ‘characteristic numbers’ to concepts. For instance, in one such scheme Leibniz proposed that the relationship between complex concepts and their simple constituents be represented in terms of the relationship between whole numbers and their prime factors, thus capturing the unique composition of any complex from its primitive components (1679, 1679/1686). In this and similar ways, Leibniz sought to provide a basis for the view that inference, and the evaluation of truth more generally, could be carried out algorithmically – that is, as a mere process of calculation – by familiar arithmetical means. By the 1680s Leibniz had grown pessimistic about the prospect of completing the project of the universal characteristic, and he turned his energies to the more confined task of devising an abstract logical calculus. Leibniz worked on a number of different versions of his logical calculus through the 1680s and 1690s. In each case he explained how standard propositional forms could be expressed in a quasi-algebraic

Logic and Language: Philosophical Aspects 295

notation. He also laid down logically primitive laws pertaining to his formulas (what he called ‘propositions true in themselves’) and abstractly specified valid inferential transformations, usually in terms of a definition of ‘sameness’ in conjunction with principles pertaining to the substitutability of identicals. Though Leibniz’s efforts at constructing a logical calculus were hampered by his view that all judgments ultimately reduce to a simple subject-predicate form – thus excluding primitive relational judgments – his emphasis on formal explicitness and mathematically exact symbolism stands as an anticipation of the main lines of development in formal logic in the following centuries. Leibniz’s mathematical approach to logic made little impression until the late nineteenth century, when his manuscripts were finally collected and published. By that time, however, the mathematical approach had gained momentum independently, largely on the basis of the work of George Boole (1815–1864). Boole shared with Leibniz the aim of devising an algebraic means of expressing relationships among terms figuring in propositions. However, Boole differed from Leibniz in treating the extensions of concepts (or classes), rather than concepts construed as attributes or ‘intensions,’ as the relevant propositional constituents. In The laws of thought (1854), Boole presented his class logic, which he called ‘the logic of primary propositions,’ as the first and fundamental division of his system. In the second part of the same work, Boole adapted the calculus of classes to a special interpretation that allows for the representation of logically compound propositions, or ‘secondary propositions,’ thereby unifying (after a fashion) the calculus of classes with a version of modern propositional calculus. Boole’s central idea is that an algebra of logic arises as an interpretive variant of standard numerical algebra when the latter is modified by a single principle that is naturally suggested by the logical interpretation of the symbolism. In Boole’s class logic, letters (or ‘literal symbols’) are interpreted as standing for classes of things determined by some common attribute, with ‘1’ standing for the universe class and ‘0’ standing for the null class. Multiplication, addition, and subtraction operators are treated as standing for the operations of intersection, disjoint union, and difference (or ‘exception’) of classes, respectively. Primary propositions are then expressed as equations with appropriately formed class terms standing on either side of the identity sign. On the basis of this class-theoretic interpretation of the symbolism, Boole maintained that the logical calculus differs from ordinary numerical algebra only with respect

to the characteristically logical law that, for any class x, xx ¼ x ðx intersect x is xÞ which holds generally for class theoretic intersection but which holds for numerical multiplication only for x ¼ 0 and x ¼ 1. Having emphasized this difference, Boole observed that the laws and transformations of numerical algebra will be identical to those of an algebra of logic when the numerical values of literal symbols in the former are restricted to 0 and 1. Boole appealed to this formal analogy between the numerical and logical algebras in justifying his approach to inference, which he presented as a process of solving sets of simultaneous equations for unknowns by standard algebraic methods. In The laws of thought, Boole transformed the calculus classes into a serviceable propositional calculus by interpreting his literal symbols over ‘portions of time’ during which elementary propositions are true, thus adapting the notation and methods designed for dealing with class relationships to the propositional case. Boole’s appeal to portions of time reflected a somewhat puzzling endeavor to assimilate or reduce propositional logic to the kind of term logic embodied in his class calculus, and the artificiality of this approach was not lost on subsequent logicians both within and without the algebraic tradition. However, peculiarities of interpretation notwithstanding, Boole can be credited with the first systematic formulation of propositional logic and a commensurate expansion of the scope of formal logic in general. Moreover, his suggestion that propositional logic, class logic, and numerical algebra (suitably restricted) arise as interpretive variants of a single algebraic system anticipates subsequent developments in abstract algebra and (perhaps only dimly) modern model-theoretic methods in logic. The contributions of Leibniz and Boole constitute beginnings in the fruitful deployment of artificial languages in the analysis of propositional content and the systematization of deductive inference. However, despite their considerable accomplishments, neither Leibniz nor Boole can be credited with bringing logic to its current state of maturity. They produced no inroads in the logic of relations and the use of quantifiers for the expression of generality is entirely foreign to their work. These shortcomings were addressed by later logicians working in the algebraic tradition (e.g., Pierce and Schro¨ der), but the significance of their resolution for the development of modern formal logic and its philosophical offshoots will be better appreciated if we adopt a somewhat different

296 Logic and Language: Philosophical Aspects

perspective on the historical interplay between logic and mathematics.

Logic and Language in Frege For the better part of his career, the philosophermathematician Gottlob Frege (1848–1925) devoted his energies to establishing the ‘logicist’ thesis that arithmetical truth and reasoning are founded upon purely logical principles. At an early stage in his efforts, Frege realized that existing systems of logic were inadequate for carrying out the analysis of content necessary for establishing arithmetic’s logical character. His Begriffsschrift (or ‘concept notation’) (1879) was intendes to address this deficiency. The logical system that Frege presented in the Begriffsschrift and later refined in Part I of his Grundgesetze Der Arithmetic (1893) constitutes the greatest single contribution in formal logic since the time of Aristotle. The most distinctive aspects of Frege’s logic are (1) the use of variables and quantifiers in the expression of generality; (2) the assimilation of predicates and relational expressions to mathematical expressions for functions; (3) the incorporation of both propositional logic and the logic of relations within (second-order) quantificational logic; (4) the notion of a formal system – i.e., of a system comprising a syntactically rigid language along with explicit axioms and inference rules that together determine what is to count as a proof in the system. Frege’s approach to generality is based on his analysis of predication in terms of function and argument. In arithmetic, a term such as 7 þ 5 can be viewed dividing into function and argument in different ways. For instance, it can be treated as dividing into the function ( ) þ 5 with 7 as argument, or as dividing into the function 7 þ ( ) with 5 as argument, or as dividing into the binary function ( ) þ [ ] with 7 and 5 (in that order) as arguments. Frege’s approach to predication assimilates the analysis of sentences to this feature of the analysis of arithmetical expressions. For example, a simple sentence such as ‘John loves Mary’ can be regarded as predicating the (linguistic) function ‘( ) loves Mary’ of the singular term ‘John,’ or the function ‘John loves ( )’ of the singular term ‘Mary,’ or the relational function ‘( ) loves [ ]’ of ‘John’ and ‘Mary’ (in that order). In the Begriffsschrift, Frege remarked that, for simple sentences like this, the analysis into function and argument makes no difference to the ‘conceptual content’ that the sentence expresses. However, the possibility of analyzing a sentence in these ways is nevertheless crucial to logic, since only on this basis do we recognize logical relationships between generalizations and their

instances. Adopting a standard arithmetical practice, Frege makes use of variables as a means of expressing generality. For example, by putting the variable ‘x’ in the argument-place of ‘Mary’ in our example, we arrive at the statement ‘John loves x,’ which is the Begriffsschrift equivalent of the colloquial generalization ‘John loves all things.’ The inference from this generalization to ‘John loves Mary’ now requires that we regard ‘Mary’ as argument to the function ‘John loves ( ),’ since only so is ‘John loves Mary’ recognizable as an instance of the generalization. Other function-argument analyses become salient in connection with other generalizations to which the statement relates as an instance (e.g., ‘x loves Mary’). In the system of the Begriffsschrift, the above described use of variables suffices to express generality in a limited variety of sentential contexts. However, Frege’s broader treatment of generality involves a second crucial component, namely, the use of quantifiers – i.e., the variable binding operators ‘8x’ (read: ‘Every x’) and ‘9x’ (read: ‘some x’) – as a means of indicating the scope of the generality associated with a variable. (Our discussion here prescinds from the peculiarities of Frege’s now obsolete notation as well as his convention of treating existential quantification in terms of universal quantification and negation – i.e., his treatment of ‘9x . . .’ as ‘%8x% . . .’). One of the many ways in which the use of quantifiers has proven important to logic concerns the expression of multiply general statements, for which no adequate treatment existed prior to the Begriffsschrift. Consider, for example, the relational generalization ‘Everyone loves someone.’ This statement is ambiguous between the following two readings: (1) ‘There is some (at least one) person that is loved by all,’ and (2) ‘Every person is such as to love some (at least one) person.’ The use of quantifiers resolves this ambiguity by requiring that expressions of generality in multiply general statements be ordered so as to reflect scope. The first reading of the statement is expressed by the existentially quantified sentence ‘9y8x xLy’ where the scope of universal quantifier falls within that of the existential quantifier. (For convenience, we assume here that the variables ‘x’ and ‘y’ are restricted to a domain of persons.) By contrast, the second reading is given by the sentence ‘8x9y xLy’ where the universal quantifier has wide scope with respect to the existential quantifier. Since the Begriffsschrift’s formation rules ensure that the scope of a quantifier will be properly reflected in

Logic and Language: Philosophical Aspects 297

any sentence in which it occurs, an ambiguous sentence such as the one we started with cannot even be formulated in the language. Scope considerations apply in essentially the same way to sentential operators (e.g., the negation sign ‘%,’ and the conditional sign ‘!’) in the logic of the Begriffsschrift. For instance, the ambiguity of the sentence ‘Every dog is not vicious’ results from the fact that, as stated, the sentence does not determine the scope of the negation sign with respect to that of the universal quantifier. On one reading, the scope of the negation sign falls within that of quantifier, i.e., ‘8x(Dx!%Vx),’ the statement thus affirming that anything that is a dog is not vicious (or, colloquially expressed: ‘No dogs are vicious’). By contrast, when the statement is read as giving wide scope to the negation sign, i.e., ‘%8x(Dx!Vx),’ it becomes the denial of the generalization that all dogs are vicious (i.e., ‘It is not the case that all dogs are vicious’). As the above examples begin to suggest, Frege’s technique of ordering of operators according to scope provides the basis for his incorporation of both propositional logic and the logic of relations within quantificational logic. Frege’s philosophical interest in language extended beyond his characterization the formal mechanisms of a logically perspicuous language such as the Begriffsschrift. In the classic paper ‘On Sense and Reference’ (1892), Frege presented a framework for a general theory of meaning applicable to both natural languages and formal languages. The core of the doctrine consists in the contention that any adequate account of the meaning of a linguistic expression must recognize two distinct, but related, semantic components. First, there is the expression’s reference, i.e., its denotative relation to a referent (or denoted entity). Second, there is the expression’s sense, which Frege characterized as a particular manner in which the expression’s referent is cognitively presented to the language user. Frege motivated this distinction by drawing attention to sentences in which the meaning of a singular term is apparently not exhausted by its correlation with a referent. For example, if the meaning of a singular term were to consist only in what it refers to, then the true, but non-trivial, identity statement ‘The evening star is the morning star’ could not differ in meaning from the trivially true identity statement ‘The evening star is the evening star.’ Since the latter statement results from the former by substituting co-referential singular terms, any view that equates meaning with reference will necessarily fail to register any difference in meaning between the two statements. But the two sentences clearly do differ in meaning, since ‘The evening star is the morning star’ is not a trivial identity, but an informative identity – indeed, one that expresses the

content of a genuine astronomical discovery – whereas ‘The evening star is the evening star’ is plainly trivial. Frege accounts for this by suggesting that while ‘the evening star’ and ‘the morning star’ have a common referent, they express different senses. A language user therefore grasps the common referent differently in connection with each of the two expressions, and this in turn accounts for the difference in ‘cognitive value’ between the two identity statements. Frege applies the notion of sense to similar effect in addressing puzzles concerning the meaning of singular terms in so-called ‘intensional contexts,’ for example, belief reports. In subsequent writings, Frege extended the sensereference distinction beyond the category of singular terms (which, on Frege’s account, refer specifically to ‘objects’), to all categories of linguistic expression, including mondadic and polyadic predicates (which refer to ‘concepts’ and ‘relations,’ respectively), and complete sentences. In the case of sentences, Frege identified as referents the two truth-values, ‘the true’ and ‘the false,’ and he characterized these as special ‘logical objects.’ A sentence’s sense is, by contrast, the ‘thought’ it expresses, where the thought is understood as a compositional product of the senses of the sentence’s linguistic subcomponents. As strained as this extension of the theory may appear, particularly with respect to reference, it brings to light two important features of Frege’s approach to meaning. First, it reflects his insistence that the theory of reference should comprise, inter alia, a theory of semantic value – that is, a systematic account of how the semantic values of complex expressions (which, in the case of sentences, are truth-values) are determined on the basis of the semantic values of their subordinate constituents. Second, it reflects an endeavor to integrate the theory of semantic value with a plausible general account of linguistic understanding (as given by the theory of sense). Seen in light of these general ambitions, Frege’s theory of sense and reference proposed an agenda that any comprehensive approach to the theory of meaning must in one way or another respect – a point that is amply borne out by subsequent developments in analytic philosophy of language.

Russell: Definite Descriptions and Logical Atomism The idea that logical analysis forms the basis of a general philosophical method is central to the philosophy of Bertrand Russell (1872–1970). It is especially prominent in the works Russell produced over the first quarter of the twentieth century. In this period, Russell developed and defended the doctrine of

298 Logic and Language: Philosophical Aspects

‘logical atomism,’ which grew out of his attempt to establish a version of logicism in the philosophy of mathematics, and came to encompass a wide variety of semantic, metaphysical, and epistemological ambitions. The common thread in Russell’s approach to these matters consists in his emphasis on logical analysis as a method for clarifying the ontological structure of the world and the epistemological basis of our knowledge of it. As Russell put it, ‘‘the atom I wish to arrive at is the atom of logical analysis, not physical analysis’’ (1918: 37). Bound up with the notion of a logical atom, understood as a basic residue of logical analysis, is the notion of logical structure itself. Our aim in this section is to illuminate Russell’s conception of logical structure, or ‘logical form,’ as it emerges in his theory of linguistic meaning and in his broader atomism. Russell’s theory of ‘definite descriptions,’ first articulated in his classic paper ‘On denoting’ (1905), paradigmatically illustrates Russell’s methodological reliance on logical analysis in addressing questions about meaning. The argument of the paper involves, among other things, the defense of a principle that Russell regarded as fundamental to the account of linguistic understanding. In The problems of philosophy, Russell gave the following succinct statement of the principle: ‘‘Every proposition which we can understand must be composed wholly of constituents with which we are acquainted’’ (1910: 32). At the time of ‘On denoting,’ Russell meant by a ‘proposition,’ roughly, the state of affairs that is expressed by an indicative sentence, whether or not that state of affairs actually obtains. A proposition’s ‘constituents’ are the real-world entities that figure in the state of affairs (or would figure in it, were the state of affairs to obtain). So understood, a proposition is not a linguistic entity, even in the attenuated sense of a Fregean thought. A proposition is, rather, a structured entity that comprises various nonlinguistic components of the world. What characterizes Russell’s principle of acquaintance as a principle of linguistic understanding, then, is not the linguistic nature of propositions, but the correlativity of propositions with the indicative sentences of a language. For Russell, understanding any such sentence requires direct experiential acquaintance with the non-linguistic constituents comprised in the proposition it expresses. In ‘On denoting’ Russell addressed problems that the principle of acquaintance ostensibly confronts in connection with ‘denoting phrases’ – i.e., phrases of the form ‘some x,’ ‘every x,’ and especially ‘the x’ (i.e., so-called ‘definite descriptions’). Consider the statement: ‘The author of ‘‘On denoting’’ was a pacifist.’ Since Russell’s principle requires acquaintance with a proposition’s constituents as a condition for linguistic

understanding, it would seem that only those personally acquainted with the author of ‘On denoting’ (i.e., with Russell himself) are in a position to understand the sentence. However, this highly counterintuitive consequence only arises on the assumption that the denoting phrase ‘the author of ‘‘On denoting’’’ functions as a genuine singular term, one that singles out Russell as a constituent of the corresponding proposition. Russell’s account of definite descriptions challenged this assumption by arguing that the characterization of definite descriptions as singular terms arises from a mistaken account of the logical form sentences containing them. According to this mistaken analysis, the sentence ‘The author of ‘‘On denoting’’ was a pacifist’ is an instance of the simple subjectpredicate form Ps, where s indicates the occurrence of a singular term and P the occurrence of a predicate. Russell maintained that sentences containing definite descriptions have a far richer logical structure than this account would suggest. On Russell’s analysis, the statement ‘The author of ‘‘On denoting’’ was a pacifist’ is not a simple subject-predicate statement but has the form, rather, of a multiply quantified statement: 9xððx authored ‘On denoting’ & 8y ðy authored ‘On denoting’ ! y ¼ xÞÞ & x was a pacifistÞ On this analysis, the statement says: there is an x such that (1) x authored ‘On denoting,’ (2) for any y, if y authored ‘On denoting,’ then y ¼ x (this clause serving to ensure the uniqueness implied by the use of the definite article) and (3) x was a pacifist. So construed, the only nonlogical components of the sentence are the descriptive predicates ‘( ) authored ‘‘On denoting’’’ and ‘( ) was a pacifist,’ with no trace remaining of the putative singular term ‘the author of ‘‘On denoting’’’. Therefore, beyond an implicit understanding of the mechanisms of quantification and the logical relation of identity, acquaintance with the referents of these descriptive predicates suffices for understanding the sentence. The sentence still manages to be about Russell since he, and he alone, satisfies the descriptive predicates (or ‘propositional functions,’ in Russell’s terminology) contained in the sentence. However, it no longer singles out Russell as a constituent of the corresponding proposition, thus dispensing with the worry that the principle of acquaintance would require personal acquaintance with Russell as a condition for understanding what the sentence means. The theory of definite descriptions vividly conveys the sense in which, for Russell, the surface grammar of natural language is inadequate as a guide to the analysis of logical form. Indeed, Russell maintained that many of the metaphysical and epistemological perplexities of traditional philosophy were a direct

Logic and Language: Philosophical Aspects 299

result of conflating the grammatical forms of natural language sentences with logical forms of the propositions we manage to express in natural language. In this connection, it is important to recognize that, for Russell, logical form is not a purely linguistic notion. We have already taken note of the fact that Russell’s early philosophy treats propositions as structured complexes that array various non-linguistic components of reality. Though Russell ultimately abandoned his early theory of propositions, he never abandoned the view that reality itself exhibits varieties of structure to which the details of a suitably perspicuous logical language must answer. In his lectures on The philosophy of logical atomism (1918), this view takes the form of a doctrine of ‘facts,’ where facts are understood as real-world complexes of individual objects and the properties and relations predicable of them. On Russell’s characterization, a fact is a kind of complex that is inherently apt to determine a corresponding indicative statement as true or false – that is, true when the statement affirms the fact, and false when it denies it. The kernel of Russell’s atomism consists in the view that the content of any statement (of whatever order of complexity) is ultimately analyzable in terms of the constellation of logically primitive facts that determine the statement as true or false. Russell’s inventory of such facts includes ‘atomic facts,’ in which properties and relations are predicated of metaphysically ‘simple’ entities, and ‘general facts,’ which are facts concerning all or some of a particular category of entity. Atomic facts correspond to the atomic sentences, and general facts to the quantified sentences, of a logically regimented language. All other sentences are ‘molecular’ in the sense that they are compounds built up from atomic sentences and quantified sentences by the application of logical connectives such as ‘not,’ ‘and,’ ‘or,’ and ‘if . . . then . . . .’ Though molecular sentences assert neither atomic nor general facts, their truth or falsity is nevertheless dependent upon such facts in the sense that a molecular sentence will be determined as true or false as a function of the truth or falsity of its nonmolecular subsentences. For example, if ‘p’ and ‘q’ are atomic sentences, then the disjunction ‘p or q’ will be true just in case one or both of ‘p’ and ‘q’ are true, where the truth or falsity of these subsentences is determined directly by the atomic facts to which they relate. Russell’s metaphysical view of logical form – that is, his view that logical structure is an inherent characteristic of the facts that the real world ultimately comprises – is nicely expressed in a comment from his Introduction to mathematical philosophy. There Russell maintains that ‘‘logic is concerned with the real world just as truly as zoology, though with its

more abstract and general features’’ (1919: 169). At least part of Russell’s motivation for this ‘substantive’ conception of logic consists in his abiding conviction that the structure of language (or of an ideal language, at any rate) and the structure of the world must in some way coincide if there is to be any prospect of expressing our knowledge of the world by linguistic means. The task of giving a detailed account of this community of form between language and world was one that Russell wrestled with many times over, but which he ultimately left to the talents of his most gifted student, Ludwig Wittgenstein – the second great exponent of the philosophy of logical atomism.

Wittgenstein on Logic and Language Of the figures we are discussing, arguably Wittgenstein (1889–1951) addressed the question of the relation between logic and language most extensively. His earliest major work, the Tractatus logic-philosophicus, devotes much attention to this problem. It is on this early work that we will focus here. In this work he praised Russell for discovering that the apparent logical form of a proposition need not be its real logical form. He supplemented Russell’s view with the claim that the real form of a proposition is a picture of a state of affairs in the world. Propositions, according to the Tractatus, are pictures of facts. The structure of a proposition mirrors the structure of the fact it represents. What a fact and the proposition that describes it have in common is their form. ‘‘The picture, however, cannot represent its form of representation; it shows it forth’’ (2.172). Here we see the important Tractarian distinction between saying and showing. The statement ‘it is now raining’ says something about the world. The statement ‘it is either the case that it is now raining or it is not the case that it is now raining’ says nothing about the world. It does, however, show the logical relations between facts. If something can be shown it cannot be said (4.1212). It follows that nothing concerning the logical relations between facts can be said. According to the Tractatus ‘‘the world is the totality of facts, not of things’’ (1.1). That is to say, the world is not completely described by a list of all the objects that it contains. Rather, a complete description of the world would consist of all true sentences. Facts can either be atomic or compound, and correspondingly there are two types of propositions. Atomic facts are the most basic type of fact, and all atomic facts are independent of one another. Likewise, any possible set of atomic propositions could be true at the same time. This does not hold generally, as p and %p (it is not the case that p) cannot both be true at the same time. Compound propositions are built up

300 Logic and Language: Philosophical Aspects

by truth functions on atomic proposition. Any operator, including the logical operators (and, or, not, . . .), that takes sentences as arguments and assigns a truth value to the compound expression based only on the truth value of the arguments, is called a truth functional operator. For instance, ‘or’ designates a truth function with two argument places: the truth value of the sentence ‘p or q’ depends only on the truth value of the sentences ‘p’ and ‘q’. On the other hand, in the sentence ‘Julius Caesar conquered Gaul before Rome fell to barbarians,’ ‘. . .. before ’ designates a function that takes sentences as arguments, but it is not truth functional since we need to know more than the truth value of the arguments to determine the truth value of the compound. Wittgenstein observed that all propositions are either atomic or built up by truth functions on atomic propositions. Because of this all propositions can be expressed as a truth function on a set of atomic propositions. Statements such as all of those of the form ‘p or %p’ are tautologies: they are true no matter what the truth value of the constituents. We can know for certain that a tautology is true, but this is only because tautologies are true independently of which atomic facts turn out to be true (and because all sentences are truth functions of atomic sentences). We cannot say that the world has a certain logical structure, this can only be shown. It is tautologies that show the logical syntax of language, but tautologies say nothing. ‘‘Logical propositions describe the scaffolding of the world, or rather they present it. They ‘treat’ of nothing’’ (6.124). Concerning sentences of natural language, Wittgenstein thought that no serious reconstruction is necessary. ‘‘In fact, all the propositions of our everyday language, just as they stand, are in perfect logical order’’ (5.5563). Furthermore, Wittgenstein thought that what he says about the logical structure of language must already be known by anyone who can understand the language. ‘‘If we know on purely logical grounds that there must be elementary propositions, then everyone who understands propositions in their unanalysed form must know it’’ (5.5562). Remember that Wittgenstein, following Russell, distinguished between the apparent logical form of a proposition and its real logical form. The logical form of natural languages is extremely complex and shrouded in conventions. ‘‘Man possesses the capacity of constructing languages, in which every sense can be expressed, without having an idea of how and what each word mean – just as one speaks without knowing how the single sounds are produced. Colloquial language is part of the human organism and is no less complicated than it. From it it is humanly impossible to gather immediately the logic of language’’ (4.002).

While ordinary use of natural language is in perfect logical order, philosophy arises from the abuse of natural language. Wittgenstein thinks that philosophy is nonsense because it attempts to state what cannot be said. ‘‘Most propositions and questions, that have been written about philosophical matters, are not false, but senseless’’ (4.003). The view that philosophy as standardly practiced is senseless, and therefore that a radically new approach to philosophy must be developed had a profound influence on a group of philosophers who held weekly meetings in Vienna – the Vienna Circle.

Carnap and the Vienna Circle Rudolf Carnap (1891–1970) is generally regarded as the most influential member of the Vienna circle. This group (often called the logical positivists or logical empiricists) studied Wittgenstein’s Tractatus carefully and much of what they wrote was inspired by or was a reaction to this work. Logical positivism is often thought of as being characterized by its commitment to verificationism. In its strictest form, verificationism is the view that the meaning of a sentence consists in the method of its verification – that is, in the epistemic conditions under which the statement would properly be acknowledged as true. In a less strict form, it is the view that the meaning of a sentence consists of what would count as evidence for or against it. There was much debate in the circle as to what form the verificationist principle should take. There was in the circle very little objection (Go¨ del being the notable exception) to the thesis that there are two different kinds of statements – empirical (synthetic) and logico-mathematical (analytic) statements. Concerning empirical statements, their meaning is given by what would amount to a verification (or confirmation on a less strict view) of the statement or its negation. Concerning logico-mathematical statements, the circle was much influenced by Wittgenstein’s view that tautologies are a priori truths – truths that are knowable independently of experience because they say nothing concerning the state of the empirical world. What Wittgenstein counted as a logical truth (a tautology) was not sufficiently broad to include all of mathematics. Since mathematical truths are not empirical assertions, members of the Vienna circle thought they should have the same status as other logical truths. Carnap undertook to broaden the definition of logical truth so as to include all mathematical statements. To do this Carnap had to answer the question of what makes something a logical truth. Carnap’s answer to this question involved the adoption of a strong form of conventionalism, which he

Logic and Language: Philosophical Aspects 301

expressed in terms of his famous ‘principle of tolerance’: ‘‘In logic, there are no morals. Everyone is at liberty to build up his own logic, i.e. his own form of language, as he wishes. All that is asked of him is that, if he wishes to discuss it, he must state his methods clearly, and give syntactic rules instead of philosophical arguments’’ (The logical syntax of language, x17). This principle states that logical truth is a matter of convention. Which statements are treated as belonging to the set of analytic statements is a matter of pragmatic decision, provided that the set can be clearly defined. There is, for Carnap, no logical structure of the world that is either rightly or wrongly captured by our choice of logic. Logical relationships between sentences are a matter of stipulation on our part. However, by classifying logical statements as analytic, and therefore independent of empirical circumstances, Carnap preserves the Wittgensteinian idea that logical truths say nothing about the world. Carnap later summarized his position on logico-mathematical truth by claiming that analytic statements are true in virtue of their meaning. Carnap’s principle of tolerance was inspired by the debates concerning the foundations of mathematics in the 1920s. One party in this debate were the intuitionists, who did not believe that we have grounds to assert a mathematical sentence of the form ‘p or %p,’ unless we have a proof of either p or of %p. According to classical logic, the sentence ‘p or %p’ is a tautology; it therefore stands in no need of prior justification. Intuitionists, therefore needed to abandon classical logic in favor of a logic that would not count all instances of the law of the excluded middle (p or %p) as valid. Carnap saw both classical and intuitionistic logic as well motivated, and saw nothing that could decide between the two. He therefore saw the decision of which logic to adopt as a matter of choice. Further developments in logic amplified the differences between Carnap and Wittgenstein. Go¨ del’s famous incompleteness theorems made use of a technique that has since become known as Go¨ del numbering. By Go¨ del numbering we assign code numbers to expressions of the language. Through this coding technique, a language capable of expressing arithmetical properties becomes a device for discussing certain syntactic properties of any language system. Carnap saw this as a refutation of Wittgenstein’s idea that the logical syntax of language is inexpressible. In fact, one of the general goals of Carnap’s The logical syntax of language was to show that it is possible to deal in a clear systematic manner with the syntactic properties of any language. Recall that for Wittgenstein, we cannot say anything concerning the logical syntax of language.

Carnap’s logical tolerance led him to assert that even statements of the form (9x)Px, which assert the existence of an object with the property P, might be true by stipulation. That we could stipulate an object into existence seemed odd to many philosophers. In order to address this worry, Carnap formulated a distinction between internal and external questions of existence. In the language system of arithmetic it is provable that (9x) (7 < x < 9). Relative to this language, the question of the existence of numbers is trivial. But when someone asks whether numbers exist they do not mean to be asking the questions in such a manner that it is answerable in by appeal to the standards of proof and disproof that prevail in arithmetic. Rather, they mean to ask if the numbers really exist in some absolute sense. Carnap viewed such ‘external’ questions as unanswerable given that they remove the questions from a context in which there are clear standards for addressing it, without embedding it in another context where there are any such standards. But the coherence of the language system that includes, for instance, the numbers does not depend on a positive answer to the external question of the existence of numbers. In this way, not only the logical structure of the world but its ontology as well becomes a matter of convention.

Quine: the Thesis of Gradualism W. V. O. Quine (1908–2000) began his philosophical career as a self-described disciple of Carnap’s. However, from his earliest interaction with Carnap, Quine questioned Carnap’s strict division between analytic and synthetic sentences. This early reaction to Carnap’s work grew into a major break between the two philosophers. Recall that the analytic/synthetic distinction divides sentences (respectively) into those that concern the world and are capable of being empirically confirmed, and those that are accepted by stipulation and are true in virtue of meaning. Quine thought that this difference in kind ought to be replaced with a difference of degree. This is known as the thesis of gradualism. For Quine our knowledge forms a structure like a web. The nodes of the web are sentences and the links between nodes are entailment relations. Only at the periphery are our decisions to accept or reject sentences directly influenced by experience. Decisions over sentences closer to the center are of an increasingly ‘theoretical’ character, with accepted logical and mathematical statements forming the most central class. The ordering of sentences from periphery to interior is based on how willing we would be to abandon a sentence when revising our beliefs in light of new evidence. For sentences like ‘this table

302 Logic and Language: Philosophical Aspects

is red’ we can easily imagine a set of experiences that would lead us to abandon it. By contrast, it is far more difficult to imagine the experiences that would lead us to abandon ‘2 þ 2 ¼ 4.’ Abandoning this statement would entail a far more radical change in our overall belief system. However, for Quine, the difference is one of degree, rather than kind. No sentence, mathematical, logical or otherwise, is ultimately immune from revision in light of experience. Physics, for instance, in departing from Euclidean geometry has abandoned sentences such as ‘between any two points exactly one straight line can be drawn’ once believed to be on the firmest foundation. We have seen that, for Carnap, logico-mathematical truths are not responsible to any aspect of the world. We are perfectly free to accept any set of sentences to count as analytic. The way the world is affects only the practical utility of our choices of analytic statements; it does not affect the theoretical legitimacy of those choices. (However, Carnap is far more interested in giving reconstructions of existing notions instead of constructing arbitrary systems.) For Quine, on the other hand, logical and mathematical truths are on par with highly theoretical statements of physics. It may turn out that by abandoning classical logic or altering our mathematics we will be able to formulate more simple scientific theories. Since simplicity is one of the norms of theory choice, it may be that our best scientific theory does not conform to the laws of classical logic. Carnap’s principle of tolerance suggests that logical truths are true by virtue of the meanings assigned to the logical vocabulary. Quine rejects this view and sees logical truths as subject to the same standards of acceptance as any other scientific claim. Since there are a certain sets of experiences that would lead us to reject what we now regard as a logical truth, Quine maintained that we could no longer hold, as Wittgenstein did, that logical truths are true independently of how things happen to be in the empirical world. Logical truths therefore lose their special status and become statements on par with other scientific claims. They are true because they are part of our best description of the world. See also: Analytic Philosophy; Analytic/Synthetic, Necessary/Contingent, and a priori/a posteriori; A Priori Knowledge: Linguistic Aspects; Logical Consequence;

Propositions; Semantic Value; Sense and Reference: Philosophical Aspects.

Bibliography Boole G (1854). The laws of thought. [Reprinted New York: Dover, 1958.] Carnap R (1937). The logical syntax of language. Amethe Smeaton (trans.). New Jersey: Littlefield Adams, 1959. Carnap R (1950). ‘Empiricism, semantics and ontology.’ In Sarkar S (ed.). 1996. Frege G (1879). ‘Begriffsschrift: a formula language modeled upon that of arithmetic, for pure thought.’ BauerMengelberg S (trans.). In van Heijenoort Jean (ed.) From Frege to Go¨ del: a source book in mathematical logic. Cambridge, MA: Harvard University Press, 1976. Frege G (1892). ‘On sense and reference.’ Black M (trans.). In Black M & Geach P T (eds.) Translations from the philosophical writings of Gottlob Frege, 2nd edn. Oxford: Basil Blackwell, 1960. Frege G (1893). Grundgesetze der Arithmetik (vol. 1). Jena: Verlag Hermann Pohle. [Partially translated as Furth M, The basic laws of arithmetic by University of California Press, Berkeley 1964.] Leibniz G W (1696). ‘Letter to Gabriel Wagner on the value of logic.’ Loemker D L (ed. & trans.). In Philosophical papers and letters, 2nd edn. Dordrecht & Boston: Reidel Publishing, 1976. Leibniz G W (1679). ‘Elements of a calculus.’ In Parkinson G H R (ed. & trans.). 1966. Leibniz G W (1679/1686). ‘A specimen of the universal calculus.’ In Parkinson G H R (ed. & trans.). 1966. Parkinson G H R (ed. & trans.) (1966). Logical papers. Oxford: Clarendon Press. Quine W V (1951). ‘Two dogmas of empiricism.’ In Sarkar S (ed.). 1996. Quine W V (1969). ‘Epistemology naturalized.’ In Ontological relativity and other essays. New York: Columbia University Press. Russell B (1905). ‘On denoting.’ [Reprinted in Marsh R C (ed.) Logic and knowledge. London: Unwin, 1988.] Russell B (1912). The problems of philosophy. [Reprinted London: Oxford University Press, 1986.] Russell B (1918). The philosophy of logical atomism. Pears D (ed.). La Salle, Illinois: Open Court, 1972. Russell B (1919). Introduction to mathematical philosophy. New York: Clarion, 1971. Sarkar S (ed.) (1996). Science and philosophy in the twentieth century v. 5. New York: Garland. Wittgenstein L (1918). Tractatus logico-philosophicus. Ogden C K (trans.). London: RKP, 1988.

Logical and Linguistic Notation 303

Logical and Linguistic Notation J Lawler, University of Michigan, Ann Arbor, MI, USA ! 2006 Elsevier Ltd. All rights reserved.

Notation is a conventional written system for encoding a formal axiomatic system. Notation governs . The rules for assignment of written symbols to elements of the axiomatic system . The writing and interpretation rules for wellformed formulae in the axiomatic system . The derived writing and interpretation rules for representing transformations of formulae, in accordance with the rules of deduction in the axiomatic system. All formal systems impose notational conventions on the forms. Just as in natural language, to some extent such conventions are matters of style and politics, even defining group affiliation. Thus, notational conventions display sociolinguistic variation; alternate conventions are often in competing use, though there is usually substantial agreement on a ‘classical’ core notation taught to neophytes. This article is about notational conventions in formal logic, which is (in the view of most mathematicians) that branch of mathematics (logicians, by contrast, tend to think of mathematics as a branch of logic; both metaphors are correct, in the appropriate formal axiomatic system) most concerned with many questions that arise in natural language, e.g., questions of meaning, syntax, predication, well formedness, and – for our purposes, the most important such – detailed, precise specification. Specification is the purpose of notation, both in mathematics and in science, but such precise conventions are unavoidably context sensitive. Thus, the use of logical notation is different in logic and in linguistics. Bochenski (1948; English translation 1960) is still the best short introduction to logical notation. Almost all logical notation is modern, dating from the last century and a half. However, there is some prior work that deserves comment here, because logic, alone of all mathematical fields, was widely studied and significantly developed in the European Middle Ages. The principal concern of medieval logicians was the syllogism. By the time of the Renaissance, there was an extensive and thorough account of syllogistic. One of its major achievements was the development of systematic names for the modes of the syllogism. These names (as conventionally grouped, Barbara, Celarent, Darii, Ferio, Barbarix, Feraxo; Cesare, Festino, Camestres, Baroco, Camestrop,

Cesarox; Darapti, Disamis, Datisi, Felapton, Bocardo, Ferison; Bramantip, Camenes, Dimaris, Fesapo, Fresison, Camenop) constitute the first real notational convention in logic. The names are mnemonic, designed to be chanted, like Pa¯ nini’s rules. The three vowels in each name are the ˚letters A, E, I, O, which mark the vertices of the Square of Opposition, indicating the proposition type (respectively, Universal Affirmative, Universal Negative, Existential Affirmative, and Existential Negative) of each of the three propositions of the syllogism. The letters s, p, m, and c also have specific meanings in these mnemonics, summarizing relevant logical properties of each type, thus serving the notational goal of detailed, precise specification. Syllogistic is largely of historical interest in modern logic, but its concerns, terminology, and notation continued to be used and understood until well into the development of modern logic. In modern logic and mathematics, notation is a necessary part of a calculus, one of a number of special sets of formalized concepts and techniques for manipulating them. The metaphor refers to the origins of classical calculation, which was performed with pebbles (Lat calculus) on a counting table or abacus. This metaphor licenses notational practice with formal systems, which is to . Encode: represent parts (hopefully, natural parts) of a quantity, concept, or truth with symbols, then . Calculate: push those symbols around in conventionally accepted fashions, hoping thereby to . Decode: find in the changed symbolic patterns representations of previously unknown quantities, concepts, or truths. Calculi are an invention of the 17th century; the best known are Leibniz’s integral calculus and Newton’s differential calculus, which (together) are understood as the default meaning of calculus in modern English. There are many calculi in modern mathematics, some of which exist in name only, such as Leibniz’s putative calculus ratiocinator. In symbolic logic, which is the closest thing to what Leibniz called for, the two most important calculi, each with its own notational conventions, are Propositional Calculus and Predicate Calculus, both of which were originally intended straightforwardly (as the titles of their original publications show) as tools for the representation of human thought. Language did not enter into the picture at first, except as a transparent expression of thought.

304 Logical and Linguistic Notation

Propositional Calculus Symbolic logic, and its notation, originated in the works of George Boole (1815–1864), of which Boole (1854) is the best known. Boole’s intention was to produce an algebraic account of propositions as combined via what we have come to call Boolean connectors, principally (logical) and, or, not, equivalent, and implies, which then achieved the dual status of both English words that are prominent in logical discussion and mathematically defined functors. Boole first made explicit the alternation between logical and and or enshrined in DeMorgan’s Laws, comparing them (respectively) to multiplication and addition in algebra; thus he represented x and y as xy, whereas x or y was x þ y; he used numbers throughout, using 1 " x to represent not x, for instance. This notational convention, and the system based on it, is often called Boolean Algebra (though technically it is a complemented distributive lattice, not an algebra). Boole’s logic did not use quantifiers per se; instead, he dealt with the quantification inherent in syllogistic by using the traditional letters A, E, I, O. Propositional calculus, the calculus of arbitrary whole propositions without regard to their predicates or arguments, uses two major notations. One, usually called ‘Classical’ or ‘Standard,’ exists in numerous individual variations and is usually the one taught to students; the other, called ‘Polish,’ ‘Łukasiewicz,’ or ‘Prefix,’ is standardized and in widespread use. Classical notation for propositional calculus uses lowercase letters for propositions (traditionally p, q, r, s) and special symbols for their connectives. The two truth values of a proposition are usually either T and F, or 1 and 0. In ternary logics, T / F / # is more common than numeric codes, because arithmetic systems like 1 / 0 / "1 or 1 / ½ / 0 make implicit combinatorial claims. The Classical special symbols for functors include those shown in Table 1. In each case, the first symbol is the most widely accepted. In addition to functors, propositional logic also contains symbols for pragmatic connectives used in proofs, such as entailment, which usually uses a single arrow (!), and assertion, which uses a variety of symbols, including ‘. Table 1 Classical special symbols for functors not p p and q p or q p is equivalent to q p implies q

:p, -p, $p, p p ^ q, p % q, p & q, pq p_q p & q, p ! q, p ) q p # q, p $ q, p , q

There are also parentheses, because grouping of formulae can introduce significant ambiguity, which is anathema in logic. In extended use, parentheses were found to be burdensome, because balancing them was a frequent source of avoidable error. To combat this, Whitehead and Russell in their monumental Principia mathematica (1910–13), developed a special parenthesis-free notation to augment their Classical formulae, based on using groups of 1,2,3, . . . dots to separate propositions. This version is rarely seen today. Polish notation was developed and popularized by Jan Łukasiewicz (1878–1956) in the early 1920s as a by-product of his development of ternary logic, for which he also invented the truth table. In this notation, propositions are again represented by lowercase letters, but functors are uppercase letters placed immediately before their argument(s): not p is Np, p and q is Kpq, p or q is Apq, p implies q is Cpq, and p is equivalent to q is Epq. Because functors form valid propositions, these can be nested indefinitely without recourse to parentheses; for instance, DeMorgan’s Laws, which are stated in classical notation as : (p ^ q) # : p _ : q and : (p _ q) # : p ^ : q, are stated in Polish notation respectively as EKNpqANpNq and EANpqKNpNq. Because the prefixal position of the Polish functors is arbitrary, a postfixal variant, called Reverse Polish Notation, or RPN (linguists note that it should be called ‘Japanese Notation,’ because it acts like a standard SOV language), is equally valid and is widely used in computing circles, because it turns out to be ideally adapted to performing calculations using a pushdown stack. In RPN, DeMorgan’s laws are stated as pNqKpNqNAE and pNqApNqNKE. Modal Logic, an extension of propositional calculus into modality, introduces two more common notational symbols, ep for p is possibly true (in Polish notation Mp, for Mo¨ glich), and up for p is necessarily true (Polish Lp, for Logisch). DeMorgan’s Laws for modal logic (where u is associated with ^ and e with _) can thus be stated :up # e:p (Polish ENLpMNp) and :ep # u : p (Polish ENMpLNp).

Predicate Calculus Quantified Predicate Calculus (both first- and secondorder) was first axiomatized and used notationally by Gottlob Frege (1848–1925) in 1879, a quartercentury after Boole. In predicate calculus, the atomic proposition of propositional calculus is split into predicate and argument(s), allowing far more representation of actual natural language phenomena. To represent predication, Frege introduced the now-standard functional notation, widely used in

Logical Consequence 305

mathematics. In this notation, an atomic proposition p could now be seen to consist of a predicate (typically using uppercase letters) operating on arguments expressed by following parenthesized variables, in the same way as a mathematical function like f(x) ¼ x2; e.g., TALL (x) ¼ X is tall, SEE (x, y) ¼ X sees Y, and GIVE (x, y, z) ¼ X gives Y to Z. In particular, quantifiers were separated by Frege for the first time from their traditional Aristotelian A, E, I, O notation. Quantifiers in natural language are specialized words that often involve special syntax; normally they appear in construction with some noun, which they are said to bind. However, their syntax varies widely, and quantifier ambiguities are frequent. Modern logic admits what McCawley (1993) calls ‘‘the logicians’ favorite quantifiers’’: the existential quantifier, 9x, pronounced ‘for some x’ or ‘there exists an x,’ and the universal quantifier, 8x, pronounced ‘for all/every/each x.’ The x’s in each case are dummy variables; they do no more than indicate which variable in the proposition following is to be considered bound by the quantifier. Quantifiers are rigidly controlled in the formulae in order to avoid ambiguity (and indeed to allow natural ambiguities to be explicated). They are placed before the formula containing the variable they bind, and their relative placement serves to denote the concept of scope, which is highly relevant to the three natural language elements represented in logic by operators (i.e., quantification, negation, and modality), all of which govern scope phenomena like Negative Polarity. Thus, the two ambiguous readings of A boy beat every girl at tennis are represented by (9x) (8y) BEAT (x, y) and (8y) (9x) BEAT (x, y).

Naturally, there are variations in quantifier notation as well: a formula like DeMorgan’s Laws for quantifiers, which can be written ENPxjxSxNjx and ENSxjxPxNjx [Px is (8x) and Sx is (9x)] in Polish notation, comes out as :(8x) j(x) # (9x) :j(x) and :(9x) j(x) # (8x) :j(x) in Classical notation, which also admits a simple parenthesized variable (x) instead of (8x), and also one with a circumflex (yˆ ) instead of (9y), in the appropriate position. The use of parentheses, colons, brackets, and other punctuation with quantifiers is inconsistent and follows individual style, which is usually oriented toward scope delimitation. See also: Assertion; Cladistics; Meaning: Overview of Philosophical Theories; Negation; Number; Panini; Predication; Propositional and Predicate Logic: Linguistic Aspects; Quantifiers: Semantics; Scope and Binding: Semantic Aspects.

Bibliography Bochenski I M, O P (1948). Pre´ cis de logique mathematique. Bussum, Netherlands: F. G. Kroonder. English translation (1960) by Otto Bird, ‘A pre´ cis of mathematical logic.’ Dordrecht: Reidel. Boole G (1854). An investigation of the laws of thought, on which are founded the mathematical theories of logic and probabilities. London: Walton and Maberley. Frege G (1879). Begriffsschrift, eine der arithmetischen nachgebildete formelsprache des reinen Denkens. Halle: L. Nebert. McCawley J D (1993). Everything that linguists have always wanted to know about logic (but were ashamed to ask) (2nd edn.). Chicago: University of Chicago Press.

Logical Consequence P Blanchette, University of Notre Dame, Notre Dame, IN, USA ! 2006 Elsevier Ltd. All rights reserved.

Fundamentals Logical consequence is the relation that holds between the premises and conclusion of an argument when the conclusion follows from the premises, and does so for purely logical reasons. When a conclusion is a logical consequence of premises, the truth of those premises suffices to guarantee the truth of the conclusion. To clarify, we’ll look at some examples.

When we reason that (A1) Socrates is mortal

follows from (A2) Socrates is human

and (A3) All humans are mortal,

we need not appeal to any known facts about Socrates, about humanity, or about mortality. These specifics are irrelevant to the fact that (A1) follows from (A2) and (A3), which shows that the sense of ‘following-from’ involved here is the purely logical

Logical Consequence 305

mathematics. In this notation, an atomic proposition p could now be seen to consist of a predicate (typically using uppercase letters) operating on arguments expressed by following parenthesized variables, in the same way as a mathematical function like f(x) ¼ x2; e.g., TALL (x) ¼ X is tall, SEE (x, y) ¼ X sees Y, and GIVE (x, y, z) ¼ X gives Y to Z. In particular, quantifiers were separated by Frege for the first time from their traditional Aristotelian A, E, I, O notation. Quantifiers in natural language are specialized words that often involve special syntax; normally they appear in construction with some noun, which they are said to bind. However, their syntax varies widely, and quantifier ambiguities are frequent. Modern logic admits what McCawley (1993) calls ‘‘the logicians’ favorite quantifiers’’: the existential quantifier, 9x, pronounced ‘for some x’ or ‘there exists an x,’ and the universal quantifier, 8x, pronounced ‘for all/every/each x.’ The x’s in each case are dummy variables; they do no more than indicate which variable in the proposition following is to be considered bound by the quantifier. Quantifiers are rigidly controlled in the formulae in order to avoid ambiguity (and indeed to allow natural ambiguities to be explicated). They are placed before the formula containing the variable they bind, and their relative placement serves to denote the concept of scope, which is highly relevant to the three natural language elements represented in logic by operators (i.e., quantification, negation, and modality), all of which govern scope phenomena like Negative Polarity. Thus, the two ambiguous readings of A boy beat every girl at tennis are represented by (9x) (8y) BEAT (x, y) and (8y) (9x) BEAT (x, y).

Naturally, there are variations in quantifier notation as well: a formula like DeMorgan’s Laws for quantifiers, which can be written ENPxjxSxNjx and ENSxjxPxNjx [Px is (8x) and Sx is (9x)] in Polish notation, comes out as :(8x) j(x) " (9x) :j(x) and :(9x) j(x) " (8x) :j(x) in Classical notation, which also admits a simple parenthesized variable (x) instead of (8x), and also one with a circumflex (yˆ) instead of (9y), in the appropriate position. The use of parentheses, colons, brackets, and other punctuation with quantifiers is inconsistent and follows individual style, which is usually oriented toward scope delimitation. See also: Assertion; Cladistics; Meaning: Overview of Philosophical Theories; Negation; Number; Panini; Predication; Propositional and Predicate Logic: Linguistic Aspects; Quantifiers: Semantics; Scope and Binding: Semantic Aspects.

Bibliography Bochenski I M, O P (1948). Pre´cis de logique mathematique. Bussum, Netherlands: F. G. Kroonder. English translation (1960) by Otto Bird, ‘A pre´cis of mathematical logic.’ Dordrecht: Reidel. Boole G (1854). An investigation of the laws of thought, on which are founded the mathematical theories of logic and probabilities. London: Walton and Maberley. Frege G (1879). Begriffsschrift, eine der arithmetischen nachgebildete formelsprache des reinen Denkens. Halle: L. Nebert. McCawley J D (1993). Everything that linguists have always wanted to know about logic (but were ashamed to ask) (2nd edn.). Chicago: University of Chicago Press.

Logical Consequence P Blanchette, University of Notre Dame, Notre Dame, IN, USA ! 2006 Elsevier Ltd. All rights reserved.

Fundamentals Logical consequence is the relation that holds between the premises and conclusion of an argument when the conclusion follows from the premises, and does so for purely logical reasons. When a conclusion is a logical consequence of premises, the truth of those premises suffices to guarantee the truth of the conclusion. To clarify, we’ll look at some examples.

When we reason that (A1) Socrates is mortal

follows from (A2) Socrates is human

and (A3) All humans are mortal,

we need not appeal to any known facts about Socrates, about humanity, or about mortality. These specifics are irrelevant to the fact that (A1) follows from (A2) and (A3), which shows that the sense of ‘following-from’ involved here is the purely logical

306 Logical Consequence

sense. That is, (A1) is a logical consequence of (A2) and (A3). By contrast, when we reason that (B1) There are mammals in the ocean

follows from (B2) There are dolphins in the ocean

we must appeal to facts peculiar to the nature of mammals and of dolphins. We appeal, specifically, to the fact that dolphins are mammals. In this case, although there is a sense in which (B1) ‘follows from’ (B2), (B1) does not follow logically from (B2). It follows, one might say, biologically, because an appeal to biological facts is needed to get from (B2) to (B1). Nevertheless, the fact that (B1) follows in this extra-logical way from (B2) is because of the relation of logical consequence. Specifically, it is because of the fact that (B1) is a logical consequence of (B2) together with (B3) All dolphins are mammals.

That (B1) is a logical consequence of (B2) and (B3) can be seen by noting that it follows from them independently of the specific nature of the objects, properties, and relations mentioned in these statements. In general, all cases of ‘following from’ are due, in this way, to the relation of logical consequence. If a conclusion follows from some collection of premises, this is because that conclusion is a logical consequence of the premises, together perhaps with various ancillary claims that are presupposed in the given context. Logical consequence is therefore a ubiquitous relation: all of our reasoning turns on recognizing (or attempting to recognize) relations of logical consequence, and virtually all of the important connections between theories, claims, predictions, and so on, are in large part due to logical consequence. Furthermore, whenever we say that a given argument is valid or that it is invalid, or that a particular set of claims is consistent or inconsistent, we are employing the notion of logical consequence: a valid argument is one the conclusion of which is a logical consequence of its premises, whereas a consistent set of claims is a collection that has no contradiction as a logical consequence. Because the central logical notions of validity, consistency, etc., are definable in terms of logical consequence, the investigation of the nature of logical consequence is at the same time the investigation of the nature of the logical properties and relations in general.

The Formal Study of Logical Consequence The modern investigation of logical consequence is closely connected to the discipline of formal logic.

Formal logic is the study of formal (i.e., syntactically specified) languages, and of various philosophically and mathematically significant properties and relations definable in terms of such languages. Of particular significance for the study of logical consequence are two kinds of relations definable on the formulas of a formal language, the relations of proof-theoretic consequence and of model-theoretic consequence. Given a formal language, a relation of prooftheoretic consequence is defined via the rigid specification of those sequences of formulas that are to count as proofs. Typically, the specification is given by designating specific formulas as axioms, and designating some rules of inference by means of which formulas are provable one from another. Both axioms and rules of inference are specified entirely syntactically. A proof is then a series of formulas each of which is either taken as premise, or is an axiom, or is obtained by previous formulas in the series via a rule of inference. A formula j is a proof-theoretic consequence of a set S of formulas if and only if there is a proof the premises of which are among the members of S, and the conclusion of which is j. Model-theoretic consequence, by contrast, is defined in terms of a range of interpretations (or models) of the formal language in question. While the vocabulary of the language is divided into the ‘logical’ terms (typically, analogues of the English-language ‘and,’ ‘or,’ ‘not,’ ‘if. . .then,’ and ‘for all’), the meaning of which is taken as unchanging, and the ‘non-logical’ terms (typically analogues of natural-language predicates and singular terms), an interpretation is an assignment of objects and sets of objects to the nonlogical terms. In the standard case, the formulas are taken to have a truth-value (i.e., to be either true or false) on each such interpretation. A formula j is then a model-theoretic consequence of a set S of formulas if and only if there is no interpretation on which each member of S is true while j is false. The connection between these defined relations and logical consequence arises when the formulas in question are taken to stand as representatives of natural-language sentences or the claims they express. Given such a representation-relationship, the relations of proof-theoretic and of model-theoretic consequence are typically designed so as to mirror, to some extent, the relation of logical consequence. Thus, the idea behind a standard design of a relation of proof-theoretic consequence is that it only count as axioms those formulas representing ‘logical truths’ (e.g., ‘Either 5 is even or 5 is not even’), and that its rules of inference similarly mirror logical principles. In such a case, a formula j will be a proof-theoretic

Logical Consequence 307

consequence of a set S of formulas only if the kind of ordinary sentence represented by j is indeed a logical consequence of the ordinary sentences represented by the members of S. This does not ensure that the relation of proof-theoretic consequence exhausts the relation of logical consequence, for two reasons: first of all, the formal language in question may not contain representatives of all ordinary sentences; second, the proof system may not be rich enough to reflect all of the instances of logical consequence amongst even those ordinary sentences that are represented in the language. The system of proof-theoretic consequence will, however, have the virtue of being well defined and tractable. Similar remarks apply to the relation of model-theoretic consequence: in a well-designed formal language, the relation of modeltheoretic consequence will mirror, in important ways, the relation of logical consequence. The intention in designing such a system is, typically, that j will be a model-theoretic consequence of S if and only if the kind of ordinary sentence represented by j is a logical consequence of those represented by the members of S. Given a particular language together with its prooftheoretic and model-theoretic consequence relations, the question arises whether those relations are coextensive: whether, that is, j is a proof-theoretic consequence of S if and only if j is a model-theoretic consequence of S. In some cases, the answer is ‘yes,’ and, in some, ‘no.’ Each half of the inclusion is a separate, significant issue: when every proof-theoretic consequence of each set of formulas is also a modeltheoretic consequence of that set, the system is said to be sound, and when every model-theoretic consequence of each set of formulas is also a prooftheoretic consequence of that set, the system is said to be complete. The soundness of a system is typically a straightforward matter, following immediately from the design of the proof-theoretic system; completeness is typically a considerably more significant issue. The most-important system of logic, that of classical first-order logic, was proven by Kurt Go¨ del in 1930 to be complete; this is the celebrated ‘completeness theorem for first-order logic.’ First-order logic is, in various ways, the ‘strongest’ complete system (see Enderton, 1972). Formal systems, i.e., formal languages together with proof-theoretic or model-theoretic consequence relations, differ from each other in a number of ways. Most important for the purposes of the study of logical consequence are the following two differences: (1) proof-theoretic relations differ over the axioms and rules of inference they include, and hence over the instances of logical consequence that they represent. Some such differences are just because

the languages of some such systems are expressively weaker than others, so that principles contained in one simply cannot be expressed in the other. More interesting are differences motivated by differing views of logical consequence itself. Thus, for example ‘classical’ logic differs from intuitionist logic in including the principle of excluded middle, the principle guaranteeing the truth of all statements of the form p-or-not-p. As the proponent of intuitionist logic sees it, this principle is not universally accurate, and hence should not be included in a system of logic. (2) Model-theoretic relations differ in a number of small ways, including the specifics of the definition of interpretation, and of the definition of truth-on-aninterpretation. More important, the model-theoretic consequence relations for different systems differ when the formal languages in question are importantly structurally different. Thus, for example, standard second-order logic has a richer model-theoretic consequence relation than does first-order logic, and there are natural-language arguments whose secondorder representation yields a conclusion that is a model-theoretic consequence of its premises, but whose first-order representation does not (see van Dalen, 2001; Shapiro, 1991). The question of the extent to which each such system gives an accurate characterization of logical consequence is of central philosophical concern. With respect to the relations of proof-theoretic consequence, debate turns on the accuracy of specific axioms and rules of inference. With respect to relations of model-theoretic consequence, the significant debate is rather over the question of the extent to which model-theoretic consequence relations in general (or, perhaps, that relation as applied to classical first-order logic) offer an analysis of the ordinary, non-formal relation of logical consequence. If logical consequence is in some sense ‘essentially’ the relation of truth-preservation across interpretations, then model-theoretic consequence has a privileged position as simply a tidied-up version of the core relation of logical consequence. If, by contrast, the relation of truth-preservation across interpretations is simply another sometimes-accurate, sometimes-inaccurate means of representing the extension of the relation of logical consequence, then model-theoretic consequence has no immediate claim to accuracy (see Etchemendy, 1990).

General Philosophical Concerns In addition to questions surrounding its appropriate formal representation, the investigation of logical consequence includes questions concerning the nature of the relation itself.

308 Logical Consequence

One important cluster of such questions concerns the relata of the relation. Here we want to know whether the items between which logical consequence holds are, say, the sentences of ordinary language, or the non-linguistic propositions expressed by such sentences, or something else altogether. Although logical consequence is perhaps most straightforwardly viewed as a relation between sentences, one reason to reject this idea is that sentences, at least when thought of as syntactic entities (strings of letters and spaces), seem the wrong kinds of things to bear that relation to one another. Because any given sentence so understood could, under different circumstances, have had a quite different meaning, and would thereby have borne different logical relationships to other sentences, it is arguable that the sentence itself is not the primary bearer of this relation but is, rather, just a means of expression of the primary bearer. This line of reasoning motivates the view of non-linguistic propositions, the kinds of things expressed by (utterances of) fully interpreted sentences, as the relata of logical consequence. The central reason for rejecting this proposal, though, is skepticism about the existence of such things as nonlinguistic propositions. A third option is to take the relata of the logical consequence relation to be sentences-in-use, essentially pairs of sentences and meaning-conferring practices (see Cartwright, 1987; Strawson, 1957; Quine, 1970). The second, related collection of questions concerning logical consequence arises from the inquiry into what makes one thing a logical consequence of others. Here, we are looking for an explanation or an analysis of logical consequence in terms of other, more well-understood notions. One potential answer is that logical consequence is to be explained in terms of the meanings of various specific parts of our vocabulary, specifically in terms of the meanings of the ‘logical’ words and phrases (see above). A second, not necessarily competing, account is that logical consequence is because of the form, or overall grammatical structure, of the sentences and arguments in question. A third type of answer, mentioned above, is that logical consequence is best explained in terms of model-theoretic consequence. Various of the accounts of logical consequence have been criticized on grounds of circularity: to say that j’s being a logical consequence of S is because of some other relation between j and S is, arguably, to say that the claim that j is a logical consequence of S is itself a logical consequence of the purported explanans. If this charge of circularity is accurate, it is arguable that all such explanations of the nature of logical consequence will be found to be circular, with the result that this relation must be taken to be ‘primitive,’ not

capable of reduction to anything else. Part of the debate here will turn on what one takes the nature of explanation to be, and on whether explanation requires reduction (see Quine, 1936). In short: although it generally is agreed that some claims are logical consequences of others, there is scope for important disagreement about (a) which specific claims are in fact logical consequences of which others, (b) how to construe the notion of ‘claim’ involved here, and (c) how to give a correct account of the nature of the relation of logical consequence. Because of the connections between these issues and general positions in the philosophy of logic, philosophy of mathematics, and philosophy of language, one’s preferred answers to the questions noted here will turn in large part on one’s position with respect to a host of surrounding topics. See also: Inference: Abduction, Induction, Deduction; Logic and Language: Philosophical Aspects; Logical Form in Linguistics; Propositions; Propositional and Predicate Logic: Linguistic Aspects.

Bibliography Blanchette P A (2000). ‘Models and modality.’ Synthese 124(1), 45–72. Blanchette P A (2001). ‘Logical consequence.’ In Goble L (ed.) The Blackwell guide to philosophical logic. Malden, MA/Oxford: Blackwell Publishers. 115–135. Cartwright R (1987). ‘Propositions.’ In Butler R J (ed.) Analytical philosophy, 1st series. Oxford: Blackwell. Reprinted in Cartwright R. Philosophical essays. Cambridge and London: MIT Press, 1987. 33–53. Enderton H (1972). A mathematical introduction to logic. Orlando, FL: Academic Press. Etchemendy J (1990). The concept of logical consequence. Cambridge, MA: Harvard University Press. Reprinted 1999, Stanford: CSLI Publications. Goble L (ed.) (2001). The Blackwell guide to philosophical logic. Malden, MA/Oxford: Blackwell Publishers. Quine W V O (1936). ‘Truth by convention.’ In Lee O H (ed.) Philosophical essays for A. N. Whitehead. New York: Longmans. Reprinted in Quine W V O. The ways of paradox and other essays. Cambridge, MA/London: Harvard University Press, 1976. 77–106. Quine W V O (1970). Philosophy of logic. Englewood, NJ: Prentice Hall. Shapiro S (1991). Foundations without foundationalism: a case for second-order logic. Oxford: Oxford University Press. Strawson P F (1957). ‘Propositions, concepts, and logical truths,’ Philosophical Quarterly 7. Reprinted in Strawson P F. Logico-Linguistic Papers. London: Methuen & Co. 1971. 116–129. Tarski A (1936). ‘On the concept of logical consequence,’ translation of ‘O pojciu wynikania logicznego.’ In

Logical Form in Linguistics 309 Przeglad Filozoficzny, 39, 58–68. English translation in Logic, semantics, metamathematics (2nd edn.). Woodger J H (trans.) & Corcoran J (ed.). Indianapolis: Hackett Publishing Company, 1983. 409–420.

van Dalen D (2001). ‘Intuitionistic logic.’ In Goble L (ed.) The Blackwell guide to philosophical logic. Malden, MA/ Oxford: Blackwell Publishers. 224–257.

Logical Form in Linguistics D Blair, University of Western Ontario, Canada ! 2006 Elsevier Ltd. All rights reserved.

To describe the logical form of some claim is to describe its logically significant properties and structure, showing its connection to other claims via what it entails and what it entails it. Given the variety of claims that philosophers have taken in an interest in, it is not surprising that there are a large number of theories of logical form. But even if there is no shortage of theories aiming at the logical form of, e.g., propositional attitude sentences or counterfactual conditionals, surprisingly little attention has been given to the prior question of what logical form is to begin with. Just as importantly, it is not clear what it is that is supposed to have a logical form in the first instance. Is it, for example, a linguistic object like a sentence, or the utterance of a sentence, or something different from both of these, such as the proposition expressed by an utterance of a sentence? The presence of logic within the notion of logical form may make one suspicious of paying too much attention to the details of natural language. Other kinds of items seem better suited to having logical forms. For example, propositions have whatever truth conditions they have essentially, whereas sentences do not: ‘snow is white’ might have meant that most colorless beverages lack sodium. Further, it is a notorious fact about natural language that it contains a good deal of vagueness and context sensitivity that is hard to capture within a theory of inference. Facts like these have made philosophers wary of placing too much emphasis on natural language sentences. At the very least, one would want to purge natural language of its logically problematic features before building upon it a theory of logical form. This was precisely the reaction of Frege (1952) and Russell (1919) to the defects of natural language. For them, one needed to formulate an ideal language free from the flaws of natural language in order to spell out the content of various claims. Only then could one think about constructing theories of logical form. Frege’s Begriffschrift formulated an ideal language in which to conduct arithmetic and overcame some of the

difficulties of explaining inferences involving multiple quantifiers that beset earlier logical theories. But even if having a logically perspicuous representation of the propositional content of an assertion makes it easier to assess how well a theory accords with what is said about, e.g., the good or the propositional attitudes, there are serious questions concerning how such representations are related to the grammatical properties of a sentence. In the hands of Frege and Russell, one simply translated, as best one could, from natural language into an ideal language. These languages were specifically designed to expedite inference, and so no question arises about their logical forms. But until the last few decades, the kinds of structures required for the purposes of detailing the inferential properties of natural language sentences were thought to be quite remote from anything one might call ‘the grammar’ of a language. Indeed, one way of motivating talk of logical form was by showing the deficiencies of theories of meaning built upon generalizations of apparent grammatical form and function. A number of developments in the 1960s and 1970s changed this picture. A growing number of philosophers became intrigued with the idea of constructing theories of meaning for natural languages directly. The idea that such a theory could be done systematically stems in large part from the work of Noam Chomsky in the 1950s and 1960s, showing how rigorous theories of grammatical structure were possible. In light of the success of Chomsky’s program, it was natural to wonder whether a semantic theory along the lines of his work in syntax could be constructed. The classic picture of the grammatical structure of a sentence involves a series of levels of representation, the most well known of which is the so-called ‘T-model.’ In this model, there are four ‘levels of representation’: D-structure, S-structure, LF, and then PF, or the phonological form of a sentence. Since the last item is a representation of a sentence’s phonological properties, I leave it aside. Each level is related to the one before via the application of a rule or set of rules. The conception of rules has changed over the years, but the underlying idea is that syntactic structure of a sentence is built up, step by step, through a series of representations, each

Logical Form in Linguistics 309 Przeglad Filozoficzny, 39, 58–68. English translation in Logic, semantics, metamathematics (2nd edn.). Woodger J H (trans.) & Corcoran J (ed.). Indianapolis: Hackett Publishing Company, 1983. 409–420.

van Dalen D (2001). ‘Intuitionistic logic.’ In Goble L (ed.) The Blackwell guide to philosophical logic. Malden, MA/ Oxford: Blackwell Publishers. 224–257.

Logical Form in Linguistics D Blair, University of Western Ontario, Canada ! 2006 Elsevier Ltd. All rights reserved.

To describe the logical form of some claim is to describe its logically significant properties and structure, showing its connection to other claims via what it entails and what it entails it. Given the variety of claims that philosophers have taken in an interest in, it is not surprising that there are a large number of theories of logical form. But even if there is no shortage of theories aiming at the logical form of, e.g., propositional attitude sentences or counterfactual conditionals, surprisingly little attention has been given to the prior question of what logical form is to begin with. Just as importantly, it is not clear what it is that is supposed to have a logical form in the first instance. Is it, for example, a linguistic object like a sentence, or the utterance of a sentence, or something different from both of these, such as the proposition expressed by an utterance of a sentence? The presence of logic within the notion of logical form may make one suspicious of paying too much attention to the details of natural language. Other kinds of items seem better suited to having logical forms. For example, propositions have whatever truth conditions they have essentially, whereas sentences do not: ‘snow is white’ might have meant that most colorless beverages lack sodium. Further, it is a notorious fact about natural language that it contains a good deal of vagueness and context sensitivity that is hard to capture within a theory of inference. Facts like these have made philosophers wary of placing too much emphasis on natural language sentences. At the very least, one would want to purge natural language of its logically problematic features before building upon it a theory of logical form. This was precisely the reaction of Frege (1952) and Russell (1919) to the defects of natural language. For them, one needed to formulate an ideal language free from the flaws of natural language in order to spell out the content of various claims. Only then could one think about constructing theories of logical form. Frege’s Begriffschrift formulated an ideal language in which to conduct arithmetic and overcame some of the

difficulties of explaining inferences involving multiple quantifiers that beset earlier logical theories. But even if having a logically perspicuous representation of the propositional content of an assertion makes it easier to assess how well a theory accords with what is said about, e.g., the good or the propositional attitudes, there are serious questions concerning how such representations are related to the grammatical properties of a sentence. In the hands of Frege and Russell, one simply translated, as best one could, from natural language into an ideal language. These languages were specifically designed to expedite inference, and so no question arises about their logical forms. But until the last few decades, the kinds of structures required for the purposes of detailing the inferential properties of natural language sentences were thought to be quite remote from anything one might call ‘the grammar’ of a language. Indeed, one way of motivating talk of logical form was by showing the deficiencies of theories of meaning built upon generalizations of apparent grammatical form and function. A number of developments in the 1960s and 1970s changed this picture. A growing number of philosophers became intrigued with the idea of constructing theories of meaning for natural languages directly. The idea that such a theory could be done systematically stems in large part from the work of Noam Chomsky in the 1950s and 1960s, showing how rigorous theories of grammatical structure were possible. In light of the success of Chomsky’s program, it was natural to wonder whether a semantic theory along the lines of his work in syntax could be constructed. The classic picture of the grammatical structure of a sentence involves a series of levels of representation, the most well known of which is the so-called ‘T-model.’ In this model, there are four ‘levels of representation’: D-structure, S-structure, LF, and then PF, or the phonological form of a sentence. Since the last item is a representation of a sentence’s phonological properties, I leave it aside. Each level is related to the one before via the application of a rule or set of rules. The conception of rules has changed over the years, but the underlying idea is that syntactic structure of a sentence is built up, step by step, through a series of representations, each

310 Logical Form in Linguistics

having its own properties. Diagrammatically, what we have is the following:

The ‘S-structure’ or surface structure of a sentence is what corresponds, nearly enough, to the order of expressions as heard or written. ‘LF’ or logical form is a syntactic representation that is derived from the S-structure via a set of transformations, just as S-structures were derived from D-structures via transformations. Since only one level of representation seems to correspond to the overt form of a sentence, it follows that a good deal of syntactic structure remains hidden. The idea that unpronounced structure can be given a grammatical motivation is compelling. Consider the following pair of sentences: (1) John kissed Mary (2) Who did John kiss

The leftmost WH-phrase in (2) is intuitively related to the position of ‘Mary’ in (1). The grammar of English disguises this fact by requiring that unstressed WHphrases in sentences like (2) be fronted. Were English different in this regard, the parallel would be more obvious. Interestingly, a good many languages allow for just this possibility while others require all WH-phrases to be placed at the left- periphery of a sentence. A more perspicuous representation of English would abstract from these kinds of provincial eccentricities of surface form and expose, via a logically perspicuous notation, just these parallels. There is evidence that the grammatical structure of sentences like these in different languages is abstractly identical, i.e., that all WH-phrases are located at the edge of a clause at some level of representation. In some languages, like Russian, this is overtly true, even when there are several WH phrases in the clause. In other cases, like Chinese, there is little or no movement to the edge of the clausal periphery (see Huang, 1982). The difference between the overt forms of WH-questions then doesn’t disguise just the logical or semantic structure of a sentence; it hides the grammatical structure as well. A more articulated version of (2) shows this abstract structure: (3)

The key idea is that movement of a WH-phrase may occur overtly, as in English, or ‘covertly,’ as in some cases of French. When the WH-phrase does move, however, what we end up with is (3) The movement of the WH-phrase to its position at the left edge of the clause leaves a record in the form of a ‘trace,’ notated above as ‘t.’ Structures like (3) resemble, in a rather striking way, the kinds of representations that one finds within first-order logic, in particular with respect to the relationship between a quantificational expression and a variable that it binds. Let’s look at this in more detail. It is now commonplace to use examples of scope ambiguities as evidence for the ambiguity of sentences, one to be sorted out in a semantic theory. Thus, a sentence like (4) is ambiguous depending upon whether or not one takes the quantificational phrase ‘every boy’ to have scope over the subject quantificational phrase ‘some girl’ or vice versa, i.e., (5a/b): (4) Some girl danced with every boy (5a) 9x: girl(x) [8y: boy(y) [danced(x,y)]] (5b) 8y: boy(y) [ 9: girl (x) [danced (x,y)]]

The usual way of describing this difference is to say that in (5a), ‘some girl’ has scope over ‘every boy,’ while in (5b), the opposite relation holds. The scope of the quantifiers is determined by looking at the material appearing to its right, i.e., the closest formula that does not contain the expression within the first order translation. It turns out that one can define the relevant relation in syntactic terms as well, using the properties of phrase structure. To see this, consider the core syntactic relation of c-command. An expression a c-commands an expression b if and only if the first branching node dominating a dominates b and neither a nor b dominates the other.

What is important is that one can use this definition to say something about quantificational scope. Suppose we take quantificational expressions to move to positions from which they c-command their original position:

In this case, XP c-commands ZP and everything that is contained in the latter, including the trace of XP. Strikingly, when we look at what the structure of 1 is when this structure is explicit, we see the kind of structure required for the definition of scope: (6) [S [QP Some girl]2 [S [QP Every boy]1 [S t2 [VP danced t1 ]]]]

Logical Form in Linguistics 311

For the reading of (4) where the scopes of the quantificational NPs are inverted relative to their surface order, ‘every boy’ is adjoined to a position from which it c-commands both ZP and the position to which ‘some girl’ has been adjoined: (7) [S [QP Every boy]1 [S [QP Some girl]2 [S t2 [VP danced t1 ]]]]

Both of these movements can be given more detailed defense; see May (1977). The structures that seem to be needed for semantics and that philosophers have thought were disguised by ordinary grammar really are hidden, although not quite in the way they thought. What is hidden is more syntactic structure. Of course, ‘LF’ is a syntactic level of representation and is not a semantic representation. This is not to suggest, however, that no gain has been made within theorizing about natural language by incorporating the LF hypothesis. For one could hold that the grammatical structures that are interpreted by the semantic theory are just those provided by a theory of grammar incorporating the LF hypothesis. There is no need to first regiment the formal structures of sentences into something to which semantic rules could then apply. What one finds in the idea of LF is the idea that natural languages already have enough structure to supply a lot of what is needed for the purposes of semantics. Further developments within syntactic theory have made the concept of logical form more prominent. Thus, Chomsky (1995) and others have proposed that the only level of grammatical representation is LF, although the role of LF is likely to change, just as it has in the past (see, e.g., Lasnik, 2001). Even so, it is apparent that progress has been made in joining together two bodies of thinking about language, one rooted in traditional philosophical problems about the representation of logic and inference and the other in more recent developments coming from linguistics. There are limits, however, to how much philosophical work a linguistic-based approach to logical form can do. Recall that one of the problems that has made many philosophers wary of paying too much attention to natural language concerned such things as the context sensitivity of certain aspects of natural language sentences. It is an open question just how to treat different kinds of context sensitivity within natural language, and whether revisions are needed to our conception of logical form in natural language in order to accommodate it. It is also true that a good number of philosophical projects targeting logical form are usually concerned with the conceptual analysis of certain notions, e.g., moral goodness, knowledge, etc. Indeed, one of the traditional roles

of logical form within philosophy is to serve as scaffolding for just these sorts of projects. Doubts about the viability of conceptual analysis to one side, this is what has given weight to the claim that ‘ordinary language’ disguises the logically significant structure of our concepts. But if this is the role that logical form must play if it is to have a role within philosophy, then it is unclear whether the linguistic conception of logical form can wholly supplant the traditional view. The linguistic conception of logical form seemingly has little to do with the conceptual analysis. And unless conceptual analysis takes the form of a grammatical analysis, it is unlikely that one can substitute grammatical analysis for the description of the logically significant aspects of our concepts. This is not to deny that a linguistics-based conception of logical form is an important, maybe even essential part of understanding how to think about some aspects of logic and meaning. This is particularly clear with respect to the study of quantification. But there are many questions about the nature of logical form that need to be resolved before particular view can be judged to be the most viable. See also: Chomsky, Noam (b. 1928); Frege, Gottlob (1848–

1925); Inference: Abduction, Induction, Deduction; Interpreted Logical Forms; Propositions; Quantifiers: Semantics.

Bibliography Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky N (1977). ‘On WH movement.’ In Culicover P, Wasow T & Akamajian A (eds.) Readings in English transformational grammar. Waltham, MA: Ginn. 184–221. Chomsky N (1995). The minimalist program. Cambridge, MA: MIT Press. Davidson D (1967). ‘Truth and Meaning.’ Synthese 17, 304–323. Frege G (1952). Translations from the philosophical writings of Gottlob Frege. Oxford: Blackwell. Higginbotham J (1993). ‘Logical form and grammatical form.’ Philosophical Perspectives 7, 173–196. Huang C T J (1982). ‘Move WH in a language without WH movement.’ Linguistic Review 1, 369–416. Lasnik H (2001). ‘Derivation and representation in generative grammar.’ In Baltin M & Collins C (eds.) Handbook of contemporary syntactic theory. Oxford: Blackwell. 62–88. Lepore E & Ludwig K (2002). ‘What is logical form?’ In Preyer G & Peters G (eds.). 54–90.

312 Logical Form in Linguistics Ludlow P (2002). ‘LF and natural logic.’ In Preyer G & Peters G (eds.). 132–168. May R (1977). ‘The grammar of quantification.’ Ph.D. diss., MIT. Neale S (1994). ‘Logical Form and LF.’ In Otero C (ed.) Noam Chomsky: critical assessments. London: Routledge. 788–838.

Preyer G & Peters G (eds.) (2002). Logical form and language. Oxford: Oxford University Press. Russell B (1919). Introduction to mathematical philosophy. London: George Allen and Unwin. Williams E (1983). ‘Syntactic and Semantic Categories.’ Linguistics and Philosophy 6, 423–446.

Logophoric Pronouns M von Roncador, Bayreuth University, Bayreuth, Germany ! 2006 Elsevier Ltd. All rights reserved.

Reported discourse can be seen as a procedure of integrating a secondary text into a primary text. Traditionally, two integration strategies are distinguished: direct discourse, reproducing or pretending to reproduce another text without adjusting its orientation to the primary text and speaker (John said, ‘‘You can’t stop me’’), and indirect discourse, in which the secondary text is maximally adjusted to the primary speaker’s orientation (John affirmed that I/you/ she/he could not stop him). It should be noted that the reference of the subject of the secondary clause, the reported addressee, cannot be determined in the direct discourse example, whereas in the indirect discourse example, the third-person object pronoun may be coreferring to John or not. Over the last decades, attention has shifted to intermediate strategies of adjustment within reported discourse (cf. Roncador, 1988), one of which is the phenomenon of logophoricity, so named by Hage`ge (1974). Crosslinguistically, logophoric marking may be combined with direct discourse, as well as with indirect discourse orientation (see below). A logophoric marking device or logophor in the secondary text refers obligatorily to the reported speaker defined in the primary text. Instances with logophoric marking devices can thus be unambiguous where the English translation into indirect discourse is not: Lobiri (Central Gur) (1a) ?a´ i so´ re´ !?VB -!na´ i ?ı´nEB he say LOG-FUT come ‘hei said hei would come’ (1b) ?a´i so´re´ !?a´-na´j !?ı´nEB he say 3SG-FUT come ‘hei said hej would come’

Major questions concerning logophoricity are: (1) How is logophoricity expressed and what are its properties? (2) What is the status of logophorically

marked reported discourse? and (3) What is the origin of logophoric marking and why are its most typical occurrences found in the savannah belt south of the Sahara?

Forms and Properties of Logophoric Marking Logophoricity is typically expressed by a special pronominal form marking coreference with a subject of a speech act verb as in (1a), whereas the use of the ‘normal’ third person singular subject form (1b) will indicate disjoint reference (that someone else will come). There are exceptions in that logophoric marking may be expressed by a verbal suffix, possibly derived form a pronominal form, as is the case in Gokana (Hyman and Comrie, 1981); Curnow (2002) counts this as one case of his category of verbal logophoricity. Typological properties of logophoric marking have been formulated by, amongst others, Hyman and Comrie (1981), Roncador (1992), and Culy (1994). The following conventions and abbreviations will be used here: T0 will denote the primary text and S0 the primary speaker; tl the secondary text, Sl the trigger of logophoric marking (in the most typical cases the reported speaker); LOG the logophorically marked participant in tl. The setting permitting logophoric marking (in languages with logophors) is called ‘logophoric context’. Binding

Logophors in tl obligatorily express coreference with a participant of the primary text T0. There exist different interpretations concerning the properties of ‘normal’ anaphoric devices in tl. In most cases the absence of logophoric marking in logophoric contexts will express disjoint reference to Sl. Some authors claim that coreference of the ‘normal’ pronoun to Sl may be possible, but with the sense that it is the point of view of S0 which is expressed in tl, whereas the presence of the LOG indicates the point of view of Sl (for discussion, see Roncador 1988: 289–295).

312 Logical Form in Linguistics Ludlow P (2002). ‘LF and natural logic.’ In Preyer G & Peters G (eds.). 132–168. May R (1977). ‘The grammar of quantification.’ Ph.D. diss., MIT. Neale S (1994). ‘Logical Form and LF.’ In Otero C (ed.) Noam Chomsky: critical assessments. London: Routledge. 788–838.

Preyer G & Peters G (eds.) (2002). Logical form and language. Oxford: Oxford University Press. Russell B (1919). Introduction to mathematical philosophy. London: George Allen and Unwin. Williams E (1983). ‘Syntactic and Semantic Categories.’ Linguistics and Philosophy 6, 423–446.

Logophoric Pronouns M von Roncador, Bayreuth University, Bayreuth, Germany ! 2006 Elsevier Ltd. All rights reserved.

Reported discourse can be seen as a procedure of integrating a secondary text into a primary text. Traditionally, two integration strategies are distinguished: direct discourse, reproducing or pretending to reproduce another text without adjusting its orientation to the primary text and speaker (John said, ‘‘You can’t stop me’’), and indirect discourse, in which the secondary text is maximally adjusted to the primary speaker’s orientation (John affirmed that I/you/ she/he could not stop him). It should be noted that the reference of the subject of the secondary clause, the reported addressee, cannot be determined in the direct discourse example, whereas in the indirect discourse example, the third-person object pronoun may be coreferring to John or not. Over the last decades, attention has shifted to intermediate strategies of adjustment within reported discourse (cf. Roncador, 1988), one of which is the phenomenon of logophoricity, so named by Hage`ge (1974). Crosslinguistically, logophoric marking may be combined with direct discourse, as well as with indirect discourse orientation (see below). A logophoric marking device or logophor in the secondary text refers obligatorily to the reported speaker defined in the primary text. Instances with logophoric marking devices can thus be unambiguous where the English translation into indirect discourse is not: Lobiri (Central Gur) (1a) ?a´i so´re´ !?VB -!na´i ?ı´nEB he say LOG-FUT come ‘hei said hei would come’ (1b) ?a´i so´re´ !?a´-na´j !?ı´nEB he say 3SG-FUT come ‘hei said hej would come’

Major questions concerning logophoricity are: (1) How is logophoricity expressed and what are its properties? (2) What is the status of logophorically

marked reported discourse? and (3) What is the origin of logophoric marking and why are its most typical occurrences found in the savannah belt south of the Sahara?

Forms and Properties of Logophoric Marking Logophoricity is typically expressed by a special pronominal form marking coreference with a subject of a speech act verb as in (1a), whereas the use of the ‘normal’ third person singular subject form (1b) will indicate disjoint reference (that someone else will come). There are exceptions in that logophoric marking may be expressed by a verbal suffix, possibly derived form a pronominal form, as is the case in Gokana (Hyman and Comrie, 1981); Curnow (2002) counts this as one case of his category of verbal logophoricity. Typological properties of logophoric marking have been formulated by, amongst others, Hyman and Comrie (1981), Roncador (1992), and Culy (1994). The following conventions and abbreviations will be used here: T0 will denote the primary text and S0 the primary speaker; tl the secondary text, Sl the trigger of logophoric marking (in the most typical cases the reported speaker); LOG the logophorically marked participant in tl. The setting permitting logophoric marking (in languages with logophors) is called ‘logophoric context’. Binding

Logophors in tl obligatorily express coreference with a participant of the primary text T0. There exist different interpretations concerning the properties of ‘normal’ anaphoric devices in tl. In most cases the absence of logophoric marking in logophoric contexts will express disjoint reference to Sl. Some authors claim that coreference of the ‘normal’ pronoun to Sl may be possible, but with the sense that it is the point of view of S0 which is expressed in tl, whereas the presence of the LOG indicates the point of view of Sl (for discussion, see Roncador 1988: 289–295).

Logophoric Pronouns 313

It has been noted frequently that in most languages with logophoric marking, the binding domain cannot be defined syntactically (see Roncador, 2002 for textually bound logophors). Nevertheless, in some languages logophors may be syntactically bound (see Koopman and Sportiche, 1989 for a generative approach). In any case, logophoric marking will take place outside of S0, thus distinguishing it from longdistance anaphora and reflexives. Properties of T0 and Sl

T0 may introduce words, ideas, or emotional attitudes; see Culy (1994, 2002) for a scalar ranking of predicates introducing (triggering) LOG, including verbs of saying at the top and predicates expressing experiences or finality at the bottom. If there is a reported speaker in T0, then he will be the source Sl, otherwise another subject of consciousness may also be Sl. This correlates with well-known implicata in that the agent in T0 is more likely to trigger LOG than the experiencer (and consequently the patient), and a subject is more likely to trigger LOG than an (indirect) object, etc. For a number of Chadic languages, Frajzyngier (1985) reported the marking of not only the reported speaker in tl, but also the reported addressee by a special form, sometimes called ‘addressee pronoun/logophor’. S1 is also subject to a hierarchy of person and number in that a third person is more likely to trigger logophoric marking than a second and a first person, singular more likely than plural. In fact, different logophoric marking for all persons seems not to be attested, but syncretism of person is reported, e.g., in Gokana (Hyman and Comrie, 1981), though it is unusual for a first person S1. Quite frequently, there is syncretism of third and second person in logophoric contexts: FOn (Kwa) (2a) Kofi BO e´mı´ gba xwe e´mı´-tcD Bayi Kofi say:that LOG build house LOG:POSS T/A ‘Kofii said that hei had built hisi house’

(2b) a` BO e´mı´ 2S say:that LOG e´mı´-tcD Bayi LOG:POSS

gba build

xwe house

T/A

‘You said that you had built your house’

It is worth noting that in FOn, as in other languages with this sort of syncretism, second and third person marking are distinct outside of logophoric contexts (for discussion see Roncador, 1992). Concerning plurals, it is often not the identity of plural referents which is at issue, but a relation of inclusion in that a singular S1 may trigger a plural LOG (see Clements, 1975 for examples in Ewe). The reverse does not seem to exist.

The Status of Logophoric Constructions within Reported Discourse By the choice of the logophoric form as such, a logophoric construction will be distinct from direct discourse. On the other hand, as Hage`ge (1974) and Clements (1975) noted, logophoric constructions in many languages permit elements otherwise reserved to direct discourse, such as exclamatives, vocatives, and second and even first person elements with direct discourse orientation. It has been proposed (Roncador, 1988) to treat logophors as devices to introduce a secondary subject of consciousness as in free indirect discourse (discours indirect libre), also called ‘represented speech and thought,’ the difference being that the subject of consciousness in logophoric constructions is not just reflecting but also communicating, as shown by the occurrence of vocatives or second persons: Mundang (Adamawa) (Hage´ ge, 1974) (3) a` fa´ mo` ?I¯ ZI` ne¯ ˜ 3S say 2S see LOG PART ‘He asked (him): ‘‘Did you see me?’’’

The Origin of Logophors In cases where the logophoric device also appears in other contexts, such functions are usually more marked than the ‘normal’ elements expressing disjoint reference in logophoric contexts. Emphatics, nonsubject pronouns, possessive reflexives, etc., may be used as logophoric devices in appropriate contexts (for discussion, including counterexamples, see Roncador, 1992). The examples in (4) illustrate the possessive reflexive (4a) contrasting with the regular 3rd person possessive (4b). In (4c), the possessive form may be interpreted as coreferring to the reported speaker (as a logophor) or to the subject of the clause (as reflexive possessive). Bwamu (Central Gur): (4a) a` wEE mı´ 3Sg:Sbj like Poss-Refl ‘He likes his (own) field’ (4b) a` wEE pe¯ e` 3Sg:Sbj like 3Sg:Poss ‘Hei likes hisj field’

mu´ u´ field mu´ u´ field

tOn¯ (4c) ya`n`-zOn` nEE mı´ hı´n´ ˜ Yanzon say LOG TAM build mı´ zu¯ nu` ˜ LOG/Poss-Refl house ‘Yanzoni said that hei had built hisi house.’

Several authors have equated logophors with longdistance reflexives (Sells, 1987; Koster and Reuland, 1991; Stirling, 1993; Huang, 2002). Although there is

314 Logophoric Pronouns

some functional overlap (see Suzuki, 2002 for the Japanese reflexive zibun, which may be used in modern literature to represent the subject of consciousness in free indirect discourse), the main difference is that long-distance reflexives may also occur bound within a clause while logophors may not. In many cases (e.g., in Latin), long-distance reflexives have a wider function than rendering speech or thought. As Comrie (2004) noted, the property that the marked value indicates coreference finds its parallels in other reference-tracking devices such as ‘prodrop,’ long-distance reflexives, etc. He acknowledged nevertheless that logophoricity (proper) is a unique areal characteristic of African languages not found elsewhere. Its distribution in an area north of the rain forest from the Gulf of Guinea to the Nile valley has been discussed by, amongst others, Roncador (1992), Dimmendaal (2001), and Gu¨ ldemann (2003); the latter proposed the name ‘Macro-Sudan’ for this language area. While Roncador suggested different origins of logophoric devices through function borrowing, Dimmendaal took the occurrence of similar forms in Niger-Congo and Nilo-Saharan (see Boyeldieu, 2004 for details in a Nilo-Saharan subgroup) as proof for the CongoSaharan hypothesis that the two stocks had a common origin. Gu¨ ldemann examined three possibilities for the development of logophors: internal development, genealogical inheritance, and language contact. He suggests a connection between logophoricity and the frequency of nondirect speech in a given language. Ameka (2004) has recently argued for a connection between logophoricity and evidential marking on the one hand and the practice of triadic communication in African societies. In the communication of a chief via a spokesperson to his audience, the intermediary will usually not use direct speech to render the words of the chief; it may be felt as presumptuous to assume the same orientation. On the other hand, the use of the ‘regular’ third person pronoun would put the chief on the same ‘level’ as other third persons. This scenario might find some corroboration in the fact that in some languages, such as Bijogo (Bidyogo) (G. Segerer, personal communication), the logophoric form might be traced to a noun denoting ‘owner, master’. See also: Anaphora, Cataphora, Exophora, Logophori-

city; Reported Speech: Pragmatic Aspects; Switch Reference.

Bibliography Ameka F K (2004). ‘Grammar and cultural practices: the grammaticalization of triadic communication in West

African languages.’ Journal of West African Languages 30, 5–28. Boyeldieu P (2004). ‘Les pronoms logophoriques dans les langues d’Afrique centrale.’ In Ibriszimov D & Segerer G (eds.) Syste`mes de marques personnelles en Afrique. Louvain, Belgium: Peeters. 11–22. Clements G N (1975). ‘The logophoric pronoun in Ewe: its role in discourse.’ Journal of West African Languages 10, 141–177. Comrie B (2004). ‘West African logophorics and the typology of reference-tracking.’ Journal of West African Languages 30, 41–52. Culy C (1994). ‘Aspects of logophoric marking.’ Linguistics 32, 1055–1094. Culy C (2002). ‘The logophoric hierarchy and variation in Dogon.’ In Gu¨ ldemann & Roncador (eds.). 201–211. Curnow T J (2002). ‘Three types of verbal logophoricity in African languages.’ Studies in African Linguistics 31, 1–25. Dimmendaal G (2001). ‘Logophoric marking and represented speech in African languages as evidential hedging strategies.’ Australian Journal of Linguistics 21, 131–157. Frajzyngier Z (1985). ‘Logophoric systems in Chadic.’ Journal of African Languages and Linguistics 7, 23–37. Gu¨ ldemann T (2003). ‘Logophoricity in Africa: an attempt to explain and evaluate the significance of its modern distribution.’ Sprachtypologie und Universalienforschung 56, 366–387. Gu¨ ldemann T & Roncador M von (eds.) (2002). Reported discourse: a meeting ground for different linguistic domains. Amsterdam: John Benjamins. Gu¨ ldemann T, Roncador M von & van der Wurff W (2002). ‘A comprehensive bibliography of reported discourse.’ In Gu¨ ldemann & Roncador (eds.). 363–415. Hage`ge C (1974). ‘Les pronoms logophoriques.’ Bulletin de la Socie´ te´ de Linguistique de Paris 69, 287–310. Huang Y (2002). ‘Logophoric marking in East Asian languages.’ In Gu¨ ldemann & Roncador (eds.). 213–226. Hyman L M & Comrie B (1981). ‘Logophoric reference in Gokana.’ Journal of African Languages and Linguistics 3, 19–37. Koopman H & Sportiche D (1989). ‘Pronouns, logical variables, and logophoricity in Abe.’ Linguistic Inquiry 20, 555–588. Koster J & Reuland E (eds.) (1991). Long distance anaphora. Cambridge: Cambridge University Press. Roncador M von (1988). Zwischen direkter und indirekter Rede: nichtwo¨ rtliche direkte Rede, erlebte Rede, logophorische Konstruktionen und Verwandtes. Tu¨ bingen: Max Niemeyer. Roncador M von (1992). ‘Types of logophoric marking in African languages.’ Journal of African Languages and Linguistics 13, 163–182. Roncador M von (2002). ‘Zur Bindung der Logophoren: Fallstudien aus Gursprachen.’ In Bublitz W, Roncador M von & Vater H (eds.) Typologie, Philologie und Sprachstruktur: Festschrift fu¨ r Winfried Boeder. Frankfurt: Peter Lang. 171–190.

Lomonosov, Mikhail Vasilyevich (1711–1765) 315 Sells P (1987). ‘Aspects of logophoricity.’ Linguistic Inquiry 18, 445–479. Stirling L (1993). Switch-reference and discourse representation. Cambridge: Cambridge University Press.

Suzuki Y (2002). ‘The acceptance of ‘‘free indirect discourse’’: a change in the representation of thought in Japanese.’ In Gu¨ ldemann & Roncador (eds.). 111–122.

Lomonosov, Mikhail Vasilyevich (1711–1765) S Archaimbault, Universite´ Paris, Paris, France ! 2006 Elsevier Ltd. All rights reserved.

Mixail Vasilievich (Mikhail Vasilyevich) Lomonosov was born in 1711 in a small village of northern Russia (Kurostrov, Arkhangel’sk government). He was first educated by a local deacon and then, in 1730, joined the Slav-Greco-Latin Academy of Moscow, where he devoted himself to the comparative study of Slavonic and the Russian dialect. He turned out to be an outstanding student, and in 1735 he was accepted to the newly founded University of the Academy of Sciences and then sent to Marburg in 1736 to improve his knowledge of exact sciences. There, he was taught by Christian Wolff, whose Principles of physics he translated later on. He stands in good place in the pantheon of Russian scientists, even during the Soviet period, when his modest origins helped build the myth of the self-made scientist. His complete works were edited in the 1950s, including his comments and drafts, which are of invaluable help to researchers. He was interested in every field of knowledge and worked successively as Physics Assistant and Chemistry Professor before starting an outstanding academic career, thanks to the solicitude of Ivan Shulavov, the prote´ ge´ of Empress Elisabeth Petrovna. He also wrote several books on physics, crystallography, and history and important works devoted to the study of language. His Russian grammar (Rossijskaja grammatika, 1755), which was reedited several times, elaborates on the Slavonic grammar sketched in Zizanius or Smotrickij, but also describes new language facts. As the drafts show, he gathered a huge corpus, mainly lexical, and was influenced by the observation method of the natural sciences. Lomonosov’s ultimate aim was to fight the Slavonic/Russian diglossia by providing the Russian language – which he considered as the first European language – with a grammar of usage, but he also wanted to describe a correct language that all social classes could share. Consequently, he adapted the ‘three styles’ theory (high, low, medium), using a fair proportion of slavonicisms and elements of the spoken or popular language. He linked syntax and rhetoric, relating clause to

period, the period being for him the syntactic unit. Therefore, his grammar and his rhetoric should be read as complementary: They serve the same purpose, namely, that the literary language be not different from the spoken language. Moreover, he developed his thought within the framework of General Grammar, shared by the European scientists of the time, which was a general philosophical conception of human language. He completed the grammatical terminology, integrating new terms; and in an important part devoted to phonetics, he strictly delimited grapheme and sound and finely described the point of articulation. He also wrote a great number of odes and tragedies, and an epistle on the usefulness of glass (Pis’mo o pol’ze stekla, 1752), where he used the ‘medium style’ to pay homage to his protector I. I. Shuvalov, who was at the time setting up a glass factory near Saint Petersburg. Even though he will not be remembered as a prominent prose writer or poet, he will remain an encyclopedic mind who devoted his life to the Promethean task of constituting a Russian science capable of rivaling European science. His name is attached to the founding of the University of Moscow (1755), where all subject matters and disciplines had to be taught in Russian. He died in 1765 in Saint Petersburg.

Bibliography (1979). Slavjanovedeniev dorevoljucionnoj Rossii. Bibliograficheskij slovar’. Moskva: Nauka. Archaimbault S (1999). Pre´ histoire de l’aspect verbal, l’e´ mergence de la notion dans les grammaires russes. Paris: CNRS Editions. Breuillard J (1994). N. M. Karamzine et la formation de la langue russe. The`se de Doctorat d’Etat, Universite´ de Paris-Sorbonne. Budilovich A (1869). M. V. Lomonosov kak naturalist i filolog. Sprilozhenijami, soderzhashchimi materialy po istorii jazyka i slovesnosti. Sankt Peterburg: Tipo. Imp. Akad. Nauk. Efimov A I (1961). M. V. Lomonosov I russkij jazyk. Moskva: Izd. Moskovskogo Universiteta. Freidhof G (1987). ‘Kvoprosu o ponjatii ‘‘suzhedenie’’ u Lomonosova, Barsova i Jakoba.’ Russian Linguistics 11(1), 319–333.

Lomonosov, Mikhail Vasilyevich (1711–1765) 315 Sells P (1987). ‘Aspects of logophoricity.’ Linguistic Inquiry 18, 445–479. Stirling L (1993). Switch-reference and discourse representation. Cambridge: Cambridge University Press.

Suzuki Y (2002). ‘The acceptance of ‘‘free indirect discourse’’: a change in the representation of thought in Japanese.’ In Gu¨ldemann & Roncador (eds.). 111–122.

Lomonosov, Mikhail Vasilyevich (1711–1765) S Archaimbault, Universite´ Paris, Paris, France ! 2006 Elsevier Ltd. All rights reserved.

Mixail Vasilievich (Mikhail Vasilyevich) Lomonosov was born in 1711 in a small village of northern Russia (Kurostrov, Arkhangel’sk government). He was first educated by a local deacon and then, in 1730, joined the Slav-Greco-Latin Academy of Moscow, where he devoted himself to the comparative study of Slavonic and the Russian dialect. He turned out to be an outstanding student, and in 1735 he was accepted to the newly founded University of the Academy of Sciences and then sent to Marburg in 1736 to improve his knowledge of exact sciences. There, he was taught by Christian Wolff, whose Principles of physics he translated later on. He stands in good place in the pantheon of Russian scientists, even during the Soviet period, when his modest origins helped build the myth of the self-made scientist. His complete works were edited in the 1950s, including his comments and drafts, which are of invaluable help to researchers. He was interested in every field of knowledge and worked successively as Physics Assistant and Chemistry Professor before starting an outstanding academic career, thanks to the solicitude of Ivan Shulavov, the prote´ge´ of Empress Elisabeth Petrovna. He also wrote several books on physics, crystallography, and history and important works devoted to the study of language. His Russian grammar (Rossijskaja grammatika, 1755), which was reedited several times, elaborates on the Slavonic grammar sketched in Zizanius or Smotrickij, but also describes new language facts. As the drafts show, he gathered a huge corpus, mainly lexical, and was influenced by the observation method of the natural sciences. Lomonosov’s ultimate aim was to fight the Slavonic/Russian diglossia by providing the Russian language – which he considered as the first European language – with a grammar of usage, but he also wanted to describe a correct language that all social classes could share. Consequently, he adapted the ‘three styles’ theory (high, low, medium), using a fair proportion of slavonicisms and elements of the spoken or popular language. He linked syntax and rhetoric, relating clause to

period, the period being for him the syntactic unit. Therefore, his grammar and his rhetoric should be read as complementary: They serve the same purpose, namely, that the literary language be not different from the spoken language. Moreover, he developed his thought within the framework of General Grammar, shared by the European scientists of the time, which was a general philosophical conception of human language. He completed the grammatical terminology, integrating new terms; and in an important part devoted to phonetics, he strictly delimited grapheme and sound and finely described the point of articulation. He also wrote a great number of odes and tragedies, and an epistle on the usefulness of glass (Pis’mo o pol’ze stekla, 1752), where he used the ‘medium style’ to pay homage to his protector I. I. Shuvalov, who was at the time setting up a glass factory near Saint Petersburg. Even though he will not be remembered as a prominent prose writer or poet, he will remain an encyclopedic mind who devoted his life to the Promethean task of constituting a Russian science capable of rivaling European science. His name is attached to the founding of the University of Moscow (1755), where all subject matters and disciplines had to be taught in Russian. He died in 1765 in Saint Petersburg.

Bibliography (1979). Slavjanovedeniev dorevoljucionnoj Rossii. Bibliograficheskij slovar’. Moskva: Nauka. Archaimbault S (1999). Pre´histoire de l’aspect verbal, l’e´mergence de la notion dans les grammaires russes. Paris: CNRS Editions. Breuillard J (1994). N. M. Karamzine et la formation de la langue russe. The`se de Doctorat d’Etat, Universite´ de Paris-Sorbonne. Budilovich A (1869). M. V. Lomonosov kak naturalist i filolog. Sprilozhenijami, soderzhashchimi materialy po istorii jazyka i slovesnosti. Sankt Peterburg: Tipo. Imp. Akad. Nauk. Efimov A I (1961). M. V. Lomonosov I russkij jazyk. Moskva: Izd. Moskovskogo Universiteta. Freidhof G (1987). ‘Kvoprosu o ponjatii ‘‘suzhedenie’’ u Lomonosova, Barsova i Jakoba.’ Russian Linguistics 11(1), 319–333.

316 Lomonosov, Mikhail Vasilyevich (1711–1765) Freidhof G, Kosta P & Schu¨ trumpf M (1987). Studia Slavica in Honorem Viri Doctissimi Olexa Horbatsch, Teil 3, Lomonosov eine grammatische Beschreibung im 18. Jahrhundert Mu¨ nchen: Verlag Otto Sagner.

Lewicki R (1996). ‘Lomonosov, Mikhail Vasil’evich.’ In Stammerjohann H (ed.) Lexicon grammaticorum: Who’s who in the history of world linguistics. Tu¨ bingen: Max Niemeyert Verlag.

Long-Distance Dependencies Y N Falk, The Hebrew University of Jerusalem, Jerusalem, Israel

2d. Purpose infinitive: I bought the book [for the on the shelf]. librarian to put

! 2006 Elsevier Ltd. All rights reserved.

Examining an example of a strong LDD construction, such as the main clause interrogative (Subgroup (1a)), reveals the basic anatomy of these constructions. The crucial element is the underlined phrase which book. This phrase fulfills two distinct functions: it is both the focus of the main clause question and the object of the verb put. It occupies a special structural position associated with topical and focal elements (specifier of the complement phrase). The structural position normally held by the object of put appears to be missing. This missing position is often referred to as the ‘gap,’ and the fronted position is often referred to as the ‘filler.’ The element that appears in filler position and bears the functions associated with both filler and gap positions will be called the ‘extractee’ in this article. LDD constructions, then, are filler–gap constructions. They are called long-distance (or unbounded) dependencies because there is no limit on the distance between the filler and the gap. This long-distance quality is one of the unusual features of this class of constructions; most syntactic constructions are bounded. The long-distance nature of these constructions has also led to the need to refer to parts of the constructions: the constituent containing the filler can be referred to as the ‘top’ of the construction, and the constituent containing the gap is the ‘bottom.’ The constituents in between are called the ‘middle’ (or ‘body’) of the construction. The top, middle, and bottom taken together constitute the ‘path.’ The extractee is said to be extracted along the path from the gap position to the filler position. LDD constructions involve a close relation between the filler and the gap. This can be seen from the property of ‘strong connectivity’: syntactic requirements of the gap position (such as case marking) hold of the filler. This is hard to see in a language such as English, which has very little overt case marking, but can be clearly seen in the following example from Hebrew, wherein the topicalized noun phrase is marked with accusative case (in Hebrew, the prenominal particle et) because it corresponds to an object gap:

Basic Properties The family of long-distance dependency constructions has been given various names in the literature. In addition to ‘long-distance dependencies,’ these constructions have also been called ‘extraction,’ ‘unbounded dependencies,’ ‘wh dependencies’ (or ‘wh movement’), ‘A¯ dependencies’ (or ‘A¯ movement’), ‘syntactic binding,’ ‘operator movement,’ and ‘constituent control.’ Long-distance dependency (LDD) constructions can be split into two subgroups; the constructions in Subgroup (1) are sometimes called ‘strong’ LDD constructions, and those in Subgroup (2) are sometimes called ‘weak’ constructions. Subgroup (1) (strong) constructions are exemplified by the following: 1a. Wh questions: Which book does the student think the teacher said the librarian put on the shelf? 1b. Exclamatives: What a book the student thinks the teacher said the librarian put on the shelf! 1c. Wh relative clauses: The book [which the student thinks the teacher said the librarian put on the shelf]. 1d. Pseudoclefts: [What the student thinks the teacher said the librarian put on the shelf] is the book. 1e. Topicalization: This book, the student thinks the teacher said the librarian put on the shelf. Subgroup (2) (weak) constructions are exemplified by the following: 2a. Non-wh relative clauses: The book [(that) the student thinks the teacher said the librarian put on the shelf]. 2b. Cleft: It is the book [that the student thinks the teacher said the librarian put on the shelf]. 2c. ‘Tough movement’: This book is easy [to think that the teacher said the librarian put on the shelf].

316 Lomonosov, Mikhail Vasilyevich (1711–1765) Freidhof G, Kosta P & Schu¨trumpf M (1987). Studia Slavica in Honorem Viri Doctissimi Olexa Horbatsch, Teil 3, Lomonosov eine grammatische Beschreibung im 18. Jahrhundert Mu¨nchen: Verlag Otto Sagner.

Lewicki R (1996). ‘Lomonosov, Mikhail Vasil’evich.’ In Stammerjohann H (ed.) Lexicon grammaticorum: Who’s who in the history of world linguistics. Tu¨bingen: Max Niemeyert Verlag.

Long-Distance Dependencies Y N Falk, The Hebrew University of Jerusalem, Jerusalem, Israel

2d. Purpose infinitive: I bought the book [for the on the shelf]. librarian to put

! 2006 Elsevier Ltd. All rights reserved.

Examining an example of a strong LDD construction, such as the main clause interrogative (Subgroup (1a)), reveals the basic anatomy of these constructions. The crucial element is the underlined phrase which book. This phrase fulfills two distinct functions: it is both the focus of the main clause question and the object of the verb put. It occupies a special structural position associated with topical and focal elements (specifier of the complement phrase). The structural position normally held by the object of put appears to be missing. This missing position is often referred to as the ‘gap,’ and the fronted position is often referred to as the ‘filler.’ The element that appears in filler position and bears the functions associated with both filler and gap positions will be called the ‘extractee’ in this article. LDD constructions, then, are filler–gap constructions. They are called long-distance (or unbounded) dependencies because there is no limit on the distance between the filler and the gap. This long-distance quality is one of the unusual features of this class of constructions; most syntactic constructions are bounded. The long-distance nature of these constructions has also led to the need to refer to parts of the constructions: the constituent containing the filler can be referred to as the ‘top’ of the construction, and the constituent containing the gap is the ‘bottom.’ The constituents in between are called the ‘middle’ (or ‘body’) of the construction. The top, middle, and bottom taken together constitute the ‘path.’ The extractee is said to be extracted along the path from the gap position to the filler position. LDD constructions involve a close relation between the filler and the gap. This can be seen from the property of ‘strong connectivity’: syntactic requirements of the gap position (such as case marking) hold of the filler. This is hard to see in a language such as English, which has very little overt case marking, but can be clearly seen in the following example from Hebrew, wherein the topicalized noun phrase is marked with accusative case (in Hebrew, the prenominal particle et) because it corresponds to an object gap:

Basic Properties The family of long-distance dependency constructions has been given various names in the literature. In addition to ‘long-distance dependencies,’ these constructions have also been called ‘extraction,’ ‘unbounded dependencies,’ ‘wh dependencies’ (or ‘wh movement’), ‘A¯ dependencies’ (or ‘A¯ movement’), ‘syntactic binding,’ ‘operator movement,’ and ‘constituent control.’ Long-distance dependency (LDD) constructions can be split into two subgroups; the constructions in Subgroup (1) are sometimes called ‘strong’ LDD constructions, and those in Subgroup (2) are sometimes called ‘weak’ constructions. Subgroup (1) (strong) constructions are exemplified by the following: 1a. Wh questions: Which book does the student think the teacher said the librarian put on the shelf? 1b. Exclamatives: What a book the student thinks the teacher said the librarian put on the shelf! 1c. Wh relative clauses: The book [which the student thinks the teacher said the librarian put on the shelf]. 1d. Pseudoclefts: [What the student thinks the teacher said the librarian put on the shelf] is the book. 1e. Topicalization: This book, the student thinks the teacher said the librarian put on the shelf. Subgroup (2) (weak) constructions are exemplified by the following: 2a. Non-wh relative clauses: The book [(that) the student thinks the teacher said the librarian put on the shelf]. 2b. Cleft: It is the book [that the student thinks the teacher said the librarian put on the shelf]. 2c. ‘Tough movement’: This book is easy [to think that the teacher said the librarian put on the shelf].

Long-Distance Dependencies 317 (1) Et

hasefer haze, hasafranit the.book the.this the.librarian lo tasim al hamadaf. not will.put on the.shelf ‘This book, the librarian won’t put on the shelf.’ ACC

Weak LDD constructions have essentially the same structure as strong ones have. The difference, as can be seen in the Subgroup (2) examples, is that they appear to lack a filler. Nevertheless, they display all the LDD phenomena discussed in the following section.

Long-Distance Dependency Phenomena Islands

The most discussed phenomenon in LDD is ‘islands,’ which are structures that an LDD path cannot cross. Many of the island constraints were originally stated and named by Ross (1967). One island constraint, the coordinate structure constraint, prohibits extraction from one conjunct of a coordinate structure: (2a) The librarian [[put these books on the shelf] and [then took a break]. (2b) *Which books did the librarian [[put on the shelf] and [then take a break]]? (3a) The librarian put [[books] and [magazines]] on the shelf. (3b) *What did the librarian put [ and magazines] on the shelf?

On the other hand, ‘across-the-board’ (or ATB) extraction from both conjuncts is grammatical: (4a) The librarian [[recorded these books in the catalog] and [put these books on the shelf]. (4b) Which books did the librarian [[record in the catalog] and [put on the shelf]]?

Another island constraint is the left branch condition, which prohibits the extraction of prehead (left) dependents in noun phrases and adjective phrases: (5a) The librarian put [my book] on the shelf. (5b) *Whose did the librarian put [ book] on the shelf? (6a) That book is [extremely interesting]. (6b) *How is that book [ interesting]?

Elements bearing certain grammatical functions constitute islands. For example, extraction is forbidden from subjects: (7a) [Books about wayward librarians] shocked me. ] shock you? (7b) *Who did [books about

Adjuncts are also islands: (8a) I screamed [after the librarian put that book on the shelf].

(8b) *Which book did you scream [after the librarian put on the shelf]?

However, some speakers find the adjunct condition to be weaker than the subject condition. Some island constraints produce violations that can either be relatively acceptable or totally unacceptable, depending on the circumstances. One such constraint is the complex noun phrase (NP) constraint, which prohibits extraction from a clause within a noun phrase. Extraction from a complement clause (Examples (9a) and (9b)) is not as bad as extraction from a relative clause (Examples (10a) and (10b)), because the latter combines the complex NP constraint with the adjunct constraint: (9a) I heard [a rumor [that the librarian put that book on the shelf]]. (9b) ?*Which book did you hear [a rumor [that the librarian put on the shelf]]? (10a) I saw [the librarian [who put that book on the shelf]]. (10b) *Which book did you see [the librarian [who put on the shelf]]?

Another such island constraint is the wh island constraint, which rules out extractions from wh clauses (clauses introduced by wh elements). Extraction from nonfinite wh clauses (as in Examples (11a) and (11b)) is not totally unacceptable, but extraction from finite wh clauses (as in Examples (12a) and (12b)) is much worse: (11a) The librarian wondered [whether to put that book on the shelf]. (11b) ?Which book did the librarian wonder [whether to put on the shelf]? (12a) I wondered [whether the librarian put that book on the shelf]. (12b) *Which book did you wonder [whether the librarian put on the shelf]?

Another ungrammatical LDD involves the extraction of a subject following a complementizer; this is known as the that-trace effect. Extraction of a subject is grammatical in the absence of the complementizer, and extraction of nonsubjects is not sensitive to the presence of a complementizer: (13a) I think [the librarian put the book on the shelf]. (13b) Who do you think [ put the book on the shelf]? (13c) What do you think [the librarian put on the shelf]? (14a) I think [that the librarian put the book on the shelf]. (14b) *Who do you think [that put the book on the shelf]?

318 Long-Distance Dependencies (14c) What do you think [that the librarian put on the shelf]? Pied-Piping

Sometimes the filler involves more than just the element that is being focused on or topicalized. Often this is done as a way of circumventing island constraints. So, whereas Example (15a) is ungrammatical (ruled out by the left branch condition), it is possible to front the larger NP, as in Example (15b), even though it is only the possessor of the book that is being questioned: (15a) *Whose did the librarian put [ book] on the shelf? (15b) Whose book did the librarian put on the shelf?

This phenomenon was dubbed ‘pied-piping’ by Ross (1967). Pied-piping is sometimes a stylistic option; for example, instead of extracting the object of a preposition and leaving the preposition stranded, which is grammatical but stigmatized in certain styles of English, the prepositional phrase can be pied-piped: (16a) Which shelf did the librarian put the book [on ]? (16b) On [which shelf] did the librarian put the book ? Crossover

LDD constructions interact with pronominal coreference. The position (or grammatical function) of the gap influences coreferential possibilities. Roughly speaking, if the gap is structurally lower than the pronoun, coreference is impossible even though the filler position occupied by the extractee is higher: (17a) Whoi did you say thinks the librarian hates himi? (17b) *Whoi did you say hei thinks the librarian hates ?

The ungrammaticality of Example (17b) has been named crossover because, in a transformational account, the extractee crosses over a coreferential NP. That is to say, in this example, who crosses over the coreferential he. A weaker crossover effect obtains if the pronoun is embedded: (18a) Whoi did you say thinks the librarian hates [hisi students]? (18b) ?*Whoi did you say [hisi students] think the librarian hates ?

The effect in Example (17b) is called strong crossover, and that in Example (18b) is weak crossover.

Parasitic Gaps

A phenomenon that has received much attention in the literature on LDDs since the 1980s is illustrated by the following example: (19) Which books did the librarian put shelf before reading ?

on the

In this sentence, there is one filler, but two gaps. The filler corresponds to both of these gaps. The existence of two gaps corresponding to a single filler is strange; more strange is that the second gap, being embedded within an adjunct, would normally be ungrammatical (because of the adjunct condition), and thus can be thought of as relying on the first gap for its existence. The second gap is said to be parasitic on the first (true) gap, and this construction is referred to as the ‘parasitic gap’ construction. Despite a rather large literature on parasitic gaps, many open questions remain. Even the connection with ordinary LDD constructions is called into question by the fact that parasitic gaps are possible in heavy NP shift constructions, which do not have the same properties as LDDs have: (20) The librarian put on the shelf without reading first [a very expensive edition of Chomsky’s classic book Syntactic structures].

Other non-LDD constructions in other languages, such as scrambling and object clitic constructions, have also been observed to license parasitic gaps. Culicover (2001) provided a good overview of the literature on the subject. Path Phenomena

In some languages, clauses on the path of an LDD exhibit morphosyntactic properties that distinguish them from ordinary clauses. A survey of such cases is provided by Zaenen (1983). These include a complementizer selection in Irish, a tonal phenomenon (downstep deletion) in Kikuyu (Gikuyu), a ‘stylistic’ subject–verb inversion in French, and the nonuse of the pleonastic element ?það in Icelandic. The following examples from Irish illustrate this: the normal complementizer is goN, but the complementizer on a path is aL (the capitals N and L indicate phonological nasalization or lenition of the following consonant). The complementizer aL is used in all and only clauses on the path. (21) Deir siad goN sı´leann an t-athair say they that thinks the father goN bpo´sfaidh Sı´le e´. that will.marry Sheila him ‘They say that the father thinks Sheila will marry him.’

Long-Distance Dependencies 319 (22) An fear [aL deir siad [aL shı´leann the man that say they that thinks an t-athair [aL pho´ sfaidh Sı´le ]]]. the father that will.marry Sheila ‘The man that they say the father thinks Sheila will marry.’ (23) An fear [aL shı´l [goN mbeadh se´ ann]]. the man that said that would.be he there ‘The man that said that he would be there.’

Path phenomena are important in understanding LDD constructions because they show that the middle of the construction has syntactic relevance. The relevance of the middle of the construction is not surprising. As noted earlier in this article, the longdistance nature of LDD constructions is surprising: syntax usually operates on more local domains. The necessity of a clause-by-clause (or constituent-byconstituent) theory of LDDs is therefore expected. All of the contemporary theoretical analyses discussed in the following sections have a local component to them, and are thus compatible with the existence of path phenomena. Reconstruction Effects

Another property of LDDs is what has been referred to as reconstruction effects. Reconstruction refers to the extractee, which occupies filler position, having semantic properties that make it appear as if it is in the position of the gap. For example, note the reflexive herself in the following example: (24) Which book about herselfi does he think the teacher said the librariani put on the shelf?

The reflexive here is part of the filler. A reflexive (roughly) needs to have its antecedent higher in the structure and within its clause, but in this case the antecedent is lower in the structure. What makes this sentence grammatical is that the antecedent is in the correct position for the reflexive if the reflexive is seen as part of the gap. Like strong connectivity, reconstruction effects are an indication of the close relationship between the filler and gap. They suggest, as is embodied in all contemporary standard analyses of LDDs, that the extractee is associated with both filler and gap positions/functions. Resumptive Pronouns

Many languages allow or require a pronoun in place of the gap in certain types of LDD constructions. This pronoun is called a ‘resumptive pronoun.’ For example, in Hebrew relative clauses, when the extractee is the object of a verb, a resumptive pronoun is optional; when the extractee is the object of a preposition, a resumptive pronoun is obligatory:

(25a) Hasefer sˇ e hasafranit sama the.book that the.librarian put al hamadaf. on the.shelf ‘The book that the librarian put on the shelf.’ (25b) Hasefer sˇ e hasafranit sama oto the.book that the.librarian put it al hamadaf. on the.shelf ‘The book that the librarian put (it) on the shelf.’ (26a) *Hamadaf sˇ e hasafranit sama the.shelf that the.librarian put al et hasefer. on ACC the.book ‘The shelf that the librarian put the book on.’ (26b) Hamadaf sˇ e hasafranit sama althe.shelf that the.librarian put onav et hasefer. it ACC the.book ‘The shelf that the librarian put the book on (it).’

Resumptive pronouns are usually immune from island constraints and do not normally trigger path phenomena. They may or may not display crossover effects or license parasitic gaps. Thus, although they are LDD constructions, they differ from gap-containing constructions. Often, resumptive pronouns are used as a way of circumventing island constraints. For many speakers of English, resumptive pronoun constructions, although not entirely grammatical, seem better than island violations with gaps: (27a) *The book that I saw the librarian who put on the shelf? (27b) ??The book that I saw the librarian who put it on the shelf? (28a) *Who did you ask whether put the book on the shelf? (28b) ??Who did you ask whether she put the book on the shelf?

The analysis of LDD constructions with resumptive pronouns is more controversial than that of LDD constructions with gaps, and will not be discussed here.

Analysis Discussions in the following sections outline the analysis of LDDs in the major current streams of syntactic theory. The important aspects of the analyses will be covered without involving theory-internal technicalities. Transformational Grammar

The informal description of LDDs in the preceding sections, given in terms of extraction of the extractee

320 Long-Distance Dependencies

from the gap position to the filler position, corresponds to the transformational analysis of these constructions. In the underlying (deep) structure, the extractee occupies the gap position, whereas in surface structure, it occupies filler position. In the earliest work in transformational grammar (such as that of Chomsky (1957)), it was taken for granted that LDD constructions were derived by transformation: (29) Deep structure:

the librarian PAST put [which book] on the shelf. Surface structure: [which book] did the librarian put on the shelf.

The deep structure location of the extractee in the position of the gap is natural, given the fact that the extractee fulfills selectional properties of the verb and is interpreted as an argument. Subsequent development of the theory has refined this view somewhat. For example, since the introduction by Chomsky (1973) of the idea that movement leaves a trace, it has been standard to hypothesize a surface trace (t) in the underlying position of the extractee: (30) [which book] did the librarian put t on the shelf

In some versions of the theory, this trace is co-indexed with the filler, and/or forms a syntactic object called a chain with it (see, for example, Chomsky (1981)). Since a chain is a single syntactic object, with a single set of syntactic properties, strong connectivity follows naturally in such a system. It has also been proposed (Chomsky, 1995, 2000) that movement is a copying operation, and the trace is therefore a copy of the filler: this accounts naturally for reconstruction effects, since the extractee occupies both structural positions. The trace has also been endowed with the semantic status of a variable at the level of logical form, bound by the quantifier-like wh operator (Chomsky, 1977). This variable, because it is semantically similar to a name, is said to have the bindingtheoretic status of ordinary referential NPs. This accounts for strong crossover effects, because strong crossover involves the forbidden anaphoric binding of the variable by a pronoun (Chomsky, 1981). The filler position, being a nonargument (A¯ ) position, does not anaphorically bind the variable. In the earliest transformational analyses, the extractee moved in strong constructions and deleted in weak constructions. Unlike clause-bounded rules, like passive, LDD rules were stated with a variable intended to cover an arbitrary string of constituents, thus capturing their long-distance nature. Islands were taken to be the result of constraints on the interpretation of these variables (Ross, 1967). A more uniform approach to (most) islands was developed in the form of the subjacency condition (Chomsky, 1973), which

limited the distance of movement. Much subsequent work has focused on determining the exact nature of this limitation, but (roughly) the bounding nodes are a clausal node and NP. Apparently long-distance movement is taken to be derived by successive movements from specifier of complement phrase (CP) to specifier of CP: (31) Which booki does the student think [ti the teacher said [ti that the librarian put ti on the shelf]]?

Under this view, most islands follow from the lack of an appropriate intermediate landing site. (Certain other islands are the result of the empty category principle, a locality condition on the trace (Chomsky, 1981).) The subjacency-based approach to islands made the deletion analysis of weak LDD constructions untenable. Chomsky (1977) proposed that there is a single transformational rule ‘move wh’ that is responsible for all LDDs; in weak LDD constructions, the wh subsequently deletes (Chomsky, 1977) or is a phonologically empty null operator (Op) (Chomsky, 1981): (32) the book [Op (that) the student thinks [t the teacher said [t the librarian put t on the shelf]]]

Transformational theory has also been concerned with the motivation of wh movement. For example, Chomsky (1986) suggested that wh movement is necessary for the wh operator to assume the correct scope at logical form. Chomsky (1995) also attributed wh movement to the need to have features checked: the interrogative complementizer carries a wh feature that needs to be checked. Moving a wh element to its specifier allows the checking. Phrase Structure Grammar

Since the class of LDD constructions is one of the types of constructions that has been generally accepted as showing that natural-language syntax cannot be described by phrase structure grammar, one of the earliest efforts in contemporary phrase structure grammar has been to show that this assumption is false. Thus the earliest work in generalized phrase structure grammar (GPSG) by Gazdar et al. (1984) dealt with these constructions, and the basic ideas developed in that work have been carried over into head-driven phrase structure grammar (HPSG) (Pollard and Sag, 1994). The basic idea in the GPSG/HPSG analysis is that the constituent containing the gap has a special ‘missing element’ feature. Informally, this is written

Long-Distance Dependencies 321

using a slash: for example, a verb phrase (VP) with a missing NP belongs to the category VP/NP. As a result of this notation, the feature has come to be named SLASH. In the following sentence (Example (33)), the constituent headed by put is a VP/NP. The clausal constituent containing this VP/NP is S/NP, since it is missing the same NP as the VP it dominates: (33) [S Which book [S/NP does the student [VP/NP think [S/NP the teacher [VP/NP said [S/NP the librarian [VP/NP put [NP/NP e] on the shelf ]]]]]]]?

A category such as S/NP is S with the additional feature [SLASH hNPi]. More precisely, the elements of the value of SLASH are not category labels, but (part of) the feature structure of the filler. As a result of this feature sharing, the same element occupies the filler and gap positions simultaneously, without movement. This accounts for strong connectivity and reconstruction effects. The SLASH feature is said to be introduced at the bottom of the construction and discharged (or bound off) at the top. The passing of the SLASH feature up the tree has an unusual quality to it, as it appears not to based on the concept of head. In much of the GPSG/HPSG literature, SLASH is said to belong to a special class of features, called nonlocal (or foot) features, which are not constrained to percolate between heads and phrases. A different account (Bouma et al., 2001; Ginzburg and Sag, 2000) proposed that the head amalgamates all the SLASH values of its dependents, so the SLASH feature on a phrase in the path is inherited from the head. This obviates the need for a special kind of feature. In early GPSG/HPSG literature, the gap was viewed (in most cases) as an empty category (trace), as in transformational theory. However, what has become the standard view (originating in the work of Pollard and Sag (1994)) is one in which the SLASH feature originates not in a trace but rather as one of the features of the selecting head. Even in early work, traces did not appear in all LDDs: a distinction was drawn between subject extraction (which did not involve a trace) and nonsubject extraction (which did). In the work of Bouma et al. (2001), Sag (2001), and Ginzburg and Sag (2000), subject and nonsubject extraction (and adjunct extraction) have been unified. The issues involved here have been primarily theoryinternal issues of the nature of the features involved. However, questions of similarities and differences between subject and nonsubject extraction have also been raised: proponents of the earlier view have cited differences between subject and nonsubject extraction, whereas proponents of the more recent view have cited similarities. As is generally the case in

HPSG, the properties of LDD constructions are a consequence of the properties of feature structures. Island constraints are due to constraints on feature combinations. The discharge of the SLASH feature is due to the properties of feature-based constructions. The lack of a filler in weak constructions is not a problem, as the construction can bind off the SLASH feature, even without a filler. Lexical-Functional Grammar

Lexical-functional grammar (LFG) (Kaplan and Bresnan, 1982; Bresnan, 2001) models syntax as involving parallel representations of structure (known as constituent structure, or c-structure) and function (functional structure, or f-structure). At f-structure, the multifunctionality of elements in LDD constructions is represented directly. For example, as was shown in Example (33), the functional element corresponding to the determiner phrase which book fills both the function of FOCUS of the main clause (the clause headed by think) and OBJECT of the clause headed by put. Such a representation differs from other theories, particularly transformational theory, in not conceptualizing LDD constructions as involving the displacement of an element from its natural position. Instead, the element has two functions, and the cstructural position that it occupies is the canonical position of one of those two functions. Strong connectivity and reconstruction effects follow, since the phenomena in question are f-structure phenomena, and the extractee has both the filler function and the gap function simultaneously. Since LDD constructions are an f-structure phenomenon, the lack of an overt c-structure filler in weak LDD constructions is not a problem. The currently accepted view (Kaplan and Zaenen, 1989) is that the licensing of LDD constructions is purely a functional structure matter, and is not based on constituent structure properties. The licensing takes the form of a constraint schema identifying the values of two grammatical functions. The constraint schema, which represents an infinite number of specific constraints, has come to be referred to as ‘functional uncertainty.’ There are several reasons given for treating LDD constructions as being licensed at f-structure; for example, these constructions exhibit category mismatches: (34a) (34b) (34c) (34d)

That he might be wrong, he didn’t think of. *That he might be wrong, he didn’t think. *He didn’t think of that he might be wrong. He didn’t think that he might be wrong.

Another argument is that islands are based on grammatical functions such as subject and adjunct, not on structural position. Kaplan and Zaenen (1989) gave

322 Long-Distance Dependencies

arguments that, in general, islands are due to functional constraints on the middle of the construction. Although Kaplan and Bresnan’s (1982) original formulation hypothesized a trace in the position of the gap, this is not necessary under Kaplan and Zaenen’s functional uncertainty approach: a constraint associated with the position of the filler specifies the gap function. However, an alternative to the traceless c-structure proposed by Kaplan and Zaenen has been proposed by Bresnan (1995). Bresnan licensed the multifunctionality of the extractee through bottom-up (‘inside-out’ in the terminology used in LFG) functional uncertainty: the licensing constraint is associated with the empty element and expresses identity with a grammaticized discourse function higher in the structure. Falk (2001) proposed that subject extraction be distinguished from nonsubject extraction: he analyzed subject extraction the way Kaplan and Zaenen did, using top-down (outside-in) licensing without trace, and nonsubject extraction using Bresnan’s empty category with bottom-up (insideout) licensing. He suggested that certain differences between subjects and nonsubjects can be accounted for in this fashion. Falk’s analysis bears some resemblance to the early distinction in GPSG/HPSG between subject and nonsubject extraction. Traces

As the descriptions of the analyses of LDD constructions in various theories demonstrate, one of the major areas of disagreement is the structural status of the gap position. Some researchers analyze all gap positions as containing an empty category (or trace), some analyze some gap positions (nonsubjects) as containing traces and other gap positions (subjects) as not containing them, and some hold that there is never an empty element in the position of the gap. To a degree, the question of the presence of traces is a theory-internal matter. In transformational theory, for example, traces are part of the general theory of movement. Conversely, for proponents of traceless accounts, the fact that traces are not overt elements makes them suspect as elements of constituent structure. Nevertheless, there have been attempts in the literature to find overt evidence for traces in LDD constructions. This suggests that at least some researchers consider the question of the existence of traces to be an empirical question. The following discussions review some of the more well-known arguments that have appeared in the literature. Ironically, although nonderivational theories tend to be more skeptical about traces than derivational theories do, it was the original proposal that movement leaves a trace that provided the initial impetus for nontransformational theories of LDD constructions. Once

traces were proposed, the original argument for deep structure and movement carried less weight: as long as there was a mechanism for associating the trace with the extractee, there was no need to posit an abstract structure in which the extractee occupies the position of the gap. The trace was there to serve whatever functions the deep structure position of the extractee was intended to serve, such as fulfilling selectional properties and having the clause-internal grammatical function. The best known argument for traces is the phenomena of wanna contraction. In natural speech, the sequence want to is normally pronounced wanna: (35) I want to put the book on the shelf ! I wanna put the book on the shelf.

But if there is an LDD gap between want and to, this is impossible for most speakers of English: (36) Who do you want to put the book on the shelf? *Who do you wanna put the book on the shelf?

This argument for traces was summarized by Jacobson (1982). However, other analyses of wanna contraction have been proposed. For example, the ability to contract can be defined in terms of subject sharing: it is possible when the subject of want is the same as the subject of to’s clause (Postal and Pullum, 1978). Alternatively, wanna may be a lexical item, with no contraction involved at the syntactic level (Pullum, 1997). The argument from wanna contraction is thus at best inconclusive. Jacobson (1982) cited several additional arguments for traces. One of them is based on the fact that the VP following to can normally be ellipted, but not immediately following a gap: (37a) The students hope to put the books on the shelf. (37b) *Who did you ask to put the books on the shelf?. (37c) Who did you ask not to put the books on the shelf?.

Jacobson hypothesized that the constituent before to in a VP ellipsis construction must be stressed. Since trace cannot be stressed, the ellipsis is ungrammatical. Another argument for traces has been advanced by Bresnan (1995). She argued that weak crossover effects are due to syntactic rank (defined in terms of a hierarchy of grammatical functions) and linear order. She noted that, in the following examples, syntactic rank cannot be the cause, since obliques outrank predicative (Example (38a)) and clausal (Example (38b)) complements, and different obliques have the same rank (Example (38c)):

Long-Distance Dependencies 323 (38a) ?*To whomi did Mary seem proud of himi ? (38b) ??Who(m)i did they explain how hei should dress to ? (38c) ?*Everyonei in the room, I talked about hisi coursework with .

The weak crossover effects in these sentences must be due to the pronoun preceding its intended antecedent, which is possible only if there is a trace in the gap position. Pollard and Sag (1994) cited psycholinguistic evidence against the idea that traces can be shown to be in the position where the filler is reactivated. The evidence instead seems to indicate that reactivation occurs at the position of the selecting head rather than at the position of the gap. They noted that the sentence in Example (39a), in which the selecting head (in) follows the heavy NP object, is difficult to process, but the sentence in Example (40b), in which the selecting head (put) precedes the heavy NP, is not difficult to process, despite the fact that the putative trace would follow the NP: (39a) Which box did you put the very large and beautifully decorated wedding cake bought from the expensive bakery in ? (39b) In which box did you put the very large and beautifully decorated wedding cake bought from the expensive bakery ?

The evidence for the existence of traces in LDD constructions is inconclusive at this stage. More research is needed to determine whether there is empirical evidence for them. For now, it remains a theory-internal question. See also: Constituent Structure; Island Constraints; Relative Clauses; Transformational Grammar: Evolution.

Bibliography Bouma G, Malouf R & Sag I A (2001). ‘Satisfying constraints on extraction and adjunction.’ Natural Language and Linguistic Theory 19, 1–65. Bresnan J (1995). ‘Linear order, syntactic rank, and empty categories: on weak crossover.’ In Dalrymple M, Kaplan R M, Maxwell III J T & Zaenen A (eds.) Formal issues in lexical-functional grammar. Stanford, CA: CSLI Publications. 241–274.

Bresnan J (2001). Lexical-functional syntax. Oxford: Blackwell. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1973). ‘Conditions on transformations.’ In Anderson S R & Kiparsky P (eds.) A festschrift for Morris Halle. New York: Holt Rinehart and Winston. 232–286. Chomsky N (1977). ‘On wh-movement.’ In Culicover P W, Wasow T & Akmajian A (eds.) Formal syntax. New York: Academic Press. 71–132. Chomsky N (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky N (1986). Barriers. Cambridge, MA: MIT Press. Chomsky N (1995). The minimalist program. Cambridge, MA: MIT Press. Chomsky N (2000). ‘Minimalist inquiries: the framework.’ In Martin R, Michaels D & Uriagereka J (eds.) Step by step: essays on minimalist syntax in honor of Howard Lasnik. Cambridge, MA: MIT Press. 89–155. Culicover P W (2001). ‘Parasitic gaps: a history.’ In Culicover P W & Postal P M (eds.) Parasitic gaps. Cambridge, MA: MIT Press. 3–68. Falk Y N (2001). Lexical-functional grammar: an introduction to parallel constraint-based syntax. Stanford, CA: CSLI Publications. Gazdar G, Klein E, Pullum G K & Sag I (1984). Generalized phrase structure grammar. Oxford: Basil Blackwell. Ginzburg J & Sag I A (2000). Interrogative investigations: the form, meaning, and use of English interrogatives. Stanford, CA: CSLI Publications. Jacobson P (1982). ‘Evidence for gap.’ In Jacobson P & Pullum G K (eds.) The nature of syntactic representation. Dordrecht: Reidel. 187–228. Kaplan R M & Zaenen A (1989). ‘Long-distance dependencies, constituent structure, and functional uncertainty.’ In Baltin M R & Kroch A S (eds.) Alternative conceptions of phrase structure. Chicago: University of Chicago Press. 17–42. Pollard C & Sag I A (1994). Head-driven phrase structure grammar. Stanford, CA: CSLI Publications. Postal P M & Pullum G K (1978). ‘Traces and the description of English complementizer contraction.’ Linguistic Inquiry 9, 1–29. Pullum G K (1997). ‘The morpholexical nature of English to contraction.’ Language 73, 79–102. Radford A (1997). Syntax: a minimalist introduction. Cambridge: Cambridge University Press. Ross J R (1967). ‘Constraints on variables in syntax.’ Ph.D. diss., MIT. Zaenen A (1983). ‘On syntactic binding.’ Linguistic Inquiry 14, 469–504.

324 Long-Range Comparison: Methodological Disputes

Long-Range Comparison: Methodological Disputes L Campbell, University of Utah, Salt Lake City, UT, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction How are languages shown to be related to one another? Proposals of distant linguistic kinship such as Amerind, Nostratic, Eurasiatic, and Proto-World have received much attention in recent years, although these same proposals are rejected by a majority of practicing historical linguists. This has resulted in vigorous disputes about the methods for investigating remote relationships among languages as yet not known to be related. Some enthusiasts of longrange relationships, disappointed that proposed language connections they favor have not been accepted, have at times responded bitterly, for example charging that these rejections are just ‘‘clumsy and dishonest attempts to discredit deep reconstructions’’ (Shevoroshkin, 1989: 7), and that ‘‘very few [critics of long-range proposals] have ever bothered to examine the evidence first-hand. . . To really screw up classification you almost have to have a Ph.D. in historical linguistics’’ (Ruhlen, 1994: viii). The strong rhetoric is not all one-sided: At a different level – which transcends scientific worth to such an extent that it is at the fringe of idiocy – there have in recent years been promulgated a number of farfetched ideas concerning ‘long-distance relationships’, such as ‘Nostratic’, ‘Sino-Caucasian’, and ‘Amerind’. (Dixon, 2002: 23)

This article explains these disputes.

Hypothesized Long-Range Relationships The list in Table 1 of the better-known hypotheses that would group together languages that are not yet known to be related gives an idea of what is at issue. (None of the proposed genetic relationships in this list has been demonstrated yet, even though some are repeated frequently.)

Methods Scholars agree that a successful demonstration of linguistic kinship depends on adequate methods, but disagree about what these methods are. Hence discussions of methodology assume a central role in considerations of long-range comparisons. Therefore, the methodological principles and criteria considered important for investigating proposals of distant genetic relationship are surveyed here.

In practice, the successful methods for establishing distant linguistic affinity have not been different from those used to establish any family relationship, close or distant. The comparative method has always been the primary tool. Because the methods for distant relationships are not different from those for more closely related languages, we encounter a continuum from established families (e.g., Indo-European, FinnoUgric, Mayan, Bantu), to more distant but solidly demonstrated relationships (e.g., Uralic, SiouanCatawban, Benue-Congo), to plausible but inconclusive hypotheses (e.g., Indo-Uralic, Proto-Australian, Macro-Mayan, Niger-Congo), to doubtful but not implausible ones (e.g., Altaic, Austro-Tai, EskimoUralic, Nilo-Saharan), and on to virtually impossible proposals (e.g., Basque-SinoTibetan-NaDene, IndoPacific Mayan-Turkic, Miwok-Uralic, Niger-Saharan). It is difficult on the basis of standard methods to segment this continuum so that plausible proposals based on legitimate procedures fall sharply on one side, distinguished from obviously unlikely hypotheses on the other side. This leads to disagreements, even by those who profess allegiance to the same methods. A firm understanding of methodology becomes crucial if supporters of fringe proposals can pretend to apply the same methods as those employed for more plausible ones. For this reason, careful evaluation of the evidence presented on behalf of any proposed distant linguistic relationship and of the methods employed is called for. Throughout history, the criteria employed in both pronouncements about method and in actual practice for establishing language families consistently included evidence from three sources: basic vocabulary, grammatical evidence (especially morphological), and sound correspondences. Hoenigswald (1990: 119– 120) summarized the points upon which 17th- and 18th-century linguistic scholars agreed: There was . . . ‘‘a concept of the development of languages into dialects and of dialects into new independent languages’’ . . . and . . . ‘‘an insistence that not a few random items, but a large number of words from the basic vocabulary should form the basis of comparison’’ . . . the doctrine that ‘grammar’ is even more important than words; . . . the idea that for an etymology to be valid the differences in sound – or in ‘letters’ – must recur [emphasis added, LC]. These criteria figured prominently in nearly all demonstrations of language families in the past, making them important also to today’s practice. The methods and criteria generally thought necessary for reliable long-range comparison are surveyed in what follows. (See Campbell, 2003; Campbell, 1997a: 206–259 for details.)

Long-Range Comparison: Methodological Disputes 325 Table 1 Proposals of distant genetic relationships among languages Altaic (Turkic, Tungusic, Mongolian, sometimes Japanese, Korean) Amerind (uniting all Native American language families, except Eskimo-Aleut and Na-Dene) Dene-Sino Tibetan (Athabaskan [or Na-Dene] and Sino-Tibetan) Austric (Austro-Asiatic with Austronesian) Austro-Tai (Japanese-Austro-Thai) Basque-Caucasian, Basque-Sino Tibetan-NaDene Dravidian-Uralic Eskimo and Indo-European Eskimo-Uralic Eurasiatic (Indo-European, Uralic, Eskimo-Aleut, Ainu, several others) Hokan (grouping numerous American Indian families and isolates) Indo-European and Afroasiatic, Indo-European and Semitic Indo-Pacific (grouping the non-Austronesian languages of the Pacific: all Papuan families, Tasmanian, languages of the Andaman Islands) Japanese-Austronesian Khoisan (grouping most non-Bantu African click languages, an areal grouping, not a genetic one) Macro-Siouan (Siouan, Iroquoian, Caddoan, sometimes Yuchi) Maya-Chipayan (Mayan, Uru-Chipayan of Bolivia) Na-Dene (Eyak-Athabaskan, Tlinglit, Haida; Haida is highly disputed) Niger-Kordofanian (Niger-Congo) (grouping Mande, Kru, Kwa, Benue-Congo [of which Bantu is a branch], Gur, AdamawaUbangi, Kordofanian, etc.) Nilo-Saharan (most of the African languages not otherwise classified with one of Greenberg’s other three African macrofamilies) Nostratic (Indo-European, Uralic, Altaic, Kartvelian, Dravidian, Afroasiatic; some add others) Penutian (grouping numerous American Indian families and isolates) Proto-Australian (all Australian families) Proto-World (uniting all the world’s languages) Ural-Altaic (Uralic and Altaic)

Lexical Comparison

Throughout history, word comparisons have been employed as evidence of language family relationship, but, given a small collection of likely-looking words, how can we determine whether they are really the residue of common origin and not due to chance or some other factor? Lexical comparisons by themselves are seldom convincing without additional support from other criteria. Basic Vocabulary Most scholars require that basic vocabulary be part of the supporting evidence for any distant genetic relationship. Basic vocabulary is generally understood to include terms for body parts, close kinship, frequently encountered aspects of the natural world (mountain, river, cloud), and low numbers. Basic vocabulary is generally resistant to

borrowing, so comparisons involving basic vocabulary items are less likely to be due to diffusion and stand a better chance of being inherited from a common ancestor than other kinds of vocabulary. Still, basic vocabulary can also be borrowed – though infrequently – so that its role as a safeguard against borrowing is not foolproof. Glottochronology Glottochronology, now mostly abandoned, aimed at assigning dates to the split up of related languages; it has been employed in longrange comparisons. It depends on basic, relatively culture-free vocabulary, but all its basic assumptions have been challenged (including the existence of culture-free vocabulary). Most tellingly, it does not find or test distant genetic relationships, but rather it assumes that the languages compared are related and proceeds to attach a date based on the number of core-vocabulary words that are considered similar among the languages compared. This, then, is no method for determining whether languages are related. Multilateral Comparison The best-known approach that relies on inspectional resemblances among words is Joseph Greenberg’s ‘multilateral (or mass) comparison.’ It is based on ‘‘looking at . . . many languages across a few words’’ rather than ‘‘at a few languages across many words’’ (Greenberg, 1987: 23). The lexical similarities determined by superficial visual inspection that are shared ‘across many languages’ alone are taken as sufficient evidence for genetic relationship. This approach stops where others begin, at assembling lexical similarities. These inspectional resemblances must be investigated to determine why they are similar, whether the similarity is due to inheritance from a common ancestor (genetic relationship), or to borrowing, accident, onomatopoeia, sound symbolism, or nursery formations – nongenetic factors. Since multilateral comparison does not do this, its results are controversial and rejected by most mainstream historical linguists. No technique that relies on inspectional similarities in vocabulary alone has proven adequate for establishing family relationships. Sound Correspondences

Nearly all scholars consider regular sound correspondences strong evidence of genetic affinity. Correspondences do not necessarily involve similar sounds. The sounds that are equated in proposals of remote relationship are typically similar, often identical, although such identities are not so frequent among the daughter languages of well-established language families. The sound changes that lead to such

326 Long-Range Comparison: Methodological Disputes

nonidentical correspondences often make cognate words not apparent. These true but nonobvious cognates are missed by methods that seek only superficial resemblance, for example: French cinq/Russian pjatj/Armenian hing/ English five (all derived by straightforward changes from original Indo-European *penkwe- ‘five’); French boeuf/English cow (both from Proto-Indo-European *gwou- ‘cow’). The words in these cognate sets are not visually similar, but they exhibit regular sound correspondences among the cognates. Though extremely important and valuable, the criterion of sound correspondences can be misapplied. Sometimes regularly corresponding sounds are found in loans. By Grimm’s law, real FrenchEnglish cognates should exhibit the correspondence p : f, as in the cognates pe`re/father, pied/foot, pour/ for. However, French and English appear to correspondence p : p in cases where English has borrowed from French or Latin, as in paternel/paternal, pie´ destal/pedestal, per/per. Since English has many such loans, examples of the bogus p : p sound correspondence are not hard to find. In comparing languages not yet known to be related, we must be cautious of the problem of seeming correspondences in undetected loans. Sound correspondences in basic vocabulary help, since basic vocabulary is borrowed only infrequently. Some nongenuine sound correspondences can come from accidentally similar words. Languages share some vocabulary by sheer accident, for example: Proto-Je *niw ‘new’/English new; Kaqchikel mes ‘mess’/English mess; Maori kuri ‘dog’/English cur; Lake Miwok ho´ llu ‘hollow’/English hollow; Gbaya be ‘to be’/English be. Other unreal sound correspondences can come from wide semantic latitude in proposed cognates, when phonetically similar but semantically disparate forms are equated. For example, if we compare Pipil (Uto-Aztecan) teki ‘to cut’/ Finnish (Uralic) teki ‘made’, tukat ‘spider’/tukat ‘hairs’, etc., we note a recurrence of a t : t and a k : k correspondence. However, the phonetic correspondences in these word pairs are accidental – it is always possible to find phonetically similar words among languages if their meanings are ignored. With too much semantic leeway among compared forms, spurious correspondences such as the Pipil-Finnish t : t and k : k turn up. Unfortunately, wide semantic latitude is very common in cases of long-range comparison. Additional noninherited phonetic similarities crop up when onomatopoetic, sound-symbolic, and nursery forms are compared. A set of proposed cognates involving a combination of loans, chance enhanced by semantic latitude, onomatopoeia, and such factors can exhibit false sound correspondences.

For this reason, some proposed remote relationships that purportedly are based on regular sound correspondences nevertheless fail to be convincing. Grammatical Evidence

Scholars throughout linguistic history have considered morphological evidence important for establishing language families. Many favor ‘shared aberrancy’ (‘submerged features,’ ‘morphological peculiarities,’ ‘arbitrary associations’). For example, the Algonquian-Ritwan hypothesis, which groups Wiyot and Yurok (both of California) with the Algonquian family, was controversial, but morphological evidence such as the following comparison of Proto-Central-Algonquian (PCA) and Wiyot helped to confirm the relationship: Proto-Central-Algonquian *ne þ *ehkw¼ *netehkw-‘my louse’ Wiyot du- þ hı´kw ¼ dutı´kw ‘my louse’ (Teeter, 1964: 1029)

Proto-Central-Algonquian inserts -t between a possessive pronominal prefix and a vowel-initial root, while Wiyot inserts -t- between possessive prefixes and a root beginning in hV (with the loss of the h in this process). There is no phonetic reason why t should be added in this environment; this is so unusual it is not likely to be shared by borrowing or accident. Inheritance from a common ancestor that had this peculiarity is more likely, and this is confirmed by other evidence in these languages. Another often repeated example of shared aberrancy is the suppletive agreement between English good/better/ best and German gut/besser/best, where examples such as this are held to have probative value for showing languages are related. Morphological correspondences of the ‘shared aberrancy’ type are an important source of evidence for distant genetic relationships. Borrowing

Diffusion is a source of nongenetic similarity among languages that can complicate evidence for remote relationships. For example, the controversial ‘Chibchan-Paezan’ hypothesis (grouping several South American language families, part of ‘Amerind’) has the proposed cognate set ‘axe’ with words from only four of the many languages involved, but two of these are loans: Cuitlatec navaxo ‘knife’, from Spanish navajo ‘knife, razor’, and Tunebo baxita ‘machete’, from Spanish machete (Tunebo has nasal consonants only before nasal vowels, hence b substitutes for Spanish m) (Greenberg, 1987: 108). When two of the four pieces of evidence are borrowings, the putative ‘axe’ cognate must be abandoned. Examples

Long-Range Comparison: Methodological Disputes 327

such as this are not uncommon in proposals of distant genetic relationship. Semantic Constraints

It is dangerous to present phonetically similar forms with different meanings as potential evidence of remote genetic relationship, assuming semantic shifts have taken place. Of course meaning can shift, but in hypotheses of remote relationship the assumed semantic shifts cannot be documented, and the greater the semantic latitude permitted in compared forms, the easier it is to find phonetically similar forms that have no historical connection (as in the Pipil-Finnish examples, above). When semantically nonequivalent forms are compared, chance phonetic similarity is greatly increased. Within families where the languages are known to be related, etymologies are not accepted unless an explicit account of any assumed semantic changes can be provided. The problem of excessive semantic permissiveness is one of the most common and most serious in long-range comparisons, for example, sets cited for Nostratic compare forms meaning ‘lip/mushroom/soft outgrowth’, ‘grow up/become/tree/be’, for Amerind hypothesis ‘excrement/night/grass’, ‘child/copulate/son/girl/boy/ tender/bear/small’. It is for reasons such as this that these proposals of more remote linguistic relationship are disputed. Onomatopoeia

Onomatopoetic words imitate the real-world sound associated with their meanings. They may be similar in different languages because they have independently approximated the sounds of nature, not because they share any common history. A way to reduce the sound-imitative problem is to omit from long-range comparisons any word which cross-linguistically frequently has similar imitative form, for example ‘blow’, ‘breathe’, ‘suck’, ‘laugh’, ‘cough’, ‘sneeze’, ‘break/cut/chop/split’, ‘cricket’, ‘crow’ (bird names in general), ‘frog/toad’, ‘lungs’, ‘baby/infant’, ‘beat/ hit/pound’, ‘call/shout’, ‘choke’, ‘cry’, ‘drip/drop’, ‘hiccough’, ‘kiss’, ‘shoot’, ‘snore’, ‘spit’, and ‘whistle’. Unfortunately, examples of onomatopoetic words are frequent in proposals of distant genetic relationships. Nursery Forms

Nursery words (the ‘mama-nana-papa-dada-caca’ sort) should be avoided, since they typically share a high degree of cross-linguistic similarity that is not due to common ancestry. Nevertheless, examples of nursery words are frequent in cases of long range comparison. The words involved are typically ‘mother’, ‘father’, ‘grandmother’, ‘grandfather’, and often ‘brother’, ‘sister’, ‘aunt’, and ‘uncle’, and have shapes

like mama, nana, papa, baba, tata, dada. Jakobson (1962[1960]: 542–543) explained the cross-linguistic nongenetic similarity among nursery forms. Nursery words provide no reliable support for genetic relationship. Short Forms and Unmatched Segments

How long proposed cognates are and the number of matched sounds within them are important, since the greater the number of matching sounds in a proposed cognate set, the less likely it is that accident accounts for the similarity. Monosyllabic words (CV, VC, V) are so short that their similarity to forms in other languages could also easily be due to chance. If only one or two sounds of longer forms are matched, chance may explain the similarity. Such comparisons are not persuasive. Chance Similarities

Chance (accident) is another possible explanation for similarities in compared languages and needs to be avoided. The potential for accidental matching increases dramatically when one leaves the realm of basic vocabulary, when one increases the pool of words from which potential cognates are sought, and when one permits the semantics of compared forms to vary even slightly (Ringe, 1992: 5). Cases of similar but noncognate words are wellknown, for example French feu and German Feuer fire’, English much and Spanish mucho ‘much’. The phonetic similarity in these basic vocabulary items is due to accidental convergence due to the sound changes that they have undergone, not to inheritance from any common word in the proto language. That originally distinct forms in different languages can become similar due to sound changes is not surprising, since even within a single language originally distinct words can converge due to sound changes, for example, English lie/lie (from Proto-Germanic *ligjan ‘to lie, lay’/*leugan ‘to tell a lie’). Sound–Meaning Isomorphism

Only comparisons which involve both sound and meaning together are permitted. Similarities in sound alone (for example, the presence of tones in compared languages) or in meaning alone (for example, grammatical gender in languages compared) are not reliable, since they can develop independently of genetic relationship, due to diffusion, accident, and typological tendencies. Only Linguistic Evidence

Only linguistic information, no nonlinguistic consideration, is permitted as evidence of distant genetic

328 Long-Range Comparison: Methodological Disputes

relationship. Shared cultural traits, mythology, folklore, technologies, and gene pools must be eliminated from arguments for linguistic relationship. The wisdom of this is seen in face of the many strange proposals based on nonlinguistic evidence. For example, some earlier African classifications proposed that Ari (Omotic) belongs to either Nilo-Saharan or Sudanic ‘because the Ari people are Negroes’ (‘racial’ evidence), that Moru and Madi belong to Sudanic because they are located in central Africa (geographical evidence), or that Fula is Hamitic because its speakers herd cattle, are Moslems (cultural evidence), and are tall and Caucasoid (physical attributes) (Fleming, 1987: 207). Clearly the language one speaks does not deterministically depend on one’s cultural and biological connections. Erroneous Morphological Analysis

Where compared words are analyzed with more than one morpheme, it is necessary to show that the segmented morphemes in fact exist in language. Unfortunately, unmotivated morphological divisions are frequent in proposals of remote relationship. Often, a morpheme boundary is inserted where none is justified, as for example, the arbitrarily segmented Tunebo ‘machete’ as baxi-ta (borrowed from Spanish machete, and contains no morpheme boundary). This false morphological segmentation falsely makes the Tunebo word appear more similar to the other proposed cognates, Cabecar bak and Andaqui boxo-(ka) ‘axe’ (Greenberg, 1987: 108). Undetected morpheme divisions are also a problem. An example from the Amerind hypothesis compares Tzotzil ti?il ‘hole’ with Lake Miwok talokh ‘hole’, Atakapa tol ‘anus’, Totonac tan ‘buttocks’, Takelma telkan ‘buttocks’ (Greenberg, 1987: 152); however, the Tzotzil form is ti?-il, from ti? ‘mouth’ þ -il ‘indefinite possessive suffix’, meaning ‘edge, border, lips, mouth’, but not ‘hole’. The appropriate comparison ti? ‘mouth’ bears no particular resemblance to the other forms with which it is compared. Spurious Forms

Another problem is that of nonexistent or erroneous ‘data’ from ‘bookkeeping’ problems and ‘scribal’ errors. For example, for the Mayan-MixeZoquean hypothesis (Brown and Witkowski, 1979), MixeZoquean words meaning ‘shell’ were compared with K’iche’ (Mayan) sak’, said to mean ‘lobster’, actually ‘grasshopper’ – a misunderstanding of Spanish langosta, which in Guatemala (where K’iche’ is spoken) means ‘grasshopper’, but ‘lobster’ in other varieties of Spanish. A comparison of ‘shell’ and ‘grasshopper’ makes no sense. Errors of this sort can be serious;

for example, in the Amerind hypothesis (Greenberg, 1987) none of the words given as Quapaw are in fact from Quapaw; all are from Biloxi and Ofo; none of the words given as Proto-Mayan are from ProtoMayan, rather from Proto-K’ichean. Given the disputes about proposed distant genetic relationships, these methodological principles for long-range comparison are extremely important. Research on possible distant genetic relationships that does not conform to these methodological principles and cautions will remain inconclusive.

Some Examples of Long-Range Proposals It will be instructive to look briefly at some specific proposals to see why most mainstream historical linguists do not accept these hypotheses. (Space does not permit full evaluation, but references are given for more detail.) Altaic

The Altaic hypothesis would group Turkic, Mongolian, and Tungusic; some versions also include Korean and Japanese. While ‘Altaic’ is repeated in encyclopedias, most leading ‘Altaicists’ have abandoned the hypothesis. The most serious problems are the extensive borrowing among the ‘Altaic’ languages, lack of convincing cognates, lack of basic vocabulary, extensive areal diffusion, problems with the putative sound correspondences, and reliance on typologically commonplace traits. The shared ‘Altaic’ traits include vowel harmony, relatively simple phoneme inventories, agglutination, suffixing, (S)OV word order (and postpositions), no verb ‘to have’ for possession, no articles or gender, and nonmain clauses in nonfinite (participial) constructions. However, these shared features are commonplace typological traits, and thus are not good evidence of genetic relationship because they can easily develop independently in unrelated languages. These ‘Altlaic’ features are also areal traits, shared by a number of languages in surrounding regions, thus perhaps due to diffusion. Similarities in the first and second person pronoun paradigms have impressed proponents of Altaic, although critics point out that pronouns are borrowed far more frequently than proponents acknowledge and pronoun patterns of the type cited for Altaic are also not unusual nor unexpected cross-linguistically. In short, the evidence for genetic relationship has not been persuasive, explaining why so many reject the ‘Altaic’ hypothesis. (Campbell and Poser, in press.) Nostratic

The Nostratic hypothesis as advanced in the 1960s by Illich-Svitych would group Indo-European, Uralic,

Long-Range Comparison: Methodological Disputes 329

Altaic, Kartvelian, Dravidian, and Hamito-Semitic [later Afroasiatic], though other versions of the hypothesis would include various other languages. The sheer number of languages and many proposed cognates involved might make it seem difficult to evaluate Nostratic. Nevertheless, assessment is possible. With respect to the many putative cognate sets, assessment can concentrate on those cases considered the strongest by proponents of Nostratic (those of Dolgopolsky, 1986 and Kaiser and Shevoroshkin, 1988). Campbell (1998) shows that these strongest cases do not hold up and that the weaker sets are not persuasive (see below). We can easily determine to what extent the proposed reconstructions correspond to typological expectations, whether the proposed cognates are permissive in semantic associations, and when onomatopoeia, forms too short to deny chance, nursery forms, and the like are involved. Illich-Svitych’s version of Nostratic exhibits the following methodological problems. (See Campbell, 1998, 1999 for details.) 1. Descriptive forms. Illich-Svitych is forthright in labeling 26 of his 378 forms as ‘descriptive,’ meaning onomatopoetic, affective, or sound-symbolic, i.e., 7% of the total. There are 16 additional onomatopoetic, affective, or sound-symbolic forms, not so labeled, or a total of approximately 11%. 2. Questionable cognates. Illich-Svitych himself indicates that 57 or the 378 sets are questionable (15%), signaled with a question mark. However, this number should be greatly increased, since in numerous forms Illich-Svitych signals problems in other ways, with slanted lines (/ /) for things not conforming to expectation, with question marks, and with upper-case letters in reconstructions to indicate uncertainties or ambiguities. 3. Sets with only two families represented. One of Illich-Svitych’s criteria was that only cognate sets with representatives from at least three of the six ‘Nostratic’ families would be considered as supportive. Nevertheless, 134 of the 378 sets involve forms from only two families (35%), questionable by Illich-Svitychs own criteria. 4. Noncorresponding sound correspondences. Frequently, the forms presented as evidence of Nostratic do not exhibit the proposed sound correspondences, i.e., they have sounds at odds with those that would be required according to the Nostratic correspondence sets. Campbell (1998), looking mostly only at stops and only at the Indo-European and Uralic data, found 25 sets that did not follow the proposed Nostratic correspondences. There is another way in which

5.

6.

7.

8.

9.

Illich-Svitych’s putative sound correspondences are not consistent with the standard comparative method. Several of the putative Nostratic sounds are not reflected by regular sound correspondences in the languages. For example, ‘‘in Kartv[elian] and Indo-European, the reflexes of Nostratic [**]p are found to be unstable’’ (Illich-Svitych, 1990: 168). Nostratic forms beginning in **p reveal that both the Indo-European and the Kartvelian forms arbitrarily begin with either *p or *b, but this is not regular sound change and is not sanctioned by the comparative method. Similarly, glottalization in Afroasiatic is said to occur ‘‘sporadically under other conditions still not clear’’ (Illich-Svitych, 1990: 168). In the correspondence sets, several of the languages are listed with multiple reflexes of a single Nostratic sound, but with no explanation of conditions under which the distinct reflexes might appear. Short forms. Of Illich-Svitych’s 378 forms, 57 (15%) involve short forms (CV, VC, C, or V), incapable of denying chance as an alternative explanation. Semantically nonequivalent forms. Some 55 cases (16%) involve comparisons of forms in the different languages that are fairly distinct semantically. Diffused forms. Given the history of central Eurasia, with much language contact, it is not at all surprising that some forms turn out to be borrowed. Several of the Nostratic cognates have words which have been identified by others as loans, including: ‘sister-in-law’, ‘water’, ‘do’, ‘give’, ‘carry’, ‘lead’, ‘to do’/‘put’, ‘husband’s sister’, to which we can add the following as probable or possible loans: ‘conifer, branch, point’, ‘thorn’; ‘poplar’; ‘practice witchcraft’; ‘deer’; ‘vessel’; ‘birch’; ‘bird cherry’; ‘honey’, ‘mead’; ‘poplar’. Typological problems. Nostratic as traditionally reconstructed is typologically flawed. Counter to expectations, few Nostratic roots contain two voiceless stops; glottalized stops are considerably more frequent than their plain counterparts; Nostratic affricates change to a cluster of fricative þ stop in Indo-European. Evaluation of the strongest lexical sets. An examination of the Nostratic sets held by proponents to be the strongest reveals serious problems with most. These include Dolgopolsky’s (1986) 15 most stable lexemes. Most are questionable in one way or another according to the standard criteria for assessing proposals of remote linguistic kinship. In the Nostratic sets representing Dolgopolsky’s 15 most stable glosses, four have problems with phonological correspondences; five

330 Long-Range Comparison: Methodological Disputes

involve excessive semantic difference among the putative cognates; four have representatives in only two of the putative Nostratic families; two involve problems of morphological analysis; Illich-Svitych himself listed one as doubtful; and finally, one reflects the tendency to rely too heavily on Finnish when not supported by the historical evidence. All but two are challenged, and for these two the relevant forms needed for evaluation are not present. (See Campbell, 1998 for details.) These ‘strong’ cases are certainly not sufficiently robust to encourage faith in the proposed genetic relationship. Once again, it is for reasons of this sort that most historical linguists reject Nostratic. Amerind

Greenberg (1987) proposed that all Native American languages, except Na-Dene and Eskimo-Aleut languages, belong to single macro-family, Amerind, based on multilateral comparison (see above). Amerind is rejected by virtually all specialists in Native American languages and by the vast majority of historical linguists. Specialists maintain that valid methods do not at present permit reduction of Native American languages to fewer than about 150 independent language families and isolates. Amerind has been highly criticized on various grounds. There are exceedingly many errors in Greenberg’s data: ‘‘the number of erroneous forms probably exceeds that of the correct forms’’ (Adelaar, 1989: 253). Where Greenberg stops – after assembling superficial similarities and declaring them due to common ancestry – is where other linguists begin. Since such similarities can be due to chance, borrowing, onomatopoeia, sound symbolism, nursery words (the mama, papa, nana, dada, caca sort), misanalysis, and much more, for a plausible proposal of remote linguistic relationship, one must attempt to eliminate all other possible explanations, leaving a shared common ancestor as the most likely. Greenberg made no attempt to eliminate these other explanations, and the similarities he amassed appear to be due mostly to accident and a combination of these other factors: ‘‘I find no evidence whatsoever that [Greenberg’s] putative cognate sets . . . represent anything other than chance similarities’’ (Ringe, 1996: 152). In various instances, Greenberg compared arbitrary segments of words, equated words with very different meanings (for example, ‘excrement/night/grass’), misidentified many languages, failed to analyze the morphology of some words and falsely analyzed that of others, neglected regular sound correspondences, failed to eliminate loanwords, and misinterpreted well-established

findings. The Amerind ‘etymologies’ proposed are often limited to a very few languages of the many involved. (For details and examples, see Adelaar, 1989; Berman, 1992; Campbell, 1988, 1997a; Kimball, 1993; McMahon and McMahon, 1995; Poser, 1992; Rankin, 1992; Ringe, 1992, 1996). Finnish, Japanese, Basque, and other randomly chosen languages fit Greenberg’s Amerind data as well as or better than any of the American Indian languages do. Greenberg’s method has proven incapable of distinguishing implausible relationships from Amerind generally (Campbell, 1988; Campbell, 1997b). In short, it is with good reason Amerind has been rejected.

Bibliography Adelaar W F H (1989). ‘Review of Language in the Americas, by Joseph H. Greenberg.’ Lingua 78, 249–255. Berman H (1992). ‘A comment on the Yurok and Kalapuya data in Greenberg’s Language in the Americas.’ International Journal of American Linguistics 58, 230–233. Brown C H & Witkowski S R (1979). ‘Aspects of the phonological history of Mayan-Zoquean.’ International Journal of American Linguistics 45, 34–47. Campbell L (1988). ‘Review of Language in the Americas, by Joseph H. Greenberg.’ Language 64, 591–615. Campbell L (1994). ‘The American Indian classification controversy: an insider’s view.’ Mother Tongue 23, 41–55. Campbell L (1997a). American Indian languages: the historical linguistics of Native America. Oxford: Oxford University Press. Campbell L (1997b). ‘Genetic classification, typology, areal linguistics, language endangerment, and languages of the north Pacific rim.’ In Miyaoka et al. (eds.) Languages of the North Pacific Rim, vol. 2. Kyoto: Kyoto University. 179–242. Campbell L (1998). ‘Nostratic: a personal assessment.’ In Joseph B & Salmons J (eds.) Nostratic: sifting the evidence. Amsterdam: John Benjamins. 107–152. Campbell L (1999). ‘Nostratic and linguistic palaeontology in methodological perspective.’ In Renfrew C & Nettle D (eds.) Nostratic: evaluating a linguistic macrofamily. Cambridge: The McDonald Institute for Archaeological Research. 179–230. Campbell L (2003). ‘How to show languages are related: methods for distant genetic relationship.’ In Joseph B & Janda R (eds.) Handbook of historical linguistics. Oxford: Blackwell. 262–282. Campbell L & Poser W (in press). How to show languages are related. Dixon R M W (2002). Australian languages. Cambridge: Cambridge University Press. Dolgopolsky A (1986). ‘A probabilistic hypothesis concerning the oldest relationships among the language families in northern Eurasia.’ In Shevoroshkin V V & Markey T L (eds.) Typology, relationship, and time: a

Loth, Joseph (1847–1934) 331 collection of papers on language change and relationship by Soviet linguists. Ann Arbor: Karoma Publishers. 27–50. Fleming H C (1987). ‘Review article: towards a definitive classification of the world’s languages (review of A guide to the world’s languages, by Merrritt Ruhlen).’ Diachronica 4, 159–223. Greenberg J H (1987). Language in the Americas. Stanford: Stanford University Press. Hoenigswald H M (1990). ‘Is the ‘‘comparative’’ method general or family specific?’ In Baldi P (ed.) Linguistic change and reconstruction methodology. Berlin: Mouton de Gruyter. 375–383. Illich-Svitych V M (1990). ‘The Nostratic reconstructions of V. Illich-Svitych, translated and arranged by Mark Kaiser.’ In Shevoroshkin V (ed.) Proto-languages and proto-cultures: materials from the first international interdisciplinary symposium on language and prehistory. Bochum: Brockmeyer. 138–167. Jakobson R (1960). ‘Why ‘‘mama’’ and ‘‘papa’’?’ In Kaplan B & Wapner S (eds.) Perspectives in psychological theory. New York: International Universities Press. 21–29. (Reprinted 1962 Roman Jakoson selected writings, vol. 1: phonological studies. The Hague: Mouton. 538–545.) Janhunen J (1989). ‘Any chances for long-range comparisons in North Asia?’ Mother Tongue 6, 28–30. Kaiser M & Shevoroshkin V (1988). ‘Nostratic.’ Annual Review of Anthropology 17, 309–330.

Kimball G (1992). ‘A critique of Muskogean, ‘‘Gulf,’’ and Yukian material in Language in the Americas.’ International Journal of American Linguistics 58, 447–501. McMahon A & McMahon R (1995). ‘Linguistics, genetics and archaeology: internal and external evidence in the Amerind controversy.’ Transactions of the Philological Society 93, 125–225. Poser W J (1992). ‘The Salinan and Yurumanguı´ data in Language in the Americas.’ International Journal of American Linguistics 58, 202–229. Rankin R L (1992). ‘Review of Language in the Americas, by Joseph Greenberg.’ International Journal of American Linguistics 58, 324–350. Ringe D A Jr (1992). ‘On calculating the factor of chance in language comparison.’ Transactions of the American Philosophical Society 82(1), 1–110. Ringe D A Jr (1996). ‘The mathematics of ‘‘Amerind.’’’ Diachronica 13, 135–154. Ruhlen M (1994). On the origin of languages: studies in linguistic taxonomy. Stanford: Stanford University Press. Shevoroshkin V (1989). ‘A symposium on the deep reconstruction of languages and cultures.’ In Shevoroshkin V (ed.) Reconstructing languages and cultures: materials from the first international interdisciplinary symposium on language and prehistory. Bochum: Brockmeyer. 6–8. Teeter K V (1964). ‘Algonquian languages and genetic relationship.’ In Proceedings of the 9th International Congress of Linguists. The Hague: Mouton. 1026–1033.

Loth, Joseph (1847–1934) P-Y Lambert, The Sorbonne, Paris, France ! 2006 Elsevier Ltd. All rights reserved.

Born in 1847 near Gue´me´ne´-sur-Scorff, Loth was a native speaker of Breton Vannetais. After he obtained his Baccalaure´at in Vannes ‘Grand-Se´minaire,’ he chose to remain a layman and became a teacher. Some time after the war of 1870, he came to Paris to resume his studies. After earning a diploma at the E´cole des Hautes E´tudes and writing a thesis ‘e`s lettres’ (1884), he was given the post of Lecturer in Greek in Rennes University – where he soon became a professor (1889) and then dean of the arts faculty. In 1886 he founded a scholarly journal, Annales de Bretagne, and he lectured on Celtic languages from 1903 on. He succeeded D’Arbois de Jubainville as Professor of Celtic in the Colle`ge de France (1910– 1930). He was also a member of the Acade´mie des Inscriptions et Belles-Lettres, and director of the journal Revue Celtique when he died in 1934 at the age of 87. His life was extremely productive; he left

an important corpus of work and had a formative influence on many pupils. At the E´cole des Hautes E´tudes, his thesis concerned the British (Brittonic) colonization of Brittany from the 5th to the 7th century: his general theory was mainly in agreement with the ideas of modern historians, such as Arthur de La Borderie: the Bretons were immigrants coming from Great Britain into Armorica. His diploma was a dictionary collecting all the evidence provided by Old Welsh, Old Cornish, and Old Breton glosses. Joseph Loth wanted to make known to every Breton the close kinship of their language with that of the Welsh (he himself had a Welsh wife). For that, he had to teach the history of Breton and he published a selection of Breton texts from different periods, the Chrestomathie bretonne. With another book on the Latin borrowings in the Brittonic dialects (1892), he became the foremost authority on the phonetic history of British Celtic. He published a splendid translation of the Welsh prose tales, Mabinogion (1913), and a study of

Loth, Joseph (1847–1934) 331 collection of papers on language change and relationship by Soviet linguists. Ann Arbor: Karoma Publishers. 27–50. Fleming H C (1987). ‘Review article: towards a definitive classification of the world’s languages (review of A guide to the world’s languages, by Merrritt Ruhlen).’ Diachronica 4, 159–223. Greenberg J H (1987). Language in the Americas. Stanford: Stanford University Press. Hoenigswald H M (1990). ‘Is the ‘‘comparative’’ method general or family specific?’ In Baldi P (ed.) Linguistic change and reconstruction methodology. Berlin: Mouton de Gruyter. 375–383. Illich-Svitych V M (1990). ‘The Nostratic reconstructions of V. Illich-Svitych, translated and arranged by Mark Kaiser.’ In Shevoroshkin V (ed.) Proto-languages and proto-cultures: materials from the first international interdisciplinary symposium on language and prehistory. Bochum: Brockmeyer. 138–167. Jakobson R (1960). ‘Why ‘‘mama’’ and ‘‘papa’’?’ In Kaplan B & Wapner S (eds.) Perspectives in psychological theory. New York: International Universities Press. 21–29. (Reprinted 1962 Roman Jakoson selected writings, vol. 1: phonological studies. The Hague: Mouton. 538–545.) Janhunen J (1989). ‘Any chances for long-range comparisons in North Asia?’ Mother Tongue 6, 28–30. Kaiser M & Shevoroshkin V (1988). ‘Nostratic.’ Annual Review of Anthropology 17, 309–330.

Kimball G (1992). ‘A critique of Muskogean, ‘‘Gulf,’’ and Yukian material in Language in the Americas.’ International Journal of American Linguistics 58, 447–501. McMahon A & McMahon R (1995). ‘Linguistics, genetics and archaeology: internal and external evidence in the Amerind controversy.’ Transactions of the Philological Society 93, 125–225. Poser W J (1992). ‘The Salinan and Yurumanguı´ data in Language in the Americas.’ International Journal of American Linguistics 58, 202–229. Rankin R L (1992). ‘Review of Language in the Americas, by Joseph Greenberg.’ International Journal of American Linguistics 58, 324–350. Ringe D A Jr (1992). ‘On calculating the factor of chance in language comparison.’ Transactions of the American Philosophical Society 82(1), 1–110. Ringe D A Jr (1996). ‘The mathematics of ‘‘Amerind.’’’ Diachronica 13, 135–154. Ruhlen M (1994). On the origin of languages: studies in linguistic taxonomy. Stanford: Stanford University Press. Shevoroshkin V (1989). ‘A symposium on the deep reconstruction of languages and cultures.’ In Shevoroshkin V (ed.) Reconstructing languages and cultures: materials from the first international interdisciplinary symposium on language and prehistory. Bochum: Brockmeyer. 6–8. Teeter K V (1964). ‘Algonquian languages and genetic relationship.’ In Proceedings of the 9th International Congress of Linguists. The Hague: Mouton. 1026–1033.

Loth, Joseph (1847–1934) P-Y Lambert, The Sorbonne, Paris, France ! 2006 Elsevier Ltd. All rights reserved.

Born in 1847 near Gue´me´ne´-sur-Scorff, Loth was a native speaker of Breton Vannetais. After he obtained his Baccalaure´at in Vannes ‘Grand-Se´minaire,’ he chose to remain a layman and became a teacher. Some time after the war of 1870, he came to Paris to resume his studies. After earning a diploma at the E´cole des Hautes E´tudes and writing a thesis ‘e`s lettres’ (1884), he was given the post of Lecturer in Greek in Rennes University – where he soon became a professor (1889) and then dean of the arts faculty. In 1886 he founded a scholarly journal, Annales de Bretagne, and he lectured on Celtic languages from 1903 on. He succeeded D’Arbois de Jubainville as Professor of Celtic in the Colle`ge de France (1910– 1930). He was also a member of the Acade´mie des Inscriptions et Belles-Lettres, and director of the journal Revue Celtique when he died in 1934 at the age of 87. His life was extremely productive; he left

an important corpus of work and had a formative influence on many pupils. At the E´cole des Hautes E´tudes, his thesis concerned the British (Brittonic) colonization of Brittany from the 5th to the 7th century: his general theory was mainly in agreement with the ideas of modern historians, such as Arthur de La Borderie: the Bretons were immigrants coming from Great Britain into Armorica. His diploma was a dictionary collecting all the evidence provided by Old Welsh, Old Cornish, and Old Breton glosses. Joseph Loth wanted to make known to every Breton the close kinship of their language with that of the Welsh (he himself had a Welsh wife). For that, he had to teach the history of Breton and he published a selection of Breton texts from different periods, the Chrestomathie bretonne. With another book on the Latin borrowings in the Brittonic dialects (1892), he became the foremost authority on the phonetic history of British Celtic. He published a splendid translation of the Welsh prose tales, Mabinogion (1913), and a study of

332 Loth, Joseph (1847–1934)

Welsh metrics, particularly on the archaic poem of the Black Book of Carmarthen. He also studied aspects of Middle Welsh morphology and syntax (the use of the verbal particle rhy, for example). Unfortunately, his contribution to Welsh studies also includes a bitter criticism of the Welsh grammar of John Morris-Jones (Bangor), which created some resentment against him among certain scholars. He had a sound as well as profound knowledge of the living Celtic languages, in addition to his own Bas-Vannetais dialect, specializing particularly in Celtic lexicography and etymology. This scholar of another age, though less productive than Whitley Stokes (see Stokes, Whitley (1830–1909)), gives a genuine insight into the Celtic cultures of the past: his philological studies and his etymologies were a means of reviving the past, by discovering the real meaning of older texts and recovering the lost unity of the Celtic languages. As a linguist, he felt obliged to communicate his results to the historians and the archaeologists: he thoroughly studied such

semantic fields as names of cereal crops, horses, and sanctuaries. He took a great interest in the newly found Gaulish inscriptions of Coligny (1904) and La Graufesenque (1924). Towards the end of his life, alas, he was very active in defense of the fake inscriptions of Glozel. See also: Stokes, Whitley (1830–1909).

Bibliography Loth J (1884). Vocabulaire vieux-breton . . . contenant toutes les gloses en vieux-brteton, gallois, cornique, armoricain, connues. Paris: Vieweg (Reprinted, 1970). Loth J (1892). Les mots latins dans les langues brittoniques (gallois, armoricain, cornique). Paris: Bouillon. Loth J (1913). Les Mabinogion et autres romans gallois tire´s du Livre rouge de Hergest et du Livre blanc de Rhydderch (2nd edn.) (2 vols). Paris: Fontemoing. Vendryes J (1934). ‘Obituary of Joseph Loth.’ Revue Celtique 45, i–vi.

Louisiana Creole T Klingler, Tulane University, New Orleans, LA, USA ! 2006 Elsevier Ltd. All rights reserved.

This French-lexifier creole is spoken by an estimated 10 000–20 000 persons (reliable figures are not available) residing mainly in southwestern Louisiana. Most speakers live along or near the Bayou Teche, especially in the parishes of St. Landry, St. Martin, and Lafayette, but there are also pockets of speakers in several other parishes. Although it is commonly associated with African Americans and Creoles of color, Louisiana Creole (LC) is also the first language of many European Americans. The language has long coexisted with regional varieties of French, often referred to collectively as Cajun, and it is at least in part the continued influence of these varieties that explains why LC is structurally less distant from French than are the French-lexifier creoles of the Caribbean. LC shares a number of important features with Haitian Creole (e.g., the progressive marker ape, the verb gen ‘to have,’ and the possessive particle ke`n/ tche`n), and some linguists maintain that LC had as its origin a creole or pre-creole language imported to Louisiana from the French colony of Saint-Domingue before it became the free republic of Haiti. However, evidence that LC’s development predated the significant population migration from Saint-Domingue to

Louisiana in the early 19th century casts doubt on this claim and strengthens the possibility that LC is indigenous to the region. Today, the future of LC remains uncertain, since most fluent speakers are now elderly and the language is not being passed on to younger generations. LC varies considerably according to region, ethnic group, and social context. The linguistic situation in Louisiana is often said to form a speech continuum, with the type of LC that is furthest removed from French constituting the basilectal pole, and Cajun or, depending on the model used, Referential French constituting the acrolectal pole. Any given utterance, however, may display a greater or lesser quantity of French-like or Creole-like features, such that it may best be assigned to the broad mesolectal range lying between the two poles of the continuum. Like the other French-lexifier creoles, LC features definite articles that are postposed to the noun (tab-la ‘the table,’ chyen-ye ‘the dogs’); a personal pronoun system in which all of the pronouns, regardless of function, are derived from the tonic pronouns of French (1 sg. mo < moi, 2 sg. to < toi, 3 sg. li < lui, etc.); and a verbal system that shows very little inflectional morphology but relies instead on a series of markers placed before the verb to express notions of tense, mood, and aspect. The most important of these are the anterior marker te; the progressive

332 Loth, Joseph (1847–1934)

Welsh metrics, particularly on the archaic poem of the Black Book of Carmarthen. He also studied aspects of Middle Welsh morphology and syntax (the use of the verbal particle rhy, for example). Unfortunately, his contribution to Welsh studies also includes a bitter criticism of the Welsh grammar of John Morris-Jones (Bangor), which created some resentment against him among certain scholars. He had a sound as well as profound knowledge of the living Celtic languages, in addition to his own Bas-Vannetais dialect, specializing particularly in Celtic lexicography and etymology. This scholar of another age, though less productive than Whitley Stokes (see Stokes, Whitley (1830–1909)), gives a genuine insight into the Celtic cultures of the past: his philological studies and his etymologies were a means of reviving the past, by discovering the real meaning of older texts and recovering the lost unity of the Celtic languages. As a linguist, he felt obliged to communicate his results to the historians and the archaeologists: he thoroughly studied such

semantic fields as names of cereal crops, horses, and sanctuaries. He took a great interest in the newly found Gaulish inscriptions of Coligny (1904) and La Graufesenque (1924). Towards the end of his life, alas, he was very active in defense of the fake inscriptions of Glozel. See also: Stokes, Whitley (1830–1909).

Bibliography Loth J (1884). Vocabulaire vieux-breton . . . contenant toutes les gloses en vieux-brteton, gallois, cornique, armoricain, connues. Paris: Vieweg (Reprinted, 1970). Loth J (1892). Les mots latins dans les langues brittoniques (gallois, armoricain, cornique). Paris: Bouillon. Loth J (1913). Les Mabinogion et autres romans gallois tire´s du Livre rouge de Hergest et du Livre blanc de Rhydderch (2nd edn.) (2 vols). Paris: Fontemoing. Vendryes J (1934). ‘Obituary of Joseph Loth.’ Revue Celtique 45, i–vi.

Louisiana Creole T Klingler, Tulane University, New Orleans, LA, USA ! 2006 Elsevier Ltd. All rights reserved.

This French-lexifier creole is spoken by an estimated 10 000–20 000 persons (reliable figures are not available) residing mainly in southwestern Louisiana. Most speakers live along or near the Bayou Teche, especially in the parishes of St. Landry, St. Martin, and Lafayette, but there are also pockets of speakers in several other parishes. Although it is commonly associated with African Americans and Creoles of color, Louisiana Creole (LC) is also the first language of many European Americans. The language has long coexisted with regional varieties of French, often referred to collectively as Cajun, and it is at least in part the continued influence of these varieties that explains why LC is structurally less distant from French than are the French-lexifier creoles of the Caribbean. LC shares a number of important features with Haitian Creole (e.g., the progressive marker ape, the verb gen ‘to have,’ and the possessive particle ke`n/ tche`n), and some linguists maintain that LC had as its origin a creole or pre-creole language imported to Louisiana from the French colony of Saint-Domingue before it became the free republic of Haiti. However, evidence that LC’s development predated the significant population migration from Saint-Domingue to

Louisiana in the early 19th century casts doubt on this claim and strengthens the possibility that LC is indigenous to the region. Today, the future of LC remains uncertain, since most fluent speakers are now elderly and the language is not being passed on to younger generations. LC varies considerably according to region, ethnic group, and social context. The linguistic situation in Louisiana is often said to form a speech continuum, with the type of LC that is furthest removed from French constituting the basilectal pole, and Cajun or, depending on the model used, Referential French constituting the acrolectal pole. Any given utterance, however, may display a greater or lesser quantity of French-like or Creole-like features, such that it may best be assigned to the broad mesolectal range lying between the two poles of the continuum. Like the other French-lexifier creoles, LC features definite articles that are postposed to the noun (tab-la ‘the table,’ chyen-ye ‘the dogs’); a personal pronoun system in which all of the pronouns, regardless of function, are derived from the tonic pronouns of French (1 sg. mo < moi, 2 sg. to < toi, 3 sg. li < lui, etc.); and a verbal system that shows very little inflectional morphology but relies instead on a series of markers placed before the verb to express notions of tense, mood, and aspect. The most important of these are the anterior marker te; the progressive

Lounsbury, Floyd Glenn (1914–1998) 333

marker ape (e in Pointe Coupee Parish); the future maker a, va, or ale; and the conditional marker se: Ye te ka lir ‘They could read’; Lavach-la ape ko`manse do`n dule ‘The cow is beginning to give milk’; Vou pa kwa l a chinen? ‘Don’t you think he’ll win?’ See also: Endangered Languages; Language Maintenance and Shift; Language/Dialect Contact; Minorities and Language; United States of America: Language Situation; Variation in Pidgins and Creoles.

Language Maps (Appendix 1): Maps 47, 48.

Bibliography Klingler T A (2003). If I could turn my tongue like that: the creole language of Pointe Coupee Parish, Louisiana. Baton Rouge: Louisiana State University Press. Neumann I (1985). Le cre´ ole de Breaux Bridge, Louisiane. Hamburg: Buske.

Lounsbury, Floyd Glenn (1914–1998) W L Chafe, University of California at Santa Barbara, CA, USA ! 2006 Elsevier Ltd. All rights reserved.

Floyd Glenn Lounsbury was born in Stevens Point, Wisconsin on April 25, 1914 and died on May 14, 1998 in New Haven, Connecticut. He attended the University of Wisconsin at Madison, majoring in mathematics, and while still an undergraduate he was recruited by the linguist Morris Swadesh to assist in the Oneida Language and Folklore Project funded by the Works Progress Administration (WPA). The goal of that project was to teach young Oneidas to write their language, and then to ask them to record folktales told by older members of the Oneida community. When Swadesh left Wisconsin in 1939, Lounsbury became director of the project. Using an Oneida orthography that he and Swadesh had developed, he oversaw its use in the collection of a variety of Oneida texts, which not only were of value to the Oneida community but also provided Lounsbury with material for an analysis of Oneida phonology and grammar. After WPA funding ended in the summer of 1940, he entered the University of Wisconsin M.A. program and planned to submit a thesis on Oneida phonology. World War II intervened, however, and he was not awarded the M.A. until 1946. During the war Lounsbury served in the U.S. Army Air Force in Brazil as a meteorologist. At the war’s end he turned from meteorology to anthropology, enrolling as a graduate student at Yale University and completing a dissertation on Oneida verb morphology. When he received the Ph.D. in 1949 he was immediately hired by the Yale Department of Anthropology, where he taught from 1949 until his retirement in 1979. During most of that period he was affiliated with the Department of Linguistics as well. He was a Fellow of the Center for Advanced Study in the Behavioral Sciences (1963–1964), and

was on two occasions a Senior Research Scholar at Dumbarton Oaks in Washington (1973–1974 and 1977–1978). He was elected to the National Academy of Sciences in 1969, the American Academy of Arts and Sciences in 1976, and the American Philosophical Society in 1987. In 1971 he received the Wilbur Cross Medal from the Yale Graduate School of Arts and Sciences, the school’s highest honor to one of its alumni. He was awarded an honorary LL.D. by the University of Pennsylvania in 1987, and was chosen to deliver the Distinguished Lecture at the annual meeting of the American Anthropological Association in 1990. A full bibliography of his work is available in Chafe (1998). Lounsbury contributed importantly to three distinct fields of research and teaching: the recording and analysis of Iroquoian languages, the study of kinship systems, and the decipherment of Mayan hieroglyphs. In all three of these areas, in addition to his own substantial contributions, he was able to set much of the agenda that was subsequently followed by other researchers, while he recruited a succession of new scholars who continued and refined his work. The extent of Lounsbury’s influence on Iroquoian linguistic scholarship is difficult to exaggerate. When his dissertation was published in 1953 under the title Oneida verb morphology, it became the standard on which future studies of Iroquoian languages were based. In it he described the intricate structure of Oneida verbs in a way that had never previously been done for any Iroquoian language, setting forth, for example, the entire inventory of pronominal prefixes and their alternating forms, accounting for aspectual distinctions, and illuminating numerous other areas of Iroquoian linguistic structure. He himself worked to a greater or lesser extent with most of the extant languages in that family, not only with Oneida but also extensively with Mohawk, Cayuga, Tuscarora, and Cherokee. He had a profound

Lounsbury, Floyd Glenn (1914–1998) 333

marker ape (e in Pointe Coupee Parish); the future maker a, va, or ale; and the conditional marker se: Ye te ka lir ‘They could read’; Lavach-la ape ko`manse do`n dule ‘The cow is beginning to give milk’; Vou pa kwa l a chinen? ‘Don’t you think he’ll win?’ See also: Endangered Languages; Language Maintenance and Shift; Language/Dialect Contact; Minorities and Language; United States of America: Language Situation; Variation in Pidgins and Creoles.

Language Maps (Appendix 1): Maps 47, 48.

Bibliography Klingler T A (2003). If I could turn my tongue like that: the creole language of Pointe Coupee Parish, Louisiana. Baton Rouge: Louisiana State University Press. Neumann I (1985). Le cre´ole de Breaux Bridge, Louisiane. Hamburg: Buske.

Lounsbury, Floyd Glenn (1914–1998) W L Chafe, University of California at Santa Barbara, CA, USA ! 2006 Elsevier Ltd. All rights reserved.

Floyd Glenn Lounsbury was born in Stevens Point, Wisconsin on April 25, 1914 and died on May 14, 1998 in New Haven, Connecticut. He attended the University of Wisconsin at Madison, majoring in mathematics, and while still an undergraduate he was recruited by the linguist Morris Swadesh to assist in the Oneida Language and Folklore Project funded by the Works Progress Administration (WPA). The goal of that project was to teach young Oneidas to write their language, and then to ask them to record folktales told by older members of the Oneida community. When Swadesh left Wisconsin in 1939, Lounsbury became director of the project. Using an Oneida orthography that he and Swadesh had developed, he oversaw its use in the collection of a variety of Oneida texts, which not only were of value to the Oneida community but also provided Lounsbury with material for an analysis of Oneida phonology and grammar. After WPA funding ended in the summer of 1940, he entered the University of Wisconsin M.A. program and planned to submit a thesis on Oneida phonology. World War II intervened, however, and he was not awarded the M.A. until 1946. During the war Lounsbury served in the U.S. Army Air Force in Brazil as a meteorologist. At the war’s end he turned from meteorology to anthropology, enrolling as a graduate student at Yale University and completing a dissertation on Oneida verb morphology. When he received the Ph.D. in 1949 he was immediately hired by the Yale Department of Anthropology, where he taught from 1949 until his retirement in 1979. During most of that period he was affiliated with the Department of Linguistics as well. He was a Fellow of the Center for Advanced Study in the Behavioral Sciences (1963–1964), and

was on two occasions a Senior Research Scholar at Dumbarton Oaks in Washington (1973–1974 and 1977–1978). He was elected to the National Academy of Sciences in 1969, the American Academy of Arts and Sciences in 1976, and the American Philosophical Society in 1987. In 1971 he received the Wilbur Cross Medal from the Yale Graduate School of Arts and Sciences, the school’s highest honor to one of its alumni. He was awarded an honorary LL.D. by the University of Pennsylvania in 1987, and was chosen to deliver the Distinguished Lecture at the annual meeting of the American Anthropological Association in 1990. A full bibliography of his work is available in Chafe (1998). Lounsbury contributed importantly to three distinct fields of research and teaching: the recording and analysis of Iroquoian languages, the study of kinship systems, and the decipherment of Mayan hieroglyphs. In all three of these areas, in addition to his own substantial contributions, he was able to set much of the agenda that was subsequently followed by other researchers, while he recruited a succession of new scholars who continued and refined his work. The extent of Lounsbury’s influence on Iroquoian linguistic scholarship is difficult to exaggerate. When his dissertation was published in 1953 under the title Oneida verb morphology, it became the standard on which future studies of Iroquoian languages were based. In it he described the intricate structure of Oneida verbs in a way that had never previously been done for any Iroquoian language, setting forth, for example, the entire inventory of pronominal prefixes and their alternating forms, accounting for aspectual distinctions, and illuminating numerous other areas of Iroquoian linguistic structure. He himself worked to a greater or lesser extent with most of the extant languages in that family, not only with Oneida but also extensively with Mohawk, Cayuga, Tuscarora, and Cherokee. He had a profound

334 Lounsbury, Floyd Glenn (1914–1998)

influence on the career paths of a number of others whose work with Iroquoian languages followed his example. Although he had been trained in the tradition of American structural linguistics established by Leonard Bloomfield and others, Lounsbury had a special interest in semantics, an area that had been neglected within that tradition. He recognized the potential offered by the kinship vocabularies of diverse languages as a way of extending structuralist methods to meanings. In an elegant fashion he analyzed kinship systems in terms of semantic ‘components,’ showing that a small number of components and rules for their combination could account for a variety of structurally different systems. Between 1956 and 1968 he published five seminal articles on kinship analysis involving the Pawnee, Seneca, Crow, and Omaha (Omaha-Ponca) languages, the language of the Trobriand Islands, and early Latin. His writings attracted widespread attention, as linguists and anthropologists saw in them evidence that at least some areas of meaning could be as systematically analyzed as phonology and grammar. It would be irresponsible today to undertake a study of a kinship system without taking Lounsbury’s work into account. By the 1950s Lounsbury had developed an interest in Mayan hieroglyphs, and he gradually devoted more and more time to them. The nature of Mayan writing was poorly understood when he began, and he became a principal advocate of phonetic interpretation of the glyphs, establishing himself as a pioneer in Mayan decipherment. During the 1970s and 1980s his publications were instrumental in converting Mayan hieroglyph studies into a major field of research. He was able to set forth clear principles for the discovery and demonstration of the values of the glyphs, but even while the principles of phonological representation were being established and a significant selection of Mayan texts being read, he turned his attention to grammatical analysis of those texts, demonstrating the dependence of hieroglyph studies

on a detailed understanding of Mayan linguistics. Most Mayan decipherment was based on the analysis of calendrical patterns of a sort that appealed to his mathematical bent. He continued to work on Mayan mathematics throughout the 1980s, extending his work to the investigation of astronomical knowledge, its calendrical structure, and cultural uses reflected in hieroglyphic texts. After an extended period during which he devoted much of his time first to kinship and then to Mayan hieroglyphs, in the 1990s Lounsbury returned to Iroquoian research, working on, among other things, the interpretation of an important Oneida text that he had recorded earlier. In the last years of his life he continued to participate actively in conferences on both Iroquoian and Mayan topics. See also: Iroquoian Languages; Mayan Languages; Mesoamerica: Scripts; Swadesh, Morris (1909–1967).

Bibliography Campisi J & Hauptman L M (1981). ‘Talking back: The Oneida language and folklore project, 1938–1941.’ Proceedings of the American Philosophical Society 125, 441–448. Chafe W (1998). ‘Floyd Glenn Lounsbury.’ Newsletter of the Society for the Study of the Indigenous Languages of the Americas 17(2), 2–4. Chafe W & Justeson J S (1999). ‘Floyd Glenn Lounsbury.’ Language 75, 563–566. Lounsbury F G (1953). Oneida verb morphology. Yale University Publications in Anthropology, vol. 48. New Haven: Yale University Press. Lounsbury F G (1956). ‘A semantic analysis of the Pawnee kinship usage.’ Language 32, 158–194. Lounsbury F G (1978). ‘Maya numeration, computation, and calendrical astronomy.’ In Gillispie C C (ed.) Dictionary of scientific biography,, vol. 15. New York: Charles Scribner’s Sons. 759–818. Lounsbury F G & Gick B (2000). The Oneida creation story, as told by Demus Elm and Harvey Antone. Lincoln: University of Nebraska Press.

Lowth, Robert (1710–1787) K Navest, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

It is for his Short introduction to English grammar, published anonymously in 1762 by Andrew Millar and Robert and James Dodsley, that Lowth is best known among linguists today. It is a normative

grammar, and Lowth’s approach is evident in the numerous footnotes in which he quotes ‘‘the best Authors’’ breaking the rules he is describing. Because of its popularity, Lowth’s grammar was reprinted at least 30 times during the author’s lifetime and 17 times after his death. It was even translated into German.

334 Lounsbury, Floyd Glenn (1914–1998)

influence on the career paths of a number of others whose work with Iroquoian languages followed his example. Although he had been trained in the tradition of American structural linguistics established by Leonard Bloomfield and others, Lounsbury had a special interest in semantics, an area that had been neglected within that tradition. He recognized the potential offered by the kinship vocabularies of diverse languages as a way of extending structuralist methods to meanings. In an elegant fashion he analyzed kinship systems in terms of semantic ‘components,’ showing that a small number of components and rules for their combination could account for a variety of structurally different systems. Between 1956 and 1968 he published five seminal articles on kinship analysis involving the Pawnee, Seneca, Crow, and Omaha (Omaha-Ponca) languages, the language of the Trobriand Islands, and early Latin. His writings attracted widespread attention, as linguists and anthropologists saw in them evidence that at least some areas of meaning could be as systematically analyzed as phonology and grammar. It would be irresponsible today to undertake a study of a kinship system without taking Lounsbury’s work into account. By the 1950s Lounsbury had developed an interest in Mayan hieroglyphs, and he gradually devoted more and more time to them. The nature of Mayan writing was poorly understood when he began, and he became a principal advocate of phonetic interpretation of the glyphs, establishing himself as a pioneer in Mayan decipherment. During the 1970s and 1980s his publications were instrumental in converting Mayan hieroglyph studies into a major field of research. He was able to set forth clear principles for the discovery and demonstration of the values of the glyphs, but even while the principles of phonological representation were being established and a significant selection of Mayan texts being read, he turned his attention to grammatical analysis of those texts, demonstrating the dependence of hieroglyph studies

on a detailed understanding of Mayan linguistics. Most Mayan decipherment was based on the analysis of calendrical patterns of a sort that appealed to his mathematical bent. He continued to work on Mayan mathematics throughout the 1980s, extending his work to the investigation of astronomical knowledge, its calendrical structure, and cultural uses reflected in hieroglyphic texts. After an extended period during which he devoted much of his time first to kinship and then to Mayan hieroglyphs, in the 1990s Lounsbury returned to Iroquoian research, working on, among other things, the interpretation of an important Oneida text that he had recorded earlier. In the last years of his life he continued to participate actively in conferences on both Iroquoian and Mayan topics. See also: Iroquoian Languages; Mayan Languages; Mesoamerica: Scripts; Swadesh, Morris (1909–1967).

Bibliography Campisi J & Hauptman L M (1981). ‘Talking back: The Oneida language and folklore project, 1938–1941.’ Proceedings of the American Philosophical Society 125, 441–448. Chafe W (1998). ‘Floyd Glenn Lounsbury.’ Newsletter of the Society for the Study of the Indigenous Languages of the Americas 17(2), 2–4. Chafe W & Justeson J S (1999). ‘Floyd Glenn Lounsbury.’ Language 75, 563–566. Lounsbury F G (1953). Oneida verb morphology. Yale University Publications in Anthropology, vol. 48. New Haven: Yale University Press. Lounsbury F G (1956). ‘A semantic analysis of the Pawnee kinship usage.’ Language 32, 158–194. Lounsbury F G (1978). ‘Maya numeration, computation, and calendrical astronomy.’ In Gillispie C C (ed.) Dictionary of scientific biography,, vol. 15. New York: Charles Scribner’s Sons. 759–818. Lounsbury F G & Gick B (2000). The Oneida creation story, as told by Demus Elm and Harvey Antone. Lincoln: University of Nebraska Press.

Lowth, Robert (1710–1787) K Navest, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

It is for his Short introduction to English grammar, published anonymously in 1762 by Andrew Millar and Robert and James Dodsley, that Lowth is best known among linguists today. It is a normative

grammar, and Lowth’s approach is evident in the numerous footnotes in which he quotes ‘‘the best Authors’’ breaking the rules he is describing. Because of its popularity, Lowth’s grammar was reprinted at least 30 times during the author’s lifetime and 17 times after his death. It was even translated into German.

Lu Fa-yan (fl. 600 A.D.) 335

Lowth was born in Winchester on November 27, 1710. In 1722 he was admitted as a scholar at Winchester College. In 1729 Lowth entered New College, Oxford, where he was Professor of Poetry from 1741 until 1752. In 1752, Lowth married Mary Jackson. They had seven children of which only two outlived their father. In 1753, Lowth’s Latin lectures were published as De sacra poesi Hebraeorum praelectiones. Five years later, Lowth published The life of William of Wykeham, a biography about the founder of Winchester College and New College. Lowth’s knowledge of Hebrew led him to translate the book of Isaiah in 1778. In 1766, Lowth became Bishop of St. David’s, and only a few months later Bishop of Oxford. In 1777, he was made Bishop of London, which he remained until his death. Lowth originally wrote his grammar for his eldest son Thomas Henry (1753–1778), because he felt that there was a need for an English grammar that could prepare children for the study of Latin. A short introduction to English grammar contains several references to Lowth’s son in the form of examples such as ‘‘Thomas’s book’’ and ‘‘Thomas is loved by me.’’ Lowth appears to have begun the grammar in the autumn of 1757 when his son was only 4 years old. When he told his friend, the Chancellor of the Exchequer, Henry Bilson Legge (1708–1764), about his project, Legge asked for a copy of the grammar for his own son, leading Lowth to publish his grammar. Lowth may have intended his grammar as a children’s grammar, but many people considered it too difficult for young learners. Daniel Fenning (1771) admired

Lowth’s section on syntax in which Lowth divides the sentence into twelve types of ‘phrase,’ but thought that the grammar was more suitable ‘‘for Men of Letters than for Youth at school.’’ It was probably Lowth’s publisher Robert Dodsley (1703–1764), who had also been behind the publication of Johnson’s Dictionary (1755), who decided to make Lowth’s grammar available to a wider public.

See also: Johnson, Samuel (1709–1784).

Bibliography Alston R C (1965). A bibliography of the English language from the invention of printing to the year 1800, vol. 1: English grammars written in English. Leeds: E. J. Arnold and Son. Hepworth B (1978). Robert Lowth. Boston: Twayne Publishers. Lowth R (1762). A short introduction to English grammar. London: J. Hughs for A. Millar, and for R. & J. Dodsley (Facs.-repr. with a notice by Robert C. Alston, Menston: The Scholar Press, 1968.). Stammerjohann H (ed.) (1996). Lexicon grammaticorum. Tu¨ bingen: Niemeyer. Tieken-Boon van Ostade I (2000). ‘Robert Dodsley and the genesis of Lowth’s Short introduction to English grammar.’ Historiographia Linguistica 27, 21–36. Tieken-Boon van Ostade I (2003). ‘‘‘Tom’s grammar’’: the genesis of Lowth’s Short introduction to English grammar revisited.’ In Francis Austin & Chris Stray (eds.). Festschrift for Ian Michael. 36–45.

Lu Fa-yan (fl. 600 A.D.) B-C Kwok, Hong Kong, China ! 2006 Elsevier Ltd. All rights reserved.

Lu Fayan is generally remembered as the compiler of the Chinese rhyme dictionary Qieyun. We do not know much about his life. The biography of his father Lu Shuan (ca. 539–591) is included in Sui Shu (The book of the Sui dynasty), in which we learn that the Lu family was native to Linzhang in northern China. Qieyun was revised and extended by scholars of the Tang (618–907) and the Song (960–1279) dynasties, and eventually several versions were produced. Of these, Guangyun (dated 1007–1008) has gained greatest popularity. We generally regard the phonological system reflected in these dictionaries as Middle Chinese or Ancient Chinese.

The original version of Qieyun is now no longer extant, but its preface, written by Lu himself in 601, is still available. That writing, with slightly more than 400 words, described the background of the compilation and the peculiarities of the dialects at that time, but more important is that it provided some hints uncovering the nature of Qieyun. Speaking of the meeting with eight colleagues 20 years earlier that had led to the writing of the book, Lu wrote, ‘‘We discussed the right and the wrong of South and North, and the prevailing and the obsolete of past and present’’ (adapted from Malmqvist, 1968). By consulting various contemporary documents, especially the Yanshi jiaxun (The family instructions of the Yan clan), we are confident in claiming that the terms ‘South’ and ‘North’ that appear in the preface refer to the cities of Jinling (today Nanjing) and

Lu Fa-yan (fl. 600 A.D.) 335

Lowth was born in Winchester on November 27, 1710. In 1722 he was admitted as a scholar at Winchester College. In 1729 Lowth entered New College, Oxford, where he was Professor of Poetry from 1741 until 1752. In 1752, Lowth married Mary Jackson. They had seven children of which only two outlived their father. In 1753, Lowth’s Latin lectures were published as De sacra poesi Hebraeorum praelectiones. Five years later, Lowth published The life of William of Wykeham, a biography about the founder of Winchester College and New College. Lowth’s knowledge of Hebrew led him to translate the book of Isaiah in 1778. In 1766, Lowth became Bishop of St. David’s, and only a few months later Bishop of Oxford. In 1777, he was made Bishop of London, which he remained until his death. Lowth originally wrote his grammar for his eldest son Thomas Henry (1753–1778), because he felt that there was a need for an English grammar that could prepare children for the study of Latin. A short introduction to English grammar contains several references to Lowth’s son in the form of examples such as ‘‘Thomas’s book’’ and ‘‘Thomas is loved by me.’’ Lowth appears to have begun the grammar in the autumn of 1757 when his son was only 4 years old. When he told his friend, the Chancellor of the Exchequer, Henry Bilson Legge (1708–1764), about his project, Legge asked for a copy of the grammar for his own son, leading Lowth to publish his grammar. Lowth may have intended his grammar as a children’s grammar, but many people considered it too difficult for young learners. Daniel Fenning (1771) admired

Lowth’s section on syntax in which Lowth divides the sentence into twelve types of ‘phrase,’ but thought that the grammar was more suitable ‘‘for Men of Letters than for Youth at school.’’ It was probably Lowth’s publisher Robert Dodsley (1703–1764), who had also been behind the publication of Johnson’s Dictionary (1755), who decided to make Lowth’s grammar available to a wider public.

See also: Johnson, Samuel (1709–1784).

Bibliography Alston R C (1965). A bibliography of the English language from the invention of printing to the year 1800, vol. 1: English grammars written in English. Leeds: E. J. Arnold and Son. Hepworth B (1978). Robert Lowth. Boston: Twayne Publishers. Lowth R (1762). A short introduction to English grammar. London: J. Hughs for A. Millar, and for R. & J. Dodsley (Facs.-repr. with a notice by Robert C. Alston, Menston: The Scholar Press, 1968.). Stammerjohann H (ed.) (1996). Lexicon grammaticorum. Tu¨bingen: Niemeyer. Tieken-Boon van Ostade I (2000). ‘Robert Dodsley and the genesis of Lowth’s Short introduction to English grammar.’ Historiographia Linguistica 27, 21–36. Tieken-Boon van Ostade I (2003). ‘‘‘Tom’s grammar’’: the genesis of Lowth’s Short introduction to English grammar revisited.’ In Francis Austin & Chris Stray (eds.). Festschrift for Ian Michael. 36–45.

Lu Fa-yan (fl. 600 A.D.) B-C Kwok, Hong Kong, China ! 2006 Elsevier Ltd. All rights reserved.

Lu Fayan is generally remembered as the compiler of the Chinese rhyme dictionary Qieyun. We do not know much about his life. The biography of his father Lu Shuan (ca. 539–591) is included in Sui Shu (The book of the Sui dynasty), in which we learn that the Lu family was native to Linzhang in northern China. Qieyun was revised and extended by scholars of the Tang (618–907) and the Song (960–1279) dynasties, and eventually several versions were produced. Of these, Guangyun (dated 1007–1008) has gained greatest popularity. We generally regard the phonological system reflected in these dictionaries as Middle Chinese or Ancient Chinese.

The original version of Qieyun is now no longer extant, but its preface, written by Lu himself in 601, is still available. That writing, with slightly more than 400 words, described the background of the compilation and the peculiarities of the dialects at that time, but more important is that it provided some hints uncovering the nature of Qieyun. Speaking of the meeting with eight colleagues 20 years earlier that had led to the writing of the book, Lu wrote, ‘‘We discussed the right and the wrong of South and North, and the prevailing and the obsolete of past and present’’ (adapted from Malmqvist, 1968). By consulting various contemporary documents, especially the Yanshi jiaxun (The family instructions of the Yan clan), we are confident in claiming that the terms ‘South’ and ‘North’ that appear in the preface refer to the cities of Jinling (today Nanjing) and

336 Lu Fa-yan (fl. 600 A.D.)

Luoyang, respectively. The phonological systems of the dialects of these two cities were important bases for Lu to establish the framework of Qieyun. It is noteworthy that Lu had mentioned the names of the eight other people involved in the earlier discussions of the rhyme dictionary. Three of them came from Jinling, and the remainder from Ye, a city not far away from Luoyang. Lu also placed great importance on earlier sources. It is evident that Lu tried to integrate these materials into his system, while rhyming distinctions in any one of them were mostly maintained. Obviously, Qieyun is composite in nature rather than simply a rhyme dictionary based on a single dialect at a certain time. Those who wish to reconstruct the phonetic values of Qieyun should first identify its heterogeneous elements. The aim of the compilation of Qieyun, according to the preface, was to promote poetry writing. The language embodied in the dictionary, therefore, cannot be treated as representative of real speech. Lu’s main concern was with rhyming practice, not linguistic analysis of the materials he was dealing with. On the other hand, there is not even a single word telling us that Lu had an intention to establish standards of pronunciation through the compilation of the book. This point is often misinterpreted by modern scholars.

By and large, in understanding the development of Chinese languages, Qieyun is of paramount importance: it bridged the gap between Old Chinese and the Modern Chinese. It is Lu’s contribution to better knowledge of Ancient Chinese phonology. See also: Chinese; Wang Li (1900–1986).

Bibliography Baxter W H (1992). A handbook of Old Chinese phonology. Berlin: Mouton de Gruyter. Coblin W S (1996). ‘Marginalia on two translations of the Qieyun preface.’ Journal of Chinese Linguistics 24(1), 85–97. Malmqvist G (1968). ‘Chou Tsu-mo on the Ch’ieh-yu¨ n.’ Bulletin of the Museum of Far Eastern Antiquities 40, 33–78. Norman J (1988). Chinese. Cambridge: Cambridge University Press. Ting Pang-hsin (1995). Chongjian hanyu zhonggu yinxi de yixie xiangfa. Zhongguo Yuwen 6, 414–419. Zhou Z (1966). ‘Qieyun de xingzhi he tade yinxi jichu.’ In Zhou Z (ed.) Wenxue ji, vol. 2. Beijing: Zhonghua shuju. 434–473.

Luganda F Katamba, Lancaster University, Lancaster, UK ! 2006 Elsevier Ltd. All rights reserved.

Location and Genetic Affiliation Luganda (Ganda), a Bantu language of Uganda, is the mother tongue of 3 015 980 speakers (a little more than 16% of the population of Uganda); with an additional 1 million second language speakers, Luganda is the most widely spoken language in Uganda. It

belongs to the Narrow Bantu subgroup of the Bantu sub-branch of the Benue-Congo branch of NigerCongo. It is classified as Zone J15 in Guthrie’s classification system for Bantu.

Basic Phonology and Orthography The Luganda orthography is essentially phonemic. The consonants and vowel phonemes are listed in Tables 1 and 2, respectively. IPA symbols corresponding to the

Table 1 Consonants Consonant

Stops Voiceless Voiced Fricatives Voiceless Voiced Nasals Approximants

Bilabial

Labiodental

p b f v m

Alveolar

Palatal

Velar

t d

c j [ð]

k g

s z n la

ny [ J] y (j)

ng [N]

Labiovelar

w

a Though [l] and [r] are allophones of the phoneme /l/, they are represented by separate letters in the orthography. The letter ‘r’ is used after front vowels and ‘l’ is used elsewhere.

336 Lu Fa-yan (fl. 600 A.D.)

Luoyang, respectively. The phonological systems of the dialects of these two cities were important bases for Lu to establish the framework of Qieyun. It is noteworthy that Lu had mentioned the names of the eight other people involved in the earlier discussions of the rhyme dictionary. Three of them came from Jinling, and the remainder from Ye, a city not far away from Luoyang. Lu also placed great importance on earlier sources. It is evident that Lu tried to integrate these materials into his system, while rhyming distinctions in any one of them were mostly maintained. Obviously, Qieyun is composite in nature rather than simply a rhyme dictionary based on a single dialect at a certain time. Those who wish to reconstruct the phonetic values of Qieyun should first identify its heterogeneous elements. The aim of the compilation of Qieyun, according to the preface, was to promote poetry writing. The language embodied in the dictionary, therefore, cannot be treated as representative of real speech. Lu’s main concern was with rhyming practice, not linguistic analysis of the materials he was dealing with. On the other hand, there is not even a single word telling us that Lu had an intention to establish standards of pronunciation through the compilation of the book. This point is often misinterpreted by modern scholars.

By and large, in understanding the development of Chinese languages, Qieyun is of paramount importance: it bridged the gap between Old Chinese and the Modern Chinese. It is Lu’s contribution to better knowledge of Ancient Chinese phonology. See also: Chinese; Wang Li (1900–1986).

Bibliography Baxter W H (1992). A handbook of Old Chinese phonology. Berlin: Mouton de Gruyter. Coblin W S (1996). ‘Marginalia on two translations of the Qieyun preface.’ Journal of Chinese Linguistics 24(1), 85–97. Malmqvist G (1968). ‘Chou Tsu-mo on the Ch’ieh-yu¨n.’ Bulletin of the Museum of Far Eastern Antiquities 40, 33–78. Norman J (1988). Chinese. Cambridge: Cambridge University Press. Ting Pang-hsin (1995). Chongjian hanyu zhonggu yinxi de yixie xiangfa. Zhongguo Yuwen 6, 414–419. Zhou Z (1966). ‘Qieyun de xingzhi he tade yinxi jichu.’ In Zhou Z (ed.) Wenxue ji, vol. 2. Beijing: Zhonghua shuju. 434–473.

Luganda F Katamba, Lancaster University, Lancaster, UK ! 2006 Elsevier Ltd. All rights reserved.

Location and Genetic Affiliation Luganda (Ganda), a Bantu language of Uganda, is the mother tongue of 3 015 980 speakers (a little more than 16% of the population of Uganda); with an additional 1 million second language speakers, Luganda is the most widely spoken language in Uganda. It

belongs to the Narrow Bantu subgroup of the Bantu sub-branch of the Benue-Congo branch of NigerCongo. It is classified as Zone J15 in Guthrie’s classification system for Bantu.

Basic Phonology and Orthography The Luganda orthography is essentially phonemic. The consonants and vowel phonemes are listed in Tables 1 and 2, respectively. IPA symbols corresponding to the

Table 1 Consonants Consonant

Stops Voiceless Voiced Fricatives Voiceless Voiced Nasals Approximants

Bilabial

Labiodental

p b f v m

Alveolar

Palatal

Velar

t d

c j [ð]

k g

s z n la

ny [ J] y (j)

ng [N]

Labiovelar

w

a Though [l] and [r] are allophones of the phoneme /l/, they are represented by separate letters in the orthography. The letter ‘r’ is used after front vowels and ‘l’ is used elsewhere.

Luganda 337

standard letters are shown in brackets for palatal and velar consonants. As seen in Example (1), gemination (indicated by double letters) is phonemic for both consonants (C) and vowels (V). The typical syllable is CV or CVV. The only consonant clusters allowed are NC (i.e., nasal þ consonant), as in nkola, and for consonant plus glide, as in mukwano ‘friendship’: (1) ogula nkola

‘we buy’ ‘I work’

oggula nkoola

‘you open’ ‘I weed’

(L L L L L) (L H L) (L H H) (L HL)

bira girl ‘this big forest’ PP

ki-

CP7

‘comb’ ‘to become cool’ ‘to lend money’ ‘gourds’

kiCP7

no this

e-

ki-

PP

CP7

nene big

As seen in Example (6), verbs, compared to nouns, are morphologically more complex (NEG, negation; SM, subject marker; TM, tense marker; FUT, future; OM, object marker; ES, extension suffix; APPL, applicative; FV, final vowel): (6) te- ba- li-

Tone is phonemic. However, it is not marked in the standard orthography, and that convention is followed here except when tone is being discussed. On the surface, there is a contrast between low (L) tone, which is unmarked, high (H) (") tone, and falling (HL) (^) tone. There are no rising tones. (2) ki sa ni ri zo ku wo´ la ku wo´ la´ bi ta´ a`

(5b) e-

deet- era bring ES (APPL) FV not they future it me bring for FV ‘They will not bring it for me.’ NEG SM

ki- n-

TM(FUT) OM1 OM2

There is also an elaborate tense/aspect system, with distinct negative and positive paradigms: (7) Tense/aspect Past: Near past: Immediate past: Present: Near future: General future:

Positive nnalaba nnalabye ndabye ndaba nnaasoma ndiraba

Negative saalaba saalabye sirabye siraba siisome siriraba

Basic Morphology Luganda has rich, agglutinative morphology. The typical noun has the following structure (PP ¼ preprefix; CP ¼ class prefix): (3)

PP

CP

ROOT

o-

mu-

wala

‘girl’

The 21 noun classes, each one of them marked by a prefix, are normally paired for singular and plural, as in Example (4): (4) Singular:

Plural:

CLASS

PP

CP

STEM

1 7 2 8

oeae-

mukibabi-

wala bira wala bira

‘girl’ ‘forest’ ‘girls’ ‘forests’

The noun class numbering system is standardized for all Bantu languages. There is concord in a noun phrase between a noun and any dependent adjectives and determiners: (5a) o-

mu- wala CP1 girl ‘this big girl’ PP

oCP1

no this

oPP

muCP1

nene big

Table 2 Vowels Vowel

Front

High Mid Low

i e

Central

Back and round

u o a

Basic Syntax The basic word order in unmarked declarative sentences is subject-verb-object: (8a) a- ba- wala ba- a- lab- a e- m- bwa PP- CP2- girl CP2- PAST- see- FV PP- CP9- dog ‘The girls saw the dog.’ (8b) e- m- bwa y- a- lab- a a- ba- wala PP- CP9- dog CP9- PAST- see- FV PP- CP2- girl ‘The dog saw the girls.’

Typically, the head precedes its modifier: e.g., in noun phrases, nouns precede determiners and adjectives (see Examples (5a) and (5b)). See also: Bantu Languages; Tone: Phonology; Uganda:

Language Situation.

Bibliography Ashton E O, Muliira E M K, Ndawula E G M & Tucker A N (1954). A Luganda grammar. London: Longmans, Green and Co. Cole D T (1967). Some features of Ganda linguistic structure. Johannesburg: Witwatersrand University Press. Hyman L M & Katamba F (1993). ‘A new approach to tone in Luganda.’ Language 6, 34–67.

338 Luhmann, Niklas (1927–1998)

Luhmann, Niklas (1927–1998) A Sacknovitz, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.

Niklas Luhmann, founder of sociological systems theory, is known within linguistics for the central role of communication within his theory. Within Luhmann’s framework, modern society becomes a complex system of communications, with communication serving as a system’s basic element and a means of maintaining its distinction from its environment. For Luhmann, communication is a synthesis between information (selection from various referential meanings), utterance (selection from various intentional acts), and understanding (observations of difference between utterance and information). Luhmann was born December 8, 1927 in Lu¨ neburg, Germany. At age 17, he served in the Luftwaffe and was captured by Americans. After graduating from the University of Freiburg in 1949, he worked in public administration at Lu¨ neburg Administrative Court, developing a file-card system, then at the Ministry of Education of Lower Saxony, working on war reparations. His free time was spent reading Descartes and Kant, among others. In 1960, he married Ursula von Walter, with whom he had three children. In 1961, Luhmann left public administration to study the sociology of Talcott Parsons, a leading systems theorist, at Harvard. Developing his own systems theory, Luhmann taught at Hochschule fu¨ r Verwaltungswissenschaften in Speyer from 1962 to 1965, before being offered a position at the Sozialforschungsstelle of the University of Mu¨ nster. In 1966, after two of his earlier works were accepted as thesis and habilitation, Luhmann earned the title of professor. He first lectured at the University of Frankfurt and was appointed full professor of sociology at the newly founded University of Bielefeld, where he served until 1993. Luhmann, a controversial figure in sociology, particularly for his antihumanistic views, defined a system as a boundary between itself and its complex exterior, or environment. Communication enacted the boundary, by selecting and processing a finite amount of information based on meaning (Sinn), the unity of difference between actuality and potentiality, and thereby reducing complexity. Communication also allowed systems to continually reproduce themselves from previously selected elements via a process termed ‘autopoiesis,’ a term borrowed from

cognitive biologists Humberto Maturana and Francisco Varela, meaning self-creation. Social systems are autopoietically closed in that their operation is not determined from the outside. Instead, environmental resources must first be appropriated before contributing to the system’s reproduction. Communication based on a unique binary code (i.e., legal/illegal) about the system and the environment allows such contact. Social and conscious systems, which were for Luhmann distinct systems, each work according to their own code. Language, one interface between these two distinct systems, is a medium that involves verbal or nonverbal signs that regulate the difference between information and utterance via a process of symbolic generalization. Luhmann detailed his theory in more than three dozen books, dealing with such diverse subjects as mass media, law, and love. Social systems (1995) was considered to be the introduction to his theory. His 2-volume magnum opus Die Gesellschaft der Gesellschaft appeared in 1997. Also of note was Theorie der Gesellschaft oder sozialtechnologie. Was leistet die Systemforschung? (‘Theory of society or social technology: what does systems research accomplish?’ Habermas and Luhmann, 1971), a joint publication with critical theorist Ju¨ rgen Habermas of the Frankfurt School of Sociology, of which more than 35 000 copies sold in a few years. In addition, works such as Rasch (2000) and Harrison (1995) provided introductions to Luhmann’s theory. Luhmann died November 6, 1998, but not before adding substantially to the fields of linguistics and sociology, particularly as studied in Germany, Japan, and Eastern Europe. Luhmann is also remembered by his appearance as a character in Paul Wu¨ hr’s Das falsche Buch, together with Ulrich Sonnemann, Johann Georg Hamann, and Richard Buckminster Fuller.

See also: Habermas, Ju¨rgen (b. 1929).

Bibliography Habermas J & Luhmann N (1971). Theorie der Gesellschaft oder sozialtechnologie. Was leistet die Systemforschung? Frankfurt am Main: Suhrkamp. Harrison P R (1995). ‘Niklas Luhmann and the theory of social systems.’ In Roberts D (ed.) Reconstructing theory Gadamer, Habermas, Luhmann. Australia: Melbourne University Press.

Luick, Karl (1865–1935) 339 Luhmann N (1995). Social systems. Stanford, California: Stanford University Press. [1995 book was translated by John Bednarz Jr, with Dirk Baecker.] Luhman N (1997). Die Gesellschaft der Gesellschaft. Frankfurt am Main: Suhrkamp.

Rasch W (2000). Niklas Luhmann’s modernity: the paradoxes of differentiation. Stanford, California: Stanford University Press. Wu¨ hr P (1983). Das falsche Buch. Mu¨ nchen: Hanser.

Luick, Karl (1865–1935) W Viereck, University of Bamberg, Bamberg, Germany ! 2006 Elsevier Ltd. All rights reserved.

Karl Luick was born in Floridsdorf near Vienna in 1865. He studied English, French, and German at the University of Vienna, where he received his Ph.D. in 1888. His Habilitation (D.Litt.) followed only two years later. In 1893, he became the first Professor of English Philology at the University of Graz. He remained there until 1908, when he returned to Vienna. From 1925 to 1926 he was Rektor (President) of the University of Vienna; in 1935 he retired from his chair, dying in the same year. Luick published widely both in linguistics and in literature, as was customary then (his bibliography in Kastovsky et al., 1988 comprises 223 items). Although he also wrote a Deutsche Lautlehre (1904) that came out in three editions, he is best known for his diachronic work on the English language. His major works are Untersuchungen zur englischen Lautgeschichte (1896), Studien zur englischen Lautgeschichte (1903), and Historische Grammatik der englischen Sprache (1914–1940), of which the pages 797–1149 on consonantism were edited by Friedrich Wild and Herbert Koziol and published posthumously. Luick’s historical grammar is a presentation of the history of English sounds from Indo-European times down to the 19th century. He had originally also planned volumes on the history of word forms and on the history of constructions, but these remained unwritten. As a Neogrammarian at heart, he was pleased about the regularity with which sound laws work. However, in his book of 1896, he had already begun formulating structuralist thoughts, recognizing the functional load of a linguistic element, the essence of the phoneme, and the concept of minimal pairs.

The structuralist thought of the ‘sound system,’ with its stress on the proximity or distance of a single sound to or from its neighboring sounds, led Luick to a reinterpretation of the English Great Vowel Shift that operated on long vowels from the 15th to the 17th century, namely, as a ‘push-chain’ or ‘displacement’ theory, as opposed to the ‘drag-chain’ theory advocated by Jespersen in 1909. Luick also proposed a new explanation for the lengthening of Middle English short high vowels in open syllables, according to which they were first lowered and then lengthened. It must be stressed that Luick never started from a uniform, idealized pronunciation but always took the whole spectrum of dialectal variation into consideration. In fact, he was one of the first to have done this. See also: Jespersen, Otto (1860–1943); Paul, Hermann (1846–1921).

Bibliography Jespersen O (1909). A modern English grammar on historical principles: vol. 1. Heidelberg: Winter. Kastovsky D et al. (eds.) (1988). Luick revisited. Narr: Tu¨ bingen. Luick K (1896). Untersuchungen zur englischen Lautgeschichte. Strasburg: Tru¨ bner. Luick K ([1903] 1964). Studien zur englischen Lautgeschichte. New York and London: Johnson Reprint Corporation. Luick K ([1904] 1996). Deutsche Lautlehre: mit besonderer Beru¨cksichtigung der Sprechweise Wiens und der o¨sterreichischen Alpenla¨nder (Rpt. of the 3rd edn.). Wien: Pa¨ dagog. Verlag. Luick K ([1914–1921, 1929–1940] 1964). Historische Grammatik der englischen Sprache (2 vols). Oxford: Blackwell/Stuttgart: Tauchnitz. Wild F (ed.) (1925). Neusprachliche Studien: Festgabe Karl Luick zu seinem Sechzigsten Geburtstage dargebracht von Freunden und Schu¨lern. Marburg: Elwert.

Luick, Karl (1865–1935) 339 Luhmann N (1995). Social systems. Stanford, California: Stanford University Press. [1995 book was translated by John Bednarz Jr, with Dirk Baecker.] Luhman N (1997). Die Gesellschaft der Gesellschaft. Frankfurt am Main: Suhrkamp.

Rasch W (2000). Niklas Luhmann’s modernity: the paradoxes of differentiation. Stanford, California: Stanford University Press. Wu¨hr P (1983). Das falsche Buch. Mu¨nchen: Hanser.

Luick, Karl (1865–1935) W Viereck, University of Bamberg, Bamberg, Germany ! 2006 Elsevier Ltd. All rights reserved.

Karl Luick was born in Floridsdorf near Vienna in 1865. He studied English, French, and German at the University of Vienna, where he received his Ph.D. in 1888. His Habilitation (D.Litt.) followed only two years later. In 1893, he became the first Professor of English Philology at the University of Graz. He remained there until 1908, when he returned to Vienna. From 1925 to 1926 he was Rektor (President) of the University of Vienna; in 1935 he retired from his chair, dying in the same year. Luick published widely both in linguistics and in literature, as was customary then (his bibliography in Kastovsky et al., 1988 comprises 223 items). Although he also wrote a Deutsche Lautlehre (1904) that came out in three editions, he is best known for his diachronic work on the English language. His major works are Untersuchungen zur englischen Lautgeschichte (1896), Studien zur englischen Lautgeschichte (1903), and Historische Grammatik der englischen Sprache (1914–1940), of which the pages 797–1149 on consonantism were edited by Friedrich Wild and Herbert Koziol and published posthumously. Luick’s historical grammar is a presentation of the history of English sounds from Indo-European times down to the 19th century. He had originally also planned volumes on the history of word forms and on the history of constructions, but these remained unwritten. As a Neogrammarian at heart, he was pleased about the regularity with which sound laws work. However, in his book of 1896, he had already begun formulating structuralist thoughts, recognizing the functional load of a linguistic element, the essence of the phoneme, and the concept of minimal pairs.

The structuralist thought of the ‘sound system,’ with its stress on the proximity or distance of a single sound to or from its neighboring sounds, led Luick to a reinterpretation of the English Great Vowel Shift that operated on long vowels from the 15th to the 17th century, namely, as a ‘push-chain’ or ‘displacement’ theory, as opposed to the ‘drag-chain’ theory advocated by Jespersen in 1909. Luick also proposed a new explanation for the lengthening of Middle English short high vowels in open syllables, according to which they were first lowered and then lengthened. It must be stressed that Luick never started from a uniform, idealized pronunciation but always took the whole spectrum of dialectal variation into consideration. In fact, he was one of the first to have done this. See also: Jespersen, Otto (1860–1943); Paul, Hermann (1846–1921).

Bibliography Jespersen O (1909). A modern English grammar on historical principles: vol. 1. Heidelberg: Winter. Kastovsky D et al. (eds.) (1988). Luick revisited. Narr: Tu¨bingen. Luick K (1896). Untersuchungen zur englischen Lautgeschichte. Strasburg: Tru¨bner. Luick K ([1903] 1964). Studien zur englischen Lautgeschichte. New York and London: Johnson Reprint Corporation. Luick K ([1904] 1996). Deutsche Lautlehre: mit besonderer Beru¨cksichtigung der Sprechweise Wiens und der o¨sterreichischen Alpenla¨nder (Rpt. of the 3rd edn.). Wien: Pa¨dagog. Verlag. Luick K ([1914–1921, 1929–1940] 1964). Historische Grammatik der englischen Sprache (2 vols). Oxford: Blackwell/Stuttgart: Tauchnitz. Wild F (ed.) (1925). Neusprachliche Studien: Festgabe Karl Luick zu seinem Sechzigsten Geburtstage dargebracht von Freunden und Schu¨lern. Marburg: Elwert.

340

Lukas, Johannes (1901–1980)

Lukas, Johannes (1901–1980) E Shay, University of Colorado, Boulder, CO, USA ! 2006 Elsevier Ltd. All rights reserved.

Johannes Lukas was born on October 7, 1901, in Fischern bei Karlsbad (Karlovy Vary), Bohemia. He attended the University of Vienna, studying Egyptology and African and Oriental languages and culture, while also studying piano at the Vienna Conservatory. Among his professors was Eyptologist and Africanist Wilhelm Czermak. Lukas obtained a doctorate in African languages in 1925. After three years as assistant at the Vienna Museum fu¨ r Vo¨ lkerkunde (Ethnology), he spent two years as a tutor at Al-Azhar University. Here he became acquainted with many students from Central African countries, some of whom became Lukas’s informants in the first scientific studies of their native languages. In 1932–1933, as a research fellow of the International African Institute (London), Lukas conducted fieldwork on Kanuri (Saharan), and in 1937 published the seminal grammatical work on that language. After a brief stay at the School of Oriental and African Studies (SOAS) in London, he was accepted in 1934 into the Department of African Studies at the University of Hamburg. Here he taught under Africanist Carl Meinhof, while conducting his postdoctoral (Habilitation) studies and beginning to put into writing his findings on the Saharan and Chadic languages of the Lake Chad area. Lukas’s linguistic studies were interrupted by his wartime service, which he spent in Berlin at the AuslandsHochschule (Foreign University). He resumed teaching in Hamburg and in 1954 succeeded August Klingenheben as Director of the Department of African Studies (now the Institut fu¨ r Afrikanisitik und

A¨ thiopistik). He held this position until his retirement in 1970. The languages of the Lake Chad area were Lukas’s primary research interest throughout his career. An early project was to compile and analyze linguistic data collected by 19th century traveler Gustav Nachtigal. In 1936, he proposed the Chado-Hamitic (now Chadic, with some alterations) language family as a member of the Hamito-Semitic (Afroasiatic) phylum, and in 1951 he proposed the existence of the East Saharan family (now Saharan). From 1952– 1973, Lukas devoted lengthy field trips to conducting the first modern linguistic studies of many languages of the Chadic and Saharan families. Among these are Kanuri (Central Kanuri) and Tubu (Tedaga), both Saharan, and the Chadic languages Bole (Bolanci), Bade, Musgu, Logone (Lagwan), Mokolo, Buduma, and Giziga (North and South). On these he wrote numerous groundbreaking descriptive and comparative works. In the 1960s, he brought particular attention to tone in African languages, a field about which much remains to be discovered. Lukas continued to study and write on African languages well after his retirement in 1970. He died on August 4, 1980, in Hamburg. See also: Afroasiatic Languages; Chadic Languages; Meinhof, Carl Friedrich Michael (1857–1944); Nilo-Saharan Languages.

Bibliography Jungraithmayr H & Mohling W J G (eds.) (1983). Lexikon der Afrikanistik. Berlin: Reimer. 149–150. Meyer-Bahlburg H (1980). ‘Johannes Lukas, 7. Oktober ¨ bersee 63, 161–169. 1901–4. August 1980.’ Afrika und U

Luo G J Dimmendaal, University of Cologne, Cologne, Germany ! 2006 Elsevier Ltd. All rights reserved.

One of the major Nilotic (Nilo-Saharan) languages in terms of number of speakers, Luo (also known as Nilotic Kavirondo), is spoken by approximately 3.5 million people mainly in western Kenya, northern Tanzania, and eastern Uganda. Together with Acholi, Adhola, Alur, Kumam, and Lango, Luo forms the Southern Lwoo cluster within the Western Nilotic

branch of Nilotic. The Luo orthography was developed at the beginning of the 20th century. There is a growing body of literature in this Nilotic language, which is also used in the educational system in Kenya. Luo is among the few Nilotic languages that has also been studied in detail by native speakers, e.g., Okoth-Okombo (1982) and Omondi (1982). One of the pioneers of African linguistics, Archibald Tucker, also produced a grammar of Luo, published posthumously as Tucker (1994). As shown in these studies, Luo has a classical two-tone system with downdrift, downstep, as well as upstep. As is

340

Lukas, Johannes (1901–1980)

Lukas, Johannes (1901–1980) E Shay, University of Colorado, Boulder, CO, USA ! 2006 Elsevier Ltd. All rights reserved.

Johannes Lukas was born on October 7, 1901, in Fischern bei Karlsbad (Karlovy Vary), Bohemia. He attended the University of Vienna, studying Egyptology and African and Oriental languages and culture, while also studying piano at the Vienna Conservatory. Among his professors was Eyptologist and Africanist Wilhelm Czermak. Lukas obtained a doctorate in African languages in 1925. After three years as assistant at the Vienna Museum fu¨r Vo¨lkerkunde (Ethnology), he spent two years as a tutor at Al-Azhar University. Here he became acquainted with many students from Central African countries, some of whom became Lukas’s informants in the first scientific studies of their native languages. In 1932–1933, as a research fellow of the International African Institute (London), Lukas conducted fieldwork on Kanuri (Saharan), and in 1937 published the seminal grammatical work on that language. After a brief stay at the School of Oriental and African Studies (SOAS) in London, he was accepted in 1934 into the Department of African Studies at the University of Hamburg. Here he taught under Africanist Carl Meinhof, while conducting his postdoctoral (Habilitation) studies and beginning to put into writing his findings on the Saharan and Chadic languages of the Lake Chad area. Lukas’s linguistic studies were interrupted by his wartime service, which he spent in Berlin at the AuslandsHochschule (Foreign University). He resumed teaching in Hamburg and in 1954 succeeded August Klingenheben as Director of the Department of African Studies (now the Institut fu¨r Afrikanisitik und

A¨thiopistik). He held this position until his retirement in 1970. The languages of the Lake Chad area were Lukas’s primary research interest throughout his career. An early project was to compile and analyze linguistic data collected by 19th century traveler Gustav Nachtigal. In 1936, he proposed the Chado-Hamitic (now Chadic, with some alterations) language family as a member of the Hamito-Semitic (Afroasiatic) phylum, and in 1951 he proposed the existence of the East Saharan family (now Saharan). From 1952– 1973, Lukas devoted lengthy field trips to conducting the first modern linguistic studies of many languages of the Chadic and Saharan families. Among these are Kanuri (Central Kanuri) and Tubu (Tedaga), both Saharan, and the Chadic languages Bole (Bolanci), Bade, Musgu, Logone (Lagwan), Mokolo, Buduma, and Giziga (North and South). On these he wrote numerous groundbreaking descriptive and comparative works. In the 1960s, he brought particular attention to tone in African languages, a field about which much remains to be discovered. Lukas continued to study and write on African languages well after his retirement in 1970. He died on August 4, 1980, in Hamburg. See also: Afroasiatic Languages; Chadic Languages; Meinhof, Carl Friedrich Michael (1857–1944); Nilo-Saharan Languages.

Bibliography Jungraithmayr H & Mohling W J G (eds.) (1983). Lexikon der Afrikanistik. Berlin: Reimer. 149–150. Meyer-Bahlburg H (1980). ‘Johannes Lukas, 7. Oktober ¨ bersee 63, 161–169. 1901–4. August 1980.’ Afrika und U

Luo G J Dimmendaal, University of Cologne, Cologne, Germany ! 2006 Elsevier Ltd. All rights reserved.

One of the major Nilotic (Nilo-Saharan) languages in terms of number of speakers, Luo (also known as Nilotic Kavirondo), is spoken by approximately 3.5 million people mainly in western Kenya, northern Tanzania, and eastern Uganda. Together with Acholi, Adhola, Alur, Kumam, and Lango, Luo forms the Southern Lwoo cluster within the Western Nilotic

branch of Nilotic. The Luo orthography was developed at the beginning of the 20th century. There is a growing body of literature in this Nilotic language, which is also used in the educational system in Kenya. Luo is among the few Nilotic languages that has also been studied in detail by native speakers, e.g., Okoth-Okombo (1982) and Omondi (1982). One of the pioneers of African linguistics, Archibald Tucker, also produced a grammar of Luo, published posthumously as Tucker (1994). As shown in these studies, Luo has a classical two-tone system with downdrift, downstep, as well as upstep. As is

Luria, Aleksandr Romanovich (1902–1977) 341

common in a wide range of languages ranging from Senegal to Ethiopia, it also has vowel harmony based on the position of the tongue root. Luo appears to have retained relatively few prototypical Nilotic features at the morphosyntactic level, presumably as a result of contact with Niger-Congo languages at different periods in time. One stratum, which seems to have affected all Southern Lwoo languages, appears to be due to contact with Ubanguian (Niger-Congo) languages. Another, more recent stratum resulted from intensive contact between Luo and neighboring Bantu (Niger-Congo) languages (cf. Rottland and Okoth-Okombo, 1986; Dimmendaal, 2001; and Storch, 2003). One manifestation of the intensive lexical and structural borrowing from Bantu is the development of noun class prefixes in Luo. In addition to borrowed prefixes, one finds prefixes that developed from nominal roots, as in dho´ -lu´ oˆ ‘the Luo language’ (from dhok ‘mouth’); ja`-lu´ oˆ /jo`-lu´ oˆ ‘Luo person (sg/pl)’ from jal (sg), jol (pl) ‘guest, stranger’. The common constituent order in Luo is SVO. Other members of the Lwoo cluster, such as Anywa or Pa¨ ri, allow for OVS order and they inflect postverbal subjects with (ergative) case. Luo does not have case marking. Consequently, although VS order may be used in Luo to express presentative focus with intransitive predicates, the postverbal subject is not inflected for case. Compared again to

Anywa and Pa¨ ri, Luo has a reduced system of verbal derivation, using prepositions instead to modify the valency of verbs. On the other hand, Luo developed tense marking on the verb, parallel to neighboring Bantu languages. See also: Kenya: Language Situation; Nilo-Saharan Lan-

guages; Uganda: Language Situation.

Bibliography Dimmendaal G J (2001). ‘Language shift and morphological convergence in the Nilotic area.’ Sprache und Geschichte in Afrika 16/17, 83–124. Okoth-Okombo D (1982). Dholuo morphophonemics in a generative framework. Language and Dialect Atlas of Kenya, Supplement 2. Berlin: Dietrich Reimer. Omondi L N (1982). The major syntactic structures of Dholuo. Language and Dialect Atlas of Kenya, Supplement 1. Berlin: Dietrich Reimer. Rottland F & Okoth-Okombo D (1986). ‘The Suba of Kenya: a case of growing ethnicity with receding language competence.’ Afrikanistische Arbeitspapiere 7, 115–126. Storch A (2003). ‘Dynamics of interacting populations: language contact in the Lwoo languages of Bahr El-Ghazal.’ Studies in African Linguistics 32(1), 63–93. Tucker A N with Creider C A (1994). A grammar of Kenya Luo (Dholuo). In Bender M L, Rottland F & Cyffer N (eds.) Nilo-Saharan Linguistic Analyses and Documentation, 8. Cologne: Ru¨ diger Ko¨ ppe.

Luria, Aleksandr Romanovich (1902–1977) S A Romashko, Moscow, Russia ! 2006 Elsevier Ltd. All rights reserved.

Born in Kazan, Luria studied medicine and social sciences at the University of Kazan. His early interest in psychology was stimulated mainly by German neoKantian philosophy and philosophical hermeneutics (Dilthey) and psychoanalysis (in 1922, he founded a psychoanalytic association in Kazan), as well as by the reflexology of V. Bexterev. In 1923 he was invited to join the activities of the Institute of Psychology in Moscow. In 1924 Luria met L. S. Vygotskiı˘; he worked with him and under his decisive influence for the next 10 years, until Vygotskiı˘’s death in 1934. With Vygotskiı˘, Luria took part in the founding of the so-called ‘cultural-historical school’ in psychology. Referring to ideas of Marxist philosophy, Vygotskiı˘ and his school tried to establish objective foundations of the human mind and considered the sociocultural

environment as a necessary condition of human psychic activity. The role of language in human phylogeny and ontogeny was crucial for Vygotskiı˘: Language (as a universal tool) was considered to be a main condition of human behavior and thinking. The ideas of the cultural-historical school were elaborated in a critical dialogue with the ideas of K. Bu¨hler and J. Piaget. One of Luria’s first projects was the examination of children’s linguistic reactions (Luria, 1927). In a collaborative work with Vygotskiı˘, Luria discussed the behavioral patterns of a ‘primitive’ mind (child, uneducated person) (Luria and Vygotskiı˘, 1930). Political repression (Luria had to abandon his fieldwork in Central Asia) and Vygotskiı˘’s early death interrupted the development of cultural-historical psychology. Luria then tended to work in more ‘neutral’ fields: clinical research in neurophysiology (he even obtained an additional degree in medicine in the 1930s) and

Luria, Aleksandr Romanovich (1902–1977) 341

common in a wide range of languages ranging from Senegal to Ethiopia, it also has vowel harmony based on the position of the tongue root. Luo appears to have retained relatively few prototypical Nilotic features at the morphosyntactic level, presumably as a result of contact with Niger-Congo languages at different periods in time. One stratum, which seems to have affected all Southern Lwoo languages, appears to be due to contact with Ubanguian (Niger-Congo) languages. Another, more recent stratum resulted from intensive contact between Luo and neighboring Bantu (Niger-Congo) languages (cf. Rottland and Okoth-Okombo, 1986; Dimmendaal, 2001; and Storch, 2003). One manifestation of the intensive lexical and structural borrowing from Bantu is the development of noun class prefixes in Luo. In addition to borrowed prefixes, one finds prefixes that developed from nominal roots, as in dho´-lu´oˆ ‘the Luo language’ (from dhok ‘mouth’); ja`-lu´oˆ/jo`-lu´oˆ ‘Luo person (sg/pl)’ from jal (sg), jol (pl) ‘guest, stranger’. The common constituent order in Luo is SVO. Other members of the Lwoo cluster, such as Anywa or Pa¨ri, allow for OVS order and they inflect postverbal subjects with (ergative) case. Luo does not have case marking. Consequently, although VS order may be used in Luo to express presentative focus with intransitive predicates, the postverbal subject is not inflected for case. Compared again to

Anywa and Pa¨ri, Luo has a reduced system of verbal derivation, using prepositions instead to modify the valency of verbs. On the other hand, Luo developed tense marking on the verb, parallel to neighboring Bantu languages. See also: Kenya: Language Situation; Nilo-Saharan Lan-

guages; Uganda: Language Situation.

Bibliography Dimmendaal G J (2001). ‘Language shift and morphological convergence in the Nilotic area.’ Sprache und Geschichte in Afrika 16/17, 83–124. Okoth-Okombo D (1982). Dholuo morphophonemics in a generative framework. Language and Dialect Atlas of Kenya, Supplement 2. Berlin: Dietrich Reimer. Omondi L N (1982). The major syntactic structures of Dholuo. Language and Dialect Atlas of Kenya, Supplement 1. Berlin: Dietrich Reimer. Rottland F & Okoth-Okombo D (1986). ‘The Suba of Kenya: a case of growing ethnicity with receding language competence.’ Afrikanistische Arbeitspapiere 7, 115–126. Storch A (2003). ‘Dynamics of interacting populations: language contact in the Lwoo languages of Bahr El-Ghazal.’ Studies in African Linguistics 32(1), 63–93. Tucker A N with Creider C A (1994). A grammar of Kenya Luo (Dholuo). In Bender M L, Rottland F & Cyffer N (eds.) Nilo-Saharan Linguistic Analyses and Documentation, 8. Cologne: Ru¨diger Ko¨ppe.

Luria, Aleksandr Romanovich (1902–1977) S A Romashko, Moscow, Russia ! 2006 Elsevier Ltd. All rights reserved.

Born in Kazan, Luria studied medicine and social sciences at the University of Kazan. His early interest in psychology was stimulated mainly by German neoKantian philosophy and philosophical hermeneutics (Dilthey) and psychoanalysis (in 1922, he founded a psychoanalytic association in Kazan), as well as by the reflexology of V. Bexterev. In 1923 he was invited to join the activities of the Institute of Psychology in Moscow. In 1924 Luria met L. S. Vygotskiı˘; he worked with him and under his decisive influence for the next 10 years, until Vygotskiı˘’s death in 1934. With Vygotskiı˘, Luria took part in the founding of the so-called ‘cultural-historical school’ in psychology. Referring to ideas of Marxist philosophy, Vygotskiı˘ and his school tried to establish objective foundations of the human mind and considered the sociocultural

environment as a necessary condition of human psychic activity. The role of language in human phylogeny and ontogeny was crucial for Vygotskiı˘: Language (as a universal tool) was considered to be a main condition of human behavior and thinking. The ideas of the cultural-historical school were elaborated in a critical dialogue with the ideas of K. Bu¨hler and J. Piaget. One of Luria’s first projects was the examination of children’s linguistic reactions (Luria, 1927). In a collaborative work with Vygotskiı˘, Luria discussed the behavioral patterns of a ‘primitive’ mind (child, uneducated person) (Luria and Vygotskiı˘, 1930). Political repression (Luria had to abandon his fieldwork in Central Asia) and Vygotskiı˘’s early death interrupted the development of cultural-historical psychology. Luria then tended to work in more ‘neutral’ fields: clinical research in neurophysiology (he even obtained an additional degree in medicine in the 1930s) and

342 Luria, Aleksandr Romanovich (1902–1977)

education. Still pursuing his interest in language and linguistic behavior, Luria provided a neurophysiological conception of aphasia. This pioneer work was intensified in the context of World War II, when Luria was treating wounded persons with brain injuries and various kinds of speech disorders (Luria, 1947). This experience became the foundation for Luria’s ‘neuropsychology,’ a discipline combining neurophysiological data with psychological functional analysis. After a hard period of postwar repressions, Luria was able to resume his investigations in the post-Stalinist years. He continued his clinical and educational activities, working at the same time at Moscow State University. Developing the view of language as a tool and regulator (Luria, 1961), he provided the first outlines of neurolinguistics (Luria, 1975). Luria combined in his neurolinguistic research the structuralist ideas of R. Jakobson and N. Troubetzkoy (paradigmatics and syntagmatics, distinctive features) and generative linguistics with the methods of neuropsychology and neurophysiology. See also: Aphasia Syndromes; Bu¨hler, Karl (1879–1963);

Jakobson, Roman (1896–1982); Language Development: Overview; Psycholinguistics: Overview; Trubetskoy,

Nikolai Sergeievich, Prince (1890–1938); Vygotskii, Lev Semenovich (1896–1934).

Bibliography Akhutina T et al. (eds.) (2004). A. R. Luria and contemporary psychology. New York: Nova Science Publishers. Luria A R (1927). Rechevye reaktsii rebenka. Kazan: Poligrafshkola im. Lunacharskogo. Luria A R (1947). Travmaticheskaia afazia. Moskva: Upravlenie delami Soveta ministrov SSSR. [English edn.: The traumatic aphasia. The Hague, Paris: Mouton 1970.] Luria A R (1961). The role of speech in the regulation of normal and abnormal behaviour. Oxford: Pergamon Press. Luria A R (1975). Osnovnye problemy neı˘rolinguistiki. Moskva: Izdatel’stvo Moskovskogo Universiteta. [English edn.: Basic problems of neurolinguistics. The Hague: Mouton 1976.] Luria A R (1979). The making of mind: a personal account of soviet psychology. Cambridge: Harvard University Press. [Russian edn.: Etapy proı˘dennogo puti: nauchnaia avtobiografia (2nd edn.). Moskva: Izdatel’stvo Moskovskogo Universiteta 2001.] Luria A R & Vygotskiı˘ L S (1930). Etiudy po istorii povedenia. Moskva: Leningrad. [English edn.: Ape, primitive man, and child: essays in the history of behavior. New York: Harvester Wheatsheaf 1992.]

Luther, Martin (1483–1546) D J Collins, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.

Evaluating Martin Luther’s (Figure 1) role in the development of German numbers among the recurrent challenges undertaken by language historians. Among 18th- and 19th-century academics, there was a firm consensus in most quarters that Luther was indeed the ‘Father of the German language.’ Such a judgment was colored by confessional interests that were seen promoting the notion of Standard German as ‘a Protestant language,’ a label which, depending on the context, could be either praise or vilification. More recently, the issue has been gradually de-confessionalized. While the assertion of his linguistic paternity of Standard German is untenable, what and how Luther wrote still earns him a place among the most important masters of the German language. His writings – in particular his translation of the Bible – had tremendous influence on how German was

spoken and written. This article will touch on three areas: (1) his vernacular literary output, most importantly his translation of the New Testament; (2) his vernacular, its sources and impact; and (3) a summary evaluation of the relation of Luther’s German to contemporaneous German languages. Luther was educated at schools in Magdeburg and Eisenbach and studied the arts at the University of Erfurt. He lectured first on moral philosophy and later on sacred scripture at the University of Wittenberg. A short list of Luther’s vernacular writings includes theological treatises, biblical commentaries, sermons, polemical tracts, correspondence, hymn texts, and Table Talk – transcribed, revised, and published conversations between Luther and his dinner guests, distributed as devotional reading. It also includes his translations of books of the Bible. His facility in Greek and Hebrew, acquired through private study and tutorial, were commendable by the standards of the day. His translation of the New Testament occurred with the assistance of Philip Melanchton, a professor of Greek at Erfurt, and

342 Luria, Aleksandr Romanovich (1902–1977)

education. Still pursuing his interest in language and linguistic behavior, Luria provided a neurophysiological conception of aphasia. This pioneer work was intensified in the context of World War II, when Luria was treating wounded persons with brain injuries and various kinds of speech disorders (Luria, 1947). This experience became the foundation for Luria’s ‘neuropsychology,’ a discipline combining neurophysiological data with psychological functional analysis. After a hard period of postwar repressions, Luria was able to resume his investigations in the post-Stalinist years. He continued his clinical and educational activities, working at the same time at Moscow State University. Developing the view of language as a tool and regulator (Luria, 1961), he provided the first outlines of neurolinguistics (Luria, 1975). Luria combined in his neurolinguistic research the structuralist ideas of R. Jakobson and N. Troubetzkoy (paradigmatics and syntagmatics, distinctive features) and generative linguistics with the methods of neuropsychology and neurophysiology. See also: Aphasia Syndromes; Bu¨hler, Karl (1879–1963);

Jakobson, Roman (1896–1982); Language Development: Overview; Psycholinguistics: Overview; Trubetskoy,

Nikolai Sergeievich, Prince (1890–1938); Vygotskii, Lev Semenovich (1896–1934).

Bibliography Akhutina T et al. (eds.) (2004). A. R. Luria and contemporary psychology. New York: Nova Science Publishers. Luria A R (1927). Rechevye reaktsii rebenka. Kazan: Poligrafshkola im. Lunacharskogo. Luria A R (1947). Travmaticheskaia afazia. Moskva: Upravlenie delami Soveta ministrov SSSR. [English edn.: The traumatic aphasia. The Hague, Paris: Mouton 1970.] Luria A R (1961). The role of speech in the regulation of normal and abnormal behaviour. Oxford: Pergamon Press. Luria A R (1975). Osnovnye problemy neı˘rolinguistiki. Moskva: Izdatel’stvo Moskovskogo Universiteta. [English edn.: Basic problems of neurolinguistics. The Hague: Mouton 1976.] Luria A R (1979). The making of mind: a personal account of soviet psychology. Cambridge: Harvard University Press. [Russian edn.: Etapy proı˘dennogo puti: nauchnaia avtobiografia (2nd edn.). Moskva: Izdatel’stvo Moskovskogo Universiteta 2001.] Luria A R & Vygotskiı˘ L S (1930). Etiudy po istorii povedenia. Moskva: Leningrad. [English edn.: Ape, primitive man, and child: essays in the history of behavior. New York: Harvester Wheatsheaf 1992.]

Luther, Martin (1483–1546) D J Collins, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.

Evaluating Martin Luther’s (Figure 1) role in the development of German numbers among the recurrent challenges undertaken by language historians. Among 18th- and 19th-century academics, there was a firm consensus in most quarters that Luther was indeed the ‘Father of the German language.’ Such a judgment was colored by confessional interests that were seen promoting the notion of Standard German as ‘a Protestant language,’ a label which, depending on the context, could be either praise or vilification. More recently, the issue has been gradually de-confessionalized. While the assertion of his linguistic paternity of Standard German is untenable, what and how Luther wrote still earns him a place among the most important masters of the German language. His writings – in particular his translation of the Bible – had tremendous influence on how German was

spoken and written. This article will touch on three areas: (1) his vernacular literary output, most importantly his translation of the New Testament; (2) his vernacular, its sources and impact; and (3) a summary evaluation of the relation of Luther’s German to contemporaneous German languages. Luther was educated at schools in Magdeburg and Eisenbach and studied the arts at the University of Erfurt. He lectured first on moral philosophy and later on sacred scripture at the University of Wittenberg. A short list of Luther’s vernacular writings includes theological treatises, biblical commentaries, sermons, polemical tracts, correspondence, hymn texts, and Table Talk – transcribed, revised, and published conversations between Luther and his dinner guests, distributed as devotional reading. It also includes his translations of books of the Bible. His facility in Greek and Hebrew, acquired through private study and tutorial, were commendable by the standards of the day. His translation of the New Testament occurred with the assistance of Philip Melanchton, a professor of Greek at Erfurt, and

Luther, Martin (1483–1546) 343

Figure 1 Martin Luther.

others, using Erasmus’s Greek-Latin edition (1516), the so-called textus receptus. This German New Testament was published in 1522. Translation of the Old Testament proceeded at a slower pace. The Pentateuch was published in 1523; and the Psalter, in 1524. It was not until 1534 that Luther published an entire Bible (including a translation of the Greek Old Testament deuterocanonical literature). Luther’s interest in translating the Bible into the vernacular from the sacred languages points immediately to the scholarly movement that provided him with the tools to do just that: Renaissance humanism. This movement originated in Italy but spread to Germany in the last quarter of the 15th century. Humanism had significant impact on the study of languages, ancient and vernacular. Although many Renaissance humanists sided with Luther during the religious controversies of the 16th century, Luther himself cannot be judged a humanist. However, his efforts at translating the Bible into the vernacular, among other projects, made him their beneficiary and sometime ally inasmuch as both he and they were attracted to Hebrew and Greek. Both were also critical of the medieval Latin used in theology and of the accuracy of the Vulgate, the canonical Latin translation of the Bible in use at the time. Because the humanists had for some decades developed and encouraged the study of Greek and Hebrew (see Reuchlin, Johann (1455–1522)), Luther had the tools at his disposal to learn the languages, to critique the Vulgate itself, and to develop a new understanding of why and how to read the Bible that would be a central tenet of the Reformation in Germany. The principle

that church authority and Christian truth were founded sola scriptura, on the Bible alone, made Holy Writ’s availability through a vernacular translation not just a matter of convenience, but of life and death. But this would be a translation into what sort of German? Luther’s significance for the history of the German language rests in largest part on the German into which he translated the Bible. His education across various dialectic divides in Germany inspired him to apply a widely understandable German he could find. He began with a German from close to home, namely that used in the Meissen court of the electors of Saxony. Meissen is not far from where Luther was born and reared; the Saxon elector Frederick was among Luther’s staunchest princely supporters. Nonetheless, this language could only serve as Luther’s starting point in that it was a language developed by lawyers and scribes for their professional work and was thus not entirely suitable for capturing the breadth and flavor of biblical and theological expression. Luther was fortunately able to turn to the vernacular theological and mystical texts of several late medieval German writers, who, beginning with Meister Eckhart (ca. 1260–1327/28), had contributed to making the German vernacular more theologically sophisticated. None of this reliance, however, should distract one from Luther’s own genius. He was a master of the German language; his own writing and translating is marked with passion and precision. At the same time, he could not transcend all linguistic divisions in the empire: for example, into the 17th century, Luther’s Bible was itself translated into Low German for Protestant populations in the north that found the Saxon tendencies of the vocabulary foreign. His translating did occasion some controversy, in part due to the day’s religious disputes, but also touching upon serious questions concerning the translator’s art. Accused of both clumsy literalism and tendentious paraphrasing, Luther responded to his critics in 1530 with a treatise on translating (Ein Sendbrief vom Dolmetschen). In it Luther argued that good scriptural translating must be inspired by faith itself. The translator must adhere to the original texts, but without depriving the vernacular of its natural cadence and melody. These principles of translation helped him explain his insertion of the word allein into Romans 3:28, resulting in the human person’s theological justification ‘‘by faith alone.’’ It also explained why in the angelic salutation (Luke 1:28) he used Holdselige in reference to Mary instead of voll Gnade, the latter being a more common German rendering at the time, but one too Latin-bound (gratia plena) for Luther’s taste. In fact, he admitted a real

344 Luther, Martin (1483–1546)

preference for the even more homely, idiomatic translation du liebe Maria: had the angel greeted her in German, that is what he would have said. In determining the historical significance of Luther’s Bible on the German language, one historian has recently emphasized that what was translated was more important than by whom. In the Christian West, the vernacular Bible – as no other text until very recently, read and listened to daily by members of the Christian faith – could not but have had influence on the languages in which it appeared, as for example the King James Version has had on English. Cautious recent scholars stress that much of Luther’s German, however, was established by the time he used it in his translation, and other aspects of modern Standard German developed after the mid-16th century. It can be stated axiomatically that the press, becoming as it did so widely used by the forces of Protestantism and Catholicism in the sixteenth century and beyond, was the technology most responsible for the transformation of all early modern Christianity. A full exposition of printing’s implications for the history of language is beyond the scope of this article. However, suffice it here to note the obvious: the printed vernacular Bible, to the extent that it fixed a popular text and allowed for its wide dissemination, provides the historian of language with a standard against which to judge the malleability and the resistance of language speakers to the standardization of their language. In the case of the Luther Bible and the development of a standardized German, comparison between the 1522 and 1545 editions of the New Testament is particularly revealing. For example, it shows that orthography is an area of the written language on which Luther had no wide or lasting influence. This is also an area which printers themselves took to changing as they set the type and varied spelling and punctuation as they saw fit. Another kind of limit to the appropriation of Luther’s German can be seen in the resistance to the unaccented terminal e in some German plurals and dative singulars, characteristic of the Saxon dialect and used in the Luther Bible. The so-called Luther-e was long resisted in the Germanspeaking Catholic South, where it had not been common before the Reformation and where it was resisted from the 16th-century onward for confessional reasons. Allied with the southern Germans, the Allemanic speakers in southwestern Germany and the Swiss Confederation resisted many aspects of Luther’s German, but in these more Protestant regions, clearly without the confessional overtones. And today for reasons quite removed from the earlier ecclesiastical ones, the Luther-e is gradually disappearing from both spoken and written German everywhere.

In conclusion, Luther’s influence on the development of early high German should not be underestimated. Luther had an unparalleled ear for the German language and was a vigorous author. But it should also be understood, in contradistinction to the enthusiasm for his linguistic paternity on the part of many 19th-century linguists, that most of the language characteristics of the Luther Bible had precedent before 1522. Furthermore, early high German was still to develop into the literary new high German of the 17th century and later. In this regard, Werner Besch, a distinguished historian of the German language, has suggested a simile with which to understand where Luther belongs in the pantheon of German writers: Besch uses the image of the medieval retable, in which the patron is painted modestly into the corner of a tableau filled with the artistic representation of some laudable religious event or theological truth: the subject of this painting is the vernacular Bible; the modest patron, Luther. Besch notes that that is an understanding of Luther’s role in the development of the vernacular that Luther himself would surely accept. See also: Reuchlin, Johann (1455–1522); Translation:

History.

Bibliography Luther M (1883–). Werke: kritische Gesamtausgabe. Weimar: Bo¨ hlau. Luther M (1955–1986). Pelikan J & Lehmann H T (eds.) Works. St. Louis: Concordia Publishing House. Bach H (1974–1985). Handbuch der Luthersprache: Lautund Formenlehre in Luthers Wittenberger Drucken bis 1545 (2 vols). Copenhagen: G. E. C. Gad. Besch W (1999). Die Rolle Luthers in der deutschen Sprachgeschichte. Heidelberg: Universita¨ tsverlag C. Winter. Bluhm H (1984). Martin Luther, creative translator. St. Louis: Concordia Publishing House. Bluhm H (1987). Studies in Luther/Luther Studien. Bern: Peter Lang. Gelhaus H (1989). Der Streit um Luthers Bibelverdeutschung im 16. und 17. Jahrhundert. Tu¨ bingen: Max Niemeyer. Kettmann G (1967). ‘Die kursa¨ chsische Kanzleisprache in der Lutherzeit.’ PBB: Paul-Braunes Beitra¨ ge (Halle) 89, 121–129. Schildt J (ed.) (1984). Luthers Sprachschaffen: Gesellschaftliche Grundlagen, Geschichtliche Wirkungen (3 vols). Berlin: Akademie der Wissenschaften der DDR – Zentralinstitut fu¨ r Sprachwissenschaft. Schwarz W (1970). Principles and problems of biblical translation: some Reformation controversies and their background. Cambridge: Cambridge University Press.

Luxembourg: Language Situation 345 Sonderegger S (1998). ‘Geschichte deutschsprachiger Bibelu¨ bersetzung in Grundzu¨ gen.’ In Besch W et al. (eds.) Sprachgeschichte, vol. 1: ein Handbuch zur Geschichte der detuschen Sprache und ihrer Erforschung, 2nd edn. Berlin: de Gruyter. 229–284. Wells C J (1993). ‘Orthography as legitimation: ‘‘Luther’s’’ Bible orthography and Frankfurt Bibles of the 1560s and

70s.’ In Flood J L et al. (eds.) ‘Das unsichtbare Band der Sprach’: Studies in German language and linguistic history. Stuttgart: Hans-Dieter Heinz. Wolf H (ed.) (1996). Luthers Deutsch: Sprachliche Leistung und Wirkung. Frankfurt: Peter Lang. Young C & Gloning T (2004). A history of the German language through texts. London: Routledge.

Luxembourg: Language Situation H Baetens Beardsmore, Universite´ Libre de Bruxelles, Brussels, Belgium ! 2006 Elsevier Ltd. All rights reserved.

The Grand Duchy of Luxembourg, a country of 2586 km2 and with a population in 2004 of 462 690 inhabitants, is a unique case in Europe, where the totality of the indigenous population is to some extent trilingual. The languages spoken are Luxembourgish (Luxembourgeois), German, and French. Approximately 30% of the population consists of immigrants, the majority of Portuguese or Italian background. Luxembourgish is the symbol of national identity and is the only cultural feature that clearly distinguishes the citizens from those of neighboring countries. Since 1938 naturalization has been made dependent on its knowledge. Luxembourgish is a Germanic language related to Low German (Low Saxon). To some it is a dialect and has been classified as a partially standardized language, given its restricted literary tradition, or as a neglected minority mother tongue. It includes many loanwords from German and French, lacks vocabulary and styles to handle technological concepts, and given that it is primarily an oral mode of expression, has fairly elementary syntax with few subordinate clauses and restricted tense, mood, and case variations. It was made a compulsory subject in elementary education in 1912, with an official spelling and textbook, and was declared the national language in 1983, but its use, though widespread, is limited. French and German, together with Luxembourgish, are official languages and all three are vehicles of instruction for the whole school population at different stages of the education system. The system is intended to bring everyone in the country to some level of trilingual proficiency. In compulsory nursery education at ages four and five, Luxembourgish is used exclusively. This is partially intended to help anchor national identity and also to integrate immigrant children. In primary school,

German gradually replaces Luxembourgish. German is taught as a subject in Grade 1 of primary school and is the first language of reading and writing. By the end of primary school the transition must be made to the exclusive use of German for all subjects except language lessons in Luxembourgish and French. The third language, French, is introduced as a subject in Grade 2 of primary education. The education system operates on the principle of introducing children to schooling in the majority language, Luxembourgish, followed by a related but distinct L2 (German) as a subject, but not a medium of instruction for nonlanguage subjects, in Grade 1, prior to a gradual elimination of the L1 in favor of the L2 as medium of instruction. The L3 (French) is introduced as a subject in primary education, in preparation for its use as a medium in secondary education. Mathematics is taught entirely through French in secondary schools and depending on the orientation, French may increase as a medium for nonlanguage subjects as schooling progresses, while the use of German decreases. This complex system results in a working knowledge of three languages for the entire population. Proficiency depends on the type of education followed, those in general education having higher proficiency in French through more contact hours with the language than those in technical or vocational education. Luxembourgish is the language most used (but not exclusively) by all social categories for oral exchanges in private life, but in written private communication the order of preference is German, French, Luxembourgish. In the work sphere the order of preference for oral communication is Luxembourgish, French, German, but for written communication the order is French, followed by German, with no Luxembourgish. With the administration citizens have the right to use any of the three languages and the administration is expected to reply in whichever of the three is preferred by the citizen ‘‘where possible.’’ Since the legal system operates entirely in French it is not always possible to use another language.

Luxembourg: Language Situation 345 Sonderegger S (1998). ‘Geschichte deutschsprachiger Bibelu¨bersetzung in Grundzu¨gen.’ In Besch W et al. (eds.) Sprachgeschichte, vol. 1: ein Handbuch zur Geschichte der detuschen Sprache und ihrer Erforschung, 2nd edn. Berlin: de Gruyter. 229–284. Wells C J (1993). ‘Orthography as legitimation: ‘‘Luther’s’’ Bible orthography and Frankfurt Bibles of the 1560s and

70s.’ In Flood J L et al. (eds.) ‘Das unsichtbare Band der Sprach’: Studies in German language and linguistic history. Stuttgart: Hans-Dieter Heinz. Wolf H (ed.) (1996). Luthers Deutsch: Sprachliche Leistung und Wirkung. Frankfurt: Peter Lang. Young C & Gloning T (2004). A history of the German language through texts. London: Routledge.

Luxembourg: Language Situation H Baetens Beardsmore, Universite´ Libre de Bruxelles, Brussels, Belgium ! 2006 Elsevier Ltd. All rights reserved.

The Grand Duchy of Luxembourg, a country of 2586 km2 and with a population in 2004 of 462 690 inhabitants, is a unique case in Europe, where the totality of the indigenous population is to some extent trilingual. The languages spoken are Luxembourgish (Luxembourgeois), German, and French. Approximately 30% of the population consists of immigrants, the majority of Portuguese or Italian background. Luxembourgish is the symbol of national identity and is the only cultural feature that clearly distinguishes the citizens from those of neighboring countries. Since 1938 naturalization has been made dependent on its knowledge. Luxembourgish is a Germanic language related to Low German (Low Saxon). To some it is a dialect and has been classified as a partially standardized language, given its restricted literary tradition, or as a neglected minority mother tongue. It includes many loanwords from German and French, lacks vocabulary and styles to handle technological concepts, and given that it is primarily an oral mode of expression, has fairly elementary syntax with few subordinate clauses and restricted tense, mood, and case variations. It was made a compulsory subject in elementary education in 1912, with an official spelling and textbook, and was declared the national language in 1983, but its use, though widespread, is limited. French and German, together with Luxembourgish, are official languages and all three are vehicles of instruction for the whole school population at different stages of the education system. The system is intended to bring everyone in the country to some level of trilingual proficiency. In compulsory nursery education at ages four and five, Luxembourgish is used exclusively. This is partially intended to help anchor national identity and also to integrate immigrant children. In primary school,

German gradually replaces Luxembourgish. German is taught as a subject in Grade 1 of primary school and is the first language of reading and writing. By the end of primary school the transition must be made to the exclusive use of German for all subjects except language lessons in Luxembourgish and French. The third language, French, is introduced as a subject in Grade 2 of primary education. The education system operates on the principle of introducing children to schooling in the majority language, Luxembourgish, followed by a related but distinct L2 (German) as a subject, but not a medium of instruction for nonlanguage subjects, in Grade 1, prior to a gradual elimination of the L1 in favor of the L2 as medium of instruction. The L3 (French) is introduced as a subject in primary education, in preparation for its use as a medium in secondary education. Mathematics is taught entirely through French in secondary schools and depending on the orientation, French may increase as a medium for nonlanguage subjects as schooling progresses, while the use of German decreases. This complex system results in a working knowledge of three languages for the entire population. Proficiency depends on the type of education followed, those in general education having higher proficiency in French through more contact hours with the language than those in technical or vocational education. Luxembourgish is the language most used (but not exclusively) by all social categories for oral exchanges in private life, but in written private communication the order of preference is German, French, Luxembourgish. In the work sphere the order of preference for oral communication is Luxembourgish, French, German, but for written communication the order is French, followed by German, with no Luxembourgish. With the administration citizens have the right to use any of the three languages and the administration is expected to reply in whichever of the three is preferred by the citizen ‘‘where possible.’’ Since the legal system operates entirely in French it is not always possible to use another language.

346 Luxembourg: Language Situation See also: Education in a Multilingual Society; French; German; Language Education Policy in Europe; Luxembourgish; Multilingualism: Pragmatic Aspects.

Bibliography Davis K A (1994). Language planning in multilingual contexts: policies, communities and schools in Luxembourg. Amsterdam/Philadelphia: John Benjamins. Hoffman C (1998). ‘Luxembourg and the European schools.’ In Cenoz J C & Genesee F (eds.) Beyond bilingualism:

multilingualism and multilingual education. Clevedon, England: Multilingual Matters. 143–174. Lebrun N & Baetens Beardsmore H (1993). ‘Trilingual education in the Grand Duchy of Luxembourg.’ In Baetens Beardsmore H (ed.) European models of bilingual education. Clevedon, England: Multilingual Matters. 101–120. Ministe`re de l’E´ ducation Nationale, de la Formation Professionnelle et des Sports (2003). Les Chiffres cle´ s de l’e´ ducation nationale. Luxembourg: Script. Website of the Luxembourg government on official language policy: http://www.gouvernement.lu/tout_ savoir/population_langues/situling.html

Luxembourgish G Newton, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Luxembourgish (Le¨ tzebuergesch), genetically related to German, is traditionally grouped with the West Moselle Franconian dialects. However, early Salic Frankish influence and later close attachment to the Low Countries, France, and Spain have allowed it to develop an identity separate from that of the neighboring dialects in Germany. Earliest documents from the area date from the 9th century, with modern literary forms beginning in the 1820s. Various orthographies exist, including the strictly phonemic Lezebuurjer Ortografi (1946). Little used, this remained official until replaced in 1975 by the system of the Luxemburger Wo¨ rterbuch. Modifications to this were introduced in 1999. In 1939, Luxembourg naturalization was made dependent on knowledge of the language. In 1984, Le¨ tzebuergesch was legally acknowledged as the national language of the Grand Duchy. Syntactically, Le¨ tzebuergesch is similar to German, although case loss has reduced the possibilities of object-verb-subject (OVS) ordering. Parataxis predominates, though hypotaxis is frequent (subject-object-verb (SOV) ordering). In morphology, nominative and accusative have fallen together, assuming accusative form. Third-person pronouns show northern /h/ (NHG ¼ New High German): hien, hatt, hinen NHG er, es, ihnen ‘he, it, them’. Noun plurals are most commonly in , though other patterns occur. There is, however, no plural in . The pronouns mir NHG wir ‘we’ and dir NHG ihr/Sie ‘you (plural and polite)’ arise from false division of verbal endings. NHG uns ‘us’ appears as a¨ is/eis (koine) and ons (Luxembourg city). Adjectival comparison is chiefly with me´ i NHG mehr ‘more’, though occasional synthetic forms

occur. In compound nouns, a linking is frequently present, e.g., Plastikstut NHG Plastiktasche ‘plastic bag’, Autosdier NHG Autotu¨ r ‘car door’. Tenses comprise present (ech gesinn NHG ich sehe), perfect (ech hu gesinn NHG ich habe gesehen), and pluperfect (ech hat gesinn NHG ich hatte gesehen); the future can be periphrastic (ech wa¨ erd gesinn NHG ich werde sehen), but is mainly a function of the present tense (ech kommen iwwermar NHG ich komme u¨ bermorgen). Some indicative (ech gesouch NHG ich sah) and subjunctive (ech gese´ ich NHG ich sa¨ he) preterites also occur, though these are more frequent in the north (Oesling); pluperfect subjunctives (ech ha¨ tt gesinn NHG ich ha ¨ tte gesehen) occur frequently. The auxiliary verb ginn NHG geben ¼ werden ‘give ¼ become’ is used to form analytical conditionals (ech ge´ if gesinn NHG ich wu ¨ rde sehen) and passives (ech gouf gesinn NHG ich wurde gesehen ‘I was seen’). Present tense first-person singulars inflect (ech sangen NHG ich singe). Consonants are in voiced/voiceless opposition, with final neutralization. The High German sound shift is incomplete (dat NHG das/dass ‘that’, op NHG auf ‘on’, Pond NHG Pfund ‘pound’, Korf NHG Korb ‘basket’, Dall NHG Tal ‘valley’) and intervocalic /g/ is often absent, e.g., Won, Vull NHG Wagen, Vogel ‘wagon, bird’. Some velarization of dental nasals (ze´ ng, brong NHG zehn, braun ‘ten, brown’) is also present, though this is stronger in the north of Luxembourg, which also has velarized plosives, e.g., Lekt, ne´ ck (koine Leit, net) NHG Leute, nicht ‘people, not’. Medial and final /s/ combinations are liable to palatization (Meeschter NHG Meister ‘master’), more strongly in the southwest (Fe¨ nschter NHG Fenster ‘window’). Final /n/ is ‘mobile’ (Eifler Regel) and is retained only before a following vowel, /h/ or a dental (den Dag, but de Mann NHG der/den Tag, der/den Mann), or at juncture. Middle High German (MHG)

346 Luxembourg: Language Situation See also: Education in a Multilingual Society; French; German; Language Education Policy in Europe; Luxembourgish; Multilingualism: Pragmatic Aspects.

Bibliography Davis K A (1994). Language planning in multilingual contexts: policies, communities and schools in Luxembourg. Amsterdam/Philadelphia: John Benjamins. Hoffman C (1998). ‘Luxembourg and the European schools.’ In Cenoz J C & Genesee F (eds.) Beyond bilingualism:

multilingualism and multilingual education. Clevedon, England: Multilingual Matters. 143–174. Lebrun N & Baetens Beardsmore H (1993). ‘Trilingual education in the Grand Duchy of Luxembourg.’ In Baetens Beardsmore H (ed.) European models of bilingual education. Clevedon, England: Multilingual Matters. 101–120. Ministe`re de l’E´ducation Nationale, de la Formation Professionnelle et des Sports (2003). Les Chiffres cle´s de l’e´ducation nationale. Luxembourg: Script. Website of the Luxembourg government on official language policy: http://www.gouvernement.lu/tout_ savoir/population_langues/situling.html

Luxembourgish G Newton, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Luxembourgish (Le¨tzebuergesch), genetically related to German, is traditionally grouped with the West Moselle Franconian dialects. However, early Salic Frankish influence and later close attachment to the Low Countries, France, and Spain have allowed it to develop an identity separate from that of the neighboring dialects in Germany. Earliest documents from the area date from the 9th century, with modern literary forms beginning in the 1820s. Various orthographies exist, including the strictly phonemic Lezebuurjer Ortografi (1946). Little used, this remained official until replaced in 1975 by the system of the Luxemburger Wo¨rterbuch. Modifications to this were introduced in 1999. In 1939, Luxembourg naturalization was made dependent on knowledge of the language. In 1984, Le¨tzebuergesch was legally acknowledged as the national language of the Grand Duchy. Syntactically, Le¨tzebuergesch is similar to German, although case loss has reduced the possibilities of object-verb-subject (OVS) ordering. Parataxis predominates, though hypotaxis is frequent (subject-object-verb (SOV) ordering). In morphology, nominative and accusative have fallen together, assuming accusative form. Third-person pronouns show northern /h/ (NHG ¼ New High German): hien, hatt, hinen NHG er, es, ihnen ‘he, it, them’. Noun plurals are most commonly in , though other patterns occur. There is, however, no plural in . The pronouns mir NHG wir ‘we’ and dir NHG ihr/Sie ‘you (plural and polite)’ arise from false division of verbal endings. NHG uns ‘us’ appears as a¨is/eis (koine) and ons (Luxembourg city). Adjectival comparison is chiefly with me´i NHG mehr ‘more’, though occasional synthetic forms

occur. In compound nouns, a linking is frequently present, e.g., Plastikstut NHG Plastiktasche ‘plastic bag’, Autosdier NHG Autotu¨r ‘car door’. Tenses comprise present (ech gesinn NHG ich sehe), perfect (ech hu gesinn NHG ich habe gesehen), and pluperfect (ech hat gesinn NHG ich hatte gesehen); the future can be periphrastic (ech wa¨erd gesinn NHG ich werde sehen), but is mainly a function of the present tense (ech kommen iwwermar NHG ich komme u¨bermorgen). Some indicative (ech gesouch NHG ich sah) and subjunctive (ech gese´ich NHG ich sa¨he) preterites also occur, though these are more frequent in the north (Oesling); pluperfect subjunctives (ech ha¨tt gesinn NHG ich ha ¨ tte gesehen) occur frequently. The auxiliary verb ginn NHG geben ¼ werden ‘give ¼ become’ is used to form analytical conditionals (ech ge´if gesinn NHG ich wu ¨ rde sehen) and passives (ech gouf gesinn NHG ich wurde gesehen ‘I was seen’). Present tense first-person singulars inflect (ech sangen NHG ich singe). Consonants are in voiced/voiceless opposition, with final neutralization. The High German sound shift is incomplete (dat NHG das/dass ‘that’, op NHG auf ‘on’, Pond NHG Pfund ‘pound’, Korf NHG Korb ‘basket’, Dall NHG Tal ‘valley’) and intervocalic /g/ is often absent, e.g., Won, Vull NHG Wagen, Vogel ‘wagon, bird’. Some velarization of dental nasals (ze´ng, brong NHG zehn, braun ‘ten, brown’) is also present, though this is stronger in the north of Luxembourg, which also has velarized plosives, e.g., Lekt, ne´ck (koine Leit, net) NHG Leute, nicht ‘people, not’. Medial and final /s/ combinations are liable to palatization (Meeschter NHG Meister ‘master’), more strongly in the southwest (Fe¨nschter NHG Fenster ‘window’). Final /n/ is ‘mobile’ (Eifler Regel) and is retained only before a following vowel, /h/ or a dental (den Dag, but de Mann NHG der/den Tag, der/den Mann), or at juncture. Middle High German (MHG)

Luzzatto, Samuel David (1800–1865) 347

(ıˆs ‘ice’, trıˆben ‘drive’, huˆs ‘house’, liute ¨ is NHG ‘people’, hiulen ‘howl’) appear as /E:i/ or /Ai/ (A Eis, dreiwen NHG treiben), /A:o/ (Haus NHG Haus), /Ai/ or /Ao/ (Leit NHG Leute, haulen NHG heulen), and (brief ‘letter’, fuoz ‘foot’, vu¨eze ‘feet’) appear as /ei/ (Bre´if NHG Brief ), /ou/ (Fouss NHG Fuß), /ei/ (Fe´iss NHG Fu¨ße). MHG (vleisch ‘flesh’, boum ‘tree’) have the reflexes /e:/ (Fleesch NHG Fleisch), /a:/ (Bam NHG Baum). However, all of these examples may also be subject to allophonic variation, and shortened forms are common. A shift of West Germanic /i/ to /a/ is also a strong characteristic of the language: Wand NHG Wind ‘wind’. Derounding (La¨ffel, fe¨nnef NHG Lo¨ffel, fu¨nf ‘spoon, five’) and lowering (domm NHG dumm ‘stupid’) are also found, as are elements of ‘correption’ (an abrupt rise and fall of vowel pitch; only vestigially present in Luxembourgish, e.g., sta¨if/steif NHG steif ‘stiff’) and ‘circumflexion’ (a rise and fall of vowel pitch, accompanied by up to three times normal length), e.g., den Hals (nom./acc.) NHG der/den Hals ‘neck’ (not circumflected); dem Haals (dat.) NHG dem Hals(e) (circumflected). Another element is the Schwebelaut (a lengthening of consonants), which occasionally marks a difference in meaning, e.g., voll (short /l/, MHG vol) NHG voll ‘full’, voll (long /l:/, MHG volle) NHG voll ‘drunk’.

Sample Text Bei a¨is goufe vun 1825 [uechtze´nghonnertfe¨nnefanzwanzeg] bis haut verschidde Schreifweise gebraucht, de´i all hiert Guddes haten. /bAi E:is "goufe fon "uextseN%hOnert%fenefAn"tsvAntsec¸ bIs hAot fEr"SIde "SrAifvAize ge"brAuxt dei Al hi:rt godez "ha:ten/ Bei uns wurden von achtzehnhundertfu¨nfundzwanzig bis heute verschiedene Schreibweisen gebraucht, die alle ihr Gutes hatten.

NHG

‘In Luxembourg from 1825 up to today various spelling systems were used, which all had their good points’. See also: Germanic Languages; Luxembourg: Language

Situation.

Bibliography Berg G (1993). ‘Mir we¨lle bleiwe, wat mir sin’: soziolinguistische und sprachtypologische Betrachtungen u¨ber die luxemburgischen Mehrsprachigkeit. Tu¨bingen: Niemeyer. Braun J (1999). Eis Sprooch richteg schreiwen. Bartreng: Rapidpress. Bruch R (1953). Grundlegung einer Geschichte des Luxemburgischen. Luxembourg: P. Linden. Bruch R (1954). Das Luxemburgische im westfra¨nkischen Kreis. Luxembourg: P. Linden. Bruch R (1955). Pre´cis populaire de grammaire luxembourgeois. Luxembourg: P. Linden. Derrmann-Loutsch L (ed.) (2003). Deutsch Luxemburgisches Wo¨rterbuch. Luxembourg: Saint Paul. Gilles P (1999). Dialektausgleich im Le¨tzebuergeschen: zur phonetisch-phonologischen Fokussierung einer Nationalsprache. Tu¨bingen: Niemeyer. Hoffmann F (1979). Sprachen in Luxemburg: sprachwissenschaftliche und literarhistorische Beschreibung einer Triglossie-Situation. Luxembourg: Institut Grand-Ducal. Keller R E (1961). ‘Luxembourgish.’ In Keller R E (ed.) German dialects. Manchester: Manchester University Press. 248–298. Luxemburger Wo¨rterbuchkommission (1950–1977). Luxemburger Wo¨rterbuch (5 vols). Luxembourg: Linden. [Reissued in 1995 as Le¨tzebuerger Dixiona¨r (2 vols).] Newton G (1990). ‘Central Franconian.’ In Russ C V J (ed.) The dialects of modern German. London: Routledge. 136–209. Newton G (ed.) (1996). Luxembourg and Le¨tzebuergesch: language and communication at the crossroads of Europe. Oxford: Clarendon Press. Schanen F (2004). Parlons luxembourgeois. Paris: L’Harmattan. Schmitt L F (ed.) (1963). Luxemburgischer Sprachatlas. Marburg: Elwert.

Luzzatto, Samuel David (1800–1865) A Gianto, Pontifical Biblical Institute, Rome, Italy ! 2006 Elsevier Ltd. All rights reserved.

Samuel David Luzzato (often abbreviated as Shedal) was born in Trieste, Italy, on August 22, 1800, and died in Padua on September 30, 1865. He

was educated in the best Talmudic traditions by Abraham Eliezer ha-Levi, the Chief Rabbi of Trieste. His father, himself an accomplished Talmudist, taught him Hebrew at home. Luzzatto’s precocious talents were noticed early in his childhood. At the age of 11 he undertook to write a Hebrew grammar in Italian, translated into Hebrew

Luzzatto, Samuel David (1800–1865) 347

(ıˆs ‘ice’, trıˆben ‘drive’, huˆs ‘house’, liute ¨ is NHG ‘people’, hiulen ‘howl’) appear as /E:i/ or /Ai/ (A Eis, dreiwen NHG treiben), /A:o/ (Haus NHG Haus), /Ai/ or /Ao/ (Leit NHG Leute, haulen NHG heulen), and (brief ‘letter’, fuoz ‘foot’, vu¨eze ‘feet’) appear as /ei/ (Bre´if NHG Brief ), /ou/ (Fouss NHG Fuß), /ei/ (Fe´iss NHG Fu¨ße). MHG (vleisch ‘flesh’, boum ‘tree’) have the reflexes /e:/ (Fleesch NHG Fleisch), /a:/ (Bam NHG Baum). However, all of these examples may also be subject to allophonic variation, and shortened forms are common. A shift of West Germanic /i/ to /a/ is also a strong characteristic of the language: Wand NHG Wind ‘wind’. Derounding (La¨ffel, fe¨nnef NHG Lo¨ffel, fu¨nf ‘spoon, five’) and lowering (domm NHG dumm ‘stupid’) are also found, as are elements of ‘correption’ (an abrupt rise and fall of vowel pitch; only vestigially present in Luxembourgish, e.g., sta¨if/steif NHG steif ‘stiff’) and ‘circumflexion’ (a rise and fall of vowel pitch, accompanied by up to three times normal length), e.g., den Hals (nom./acc.) NHG der/den Hals ‘neck’ (not circumflected); dem Haals (dat.) NHG dem Hals(e) (circumflected). Another element is the Schwebelaut (a lengthening of consonants), which occasionally marks a difference in meaning, e.g., voll (short /l/, MHG vol) NHG voll ‘full’, voll (long /l:/, MHG volle) NHG voll ‘drunk’.

Sample Text Bei a¨is goufe vun 1825 [uechtze´nghonnertfe¨nnefanzwanzeg] bis haut verschidde Schreifweise gebraucht, de´i all hiert Guddes haten. /bAi E:is "goufe fon "uextseN%hOnert%fenefAn"tsvAntsec¸ bIs hAot fEr"SIde "SrAifvAize ge"brAuxt dei Al hi:rt godez "ha:ten/ Bei uns wurden von achtzehnhundertfu¨nfundzwanzig bis heute verschiedene Schreibweisen gebraucht, die alle ihr Gutes hatten.

NHG

‘In Luxembourg from 1825 up to today various spelling systems were used, which all had their good points’. See also: Germanic Languages; Luxembourg: Language

Situation.

Bibliography Berg G (1993). ‘Mir we¨lle bleiwe, wat mir sin’: soziolinguistische und sprachtypologische Betrachtungen u¨ber die luxemburgischen Mehrsprachigkeit. Tu¨bingen: Niemeyer. Braun J (1999). Eis Sprooch richteg schreiwen. Bartreng: Rapidpress. Bruch R (1953). Grundlegung einer Geschichte des Luxemburgischen. Luxembourg: P. Linden. Bruch R (1954). Das Luxemburgische im westfra¨nkischen Kreis. Luxembourg: P. Linden. Bruch R (1955). Pre´cis populaire de grammaire luxembourgeois. Luxembourg: P. Linden. Derrmann-Loutsch L (ed.) (2003). Deutsch Luxemburgisches Wo¨rterbuch. Luxembourg: Saint Paul. Gilles P (1999). Dialektausgleich im Le¨tzebuergeschen: zur phonetisch-phonologischen Fokussierung einer Nationalsprache. Tu¨bingen: Niemeyer. Hoffmann F (1979). Sprachen in Luxemburg: sprachwissenschaftliche und literarhistorische Beschreibung einer Triglossie-Situation. Luxembourg: Institut Grand-Ducal. Keller R E (1961). ‘Luxembourgish.’ In Keller R E (ed.) German dialects. Manchester: Manchester University Press. 248–298. Luxemburger Wo¨rterbuchkommission (1950–1977). Luxemburger Wo¨rterbuch (5 vols). Luxembourg: Linden. [Reissued in 1995 as Le¨tzebuerger Dixiona¨r (2 vols).] Newton G (1990). ‘Central Franconian.’ In Russ C V J (ed.) The dialects of modern German. London: Routledge. 136–209. Newton G (ed.) (1996). Luxembourg and Le¨tzebuergesch: language and communication at the crossroads of Europe. Oxford: Clarendon Press. Schanen F (2004). Parlons luxembourgeois. Paris: L’Harmattan. Schmitt L F (ed.) (1963). Luxemburgischer Sprachatlas. Marburg: Elwert.

Luzzatto, Samuel David (1800–1865) A Gianto, Pontifical Biblical Institute, Rome, Italy ! 2006 Elsevier Ltd. All rights reserved.

Samuel David Luzzato (often abbreviated as Shedal) was born in Trieste, Italy, on August 22, 1800, and died in Padua on September 30, 1865. He

was educated in the best Talmudic traditions by Abraham Eliezer ha-Levi, the Chief Rabbi of Trieste. His father, himself an accomplished Talmudist, taught him Hebrew at home. Luzzatto’s precocious talents were noticed early in his childhood. At the age of 11 he undertook to write a Hebrew grammar in Italian, translated into Hebrew

348 Luzzatto, Samuel David (1800–1865)

the life of Æsop, and wrote exegetical notes on the Pentateuch. After the death of his mother in 1814, Luzzato had to do household work in addition to helping his father in his work. But this did not prevent him from pursuing his intellectual and creative interests. By 1815 he had amassed a collection of 37 poems that became the basis of his poetry collection Kinnor Na’im (1825). He completed a full treatise on the Hebrew vowels, Ma’amar ha-Nikkud, in 1817, and the following year he started to write his Torah Nidreshet, a philosophical-theologial exposition of the Torah in 24 essays, which was translated into Italian and published in 1879 by M. Coen-Porto. In 1829 he was appointed professor at the rabbinical college of Padua. Among his most important works in Hebrew and Aramaic philology, besides those noted above, are Prolegomena ad una grammatica ragionata della lingua ebraica (Padua, 1896); Grammatica della lingua ebraica (Padua, 1853–1869); Elementi grammaticali del caldeo biblico e del dialetto talmudico (Padua, 1865; German transl. by Kru¨ ger, Breslau, 1873; English transl. by Goldammer, New York, 1876); Ma’amar bi-Yesodei ha-Dikduk, a treatise on Hebrew grammar (Vienna, 1865), and Ketavim Ivriyyim, from 1913, the unfinished edition of his Hebrew writings. Luzzatto was the first Jewish scholar to give importance to Syriac and Samaritan in understanding the

Aramaic translation (Targum) of the Hebrew Bible. His legacy includes a large number of exegetical works, both in Hebrew and in Italian. His son collected 89 of his letters, published as Peninei Shadal (The pearls of Samuel David Luzzatto). These letters include essays of bibliographical interest (numbers i.–xxii.); liturgical-bibliographical subjects (xxiii.–xxxi.); Biblical-exegetical letters (xxxii.–lii.), including a commentary on Ecclesiastes and a letter on Samaritan writing; other exegetical letters (liii.–lxii.); grammatical discussions (lxiii.–lxx.); historical notes (lxxi.–lxxvii.); philosophical letters (lxxviii.–lxxxii.), including letters on dreams and on Aristotelian philosophy; and theological discussions (lxxxiii.–lxxxix). See also: Hebrew, Israeli; Ibn Ezra, Abraham (ca. 1089– 1164); Ibn Janah (ca. 990–1050); Saadya Gaon (882–942); Semitic Languages.

Bibliography Luzzatto S D (1882). Autobiografia. Padua: Dr. J. Luzzatto. Margolies M B (1979). Samuel David Luzzatto, traditionalist scholar. New York: Ktav. Morais H S (1880). Eminent Israelites of the nineteenth century. Philadelphia: E. Stern. 211–217.

Lying, Honesty, and Promising D Owens, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Promising and asserting are both speech acts, and as such they are regulated by practical norms as well as linguistic norms (e.g., norms of etiquette). Here we shall be concerned with the moral norms that govern these speech acts.

Informational Theories Many philosophers take the view that the morality of an act – its rightness or wrongness – is a function of the harms and benefits this act brings to the agent and to others. These philosophers also hold that the main way promises and assertions affect the interests of human beings is when they serve as sources of information. So, they conclude, both promises and assertions are morally significant principally because,

and in so far as, they purport to offer information. Let’s call this the ‘informational’ view of the morality of promise and assertion. On this view, morality censures an unfulfilled promise or a false assertion because these deeds can harm others by giving them false information. We are all obliged to take due care not to lead others to form false beliefs, at least where this might be harmful to them (Scanlon, 1998: 300). This obligation means that we must not set out to deceive people by making them insincere promises or telling them things that we know to be false. But it also means that we mustn’t change our minds about what we promised we were going to do (without good reason) or make an assertion without adequate evidence. Someone who accepts a promise standardly forms the expectation that the promisor will perform, and they may rely on this expectation to their detriment. Someone who believes an assertion is similarly exposed, if this assertion turns out to be false. I’ll deal

348 Luzzatto, Samuel David (1800–1865)

the life of Æsop, and wrote exegetical notes on the Pentateuch. After the death of his mother in 1814, Luzzato had to do household work in addition to helping his father in his work. But this did not prevent him from pursuing his intellectual and creative interests. By 1815 he had amassed a collection of 37 poems that became the basis of his poetry collection Kinnor Na’im (1825). He completed a full treatise on the Hebrew vowels, Ma’amar ha-Nikkud, in 1817, and the following year he started to write his Torah Nidreshet, a philosophical-theologial exposition of the Torah in 24 essays, which was translated into Italian and published in 1879 by M. Coen-Porto. In 1829 he was appointed professor at the rabbinical college of Padua. Among his most important works in Hebrew and Aramaic philology, besides those noted above, are Prolegomena ad una grammatica ragionata della lingua ebraica (Padua, 1896); Grammatica della lingua ebraica (Padua, 1853–1869); Elementi grammaticali del caldeo biblico e del dialetto talmudico (Padua, 1865; German transl. by Kru¨ger, Breslau, 1873; English transl. by Goldammer, New York, 1876); Ma’amar bi-Yesodei ha-Dikduk, a treatise on Hebrew grammar (Vienna, 1865), and Ketavim Ivriyyim, from 1913, the unfinished edition of his Hebrew writings. Luzzatto was the first Jewish scholar to give importance to Syriac and Samaritan in understanding the

Aramaic translation (Targum) of the Hebrew Bible. His legacy includes a large number of exegetical works, both in Hebrew and in Italian. His son collected 89 of his letters, published as Peninei Shadal (The pearls of Samuel David Luzzatto). These letters include essays of bibliographical interest (numbers i.–xxii.); liturgical-bibliographical subjects (xxiii.–xxxi.); Biblical-exegetical letters (xxxii.–lii.), including a commentary on Ecclesiastes and a letter on Samaritan writing; other exegetical letters (liii.–lxii.); grammatical discussions (lxiii.–lxx.); historical notes (lxxi.–lxxvii.); philosophical letters (lxxviii.–lxxxii.), including letters on dreams and on Aristotelian philosophy; and theological discussions (lxxxiii.–lxxxix). See also: Hebrew, Israeli; Ibn Ezra, Abraham (ca. 1089– 1164); Ibn Janah (ca. 990–1050); Saadya Gaon (882–942); Semitic Languages.

Bibliography Luzzatto S D (1882). Autobiografia. Padua: Dr. J. Luzzatto. Margolies M B (1979). Samuel David Luzzatto, traditionalist scholar. New York: Ktav. Morais H S (1880). Eminent Israelites of the nineteenth century. Philadelphia: E. Stern. 211–217.

Lying, Honesty, and Promising D Owens, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Promising and asserting are both speech acts, and as such they are regulated by practical norms as well as linguistic norms (e.g., norms of etiquette). Here we shall be concerned with the moral norms that govern these speech acts.

Informational Theories Many philosophers take the view that the morality of an act – its rightness or wrongness – is a function of the harms and benefits this act brings to the agent and to others. These philosophers also hold that the main way promises and assertions affect the interests of human beings is when they serve as sources of information. So, they conclude, both promises and assertions are morally significant principally because,

and in so far as, they purport to offer information. Let’s call this the ‘informational’ view of the morality of promise and assertion. On this view, morality censures an unfulfilled promise or a false assertion because these deeds can harm others by giving them false information. We are all obliged to take due care not to lead others to form false beliefs, at least where this might be harmful to them (Scanlon, 1998: 300). This obligation means that we must not set out to deceive people by making them insincere promises or telling them things that we know to be false. But it also means that we mustn’t change our minds about what we promised we were going to do (without good reason) or make an assertion without adequate evidence. Someone who accepts a promise standardly forms the expectation that the promisor will perform, and they may rely on this expectation to their detriment. Someone who believes an assertion is similarly exposed, if this assertion turns out to be false. I’ll deal

Lying, Honesty, and Promising 349

with informational theories of promising first and then move onto assertion. Information theorists of promissory obligation fall into two categories. First, there are those (sometimes called ‘expectation theorists’) who argue that we are all under an obligation not to mislead others about how we shall behave in the future and that this obligation is why we ought not to make them promises that we do not fulfill (Scanlon, 1998: Chap. 7; Thomson, 1990: Chap. 12). Second, there are those who argue that we are obliged to fulfill our promises only where there is an up and running practice of fulfilling one’s promises: prior to this, there is no promissory obligation. For such ‘practice theorists,’ promissory obligation is conventional (Hume, 1978: Book III, Part II, Section V; Anscombe, 1981; Prichard, 1968). Practice theories differ from expectation theories in their account of how the obligation to keep a promise arises from our interest in having correct information about how other people are going to behave. According to the practice theorist, we can create expectations of performance in our audience by uttering words like ‘‘I promise’’ only where there is an actual practice of making such utterances true, i.e., where people have come to feel some obligation to make them true. So one can’t explain the moral significance of this utterance simply by reference to the expectations it creates. Still the practice theorist agrees with the expectation theorist that we are obliged to maintain the practice of promise-making where it exists, because this practice serves our information interest and thereby aids the coordination of behavior. Turning now to assertion, there is much controversy among philosophers of language about the extent to which language in general, and assertion in particular, involve social convention. For example, Davidson maintains that ‘‘there is no known, agreed upon, publicly recognizable convention for making assertions’’ (Davidson, 1984: 270), and he thinks the same is true of promising. On the other hand Fried urges that the promisor has ‘‘intentionally invoked a convention whose function it is to give grounds – moral grounds – for another to expect the promised performance’’ and ‘‘to abuse that confidence is like . . . lying: the abuse of a shared social institution that is intended to invoke the bonds of trust’’ (Fried, 1980: 16). Thus, there is a division among information theorists of the morality of assertion parallel to that between expectation and practice theorists of promissory obligation. It has long been debated whether there is any morally significant difference between lying to someone and deceiving them in a more oblique fashion (e.g., by way of false implicatures, deliberate ambiguity, or by

leaving misleading evidence around, etc.). And this debate reflects a genuine ambivalence in our everyday attitudes. Where we feel entitled to deceive others – for the sake of their health for instance – many of us are still inclined to go to the trouble of trying to avoid telling a direct lie. On the other hand, where such deception is wrong, the wrong is seldom thought to be mitigated just because a direct lie was avoided. There is a tradition of thought, however, according to which lying is always wrong, but we are sometimes permitted to deceive in other ways (Aquinas, 1966: II-II 110 a3; MacIntyre, 1995: 309–318). But many contemporary writers have expressed doubts about whether the manner of the deception could by itself make a serious moral difference (Sidgwick, 1981: 354–355; (Williams, 2002: 100–110). An expectation theorist who maintains that the wrong of lying is only the wrong of deception will share these doubts (Scanlon, 1998: 320). On the other hand, a practice theorist of the morality of assertion may allow that, in addition to any harm he does to the person he deceives, the liar is abusing and thereby undermining a valuable social practice (Kant, 1991: 612), namely the use of language to convey information.

Noninformational Theories Until now, we have been assuming that what makes an act, including a speech act, wrong is, some harm that it does to those it wrongs in the end. There are many moral theorists who reject this assumption, and it is open to them to propound noninformational theories of what is wrong with a lie or a broken promise. Rather than attempt a comprehensive classification of noninformational theories, I shall consider one such theory of promising and one such theory of lying, both taken from Kant. Take promising. Kant locates the moral significance of promising not in the information interests it serves but rather in the fact that it grants the promisee a certain moral authority over the promisor: it entitles the promisee to require the promisor to perform and thus deprives the promisor of a certain moral freedom (Kant, 1996: 57–61). If I promise you a lift home, I am obliged to give you a lift unless you release me from this promise. This line of thought was central to classical theories of promissory obligation (Hobbes, 1991: Chap. 2) and has found an echo in some contemporary writing (Hart, 1967: 60). For these authors, a breach of promise wrongs the promisee, whether or not it also harms them by inducing false expectations in them, because it flouts the moral authority that the promisee has acquired over the promisor.

350 Lying, Honesty, and Promising

On this view, informational theories miss what is distinctive about promising. There are many ways of influencing people’s expectations and of co-ordinating your behavior with others (Raz, 1977: 215–216). For example, one can predict that one will do something or even express a sincere intention to do something, while making it clear that one is not promising. To promise to do it is to express the intention to undertake an obligation to do it (Searle, 1969: 60), an obligation that mere expressions of intention or predictions do not bring down on the speaker, however firm or confident they may be. If I predict, on excellent evidence, that I shall be going in your direction because the police will be towing my car in that direction, I have not committed myself to making it true that I shall be going in your direction when that prediction threatens to be falsified. Turning now to lying, Kant makes a firm distinction between wrong one does in deceiving someone and the wrong one does by lying (Kant, 1996: 182– 184). One can lie without deceiving (e.g., when one knows one won’t be believed) and one can deceive without lying. On deceiving someone you may wrong them but, according to Kant, when you lie, the person you wrong is yourself. The liar violates a duty to himself (though this violation need not involve harming himself). On explaining the nature of this wrong, Kant follows thinkers like Aquinas in attributing a natural teleology to speech: communication of one’s thoughts to someone through words that yet (intentionally) contain the contrary of what the speaker thinks on the subject is an end that is directly opposed to the natural purposiveness of the speaker’s capacity to communicate his thoughts (Kant, 1996: 182).

In lying, one violates a duty to oneself by abusing one’s own faculties, by using oneself ‘‘as a mere means (a speaking machine)’’ (Kant, 1996: 183). It may be possible to capture Kant’s basic idea here without reference to natural teleology or duties to self if we adopt a certain view of assertion. One can distinguish two currently influential theories of assertion. In the first, inspired by Grice, asserting that p is a matter of uttering something with the intention of thereby getting your audience to believe that p, by means of their recognition of that very intention (Grice, 1989). If something like this statement is correct, the moral significance of an utterance qua assertion must lie solely in the effect it is trying to achieve, i.e., in the effect that the assertion has on the beliefs of others. In the second view of assertion, asserting that p is more like promising that p. In promising, someone intentionally undertakes an

obligation to perform: undertaking such obligations is what promising consists in. Similarly to assert a certain proposition is, on this view, to intentionally undertake an obligation to ensure that one asserts only what is true (Dummett, 1973: 299–302) and, perhaps, to defend one’s assertions as true, should they be challenged (Brandom, 1983). Putting oneself under such obligations is what assertion consists of. Once the second view is in play, we can say what is wrong about lying without making reference to the effect that the lie has on others. A liar is in the wrong not because he is wronging someone but because he knows that he is taking on obligations he cannot discharge. In this way, lying differs from deception that wrongs the deceived when it harms their interests in some way. Deception is an offence against others while lying is an offence against truth. Provided morality is not solely concerned with harm, this offence may be counted as a moral wrong.

Bibliography Aquinas T (1966). Summa theologiae. Gilby T (ed.). Cambridge: Blackfriars. Anscombe E (1981). Ethics, religion and politics. Oxford: Blackwell. Brandom R (1983). ‘Assertion.’ Nous 17(4), 637–650. Davidson D (1984). Essays on truth and interpretation. Oxford: Oxford University Press. Dummett M (1973). Frege philosophy of language. London: Duckworth. Fried C (1981). Contract as promise. Cambridge, MA: Harvard University Press. Grice P (1989). Studies in the way of words. Cambridge, MA: Harvard University Press. Hart H (1967). ‘Are there any natural rights?’ In Quinton A (ed.) Political philosophy. Oxford: Oxford University Press. 53–66. Hobbes T (1991). De cive. Gert B (ed.). Indianapolis: Hackett. Hume D (1978). Treatise on human nature. Oxford: Oxford University Press. Kant I (1991). Practical philosophy. Gregor M (ed.). Cambridge: Cambridge University Press. Kant I (1996). The metaphysics of morals. Gregor M (ed.). Cambridge: Cambridge University Press. MacIntyre A (1995). ‘Truthfulness, lies and moral philosophers.’ In Tanner Lectures on Human Values 16. Salt Lake City: Utah University Press. 309–361. Prichard H (1968). Moral obligation and duty and interest. Oxford: Oxford University Press. Raz J (1977). ‘Promises and obligations.’ In Hacker P & Raz J (eds.) Law, morality and society. Oxford: Oxford University Press. 210–228. Searle J (1969). Speech acts. Cambridge: Cambridge University Press.

Lyons, John, Sir (b. 1932) 351 Scanlon T (1998). What we owe to each other. Cambridge, MA: Harvard University Press. Sidgwick H (1981). The methods of ethics. Indianapolis: Hackett.

Thomson J (1990). The realm of rights. Cambridge, MA: Harvard University Press. Williams B (2002). Truth and truthfulness. Princeton: Princeton University Press.

Lyons, John, Sir (b. 1932) G Brown, University of Cambridge, Cambridge, UK ! 2006 Elsevier Ltd. All rights reserved.

John Lyons’ first degree was in classics at Cambridge (1953), where he returned in 1956 to research on the semantics of Plato’s vocabulary for knowledge and understanding, supervised by W. S. Allen. Allen introduced him to Chomsky’s Syntactic structures, which provided a technical framework in which he could formulate the structural relationships among his selected words. In 1957, he moved to a lectureship at the School of Oriental and African Studies (SOAS), where his local Ph.D. supervisor was R. H. Robins. His views on linguistics and on the philosophy of language were also strongly influenced by C. E. Bazell, then Head of Department. His thesis, published in 1963, ‘‘was probably the first work in semantics . . . to be based overtly on the principles of generative grammar’’ (Lyons, 1995a:233). Lyons spent a year (1960–1961) working on a machine-translation project directed by F. W. Householder at Indiana, where he could supplement his knowledge of the history of linguistics, culminating in European studies in the 19th and early 20th centuries. There he experienced the very different U.S. tradition of linguistic study in a major center for linguistics and anthropology. After his year in the United States, Lyons lectured at Cambridge until 1964, when he was appointed to the Chair of General Linguistics in Edinburgh. Lyons, a 32-year-old professor, with a cropped American haircut, remarkable intellectual and physical energy, and a dazzling French wife, caused comment in the ancient Scottish university. An inspiring teacher, he remodeled his department and its postgraduate program, attracting a galaxy of student talent, as he did subsequently in Sussex; a remarkable number of his students went on to chairs in linguistics. Meanwhile, he wrote the two substantial books that cemented his reputation: Introduction to theoretical linguistics (1968) and Semantics (1977), and, as founding editor of the Journal of Linguistics, established it as a leading international journal by judicious choice of initial contributors.

In 1976, Lyons moved to a new Chair of Linguistics at the University of Sussex, where he produced a stream of publications, notably Language, meaning and context (1981). During this period he was involved in the British Academy, the Social Sciences Research Council, and the European Science Foundation. In 1981, appointed Pro-Vice-Chancellor at a moment when university funding was savagely reduced, he was immediately involved in planning the elimination or reduction in size of several departments, a bruising and distressing experience. In 1984, he was elected Master of Trinity Hall, Cambridge. Within months of his return to Cambridge, cancer was diagnosed and a series of courses of chemotherapy and radiotherapy began, lasting for years. Despite the debilitating effects of the intermittently progressing disease and its treatment, the requirements of college administration, and new demands for fund raising brought about by funding reductions, Lyons brought out several papers and three books: a 3rd edition of Chomsky (1991a, with major revisions and a new chapter); Natural language and universal grammar (1991b, collected earlier papers, with copious updated notes); and Linguistic semantics (1995b, a more detailed and technical version of Lyons, 1981). His personal history was published in 2002. He retired in autumn of 2000. Lyons’ approach to semantics is stated clearly: ‘‘I take a very broad view of semantics: for me . . . linguistic semantics is the study of all the different kinds of meaning that are systematically encoded in natural languages. . .. I include within linguistic semantics a good deal of what many . . . would classify as pragmatics’’ (1981:9). In part, his eclecticism may be attributed to a rare breadth and depth of knowledge of classical linguistics, combined with 19th and 20th century European, as well as U.S., linguistic traditions, but he did not confine his reading, thinking, or research to ‘linguistics’ narrowly conceived. He contributed to psycholinguistic research (cf. Lyons and Wales, 1966; Lyons, 1981, 1996) and to stylistics (1980). His work is informed by anthropology, as well as by the philosophy of language and of mind. Repeatedly, he urged formal semantics to expand its interests beyond a narrow ‘intellectualist’ concern

Lyons, John, Sir (b. 1932) 351 Scanlon T (1998). What we owe to each other. Cambridge, MA: Harvard University Press. Sidgwick H (1981). The methods of ethics. Indianapolis: Hackett.

Thomson J (1990). The realm of rights. Cambridge, MA: Harvard University Press. Williams B (2002). Truth and truthfulness. Princeton: Princeton University Press.

Lyons, John, Sir (b. 1932) G Brown, University of Cambridge, Cambridge, UK ! 2006 Elsevier Ltd. All rights reserved.

John Lyons’ first degree was in classics at Cambridge (1953), where he returned in 1956 to research on the semantics of Plato’s vocabulary for knowledge and understanding, supervised by W. S. Allen. Allen introduced him to Chomsky’s Syntactic structures, which provided a technical framework in which he could formulate the structural relationships among his selected words. In 1957, he moved to a lectureship at the School of Oriental and African Studies (SOAS), where his local Ph.D. supervisor was R. H. Robins. His views on linguistics and on the philosophy of language were also strongly influenced by C. E. Bazell, then Head of Department. His thesis, published in 1963, ‘‘was probably the first work in semantics . . . to be based overtly on the principles of generative grammar’’ (Lyons, 1995a:233). Lyons spent a year (1960–1961) working on a machine-translation project directed by F. W. Householder at Indiana, where he could supplement his knowledge of the history of linguistics, culminating in European studies in the 19th and early 20th centuries. There he experienced the very different U.S. tradition of linguistic study in a major center for linguistics and anthropology. After his year in the United States, Lyons lectured at Cambridge until 1964, when he was appointed to the Chair of General Linguistics in Edinburgh. Lyons, a 32-year-old professor, with a cropped American haircut, remarkable intellectual and physical energy, and a dazzling French wife, caused comment in the ancient Scottish university. An inspiring teacher, he remodeled his department and its postgraduate program, attracting a galaxy of student talent, as he did subsequently in Sussex; a remarkable number of his students went on to chairs in linguistics. Meanwhile, he wrote the two substantial books that cemented his reputation: Introduction to theoretical linguistics (1968) and Semantics (1977), and, as founding editor of the Journal of Linguistics, established it as a leading international journal by judicious choice of initial contributors.

In 1976, Lyons moved to a new Chair of Linguistics at the University of Sussex, where he produced a stream of publications, notably Language, meaning and context (1981). During this period he was involved in the British Academy, the Social Sciences Research Council, and the European Science Foundation. In 1981, appointed Pro-Vice-Chancellor at a moment when university funding was savagely reduced, he was immediately involved in planning the elimination or reduction in size of several departments, a bruising and distressing experience. In 1984, he was elected Master of Trinity Hall, Cambridge. Within months of his return to Cambridge, cancer was diagnosed and a series of courses of chemotherapy and radiotherapy began, lasting for years. Despite the debilitating effects of the intermittently progressing disease and its treatment, the requirements of college administration, and new demands for fund raising brought about by funding reductions, Lyons brought out several papers and three books: a 3rd edition of Chomsky (1991a, with major revisions and a new chapter); Natural language and universal grammar (1991b, collected earlier papers, with copious updated notes); and Linguistic semantics (1995b, a more detailed and technical version of Lyons, 1981). His personal history was published in 2002. He retired in autumn of 2000. Lyons’ approach to semantics is stated clearly: ‘‘I take a very broad view of semantics: for me . . . linguistic semantics is the study of all the different kinds of meaning that are systematically encoded in natural languages. . .. I include within linguistic semantics a good deal of what many . . . would classify as pragmatics’’ (1981:9). In part, his eclecticism may be attributed to a rare breadth and depth of knowledge of classical linguistics, combined with 19th and 20th century European, as well as U.S., linguistic traditions, but he did not confine his reading, thinking, or research to ‘linguistics’ narrowly conceived. He contributed to psycholinguistic research (cf. Lyons and Wales, 1966; Lyons, 1981, 1996) and to stylistics (1980). His work is informed by anthropology, as well as by the philosophy of language and of mind. Repeatedly, he urged formal semantics to expand its interests beyond a narrow ‘intellectualist’ concern

352 Lyons, John, Sir (b. 1932)

with the truth of propositions, to include the expression of subjectivity in the use of deixis, aspect, tense, and modality (and to consider not only alethic but also epistemic and deontic modality) (Lyons, 1977, 1981, 1995a,b). Benveniste’s (1966) concern with the expression of subjectivity in language is consistently echoed by Lyons (1977, 1981, 1982, 1987, 1995a,b). Hence, he is conscious that features such as intonation and voice quality may systematically affect utterance meaning, expressing the speaker’s subjective stance toward the content of the proposition (Lyons, 1977, 1987, 1995b). Detailed attention to deixis, particularly as the way the young child arrives at the notion of reference, permeates his work (1975, 1977, 1978, 1981, 1982, 1983, 1991, 1995b). By virtue of his publications, his influential teaching, and his support for linguistics both nationally and internationally, Lyons held for 40 years a unique position in British linguistics. His contribution was formally recognized in 1987 when he was knighted ‘for services to linguistics.’ See also: Benveniste, Emile (1902–1976); Chomsky, Noam (b. 1928); Householder, Fred W., Jr. (1913–1994); Robins, Robert Henry (1921–2000).

Bibliography Benveniste E (1966). Proble`mes de linguistique ge´ne´rale. Paris: E´ ditions Gallimard. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Lyons J (1963). Structural semantics. Oxford: Blackwell. Lyons J & Wales R J (eds.) (1966). Psycholinguistics papers. Edinburgh: Edinburgh University Press. Lyons J (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press.

Lyons J (1975). ‘Deixis as the source of reference.’ In Keenan E L (ed.) Formal Semantics of natural languages. Cambridge: Cambridge University Press. 61–83. Lyons J (1977). Semantics: vols. 1 and 2. Cambridge: Cambridge University Press. Lyons J (1978). ‘Deixis and anaphora.’ In Myers T (ed.) The development of conversation and discourse. Edinburgh: Edinburgh University Press. 88–103. Lyons J (1980). ‘Pronouns of address in Anna Karenina: the stylistics of bilingualism and the impossibility of translation.’ In Leech G & Greenbaum S (eds.) Studies in English linguistics. London: Longman. 234–249. Lyons J (1981). Language, meaning and context. London: Collins/Fontana. Lyons J (1982). ‘Deixis and subjectivity: Loquor, ergo sum?’ In Jarvella R J & Klein W (eds.) Speech, place and action. London and New York: John Wiley. 101–124. Lyons J, Coates R, Deuchar M & Gazdar G (eds.) (1987). New horizons in linguistics, 2. Harmondsworth: Penguin. Lyons J (1991a). Chomsky (3rd, extended, edn.). London: HarperCollins. Lyons J (1991b). Natural languages and universal grammar: essays in linguistic theory. Cambridge: Cambridge University Press. Lyons J (1995a). ‘Grammar and meaning.’ In Palmer F R (ed.). 221–249. Lyons J (1995b). Linguistic Semantics: an introduction. Cambridge: Cambridge University Press. Lyons J (1996). ‘On competence and performance and related notions.’ In Brown G, Malmkjaer K & Williams J (eds.) Performance and competence in second language acquisition. Cambridge: Cambridge University Press. 10–32. Lyons J (2002). ‘John Lyons.’ In Brown K & Law V (eds.) Linguistics in Britain: personal histories. Publications of the Philological Society, 36. Oxford: Blackwell. 170–199. Palmer F R (ed.) (1995). Grammar and meaning: essays in honour of Sir John Lyons. Cambridge: Cambridge University Press.

M Ma, Jianzhong (1844–1900) Y Gu, The Chinese Academy of Social Sciences, Beijing, China ! 2006 Elsevier Ltd. All rights reserved.

Jianzhong Ma ( ), born on February 9, 1845 in the present-day Jiangsu Province, was a grammarian as well as a diplomat. He attended a missionary school in Shanghai and became a polyglot in Latin, Greek, English, and French. In August 1876 he went to study in Paris, France, majoring in diplomacy and law. In March 1880 he returned to China after completing his doctorate. He worked for the State Council and was subsequently assigned several important diplomatic missions abroad. In 1885, he worked in Shanghai, attending to post-Sino-French war business. Due to policy differences with his superiors, he resigned from the civil service post and dedicated himself to academic research. He died of a fever in 1900 in Shanghai. His best known work is Mr Ma’s grammar. In his lifetime, Ma was a very active reformist. His ever-lasting fame, however, lies in his academic work known as Mashi Wentong ( ) in Chinese, which can be translated as ‘Mr Ma’s grammar.’ It is generally recognized in linguistic circles that his grammar was the first systematic treatment of Chinese grammar. Some even argue that it was the first grammar of Chinese ever written. Mashi Wentong, originally entitled Wentong, was first published in 1898 by Shanghai Commercial Press. It was an immediate success. In 1954, Zhang Xichen edited an annotated version and renamed the work Mr Ma’s grammar with commentaries; it was published by China Publishing House. The punctuated version published by Commercial Press in 1983 bore the well-known title Mr Ma’s grammar. The latest version was edited by Lu¨ Shuxiang and Wang Haifen, who called their edited work Mr Ma’s grammar with a reading guide, and was published by Shanghai Education Press in 1986. It is now generally held that, although the Chinese grammatical phenomena had been consciously analyzed in pre-Mr Ma’s grammar eras, the examination

was sporadic and fragmented. It was Mr Ma’s grammar that gave the first, comprehensive treatment of Chinese grammatical phenomena. The author was versed both in archaic Chinese and in classic Chinese philology. His grammar, however, was as a whole not an integration of all the previous studies with which he was familiar. Rather, he was more influenced by Western grammars than by similar works in Chinese. Some commentators argue that Mr Ma’s grammar drew substantial insights from French grammars, such as General and rational grammar: The Port-Royal grammar, by Antoine Arnauld and Claude Lancelot, edited and translated by Jacques Rieux and Bernard E. Rollin, published in 1975 by Mouton. So in essence, Mr Ma’s grammar should be regarded as a successful attempt at applying Western grammatical theories to Chinese linguistic context. The language data Ma used for illustrations were from archaic written Chinese texts. The body of work consists of 10 chapters, the bulk of which analyzes parts of speech. Words are first divided into two major classes: shi zi (i.e., lexical words) and xu zi (i.e., functional words). Shi zi and xu zi are further classified into five and four subclasses, respectively; the former are (1) noun, (2) pronoun, (3) verb, (4) static, and (5) adjective and the latter are (1) preposition, (2) connective, (3) particle, and (4) interjective. More than 100 years have passed since Mr Ma’s grammar first appeared, a time long enough to assess its contribution. The traditional Chinese philology (called xiao xue, ) was threefold: wen zi (i.e., orthography), yin yun (i.e., phonology), and xun gu (i.e., lexical semantics). There was no place for grammar. Mr Ma’s grammar enriches the traditional philology with a fourth component. To many Chinese grammarians, it was quite revolutionary, and it even updated the traditional discipline to the level of modern linguistics. Some of its critics, on the other hand, accused its author of over-Westernizing Chinese grammar. The Western notions of parts of speech and the syntactic categories such as subject, predicate, object, etc., should be scrutinized to see if they were appropriate for Chinese. To Ma, ‘‘Every country has its own grammar, and all grammars are more or less

354 Ma, Jianzhong (1844–1900)

similar. They differ from one another in sounds and writing scripts’’ (1983: Introduction). See also: Arnauld, Antoine (1612–1694); China: Language

Situation; Lancelot, Claude (1615–1695); Port-Royal Tradition of Grammar.

Jiang W (1995). Papers on Mr Ma’s grammar. Shijiazhuang: Hebei Education Press. Xu G (1991). Xu Guozhang on language. Beijing: Foreign Language Teaching and Research Press. Wang H (1991). Mr Ma’s grammar and Chinese grammar. Heifi: Anhui Education Press.

Bibliography Ma J (1983). Mr Ma’s grammar. Beijing: Commercial Press.

Macedonia: Language Situation V A Friedman, University of Chicago, Chicago, IL, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction The Republic of Macedonia (Republika Makedonija) is bordered by Bulgaria on the east, Greece on the south, Albania on the west, and Kosovo (administered under UNSCR 1244 as of this writing, 30 October 2004) and Serbia on the north. Although efforts to create a Macedonian literary language and an independent Macedonian state date from the 19th century, it was only on August 2, 1944 that Macedonian was declared the official language of the People’s Republic of Macedonia (later the Socialist Republic of Macedonia) as one of the constitutive republics of the former Yugoslavia. The Macedonian parliament adopted its own constitution, thereby declaring independence, on November 17, 1991.

Language and the Law The original preamble of the constitution defined Macedonia as the national state of the Macedonian people (makedonskiot narod) with full civil equality for the Albanians, Turks, Vlahs, Roms, and other nationalities (nacionalnosti). On November 16, 2001, the preamble was replaced with Amendment IV, which states that the citizens of the Republic of Macedonia, the Macedonian people, as well as citizens living within its borders who are part of the Albanian, Turkish, Vlah, Serbian, Romani, Bosniac, and other peoples, constitute the Republic of Macedonia. Articles 7, 48, and 54, and Amendments V and VIII guarantee minority language rights in administration, education, culture, and the judiciary. A key difference between the articles and the amendments is that while the articles guaranteed official use of minority languages in districts with a ‘‘majority or

significant number’’ of minority language speakers, the amendments specify a significant number as 20%.

Language and Identity In the two post-independence Macedonian censuses (1994 and 2002), six languages were in official use: Macedonian, Albanian, Turkish, Romani, Serbian (Serbo-Croatian), and Aromanian. The translation of census documents into Romani and Aromanian in 1994 represented the first such official use of these languages anywhere. Table 1, compiled from figures published by the Bureau of Statistics of the Republic of Macedonia (Zavod za Statistika Republika Makedonija), gives the figures for declared ethnicity in 1994 and 2002 and declared mother tongue in 1994. Almost all non-Macedonians speak Macedonian as a second language. The category Muslim, dating from the Yugoslav period, was used by some Slavic-speaking Muslims who considered Macedonian, Serbian, and Croatian nationalities to be Christian. With the break-up of Yugoslavia, Bosniac activists claimed all Slavicspeaking Muslims as Bosniac. The figure in the 2002 census signals an acceptance of this identity by some Macedonian-speaking Muslims. Some Macedonian speakers declared Turkish or Serbian nationality on the basis of religious feeling (Muslim and Serbian Orthodox, respectively). The distinction between Muslim as an ethnic identity and as a religious identity is illustrated by the fact that in the 1994 census, a small number declared Muslim nationality and Catholic religion. Egyptians (in Macedonian, Egip ni or pci) are almost all Albanian-speaking Muslims (1856) or Macedonian-speaking Christians (961) whose ethnonym is cognate with the English Gypsy. These are descendants of long-settled, formerly Romani-speaking populations, but members of this group claim Egyptian descent. Judezmo (JudeoSpanish) was spoken in Macedonia until March 11,

354 Ma, Jianzhong (1844–1900)

similar. They differ from one another in sounds and writing scripts’’ (1983: Introduction). See also: Arnauld, Antoine (1612–1694); China: Language

Situation; Lancelot, Claude (1615–1695); Port-Royal Tradition of Grammar.

Jiang W (1995). Papers on Mr Ma’s grammar. Shijiazhuang: Hebei Education Press. Xu G (1991). Xu Guozhang on language. Beijing: Foreign Language Teaching and Research Press. Wang H (1991). Mr Ma’s grammar and Chinese grammar. Heifi: Anhui Education Press.

Bibliography Ma J (1983). Mr Ma’s grammar. Beijing: Commercial Press.

Macedonia: Language Situation V A Friedman, University of Chicago, Chicago, IL, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction The Republic of Macedonia (Republika Makedonija) is bordered by Bulgaria on the east, Greece on the south, Albania on the west, and Kosovo (administered under UNSCR 1244 as of this writing, 30 October 2004) and Serbia on the north. Although efforts to create a Macedonian literary language and an independent Macedonian state date from the 19th century, it was only on August 2, 1944 that Macedonian was declared the official language of the People’s Republic of Macedonia (later the Socialist Republic of Macedonia) as one of the constitutive republics of the former Yugoslavia. The Macedonian parliament adopted its own constitution, thereby declaring independence, on November 17, 1991.

Language and the Law The original preamble of the constitution defined Macedonia as the national state of the Macedonian people (makedonskiot narod) with full civil equality for the Albanians, Turks, Vlahs, Roms, and other nationalities (nacionalnosti). On November 16, 2001, the preamble was replaced with Amendment IV, which states that the citizens of the Republic of Macedonia, the Macedonian people, as well as citizens living within its borders who are part of the Albanian, Turkish, Vlah, Serbian, Romani, Bosniac, and other peoples, constitute the Republic of Macedonia. Articles 7, 48, and 54, and Amendments V and VIII guarantee minority language rights in administration, education, culture, and the judiciary. A key difference between the articles and the amendments is that while the articles guaranteed official use of minority languages in districts with a ‘‘majority or

significant number’’ of minority language speakers, the amendments specify a significant number as 20%.

Language and Identity In the two post-independence Macedonian censuses (1994 and 2002), six languages were in official use: Macedonian, Albanian, Turkish, Romani, Serbian (Serbo-Croatian), and Aromanian. The translation of census documents into Romani and Aromanian in 1994 represented the first such official use of these languages anywhere. Table 1, compiled from figures published by the Bureau of Statistics of the Republic of Macedonia (Zavod za Statistika Republika Makedonija), gives the figures for declared ethnicity in 1994 and 2002 and declared mother tongue in 1994. Almost all non-Macedonians speak Macedonian as a second language. The category Muslim, dating from the Yugoslav period, was used by some Slavic-speaking Muslims who considered Macedonian, Serbian, and Croatian nationalities to be Christian. With the break-up of Yugoslavia, Bosniac activists claimed all Slavicspeaking Muslims as Bosniac. The figure in the 2002 census signals an acceptance of this identity by some Macedonian-speaking Muslims. Some Macedonian speakers declared Turkish or Serbian nationality on the basis of religious feeling (Muslim and Serbian Orthodox, respectively). The distinction between Muslim as an ethnic identity and as a religious identity is illustrated by the fact that in the 1994 census, a small number declared Muslim nationality and Catholic religion. Egyptians (in Macedonian, Egip ni or pci) are almost all Albanian-speaking Muslims (1856) or Macedonian-speaking Christians (961) whose ethnonym is cognate with the English Gypsy. These are descendants of long-settled, formerly Romani-speaking populations, but members of this group claim Egyptian descent. Judezmo (JudeoSpanish) was spoken in Macedonia until March 11,

Macedonia: Language Situation 355 Table 1 Population of the Republic of Macedonia by declared nationality (1994, 2002) and mother tongue (1994) 1994

Macedonian Albanian Turkish Romani Vlah

Total

1994-MT

%

40 228 15 418 6829 3080 1682

2.0 0.8 0.4 0.2 0.1

/

/

* /

* /



*

*

– –

– –

368 595

0.0 0.0

– /

– –

– –

* 9797

1 945 932

0.4 100

68.5 22.3 3.3 1.8 0.4

%

66.5 22.9 4.0 2.3 0.4

*

1 332 983 431 363 64 665 35 120 7036

2002

1 295 964 441 104 78 019 43 707 8601

Serb Muslim Bosniac Egyptian Bulgarian Greek Yugoslav Serbo-Croat Other

%

1448

1 945 932

64.18 25.17 3.85 2.66 0.48

35 939

1.78 –

17 018 0.1

– / 35 095 38 222

1 297 981 509 083 77 959 53 879 9695

1.8 1.8 100

*

0.84

* 20 993

2 022 547

1.04 100

* Category does not apply. – Figure unavailable. / Figure pooled with ‘Serbo-Croat’.

1943, when 7200 Jews were deported by the Nazis and their collaborators to the Treblinka death camp. Only 2% survived, and most of the survivors went to Israel, but a very small Jewish community was reestablished on December 26, 1944 and lives in Macedonia today.

Language in Education and the Media Prior to 1991, as today, Albanian and Turkish had primary and secondary education, post-secondary teacher training, and academic departments at the University of Skopje. During the 1980s, support for Albanian- and Turkish-language education was curtailed, and the Albanian teachers’ college was closed in 1986. In 1995, Albanian educational activists organized a controversial Albanian-language university in Tetovo. In 2001, with support from the international community, Southeast European University (SEEU) was opened in Tetovo with teaching in Albanian, Macedonian, and English. The percentage of Macedonian-speaking students rose from 1% when school began in 2001 to 26% in the fall of 2003. There are a few elementary school classes offering Romani and Aromanian. The government-supported Albanian and Macedonian newspapers are both daily, the Turkish one is triweekly. There are private publications of various size and distribution in all of the minority languages. In 1989, only Turkish and Albanian were represented on Macedonian public television: each had 130 minutes a week. In 1991, TV programming in Romani and Aromanian was begun at 30 minutes a week. By

2000, the figures for minority-language programming on national public television (MTV 2) were 400 minutes a week for Albanian, 370 minutes a week for Turkish, and 60 minutes a week each for Romani, Aromanian, and Serbian (Antena No. 133, supplement to Dnevnik, October 27, 2000). Of 57 private local TV licenses given out in 1998 (after the passing of the Communications Act of 1997), 21 were for stations using minority languages, and of 80 private radio licenses, 26 had all or most programming in minority languages.

Macedonian in Neighboring Countries Macedonian-speaking minorities living in neighboring countries have almost no linguistic rights. The one exception is a group of Christian Macedonianspeaking villages near Lake Prespa in southeastern Albania, which have primary education in Macedonian through grade 4. Radio programming in Macedonian is also permitted in Albania, as are cultural organizations. Macedonian activists in Greece and Bulgaria have organized cultural associations and published leaflets and newspapers, but they have been subject to government harassment. See also: Balto-Slavic Languages; Education in a Multilingual Society; Ethnolinguistic Vitality; Identity in Sociocultural Anthropology and Language; Identity: Second Language; Language Policy in Multilingual Educational Contexts; Media and Language: Overview; Minorities and Language; Minority Languages: Education; Politics and Language: Overview; Slavic Languages; Standardization.

356 Macedonia: Language Situation Language Maps (Appendix 1): Map 138.

Bibliography Friedman V A (1985). ‘The sociolinguistics of Literary Macedonian.’ International Journal of the Sociology of Language 52, 31–57. Friedman V A (1998). ‘The implementation of Standard Macedonian: Problems and results.’ International Journal of the Sociology of Language 131, 31–57. Friedman V A (1999). ‘The Romani language in the Republic of Macedonia: Status, usage, and sociolinguistic perspectives.’ Acta Linguistica Hungarica 46(3–4), 317–339. Friedman V A (2001). ‘The Vlah minority in Macedonia: Language, identity, dialectology, and standardization.’ In Nuorluoto J, Leiwo M & Halla-aho J (eds.) Selected Papers in Slavic, Baltic, and Balkan Studies, (¼ Slavica Helsingiensa 21). Helsinki: University of Helsinki. 26–50.

Friedman V A (2003). ‘Language in Macedonia as an identity construction site.’ In Joseph B, DeStafano J, Jacobs N & Lehiste I (eds.) When languages collide: Sociocultural and geopolitical implications of language conflict and language coexistence. Columbus: Ohio State University. 257–295. Friedman V A (2004). ‘Language planning and status in the Republic of Macedonia and in Kosovo.’ In Bugarski R & Hawkesworth C (eds.) Language in the Former Yugoslav Lands. Bloomington, IN: Slavica. 197–231. Lunt H (1984). ‘Some sociolinguistic aspects of Macedonian and Bulgarian.’ In Stolz B, Titunik I & Dolezˇel L (eds.) Language and literary theory (Papers in Slavic Philology, 5). Ann Arbor, MI: University of Michigan. 83–132. Lunt H (1986). ‘On Macedonian Language and Nationalism.’ Slavic Review 45(4), 729–734.

Macedonian V A Friedman, University of Chicago, Chicago, IL, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction Modern Macedonian (makedonski in Macedonian) is a South Slavic language (Slavic, Indo-European). It is not to be confused with Ancient Macedonian, an Indo-European language of uncertain (but not Slavic) affiliation, whose most famous speaker was Alexander the Great. Macedonian is closest to Bulgarian and Serbian. Macedonian is descended from the dialects of Slavic speakers who settled in the Balkan peninsula during the 6th and 7th centuries C.E. The oldest attested Slavic language, Old Church Slavonic, was based on dialects spoken around Salonica, in what is today Greek Macedonia. As it came to be defined in the 19th century, geographic Macedonia is the region bounded by Mount Olympus, the Pindus range, Mounts Shar and Osogovo, the western Rhodopes, the lower course of the river Mesta (Greek Nestos), and the Aegean Sea. Many languages are spoken in this region, but it is the Slavic dialects to which the glossonym Macedonian is applied. The region was part of the Ottoman Empire from the late 15th century until 1912 and was partitioned among Greece, Serbia, and Bulgaria (with a western strip of villages going to Albania) by the Treaty of Bucharest in 1913. The modern Republic of Macedonia, in which Macedonian is the official language, corresponds roughly to

the southern part of the territory ceded to Serbia plus the Strumica valley. The population is 2 022 547 (2002 census). Outside the Republic, Macedonian is spoken by ethnic minorities in Albania, Bulgaria, Greece, and Kosovo as well as by e´migre´ communities elsewhere. Greece does not recognize the existence of its ethnic minorities, Bulgaria insists that all Macedonians are really Bulgarians, Albania refused to include questions about language and ethnicity in its last census (2001), and there has not been an uncontested statistical exercise in Kosovo since 1981, so official figures on Macedonian speakers outside the republic are unavailable; estimates range to 700 000.

History Modern Macedonian literary activity began in the early 19th century among intellectuals attempting to write their Slavic vernacular instead of Church Slavonic. Two centers of Balkan Slavic literacy arose, one in what is now northeastern Bulgaria, the other in what is now southwestern Macedonia. In the early 19th century, all these intellectuals called their language Bulgarian, but a struggle emerged between those who favored northeast Bulgarian dialects and those who favored western Macedonian dialects as the basis for what would become the standard language. Northeast Bulgarian became the basis of standard Bulgarian, and Macedonian intellectuals began to work for a separate Macedonian literary language. The earliest known published statement of a separate Macedonian linguistic identity was by

356 Macedonia: Language Situation Language Maps (Appendix 1): Map 138.

Bibliography Friedman V A (1985). ‘The sociolinguistics of Literary Macedonian.’ International Journal of the Sociology of Language 52, 31–57. Friedman V A (1998). ‘The implementation of Standard Macedonian: Problems and results.’ International Journal of the Sociology of Language 131, 31–57. Friedman V A (1999). ‘The Romani language in the Republic of Macedonia: Status, usage, and sociolinguistic perspectives.’ Acta Linguistica Hungarica 46(3–4), 317–339. Friedman V A (2001). ‘The Vlah minority in Macedonia: Language, identity, dialectology, and standardization.’ In Nuorluoto J, Leiwo M & Halla-aho J (eds.) Selected Papers in Slavic, Baltic, and Balkan Studies, (¼ Slavica Helsingiensa 21). Helsinki: University of Helsinki. 26–50.

Friedman V A (2003). ‘Language in Macedonia as an identity construction site.’ In Joseph B, DeStafano J, Jacobs N & Lehiste I (eds.) When languages collide: Sociocultural and geopolitical implications of language conflict and language coexistence. Columbus: Ohio State University. 257–295. Friedman V A (2004). ‘Language planning and status in the Republic of Macedonia and in Kosovo.’ In Bugarski R & Hawkesworth C (eds.) Language in the Former Yugoslav Lands. Bloomington, IN: Slavica. 197–231. Lunt H (1984). ‘Some sociolinguistic aspects of Macedonian and Bulgarian.’ In Stolz B, Titunik I & Dolezˇel L (eds.) Language and literary theory (Papers in Slavic Philology, 5). Ann Arbor, MI: University of Michigan. 83–132. Lunt H (1986). ‘On Macedonian Language and Nationalism.’ Slavic Review 45(4), 729–734.

Macedonian V A Friedman, University of Chicago, Chicago, IL, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction Modern Macedonian (makedonski in Macedonian) is a South Slavic language (Slavic, Indo-European). It is not to be confused with Ancient Macedonian, an Indo-European language of uncertain (but not Slavic) affiliation, whose most famous speaker was Alexander the Great. Macedonian is closest to Bulgarian and Serbian. Macedonian is descended from the dialects of Slavic speakers who settled in the Balkan peninsula during the 6th and 7th centuries C.E. The oldest attested Slavic language, Old Church Slavonic, was based on dialects spoken around Salonica, in what is today Greek Macedonia. As it came to be defined in the 19th century, geographic Macedonia is the region bounded by Mount Olympus, the Pindus range, Mounts Shar and Osogovo, the western Rhodopes, the lower course of the river Mesta (Greek Nestos), and the Aegean Sea. Many languages are spoken in this region, but it is the Slavic dialects to which the glossonym Macedonian is applied. The region was part of the Ottoman Empire from the late 15th century until 1912 and was partitioned among Greece, Serbia, and Bulgaria (with a western strip of villages going to Albania) by the Treaty of Bucharest in 1913. The modern Republic of Macedonia, in which Macedonian is the official language, corresponds roughly to

the southern part of the territory ceded to Serbia plus the Strumica valley. The population is 2 022 547 (2002 census). Outside the Republic, Macedonian is spoken by ethnic minorities in Albania, Bulgaria, Greece, and Kosovo as well as by e´migre´ communities elsewhere. Greece does not recognize the existence of its ethnic minorities, Bulgaria insists that all Macedonians are really Bulgarians, Albania refused to include questions about language and ethnicity in its last census (2001), and there has not been an uncontested statistical exercise in Kosovo since 1981, so official figures on Macedonian speakers outside the republic are unavailable; estimates range to 700 000.

History Modern Macedonian literary activity began in the early 19th century among intellectuals attempting to write their Slavic vernacular instead of Church Slavonic. Two centers of Balkan Slavic literacy arose, one in what is now northeastern Bulgaria, the other in what is now southwestern Macedonia. In the early 19th century, all these intellectuals called their language Bulgarian, but a struggle emerged between those who favored northeast Bulgarian dialects and those who favored western Macedonian dialects as the basis for what would become the standard language. Northeast Bulgarian became the basis of standard Bulgarian, and Macedonian intellectuals began to work for a separate Macedonian literary language. The earliest known published statement of a separate Macedonian linguistic identity was by

Macedonian 357

Gjorgji Pulevski 1875, but evidence of the beginnings of separatism can be dated to a letter from the teacher Nikola Filipov of Bansko to the Bulgarian philologist Najden Gerov in 1848 expressing dissatisfaction with the use of eastern Bulgarian in literature and textbooks (Friedman, 2000: 183) and attacks in the Bulgarian-language press of the 1850’s on works using Macedonian dialects (Friedman, 2000: 180). The first coherent plan for a Macedonian standard language was published by Krste Misirkov in 1903. After World War I, Macedonian was treated as a dialect of Serbian in Serbia and of Bulgarian in Bulgaria and was ruthlessly suppressed in Greece. Writers began publishing Macedonian works in Serbian and Bulgarian periodicals, where such pieces were treated as dialect literature, but some linguists outside the Balkans treated Macedonian as a separate language. On August 2, 1944, Macedonian became the official language of what was then the People’s Republic of Macedonia. Bulgaria recognized both the Macedonian language and its own Macedonian minority from 1946 to 1948. From 1948 to the 1960s, some Bulgarian linguists continued to recognize Macedonian as a separate Slavic language. When Macedonia declared independence from Yugoslavia in 1991, Bulgaria immediately recognized the state, but not the nationality or the language. In February 1999, the Bulgarian government officially recognized the Macedonian standard language.

Dialects Macedonian dialects are divided by a major bundle of isoglosses running from northwest to southeast along the River Vardar, swerving southwest at the confluence of the Vardar and the Crna and continuing down the Crna and into Greece southeast of Florina (Lerin in Macedonian), then bifurcating north of Kastoria (Kostur in Macedonian) so that the remaining Macedonian-speaking villages in Greece and Albania form a transitional zone. The western region is characterized by a relatively homogeneous central area and five groups of peripheral dialects centered on towns around the western periphery. The eastern zone has six dialect groups with no regional center. Standard Macedonian is based on the west-central dialects, with elements from other dialects.

Orthography and Phonology Macedonian is written in the Cyrillic alphabet, following the principle of one letter per sound, as in Serbian Cyrillic. Macedonian has three distinctive letters – kB, gB , s – representing the voiceless and voiced dorsopalatal stops and the voiced dental affricate, respectively. Macedonian Cyrillic is, according to

the standard (Koneski, 1967: 115), used to represent clear /l/ before consonants, before back vowels, and word-finally, where it can contrast with velar /ł/, e.g., bela [beła] ‘white’ F versus be a [bela] ‘trouble’. The contrast is neutralized before front vowels, where only clear /l/ is prescribed. Some educated speakers pronounce as palatal [l], influenced by the Serbian pronunciation of this letter and the fact that the same reflex occurs in the Skopje town dialect. Standard Macedonian has a five-vowel system (a, e, i, o, u), and most dialects outside the west-central area also have schwa, but of different origins in various regions. There is no letter to represent schwa in Macedonian Cyrillic; when it is necessary to do so, an apostrophe is prescribed. The western Macedonian dialects and the standard are characterized by fixed antepenultimate stress, e.g., vode´ nicˇar ‘miller’, vodenı´cˇari ‘millers’, vodenicˇa´rite ‘the millers’.

Morphology, Syntax, and Lexicon Macedonian has masculine, feminine, and neuter genders. Aside from plurals and pronouns, the only remnants of Slavic substantival inflection in Macedonian are the masculine and feminine vocative, which are becoming obsolete; oblique forms for masculine proper names and a few kinship terms and other masculine animates, all facultative; and a quantitative plural for inanimate nouns, which is used only sporadically, except in a few common expressions. Macedonian has a three-way opposition in the postposed definite article – -t-‘neutral’, -v-‘proximal’, -n-‘distal’ – although these meanings can be based on speaker attitude as well as physical distance. The example in (1) is illustrative. (1) raki-vcˇ e-to kBe brandy-DIMFUT DEF.NEUT dade-sˇ na give-2.sing.PRES to

mu him.DAT

go it-ACC

prijatel-ov od friend-DEF. from MASC.PX nasˇ -a-na vo frizer-ov our-FEM.FEM. in freezer-DEF. DEF.DS MASC.PX ‘Give the little [glass of] brandy to our friend here, from that [brandy] of ours, in the freezer here.’

The article attaches to the end of the first nominal in the noun phrase, i.e., not adverbs: (2) ne not

mnogu much

po-star-i-te deca COMP-oldchildren PL-DEF.PL ‘the children that are not much older’ edna od mnogu-te nasˇ -i zadacˇ -i one from many-DEF.PL our-PL problems-PL ‘one of our many problems’

358 Macedonian

The Macedonian verb has both aorist/imperfect and perfective/imperfective aspectual oppositions, but imperfective aorists are now obsolete. Perfective presents and imperfects occur only after one of eight modal particles, although perfective presents can also be used in negative questions. Macedonian also developed a new perfect series using the auxiliary ima ‘have’ and an invariant neuter verbal adjective. The synthetic pasts are marked for speaker confirmation, while the descendent of the Common Slavic perfect, using the old resultative participle in -l (no longer a true participle, since it cannot be used attributively), is not marked for speaker confirmation and is therefore used when the speaker cannot or will not vouch for the truth of the statement, e.g., because it was reported: Toj besˇ e vo Moskva ‘He was in Moscow’ (I saw him or accept the fact as established). Toj bil vo Moskva ‘He was in Moscow’ (I heard it but was not there myself, do not vouch for it, or do not believe it [nuance depending on context]). The verbal l-form is also used in the inherited Slavic pluperfect (with the auxiliary ‘be’ in the imperfect) and the inherited conditional (after invariant modal particle bi). The new pluperfect is formed with the imperfect of ‘have’ and the neuter verbal adjective. The new conditional uses the invariant future marker kBe plus the imperfect (perfective or imperfective) of the main verb. The bi-conditional tends to be used for hypothetical apodoses and the kBe conditional for irrealis. The following are distinctively Macedonian lexical items: saka ‘want, like, love’, bara ‘seek’, zboruva ‘speak’, zbor ‘word’, deka ‘that (relativizer)’, vaka ‘in this manner’, olku ‘this many’. See also: Balkans as a Linguistic Area; Balto-Slavic Languages; Bulgarian; Church Slavonic; Clitics; Demonstratives; Dialect Chains; Diminutives and Augmentatives;

Evidentiality in Grammar; Future Tense and Future Time Reference; Identity and Language; Language Change and Language Contact; Macedonia: Language Situation; Mood and Modality in Grammar; Old Church Slavonic; Perfectives, Imperfectives, and Progressives; Perfects, Resultatives, and Experientials; Standardization; Tense, Mood, Aspect: Overview; Tense; Word Stress.

Bibliography Apostolski M et al. (1969). Istorija na makedonskiot narod II. [History of the Macedonian people.] Skopje: Institut za Narodna Istorija. Friedman V A (2000). ‘The Modern Macedonian standard language and its relation to Modern Macedonian identity.’ In Roudometoff V (ed.) The Macedonian question: culture, historiography, politics. Boulder, CO: East European Monographs. 173–206. Friedman V A (2002). Macedonian. (Languages of the world/materials, 117). Munich: LinCom Europa. Koneski B (1967). Gramatika na makedonskiot literaturen jazik. [Grammar of the Macedonian literary language.] Skopje: Kultura. Koneski B (1983). Macedonian historical phonology (Shevelov G [ed.] Historical Phonology of the Slavic languages, Bd. 12). Heidelberg: Carl Winter. Kramer C (1986). Analytic modality in Macedonian. (Sagner O [ed.] Slavistische Beitra¨ ge Bd. 198). Mu¨ nchen: Otto Sagner. Kramer C (2003). Macedonian: A course for beginning and intermediate students. Madison: University of Wisconsin. Lekov I (1968). Kratka sravnitelno-istoricˇ eska i topologicˇ eska gramatika na slavjanskite ezici. [A short comparative-historical and typological grammar of the Slavic languages.] Sofia: Bulgarian Academy of Sciences. Lunt H (1952). A grammar of the Macedonian literary language. Skopje: Drzˇ avno knigoizdatelstvo. Vidoeski B (2004). The Dialects of Macedonian. Bloomington, IN: Slavica.

Machine Readable Corpora S Bernardini, University of Bologna, Bologna, Italy ! 2006 Elsevier Ltd. All rights reserved.

Introduction This article surveys the state of the art in corpus-aided translation research, teaching, and practice. The 1990s saw a surge of interest in these areas as corpora became more easily accessible, corpus linguistics established itself as a central approach to

the study of language, and translation/interpreting gained prominence as core subjects in academic curricula and learned discussions. The focus of this article is specifically on translation, and it distinguishes between descriptive (including theoretical) aspects (Descriptive Translation Studies, or DTS for short) and applied (didactic and professional) aspects (Applied Translation Studies, or ATS for short). In so doing, we follow, albeit very superficially, Holmes’s (1988) general taxonomy of translation studies.

358 Macedonian

The Macedonian verb has both aorist/imperfect and perfective/imperfective aspectual oppositions, but imperfective aorists are now obsolete. Perfective presents and imperfects occur only after one of eight modal particles, although perfective presents can also be used in negative questions. Macedonian also developed a new perfect series using the auxiliary ima ‘have’ and an invariant neuter verbal adjective. The synthetic pasts are marked for speaker confirmation, while the descendent of the Common Slavic perfect, using the old resultative participle in -l (no longer a true participle, since it cannot be used attributively), is not marked for speaker confirmation and is therefore used when the speaker cannot or will not vouch for the truth of the statement, e.g., because it was reported: Toj besˇe vo Moskva ‘He was in Moscow’ (I saw him or accept the fact as established). Toj bil vo Moskva ‘He was in Moscow’ (I heard it but was not there myself, do not vouch for it, or do not believe it [nuance depending on context]). The verbal l-form is also used in the inherited Slavic pluperfect (with the auxiliary ‘be’ in the imperfect) and the inherited conditional (after invariant modal particle bi). The new pluperfect is formed with the imperfect of ‘have’ and the neuter verbal adjective. The new conditional uses the invariant future marker kBe plus the imperfect (perfective or imperfective) of the main verb. The bi-conditional tends to be used for hypothetical apodoses and the kBe conditional for irrealis. The following are distinctively Macedonian lexical items: saka ‘want, like, love’, bara ‘seek’, zboruva ‘speak’, zbor ‘word’, deka ‘that (relativizer)’, vaka ‘in this manner’, olku ‘this many’. See also: Balkans as a Linguistic Area; Balto-Slavic Languages; Bulgarian; Church Slavonic; Clitics; Demonstratives; Dialect Chains; Diminutives and Augmentatives;

Evidentiality in Grammar; Future Tense and Future Time Reference; Identity and Language; Language Change and Language Contact; Macedonia: Language Situation; Mood and Modality in Grammar; Old Church Slavonic; Perfectives, Imperfectives, and Progressives; Perfects, Resultatives, and Experientials; Standardization; Tense, Mood, Aspect: Overview; Tense; Word Stress.

Bibliography Apostolski M et al. (1969). Istorija na makedonskiot narod II. [History of the Macedonian people.] Skopje: Institut za Narodna Istorija. Friedman V A (2000). ‘The Modern Macedonian standard language and its relation to Modern Macedonian identity.’ In Roudometoff V (ed.) The Macedonian question: culture, historiography, politics. Boulder, CO: East European Monographs. 173–206. Friedman V A (2002). Macedonian. (Languages of the world/materials, 117). Munich: LinCom Europa. Koneski B (1967). Gramatika na makedonskiot literaturen jazik. [Grammar of the Macedonian literary language.] Skopje: Kultura. Koneski B (1983). Macedonian historical phonology (Shevelov G [ed.] Historical Phonology of the Slavic languages, Bd. 12). Heidelberg: Carl Winter. Kramer C (1986). Analytic modality in Macedonian. (Sagner O [ed.] Slavistische Beitra¨ge Bd. 198). Mu¨nchen: Otto Sagner. Kramer C (2003). Macedonian: A course for beginning and intermediate students. Madison: University of Wisconsin. Lekov I (1968). Kratka sravnitelno-istoricˇeska i topologicˇeska gramatika na slavjanskite ezici. [A short comparative-historical and typological grammar of the Slavic languages.] Sofia: Bulgarian Academy of Sciences. Lunt H (1952). A grammar of the Macedonian literary language. Skopje: Drzˇavno knigoizdatelstvo. Vidoeski B (2004). The Dialects of Macedonian. Bloomington, IN: Slavica.

Machine Readable Corpora S Bernardini, University of Bologna, Bologna, Italy ! 2006 Elsevier Ltd. All rights reserved.

Introduction This article surveys the state of the art in corpus-aided translation research, teaching, and practice. The 1990s saw a surge of interest in these areas as corpora became more easily accessible, corpus linguistics established itself as a central approach to

the study of language, and translation/interpreting gained prominence as core subjects in academic curricula and learned discussions. The focus of this article is specifically on translation, and it distinguishes between descriptive (including theoretical) aspects (Descriptive Translation Studies, or DTS for short) and applied (didactic and professional) aspects (Applied Translation Studies, or ATS for short). In so doing, we follow, albeit very superficially, Holmes’s (1988) general taxonomy of translation studies.

Machine Readable Corpora 359

Corpus-Based DTS Precursors and Initiators

In the 1990s, the corpus approach came to be recognized as one of the major research paradigms in descriptive translation studies. Mona Baker (1993) argued that the discipline was at a turning point: the availability of electronic corpora made it possible to think of research directions beyond the simple comparison of an original and its translation, with its by then discredited reliance, implicit or explicit, on the notion of ‘equivalence’ (see Approaches to Translation, Linguistic; Translational Equivalence). This brought linguistic approaches to translation studies more in line with the so-called cultural turn in translation studies (see Cultural, Colonialism and Gender Oriented Approaches to Translation) (Lefevere and Bassnett, 1990). While Baker’s paper has generally been regarded as the main inspirer of the corpus-based approach, several precursors and co-founders should also be mentioned; first and foremost, Gideon Toury. Toury’s (1980, 1995) target-oriented approach to translation research – which viewed translation as an act that concerns primarily the culture in which a text is translated – had important implications that were taken on by the corpus approach, namely, the downplaying of the source text as the authority against which to evaluate a translation, and the concurrent enlargment of the scope of research to include aspects of the sociocultural environment in which a text is (to be) translated. Much current research in this area seeks textual evidence of the existence of ‘‘translation norms’’ in Toury’s sense, i.e., of those sociocultural constraints regulating the behavior of professional translators, which affect the ‘acceptability’ of any given translation at any given point in time. Similarly, the search for translation universals is grounded in Toury’s notion of ‘‘laws of translational behavior.’’ One such law, the ‘‘law of growing standardization,’’ for instance, tentatively states that ‘‘in translation, textual relations obtaining in the original are often modified [. . .] in favour of [more] habitual options offered by a target repertoire’’ (Toury, 1995: 268). Kenny (2001) investigated this hypothesis, using a corpus of original German literary texts translated into English, and reference corpora of the languages in question. She found that ‘unusual,’ creative collocations indeed tended to be ‘normalized’ in translation, i.e., in as many as 44% of cases in her corpus, more ‘normal’ words and expressions were chosen to translate unusual, creative ones. On the empirical side, precursors to this body of research were Vanderauwera (1985) and Gellerstam

(1986). Vanderauwera studied Dutch fiction translated into English between the late 1950s and 1980. In her corpus of around 70 novels (analyzed manually and on paper), she found evidence of normalization at different levels: translators tended to increase the fluency and coherence of their texts, making them more modern and more readable for their target audience and concealing their foreign origin. Gellerstam (1986) used an innovative methodology involving original novels in Swedish and novels translated from English into Swedish. Corpora of this type, often referred to as ‘monolingual comparable corpora,’ have been very influential, as we see in the section ‘Monolingual Comparable Corpora’ (MCC). Gellerstam’s aim was to find empirical evidence of translationese, i.e., that the Swedish used in translation from English was a special variety of the language heavily influenced by the English of the originals. An area in which translations are clearly influenced by the conventions of their source texts (or the genres the latter belong to) is reporting clauses, where, he found, sequences like direct speech – subject – verb of saying – adverbial were much more common in translated than in original Swedish fiction (cf., e.g., the English ‘‘‘I’ll hold it for you,’ he said with insulting kindness’’). The corpus-based approach to translation studies was thus largely characterized, particularly in its early days, by a target-oriented approach whereby the target text, language, and culture had priority over the corresponding source dimensions. At approximately the same time as Baker (1993), though in a separate line of research, Stig Johansson and colleagues (1994) at the University of Oslo were also creating resources and paradigms that would boost corpus research in translation considerably, with an eye toward the source as well as the target dimensions. In 1994, a research project was launched to build a bidirectional parallel corpus, i.e., a corpus consisting of originals and translations in two languages (the English Norwegian Parallel Corpus, Johansson and Hofland, 1994). While the main impetus for this project had come from contrastive linguistics, this design turned out to be also well-suited to translation studies. DTS: Major Corpora and Research Results

In this section, we look at several approaches to corpus-based DTS and discuss the major new hypotheses and insights they have contributed to the field. For the sake of clarity, studies are grouped according to the typologies of corpus resources they employed. The reader should be aware, however, that this is no more than a useful artifice of presentation that

360 Machine Readable Corpora

Figure 1 A selection from the English Comparable Corpus (fiction component, Laviosa, 1998).

should not hide the underlying unity of intents and goals as well as the many more or less local differences in corpus design and research methodology. Monolingual Comparable Corpora (MCC) A monolingual comparable corpus typically consists of two subcorpora of texts in the same language. These are (ideally) similar in all regards except the conditions of production: one subcorpus contains original texts, the other translations. An example of this type of corpus is the English Comparable Corpus (Laviosa-Braithwaite, 1996), developed at UMIST (Manchester). This corpus contains two subcorpora, both in English: one of translated fiction, biographies, newspaper text, and in-flight magazines, the other of comparable original texts in these various subdomains. Figure 1 exemplifies its structure. Somewhat counterintuitively for the reader who is used to thinking of translation research as an investigation of the relationship holding between a ‘source’ text (ST) in language A and its translation, or ‘target’ text (TT), in language B, a monolingual comparable corpus does not include the translations’ source texts. This follows from the principle, put forward by Baker in 1995, that translation research should undergo a shift of focus, ‘‘a shift away from comparing either ST with TT or language A with language B to comparing text production per se with translation’’ (Baker, 1995: 233). The rationale for such a shift

is that a comparison of originals and translations along these lines might highlight linguistic patterns in translated texts that differ from those in originals and that are unlikely to result from individual preferences, differences between two language systems, and so on. Several studies have searched for evidence confirming pre-existing hypotheses on ‘‘universal’’ tendencies or ‘‘norms/laws’’ of translation behavior (Toury, 1995) (see Translation Universals). Others, taking a more ‘corpus-driven’ approach (TogniniBonelli, 2001), have attempted to identify patterns and to interpret them in terms of (old and) new hypotheses. In what follows, we briefly discuss work related to three such hypotheses, concerning, respectively, simplification, explicitation (or, more precisely, explicitness, [Schmied and Scha¨ ffler, 1996]), and (un)typicality in translation. Translated Texts Are ‘Simpler’ Than Originals (Simplification Hypothesis) MCC studies in which this hypothesis – which predates corpus approaches (Blum-Kulka and Levenston, 1983) – was investigated are Laviosa-Braithwaite (1996) and Laviosa (1998). Since the direct observation of shifts between a ST and a TT is not feasible within this approach, this article identifies three features that can be easily searched for and compared across different subcorpora of the English Comparable Corpus (ECC). These

Machine Readable Corpora 361

are sentence length (the shorter the sentences, the easier the text), information load (the fewer the lexical words, the easier the text), and lexical variety (the less varied the lexis used, the easier the text). Analyses of the fiction and newspaper subcomponents of the ECC appeared to confirm that translated texts indeed had a lower information load and lower lexical variety than comparable originals. However, while translated newspaper texts in the ECC also featured a significantly lower mean sentence length than their original counterparts, the opposite was true of fiction texts. Translated Texts Are More ‘Explicit’ Than Originals (Explicitness Hypothesis) This hypothesis also predates the corpus approach (cf. Blum-Kulka, 1986, on cohesive explicitness in translation). Again making use of the ECC, Olohan and Baker (2000) and Olohan (2001) compared the frequency of (a selection of) optional syntactic elements in originals and translations, the rationale being that the optional explicitation of grammatical relations was likely to result in a more explicit text. Their results highlighted unusually high frequencies of several optional elements in translations as compared with originals. To take just one example from Olohan (2001), translated texts in this corpus clearly favored the pattern ‘PROMISE þ that’ over ‘PROMISE þ zero’ (that is present in 89 out of the total 131 cases), whereas the opposite was true of originals (89 ‘PROMISE þ zero’ out of the total 135 cases). Along similar lines, Puurtinen (2004) analyzed the use of explicit clausal connectives (conjunctions, pronouns, adverbs) in a corpus of Finnish children’s literature (part of the Corpus of Translated Finnish [Mauranen, 2004]). The corpus contains originals in Finnish and translations from English. It would follow from the explicitness hypothesis that translations should have had a higher frequency of these elements than comparable originals. However, this was not always the case in Puurtinen’s data, and results were inconclusive. Translated Texts Show ‘Unusual’ Patternings of Typical and Untypical Target Language Features with Respect to Originals ([un]typicality Hypothesis) This hypothesis differs from the previous ones in that it was first developed and investigated within the corpus approach. In this sense, it is truly corpusdriven. It consists of two complementary parts. On the one hand, it predicts that translations will contain patterns that are untypical of the target language (a claim that goes back to Gellerstam (1986) on translationese). On the other, it suggests that typical features of the target language that lack obvious linguistic

equivalents in the source language will be underrepresented in translation. Work on this hypothesis was carried out by Mauranen (2000), Tirkkonen-Condit (2004), and Eskola (2004), using the Corpus of Translated Finnish. Mauranen (2000: 137) found that a word like toisaalta (meaning approximately ‘‘on the other hand’’), which she described as ‘‘highly target-language specific,’’ was much more common in originals than in translations. Similarly, TirkkonenCondit (2004) investigated two clitics and a set of Finnish verbs that lack obvious lexicalized equivalents in Indo-European languages. She found that all these elements were underrepresented in translation and showed unusual grammatical and collocational patternings. While those studies were situated toward the lexical end of the lexicogrammatical continuum, Eskola (2004) investigated typical and untypical syntactic structures in a narrative prose subcorpus of the Corpus of Translated Finnish. Findings from this study confirmed those from previous ones with regard to target-language-specific constructions (underrepresented and showing unusual patterning in translation) and further provided evidence of translationese (overrepresentation of a source-language typical structure). Bi- and Multilingual Parallel Corpora (Including Reference Corpora) Bi- and multilingual parallel corpora are collections of originals in language A and their translations in language B (C, D, etc.). While not wholly innovative as a resource for translation research, these corpora differ from single pairs of translated texts inasmuch as (a) they are searchable by means of computerized techniques of analysis, (b) they allow the analyst to abstract away from single texts in an attempt to unveil regularities across a ‘representative’ body of data (e.g., German literary fiction translated into English, Kenny (2001)), and (c) they may avail themselves of reference corpora of the source and target languages in order to set their analyses against the background of general language use. The aim is, once again, to arrive at generalizations about translation strategies and translation universals/norms/laws. The parallel corpus methodology is sketched in Figure 2. Issues that have been investigated using this methodology include sanitization/normalization (the hypothesis that translated texts are less creative overall than their source texts) and explicitation/implicitation (the hypothesis that translated texts are more/ less explicit overall than their source texts). Translated Texts Are Less ‘Creative’ Than Their Source Texts (Normalization/Sanitization Hypothesis) This hypothesis was investigated by Kenny in

362 Machine Readable Corpora

Figure 2 The parallel corpus methodology (exemplificatory data from Danielsson, 2001).

her study based on Gepcolt, a corpus of 14 contemporary German novels and their translations into English. Her aim was to find out whether translators ‘‘typically draw on more conventional target language resources to replace unconventional [. . .] lexical features in source language texts’’ (Kenny, 2001: 111). By isolating instances of lexical creativity in the source corpus with the help of a source-language reference corpus (the Mannheim corpora of the Institut fu¨ r deutsche Sprache, Mannheim) and evaluating the originality of the corresponding target solutions with the help of a target-language reference corpus (the British National Corpus), Kenny was able to find evidence of a tendency toward normalization in translation, though she admitted that this was by no means a universal, constant feature. Translated Texts Are More/Less ‘Explicit’ Than Their Source Texts (Explicitation Hypothesis) In order to investigate this hypothesis, which is complementary to the ‘explicitness’ hypothesis, already discussed, researchers have compared source and target texts, looking for evidence of ‘explicitating shifts,’ i.e., shifts leading to ‘‘a rise in the level of cohesive explicitness in the TL text’’ (Blum-Kulka, 1986: 19). Thus, Øvera˚ s (1998) analyzed the first 50 sentences of 20 English novel extracts and their Norwegian translations (from the English Norwegian Parallel Corpus). She looked for such shifts at different rank levels (from word level to clause level) and recorded instances of implicitation shifts, i.e., changes that would result in a more implicit text, thus disconfirming the explicitation hypothesis. This study found confirmation of the explicitation hypothesis (347 explicitating shifts), but the author pointed out that implicitation was also attested (149 implicitation shifts). Bidirectional Corpora and Mixed Designs As can be seen from the discussion in the previous section,

the insights offered by parallel corpora are often complementary to those arrived at through monolingual comparable corpora. Monolingual comparable corpora have provided a new angle from which to look at translated text. But, in the words of Toury (2004: 17), ‘‘it is one thing to say that certain regularities were found in translation, and something quite different – to claim that the observed regularities were there because it is translation.’’ MCC cannot shed any light on the reasons behind the observed patternings. For this reason, a number of studies have taken advantage of bidirectional, or reciprocal, corpora. These corpora are basically a combination of two parallel corpora assembled according to the same design. Thus, for instance, the pioneering English Norwegian Parallel Corpus used by Øvera˚ s (1998) has the structure shown in Figure 3. Subcomponents A-B and C-D form parallel corpora of English (ST) ! Norwegian (TT) and Norwegian (ST) ! English (TT), respectively. Since the two corpora were designed so as to be comparable, subcomponents A $ D and C $ B constitute monolingual comparable corpora of English (O and T) and Norwegian (O and T), respectively. Other combinations are also possible, but they are less relevant to DTS. This corpus design is of particular interest to translation research for two reasons. First, it allows comparable studies of translation shifts to be carried out in two directions. The study by Øvera˚ s (1998) actually consisted of two parts, i.e., the analysis of the first 50 sentences of 20 English novel extracts and their Norwegian translation was complemented with an analysis of 20 Norwegian novel extracts translated into English. Øvera˚ s found that the tendency to explicitate/implicitate differed depending on the translation direction: 347 instances of explicitation and 149 instances of implicitation were attested in translation from English into Norwegian, versus 248 instances of explicitation and 76 instances of implicitation in translation from

Machine Readable Corpora 363

time, more linguistically informed analyses and new research directions have also been undertaken, in an attempt to both deepen and widen the paradigm. These developments give us high hopes for the future of corpus-based DTS.

Figure 3 Structure of a bidirectional corpus (the EnglishNorwegian Parallel Corpus, Johansson and Hofland, 1994).

Norwegian into English (on language- and directionspecific shifts, see Teich, 2003; also the section ‘Recent Developments and Ways Forward’). Second, this corpus design made it possible to check hypotheses made on the basis of MCC against the relevant source texts. Thus, for instance, Pa´ pai (2004) studied explicitation in a small corpus of literary and technical writing that consisted of three subcomponents: (a) originals in English, (b) their translations into Hungarian, and (c) comparable Hungarian originals. Her combined analysis confirmed that explicitness/explicitation is indeed a characteristic feature both of the translation process from English into Hungarian and of the resulting translation product. A similar corpus design was adopted by Puurtinen (1998) in her study of readability and ideology in Finnish original and translated children’s literature. Here, the English source texts served as a source of explanation for the unusual patternings observed in translated Finnish with respect to original Finnish. Other bidirectional corpora currently available are, among others, Compara for English/Portuguese (Frankenberg-Garcia and Santos, 2003), the English Swedish Parallel Corpus (ESPC, Altenberg and Aijmer, 2000), and the English German Translation Corpus (Schmied, 2002). Limits, Recent Developments, and Ways Forward This summary discussion of corpus-based DTS may give the reader an idea of the vitality of the field. In the 1990s, corpus studies revived interest in the linguistic approach to translation research and produced a host of hypotheses about the nature of translation that will keep researchers occupied for years to come. In more recent years, however, several studies have also pointed to a number of limits and possible pitfalls of this body of work. At the same

Limits A first point worth mentioning with regard to the limits of the corpus-based methodology in translation studies regards the difficulty of maximizing the comparability of the various subcomponents and at the same time ensuring that the corpus is representative of a larger, socioculturally plausible set (e.g., ‘children’s fiction,’ ‘popular science articles’). These issues apply to corpus linguistics in general but are crucial for translation research, where the variables to be taken into account when designing the corpus are often doubled with respect to monolingual corpus studies and where the corpora used are typically smaller. Bernardini and Zanettin (2004) provided a more thorough discussion of this point. Another issue that has been the target of some criticism is the tendency for some studies to draw conclusions based solely on frequency counts of very shallow textual features that can be easily and automatically harvested. Mauranen (2002) discussed differences in the patterns of use of certain Finnish connectors in her monolingual comparable corpus, which may be interpretable in terms of a tendency for translators to adhere to local source text stimuli and/or of a complementary tendency not to adopt ‘domesticating’ strategies at the level of discourse. These differences are rather subtle and would escape a simple frequency count (i.e., the connectors themselves are neither under- nor over-used in translated vs. original texts but are simply used differently). A similar criticism was voiced by Teich (2003), who pointed out that methods that rely on simple automatic counts of features (e.g., type-token ratio, ratio of grammatical to lexical words, average sentence length) essentially operate at the level of words. She suggested that ‘‘the gap between word counts and [. . .] explanations lying in the specific cognitive processes involved in translation, is simply too wide. Categories of intermediate levels of linguistic abstraction, such as grammatical or semantic categories, would need to be applied in corpus analysis in order to even have a chance of explanation of the variation found in translation’’ (Teich, 2003: 22–23). One last criticism of this approach is possibly some overenthusiasm. Several studies that set out to provide only descriptive evidence about translation did not resist the temptation to draw conclusions not warranted by such evidence. Reflections on the

364 Machine Readable Corpora

theoretical and methodological implications of this body of work have recently started to appear. These have suggested that claims about translation ‘universals’ may be better recast in terms of ‘laws’ and ‘norms’ (Toury, 1995) or ‘memes’ (Chesterman, 1997), given the very partial data on which they are by necessity based and given the difficulties of isolating variables and identifying causes for observed regularities. Toury (2004) warned against the risk of inferring universals directly from regularities that can be spotted automatically while stressing that the identification of causes for observed patterns and behaviors is a weak spot of the corpus approach. In conclusion, it would seem that if insights into translation universals are to emerge from corpusbased research, they are likely to do so from an accumulation of local studies that are restricted in focus and that control as far as possible the explosion of variables that almost by definition characterize translation. The inclusion in the design of a parallel component is also virtually unavoidable if one is to go beyond mere hypotheses about the motivations behind observed patterns. Recent Developments and Ways Forward Some recent studies have attempted to go beyond the research designs described in this article, constructing innovative corpus resources and testing alternative methodologies of analysis. One promising new resource that was recently completed at the University of Oslo is a ‘‘multiple translation corpus’’ (Figure 4). This corpus consists of two English originals (a short story and [part of] an academic paper) and 10 translations of these STs into Norwegian. The translations were commissioned from established professionals who worked independently and handed in a draft and a polished version. The 10 translations and the relative source texts are

aligned, thus making it possible to observe the range of variation and the effect of individual strategies, abstracting away from or else contrasting literary vs. technical translation. There is no denying that the resulting product is an artificial construct that does not resemble anything that could exist in reality. This drawback is unavoidable if one wishes to analyze translations that are (a) contemporary, (b) produced independently of one another, and (c) technical as well as literary. An alternative to this design is the multitarget collection of Hans Christian Andersen’s short stories and their English translations, discussed, e.g., in Malmkjær (2003). These translations, spanning more than a century, were likely to have been influenced by previous translations of the same texts. Thus, while they provide a rich source of insights about translation in a diachronic perspective, they are less than ideal when the goal is to control variables as tightly as possible in order to isolate alternative translation strategies and regularities. Moving on from innovative corpus designs to methodologies, recent attempts have been made to combine existing approaches and/or adapting those from neighboring areas. Kujama¨ ki (2004), for instance, combined classic elicitation tests and corpus-based studies of student translations to shed light on the likely cause of the underrepresentation of target-language-unique items in translation. Learners were asked to back-translate an English or German translation of a text, originally written in Finnish, that contained Finnish culture-specific words (referring to snowdrifts and road conditions) that are not lexicalized in English and German. When translating, the students tended to overlook these equivalents. When prompted by a cloze test in their native language, however, a majority produced them. Thus, by combining cloze tests and translation learner

Figure 4 Structure of the Oslo multiple translation corpus (Johansson, 2003).

Machine Readable Corpora 365

corpora (see the section ‘Translation Learner Corpora’), Kujama¨ ki was able to provide evidence linking Tirkkonen-Condit’s unique items hypothesis (2004) and Toury’s (1995) law of interference, which states in rather general terms that a target text surface form is likely to be influenced by its source text surface form. By drawing on a rich combination of theories and practices – a model of cross-linguistic variation based on systemic-functional linguistics, a contrastive typology of the English and German grammatical systems, bilingual parallel (ST $ TT), monolingual comparable (O $ T), and bilingual comparable (SL-O $ TL-O) analyses of texts – Teich (2003) developed a comprehensive approach to corpus-based translation research. Her aim was to shed light on two apparently conflicting hypotheses: (a) that the source language will ‘interfere’ with the target text formulation (SL shining through) and (b) that a translation will tend to conform to the properties of the target language (TL normalization). The corpus used for this purpose is a bidirectional collection of German originals with their English translations, and English originals with their German translations (along the lines of the ENPC, see Figure 3). Furthermore, it is register controlled, containing only popular-scientific writing (by experts addressing nonexperts). Teich found that both shining through and normalization could be observed in translation and that English TTs tended to have more normalization, whereas German TTs were characterized by more shining through. The reasons behind these findings were hypothesized to be related to the differences between the two language systems involved: Where the target language has more options in a particular grammatical system, it can afford to let the source language shine through. Where the target language has fewer options [. . .] it has to compensate; where the same compensatory means is used frequently, we encounter TL normalization. (Teich, 2003: 219)

On the methodological side, Teich concluded that an account of normalization could not be provided on the basis of an anaysis of monolingual comparable corpora alone; a contrastive analysis of original comparable texts in two or more languages and an account of their grammatical systems were necessary first steps in order to define a common ground for comparison. Summing up, there are signs within corpus-aided DTS that the discipline is coming to terms with the complexity of its research object, taking stock of previous studies and work in related disciplines in

order to develop more theoretically and methodologically sound approaches.

Corpus-Based ATS: Translation Teaching and Translation Practice Moving on from translation theory/description to practice, in this section we survey applications of corpora to the teaching of translation (in educational settings), as well as to the practice of translation (in professional settings). Long before the advent of electronic corpora, translators and translator trainees collected reference texts, in their source and target languages, representative of the subject area/genre of a translation task. The translators read these texts, often called ‘parallel texts,’ from beginning to end to familiarize themselves with a topic or to search for terms and their translations in authentic contexts of use. Nowadays, bilingual comparable corpora and ‘concordancers’ (search tools that display a given term or phrase in context) provide a shortcut to sidestep this laborious preparatory work. Large amounts of texts can be searched for terms, expressions, and concepts while a translation is being carried out, allowing even nonexpert translators to produce texts that are terminologically appropriate and that conform to the conventions of the target discourse community. Thanks to these new resources (and in general to the much greater availability of electronic texts), translators can accept a wider range of jobs, including translation into a foreign language (L2). It is therefore no surprise that corpus use became popular in translator education in the 1990s, with more and more teachers requiring learners to build specialized corpora for their (technical) translation assignments and including corpus construction and use as one of the components of their translation syllabus, on a par with, say, dictionary use and documentation. There is no denying that these developments have been favored by the ease with which single teachers and learners have been able to assemble their own corpora off the Internet. A number of other uses of corpus tools in the translation classroom have also seen the light, particularly in the area of research into translator trainees’ production. Surprisingly, on the other hand, the response of the translation industry has been rather more tepid. Corpus usage has still not reached a consensus among professionals, and this is all the more puzzling if one considers the huge success of other technological aids, which in many respects are not wholly dissimilar – namely, translation memories within computer-aided translation (CAT) tools.

366 Machine Readable Corpora

Figure 5 An excerpt from a farmhouse vacation online advertisement.

Corpora in the Translation Classroom: Comparable and Parallel Designs

The types of corpora that have been used most often as tools for translation teaching follow either the ‘bilingual comparable’ or the ‘bilingual parallel’ design. To these we should add monolingual corpora of the target language, which can assist in the identification of the correct terminology and style/register for a translation. Corpora following this design, however, have a wide range of applications besides translation (e.g., text analysis and LSP teaching), for which reason this article does not specifically focus upon them (but see Figure 8 and Figure 9). As just described, bilingual comparable corpora (or ‘comparable corpora’ for short) are collections of ‘similar’ original texts in two languages (dealing with the same topic, belonging to the same genre, produced in the same period, etc.), while bilingual parallel corpora (‘parallel corpora’) are collections of originals and their translations. Clearly, these types of corpora have different strengths and weaknesses and there are different technical requirements for their analysis. They are accordingly discussed separately. Comparable Corpora Let us take a concrete example. The description of a vacation farmhouse typically has, after an evocative description of the resort and its beauties, a section with details of the accommodation and the facilities on offer, often structured as a list, as in Figure 5. Suppose we had to translate this text into Italian. While the context may suggest an interpretation, I would expect the very reduced form of the first statement (‘sleeps 2–5’) to be confusing for some translator trainees. This use of the verb sleep to mean ‘have a given number of beds’ is indeed attested in dictionaries and has a reasonable number of occurrences in the British National Corpus (25 for the word form sleeps). Yet sifting through the evidence may be time-consuming and confusing and may give

Figure 6 Selected concordance lines for sleeps in a corpus of farmhouse vacation description web pages.

little or no indication of how typical this use is of the text typology to be translated. A specialized corpus is in this case much more helpful in a learning setting. A search for sleeps in an automatically constructed corpus of English farmhouse vacation descriptions (about 1 million words from web pages) yields the concordance lines in Figure 6 (selection). Reading these lines, a learner may get a clearer idea of the meaning of sleeps in her or his text, including the fact that the word itself is at the core of a larger frame, quite frequently used in this genre, that takes the form shown in Figure 7. Clearly, a more accurate search strategy would involve looking for other forms of the verb lemma SLEEP in this context, and we might wish to point this out to our learners. Yet the evidence available is probably enough for this rather trivial point. We are now left with the problem of finding a contextually appropriate translation for this lexical item. A comparable corpus of Italian farmhouse vacation (Agriturismo) descriptions may help. While usually not providing direct translation equivalents, comparable original texts in the target language are likely to express similar notions (the number of guests a vacation house can accommodate probably being a core piece of information these texts are expected to

Machine Readable Corpora 367

Figure 7 The sleeps frame.

Figure 8 Selected concordance lines for ‘persone’ in a corpus of Agriturismo web pages.

convey). In order to find out how this notion is expressed in Italian, we can start by looking for words that are likely to belong to an equivalent lexical item (a method described in detail in TogniniBonelli and Manca, 2002). Clearly, house or cottage would be a poor candidate in this case, as they may appear in many other patterns, and numbers are difficult to search for as a class in a raw text corpus. An alternative would be searching for an Italian equivalent for people, i.e., ‘persone’. The concordance lines in Figure 8 are taken from an automatically constructed corpus of Italian Agriturismo descriptions (about 600 000 words from web pages). A possible equivalent for ‘sleeps 2-5’ might thus be hypothesized to be ‘per 2-5 persone’ (literally ‘for 2-5 people’). This search strategy would, however, limit our chances of discovering other expressions that are more distant from the English but possibly more current in Italian. An alternative strategy might be to search for possible equivalents of the title ‘details of

the house,’ hypothesizing that the expression we are trying to identify would occur at a similar position in the text in both English and Italian. A search for ‘dettagli’ (‘details’) gives no relevant results, but a search for ‘caratteristiche’ (‘characteristics’) yields the example shown in Figure 9. The results of a subsequent search for ‘posti’ (‘places’) are shown in Figure 10. We now have an alternative translation for ‘sleeps 2-5’, which seems more natural and certainly appears more frequently in the corpus than our first solution (‘per 2-5 persone’), ‘da 2 a 5 posti letto’ (literally ‘from 2 to 5 bed places’). Comparable corpora may help trainee translators in their understanding of the meaning and structure of a source text, by drawing attention to both subject matter and genre regularities (and irregularities) across comparable exemplars (Aston, 1999); they also assist a nonexpert translator in the production of a target text that not only is factually correct but also is appropriate to the target genre conventions in terms of register and information structure (even when translating into the L2; see Zanettin (1998) on classroom activities based on comparable corpora). Parallel Corpora As mentioned in this article, parallel corpora are, at their simplest, collections of originals in one language aligned to their translations in another. Other types of parallel corpora exist, e.g., those consisting of different translations of the same text in the same language (see the section ‘Recent Developments and Ways Forward’). The process of alignment consists in the establishment of bonds between units in the texts (usually – for our purposes, at least – at the level of the sentence or of the paragraph) such that a search in one language retrieves the searched word or expression in context, as well as the corresponding unit (sentence or paragraph) in the other language. In order to use a parallel corpus, it is first necessary to align it; this can be done either manually or by using ad hoc software programs, some of which are freely available on the

368 Machine Readable Corpora

Figure 9 An example from the corpus.

Figure 10 Selected concordance lines for ‘posti ’ in a corpus of Agriturismo web pages.

Web. The corpus can then be searched by means of a parallel concordancer (which may also include its own aligner). Let us consider an example. The selected concordance lines for the word paper (English-Italian) in Figure 11 were obtained using the parallel concordancer ParaConc (Barlow, 1995) on a subset of a freely distributed corpus of EU Parliamentary Proceedings, which comes already aligned and is available in 11 European languages (the corpus was developed by Philip Ko¨ hn at the University of Southern California). The search for equivalents was facilitated by the software, which suggests possible candidates, but a degree of post-editing was necessary to arrive at the output shown here. The search word and its translation equivalent are underlined and boldfaced, while relevant collocates are in boldface only. As can be seen, a parallel concordance gives fast and easy access to a range of possible equivalents for

a given word or expression. By showing it in context, the concordance provides information about the more or less set phrases in which it occurs and about the collocations and colligations it forms with other words in texts. Normally, the concordance would also provide an indication of the word or expression’s frequency, i.e., whether it appears a few times and/or only in a small number of texts or whether it has wider currency; for obvious reasons of space, this point cannot be exemplified in this article. From the perspective of terminology, it is important, for instance, to make sure that ‘white paper’ is translated as ‘libro bianco’ (literally ‘white book’), the official label for this type of documents, and not as say, ‘white document’, which would be a more plausible candidate, since a working paper is ‘un documento di lavoro’ in Italian, and a policy paper, ‘un documento programmatico’. From the perspective of register and style, it may be interesting for learners to note that the expression ‘on paper’ has been translated once as ‘sulla carta’, which is a literal translation, once with the adverb ‘formalmente’ (formally), and once with the semifixed expression ‘RIMANERE lettera morta’ (REMAIN a dead letter), depending on the context. While evidence from parallel corpora has to be taken with a grain of salt – especially if we are not sure of the quality of the translations in the corpus and/or our learners have yet to develop the necessary critical skills – they clearly make a very useful tool. This is not only because the search for equivalents is far less complex than is the case with comparable corpora but also because they contain problem-solution pairs that are unique to the translation situation and that no comparable corpus could feature. Pearson (2003) illustrated this point with examples of classroom exercises, using a corpus of Scientific American articles and French translations published in the periodical Pour la Science. Translating the affiliation of researchers whose work was reported on in these articles is a problem that learners have to solve and for which comparable corpora are of little assistance. A parallel corpus, on the other hand, provides numerous examples. These suggest that when translating the name of a foreign university, translators tend to opt for a name that indicates

Machine Readable Corpora 369

Figure 11 Parallel concordance lines for the word paper and its Italian translations (EU Parliamentary proceedings).

where the university is, rather than its official name (e.g., ‘State University of New York at Albany’ becomes ‘L’Universite´ d’Albany’). Other Teaching Applications Translation Learner Corpora The translation learner corpus, an extension of the parallel paradigm motivated by a growing interest in learner corpora (Granger et al., 2002), is typically a collection of subcorpora, each of which contains multiple translations into the same language of a given text. Like multilingual parallel corpora, translation learner corpora are aligned in order to be searchable. While superficially similar, the two resources are used differently, the emphasis being, in the case of learner corpora, not so much on the observation of professional strategies as on peer-to-peer awareness-raising tasks concerning one’s own translation decisions. The example in Figure 12 is taken from a home assignment for students of translation at the School for Translators and Interpreters of the University

of Bologna, Italy. The source text is an article, ‘Sustainability,’ from Wikipedia, the free online encyclopedia. Translator trainees sometimes have difficulties thinking of alternative formulations after they have come up with one solution. A corpus of student translations may help in this instance, allowing them to compare their choices with those of their fellow students (and possibly of the teacher or of a professional), educating them to reformulate more and entertain different hypotheses before making a choice. From the teacher’s perspective, this corpus may assist when evaluating a student’s work: the relative merits of different choices can be compared, and their appropriateness can more easily be shown to be a gradient, going from optimal to unacceptable, rather than a right/wrong question. Furthermore, it may give indications of weak spots shared by large sections of a class, suggesting topics for remedial units. This type of corpus has a number of advantages: it recycles existing material, copyright is relatively

370 Machine Readable Corpora

Figure 12 Multiple student translations into Italian of the English expression ‘X is too many people.’

easy to obtain, and it can be of some use in the translation classroom from early on. The corpus from which the example in Figure 12 is taken, for instance, contains, at the time of writing, only 10 translations of a single text, produced by learners as a home assignment in their first course in translation from English into Italian. Assembling such a corpus may take no more than 2 weeks, and it can then be expanded in several directions: a longitudinal component can be incorporated whereby translations by the same students are added in subsequent years; conditions can be varied, e.g., collecting exam papers on comparable texts, or home assignments on different ones; a system of annotation may also be devised, possibly including an errortagging system (whereby errors are classified in

some way and marked in the corpus, to facilitate retrieval of similar instances). Bowker and Bennison (2003) discussed translation learner corpora in detail, made suggestions about ways of automatizing the process of corpus construction, and gave examples of possible applications. Building Corpora A number of researchers have pointed out that not only using a corpus but also building one may be a valid learning experience for future language professionals. This is both because constructing ‘disposable’ specialized corpora (Varantola, 2003) and managing them are perceived to be valuable professional skills in their own right and because a self-constructed corpus is easier to use and to evaluate (Aston, 1999; Maia, 2000). Varantola

Machine Readable Corpora 371

(2003) discussed a series of Web-based corpusconstruction activities carried out by translation students working in groups on specific projects. She pointed out that the problem was no longer text availability, as was the case before the advent of the World Wide Web, but rather ‘‘the relevance, adequacy, reliability and the analysis of this context-sensitive corpus material’’ (Varantola, 2003: 57). Thus, learners have to develop strategies for sifting through the texts available on the Web to find appropriate, authoritative, congruent texts (style-, register- and content-wise) with respect to a given translation task (note that a careful preliminary ST analysis is thus also required). Professional Applications

In the previous section, corpus construction and use are introduced as activities that have both a didactic, awareness-raising value and a professionalizing function. Many translation teachers nowadays see corpus management as one of the documentation skills a professional translator requires. As mentioned in the section ‘Corpus-Based ATS: Translation Teaching and Translation Practice,’ this belief is only partly shared by the translation profession itself, where this still seems to be a niche activity. In May 2004, I made a posting on the Lantra-L mailing list, that asked professional translators whether they knew or used corpora and concordancing software (I explained what I meant by these two terms) and whether they felt the Web is a more practical alternative these days. Lantra-L is one of the largest mailing lists for discussion of translation and interpreting issues, with, on average, 150 to 200 postings a day. Only five people replied. Their positions can be summarized as follows: (a) I have been using corpora (of English, German, Dutch, French, Czech) (two respondents); (b) I have been using translation memories with more advanced search facilities (one respondent); (c) I have been using corpora, though as a language teacher, not a translator (one respondent); (d) I have been using parallel texts (one respondent). In general, there was agreement that the Web, while useful, does not provide search features flexible enough for the needs of language professionals. These findings support those of Bowker (2004), who surveyed translation-related job ads and professional association literature in Canada. She found that job ads rarely focused on the ability to use specialized translation software in the first place, and corpus use was never referred to; as for newsletters of professional associations, they tended to focus mainly on translation memories as tools for translators, with few if any references to corpora and concordancers.

In this section we briefly review several areas of relevance to the translation profession in which corpora are (or could be) playing a role. Computer-Aided Translation (CAT) Systems Computer-aided translation systems typically include a translation memory manager, i.e., a computer program that creates a database of aligned ST-TT unit pairs as a translation is being carried out. As new ST units are added to the work environment, the translation memory manager compares them with those available in memory and, in case of partial or total matches, presents the user with the corresponding TT unit. Completed translations can also be entered post hoc into the translation memory; for this purpose, the package usually includes aligners, which speed up the process of creating matches between units smaller than the text (normally sentences). The resulting product is superficially very similar to an aligned parallel text, but as we look closer, differences appear. First, a translation memory is normally searched by a computer program, rather than the human end user. Ideally, it should contain units that correspond as closely as possible to each other, because any macroscopic differences (e.g., additions, deletions, gaps) between the ST and the TT are actually detrimental to achieving high levels of match between the two aligned versions. This is not so in parallel corpora, where these differences are meaningful. Second, each unit pair is considered in isolation from the texts to which it belongs. Units may be ordered alphabetically or chronologically, and a user can decide to import only selected units rather than whole texts. Thus, textual features that are above- or cross-sentence level (e.g., aspects of text cohesion or instances of compensation) are impossible to observe. Through parallel corpora and concordancers, on the other hand, users can typically access both one-line concordances and larger contexts (even whole texts) and move between these two levels of analysis. Third, translation memories are by their very nature ‘prescriptive’ tools that should include only gold standard translations, i.e., translations provided by the commissioner or agency for a specific task, typically previous versions of the same text that need to be updated. Parallel corpora, on the other hand, follow from a more ‘descriptive’ principle whereby it is suggested that translators may find it useful to check the strategies of fellow translators or acquaint themselves with a given topic or field. It does not follow, however, that they should necessarily adopt the solutions they find in the corpus. Parallel corpora are therefore more adaptable to different tasks, as long as the

372 Machine Readable Corpora

end user is aware of their structure, contents, and limitations. A new generation of tools currently being introduced into the market address the problem of integrating facilities for searching corpora with standard translation memory technology. These ‘hybrid’ tools (Bowker, 2004) combine the traditional translation memory approach to automatic search and retrieval of perfect and fuzzy matches, and the corpus searching approach. Thus, it is possible for the user to select or type in a word or phrase and retrieve for analysis the matching pairs that contain it. While the search possibilities are still rather crude, these systems would appear to go in the right direction of integrating different approaches to the reuse of existing materials. There is general agreement in the profession that translation memories are useful only for a very restricted set of tasks (basically updates of existing translations) or for very repetitive texts. While corpora take more time to compile and consult, they may be useful in a much wider range of situations, thus justifying a mixed approach. Terminology Extraction and Management The production of glossaries and terminological databases is an integral part of the translation profession, as well as an area of inquiry and systematization for terminologists. Specialized corpora are a major source of terms, definitions, and context(s) of use, and more and more tools are being developed that assist the translator or terminologist in sifting through corpus evidence in an attempt to identify relevant terms in corpora and produce a profile for them. In order to extract single- and multiword terms semiautomatically, many different strategies can be adopted. Here we shall limit discussion to two relatively straightforward approaches, one that relies more on statistics and word distributions, the other on linguistic principles. First, a raw-text (untagged) corpus can be searched for words or sequences that are either very frequent or more frequent than average (‘average frequency’ normally being defined via statistical comparisons with a reference corpus) and/or that are not function words (in the latter case a so-called stop-word list is used that removes function words from the output). A search through a corpus of press releases in the domain of computer hardware, using Wordsmith Tools Keyword facilities (Scott, 1996), and a reference corpus of EU parliamentary proceedings retrieves, among others, the 15 bigram terms in Table 1. The second strategy requires that the corpus be part of speech (POS) tagged (by means of an ad hoc software program known as a POS tagger (van Halteren,

Table 1 Bigram terms in a corpus of hardware press releasesa 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

HIGH-PERFORMANCE DIGITAL SIGNAL FAMILY OF REAL-TIME RANGE OF PROCESSOR-BASED MICROCHIP TECHNOLOGY IDEAL FOR CO-PROCESSORS ARE AVAILABLE THE CPCI DEVICES ARE PENTIUM III SOLUTION FOR EMBEDDED COMPUTING

a

Note that in this case hyphens are treated as word separators.

1999)). In this case, it is possible to search for sequences of parts of speech that follow a given pattern that the terminologist has identified. A list of this type for medical English might include those in Table 2. Once the terms have been identified in one language, their equivalents have to be found for the other. Parallel and comparable corpora may be of help, as suggested elsewhere in this article. They may also, however, assist in a further way, by facilitating the retrieval of additional information about a term, i.e., a definition and one or more examples of use. While the second step is rather straightforward, at least from the technical point of view, the first is much less so. Automatic routines that retrieve term definitions from unannotated text are not easy to devise. Pearson (2000) suggested heuristics that could facilitate the task, e.g., searching for expressions like ‘is a kind of’ or ‘i.e.’. The corpus used also plays a role, as communication by experts addressing nonexperts might be expected to be richer in terms of definitions than communication by experts addressing other experts. Thus, for these purposes textbooks are probably a better source than research articles. While corpora are unlikely to replace expert advice, they can go a long way in providing the knowledge required by terminologists and translators to produce databases that take into account the actual use to which terms are put and the phraseology associated with them in real texts. It is likely that more and more CAT tools will include terminology extraction tools that tag texts on the fly, retrieve terms according to user-defined term patterns, provide freetext search facilities for corpus analysis, and facilitate the integration of this information in the compilation of terminological databases.

Machine Readable Corpora 373 Table 2 Examples of POS patterns corresponding to English medical terms Pattern

Examples

Noun Verb Adjective Noun-Noun Noun-Noun-Noun

bradycardia, amozicillin, PSG excise, obtund distal, craniofacial eye movement, ear oximeter coronary heart disease, oxygen desaturation index night of monitoring, effect of smoking

Noun-PrepositionNoun Adjective-Noun

acoustic rhynometry, circadian rhythm

Machine Translation Applications The role of corpora in the development of machine translation systems is related only indirectly to the professional concerns of translators, for which reason it will be mentioned here only in passing. In recent years the main thrust of machine translation research has been to integrate ‘example-based’ and statistical methods with more mainstream, ‘rule-based’ methods. This new approach relies on large databases of aligned parallel text. In this regard, it follows the same recycling principles as translation memories, but it goes beyond, in the attempt to propose a novel translation rather than simply retrieve existing matching strings. At present, the idea of combining machine translation, translation memory, and corpus searching approaches via hybrid, interactive systems is considered to be one of the ways ahead in this area. The World Wide Web Thanks to the WWW, translators today have at their fingertips a massive amount of knowledge and textual evidence that can be searched more easily than ever before. An important question to ask therefore is whether corpora are still a valid tool or whether they have become anachronistic. The advantages of corpora over the Web used as a corpus are numerous. To mention but the first that come to mind, existing Web browsers have not been designed with linguistic applications in mind. Thus, at least for the time being, they offer very crude search facilities. For instance, the world’s favorite Web browser, Google, does not distinguish certain spelling variants; therefore, one cannot search for ‘thoughtprovoking’ and ‘thought provoking’ as separate queries. Similarly, specifying a window within which to search for two or more words may be quite demanding. More importantly, the sheer size and diversity of the Web and its changing nature have a number of inevitable consequences. For instance, POS annotation is unlikely to ever be an option (this

means, for example, that one cannot search for a given word only when it is a noun). Also, relevant examples from a given domain are typically mixed with irrelevant ones from other domains. Sifting through the evidence may become a lengthy process that one tends to dangerously end as soon as a reasonable hypothesis has formed on the basis of the first few hits returned. The Web, however, has a series of big advantages over corpora: it is mostly free, is readily available, is familiar, is constantly updated, is very large, and requires no time to design and construct. The last point especially is no small thing when it comes to hard-pressed translators. Attempts have been made to combine the best of two worlds, i.e., either make the Web more like a corpus or speed up corpus construction by using the Web. Several ‘Web as corpus’ tools have been designed that facilitate linguistically oriented searches (Fletcher, 2004), while experiments have been done in an attempt to automatize as far as possible the search and download of texts from the Web (including parallel texts; Almeida et al., 2002) so as to reduce the time and effort required to build a corpus, and thus hopefully encourage translators to do so. Baroni and Bernardini (2004) described an algorithm and a suite of freely available Perl scripts that have been developed specifically with professional and trainee translators in mind. These tools (a) find keywords in a text (by comparison with a reference corpus), (b) automatically search the Web for combinations of these words, (c) download texts, (d) convert them from HTML to plain text format, (e) discard duplicates, and (f) extract single- and multiword terms from the newly constructed corpus. This procedure can be iterated as many times as required, and various parameters can be tuned by the user.

Conclusion In this article I review three major areas at the interface between corpus linguistics and translation, namely, corpus-based approaches to (descriptive/ theoretical) translation studies, to translation teaching/learning, and to translation practice. As regards translation studies, I suggest that this research methodology, with its many branches and corpus typologies, is yielding valuable insights concerning translation strategies, norms/laws, and universals. More recently, attempts have been made to go beyond somewhat simplistic early assumptions and methods, to take stock of the new data, and to reflect on their implications. Moving on to the applied perspective, I provide several examples of possible uses of corpora

374 Machine Readable Corpora

(parallel, comparable, learner) in the translation classroom and point out that the activity of building a corpus may in itself make a valid exercise in text analysis and documentation. Finally, I consider the role currently played by corpus tools in the translation professions and show it to be still rather marginal, despite an obvious potential, a growing interest, and promising new applications. See also: Approaches to Translation, Linguistic; Cultural,

Colonialism and Gender Oriented Approaches to Translation; Translation Universals; Translational Equivalence.

Bibliography Almeida J, Simo˜ es A M & Castro J A (2002). ‘Grabbing parallel corpora from the Web.’ SEPLN 29, 13–20. Altenberg B & Aijmer K (2000). ‘The English-Swedish parallel corpus: a resource for contrastive research and translation studies.’ In Mair C & Hundt M (eds.) Corpus linguistics and linguistic theory (ICAME 20). Amsterdam and Atlanta: Rodopi. 15–33. Aston G (1999). ‘Corpus use and learning to translate.’ Textus 12(2), 289–314. Baker M (1993). ‘Corpus linguistics and translation studies: implications and applications.’ In Baker M, Francis G & Tognini-Bonelli E (eds.) Text and technology: in honour of John Sinclair. Amsterdam: Benjamins. 223–250. Baker M (1995). ‘Corpora in translation studies: an overview and some suggestions for future research.’ Target 7(2), 223–243. Barlow M (1995). A guide to ParaConc. Houston, TX: Athelstan. Baroni M & Bernardini S (2004). ‘BootCaT: bootstrapping corpora and terms from the Web.’ In Proceedings of LREC 2004. 1313–1316. Bernardini S & Zanettin F (2004). ‘When is a universal not a universal? some limits of current corpus-based methodologies for the investigation of translation universals.’ In Mauranen A & Kujama¨ ki P (eds.). 51–62. Blum-Kulka S (1986). ‘Shifts of cohesion and coherence in translation.’ In House J & Blum-Kulka S (eds.) Interlingual and intercultural communication: discourse and cognition in translation and second language acquisition studies. Tu¨ bingen: Gunter Narr. 17–35. Blum-Kulka S & Levenston E A (1983). ‘Universals of lexical simplification.’ In Færch C & Kasper G (eds.) Strategies in interlanguage communication. London: Longman. 119–139. Bowker L (2004). ‘Corpus resources for translators: academic luxury or professional necessity?’ TradTerm 10, 213–247. Bowker L & Bennison P (2003). ‘Student translation archive and student translation tracking system: design, development and application.’ In Zanettin et al. (eds.). 103–117.

Bowker L & Pearson J (2002). Working with specialized language: a practical guide to using corpora. London and New York: Routledge. Chesterman A (1997). Memes of translation: the spread of ideas in translation theory. Amsterdam: Benjamins. Danielsson P (2001). The automatic identification of meaningful units in language. Gothenburg: Gothenburg University. Eskola S (2004). ‘Untypical frequencies in translated language: a corpus-based study on a literary corpus of translated and non-translated Finnish.’ In Mauranen A & Kujama¨ki P (eds.). 83–99. Fletcher W (2004). ‘Facilitating the compilation and dissemination of ad-hoc Web corpora.’ In Aston G, Bernardini S & Stewart D (eds.) Corpora and language learners. Amsterdam: Benjamins. 275–302. Frankenberg-Garcia A & Santos D (2003). ‘Introducing COMPARA: the Portuguese-English parallel corpus.’ In Zanettin F et al. (eds.). 71–87. Gellerstam M (1986). ‘Translationese in Swedish novels translated from English.’ In Wollin L & Lindquist H (eds.) Translation studies in Scandinavia. Lund: CWK Gleerup. 88–95. Granger S, Hung J & Petch-Tyson S (eds.) (2002). Computer learner corpora, second language acquisition and foreign language teaching. Amsterdam: Benjamins. Holmes J S (1988, 1972). ‘The name and nature of translation studies.’ In Translated! papers in literary translation and translation studies. Amsterdam: Rodopi. 67–80. Johansson S (2003). ‘Reflections on corpora and their uses in cross-linguistic research.’ In Zanettin et al. (eds.). 135–144. Johansson S & Hofland K (1994). ‘Towards an EnglishNorwegian parallel corpus.’ In Fries U, Tottie G & Schneider P (eds.) Creating and using English language corpora. Amsterdam: Rodopi. 25–37. Kenny D (2001). Lexis and creativity in translation. Manchester: St. Jerome. Kujama¨ ki P (2004). ‘What happens to ‘‘unique items’’ in learners’ translations? ‘‘Theories’’ and ‘‘concepts’’ as a challenge for novices’ views on ‘‘good translation’’.’ In Mauranen A & Kuiama¨ ki P (eds.). 187–204. Laviosa S (1998). ‘Core patterns of lexical use in a comparable corpus of English narrative prose.’ Meta 43(4), 557–570. Laviosa-Braithwaite S (1996). The English Comparable Corpus (ECC): a resource and a methodology for the empirical study of translation. Unpublished Ph.D. diss., UMIST. Lefevere A & Bassnett S (1990). ‘Proust’s grandmother and the thousand and one nights: the ‘‘cultural turn’’ in translation studies.’ In Bassnett S & Lefevere A (eds.) Translation, history and culture. London: Pinter. 1–13. Maia B (2000). ‘Making corpora: a learning process.’ In Bernardini S & Zanettin F (eds.) I corpora nella didattica della traduzione (Corpus use and learning to translate). Bologna: CLUEB. 47–60.

Machine Translation: History 375 Malmkjær K (2003). ‘On a pseudo-subversive use of corpora in translator training.’ In Zanettin F et al. (eds.). 119–134. Mauranen A (2000). ‘Strange strings in translated language: a study on corpora.’ In Olohan M (ed.) Intercultural faultlines. Manchester: St. Jerome. 119–141. Mauranen A (2002). ‘Where’s cultural adaptation?’ Intralinea 5. Mauranen A (2004). ‘Corpora, universals and interference.’ In Mauranen A & Kujama¨ ki P (eds.). 65–82. Mauranen A & Kujama¨ ki P (eds.) (2004). Translation universals: do they exist? Amsterdam: Benjamins. Olohan M (2001). ‘Spelling out the optionals in translation: a corpus study.’ UCREL Technical Papers 13, 423–432. Olohan M & Baker M (2000). ‘Reporting that in translated English: evidence for subconscious processes of explicitation?’ Across Languages and Cultures 1(2), 141–158. Øvera˚s L (1998). ‘In search of the third code: an investigation of norms in literary translation.’ Meta 43(4), 571–588. Pa´pai V (2004). ‘Explicitation: a universal of translated text.’ In Mauranen A & Kujama¨ kin P (eds.). 143–164. Pearson J (2000). Terms in context. Amsterdam: Benjamins. Pearson J (2003). ‘Using parallel texts in the translator training environment.’ In Zanettin et al. (eds.). 15–24. Puurtinen T (1998). ‘Syntax, readability and ideology in children’s literature.’ Meta 43(4), 524–533. Puurtinen T (2003). ‘Genre-specific features of translationese? Linguistic differences between translated and non-translated Finnish children’s literature.’ Literary and Linguistic Computing 18(4), 389–406. Puurtinen T (2004). ‘Explicitation of clausal relations: a corpus-based analysis of clause connectives in translated and non-translated Finnish children’s literature.’ In Mauranen A & Kujama¨ ki P (eds.). 165–176. Schmied J (2002). ‘A translation corpus as a resource for translators: the case of English and German prepositions.’ In Maia B, Haller J & Ulrych M (eds.) Training the language services provider for the new millennium. Porto: Universidade do Porto. 251–269.

Schmied J & Hildegard S (1996). ‘Explicitness as a universal feature of translation.’ In Ljung M (ed.) Corpus-based studies in English: papers from ICAME 17. Amsterdam/ Atlanta: Rodopi. 21–36. Scott M (1996). Wordsmith Tools 3.0. Oxford: Oxford University Press. Teich E (2003). Cross-linguistic variation in system and text. The Hague: Mouton de Gruyter. Tirkkonen-Condit S (2004). ‘Unique items – over- or underrepresented in translated language?’ In Mauranen A & Kujama¨ki P (eds.). 177–184. Tognini-Bonelli E (2001). Corpus linguistics at work. Amsterdam: Benjamins. Tognini-Bonelli E & Manca E (2002). ‘Welcoming children, pets and guests: a problem of nonequivalence in the languages of ‘‘agriturismo’’ and ‘‘farmhouse holidays’’.’ Textus 15(2), 317–334. Toury G (1980). In search of a theory of translation. Tel Aviv: Tel Aviv University. Toury G (1995). Descriptive translation studies and beyond. Amsterdam: Benjamins. Toury G (2004). ‘Probabilistic explanations in translation studies: welcome as they are, would they qualify as universals?’ In Mauranen A & Kujama¨ ki P (eds.). 15–32. Vanderauwera R (1985). Dutch novels translated into English: the transformation of a ‘minority’ literature. Amsterdam: Rodopi. Van Halteren H (ed.) (1999). Syntactic Wordclass Tagging. Dordrecht: Kluwer. Varantola K (2003). ‘Translators and disposable corpora.’ In Zanettin F et al. (eds.). 55–70. Zanettin F (1998). ‘Bilingual comparable corpora and the training of translators.’ Meta 43(4), 616–630. Zanettin F, Bernardini S & Stewart D (eds.) (2003). Corpora in translator education. Manchester: St. Jerome.

Relevant Websites http://www.wikipedia.org.

Machine Translation: History J Hutchins, Norwich, UK ! 2006 Elsevier Ltd. All rights reserved.

Precursors and Pioneers, 1933–1954 Although we might trace the origins of ideas related to machine translation (MT) to 17th-century speculations about universal languages and mechanical dictionaries, it was not until the 20th century that the first practical suggestions could be made, in 1933

with two patents issued in France and Russia to Georges Artsrouni and Petr Trojanskij, respectively. Artsrouni’s patent was for a general-purpose machine that could also function as a mechanical multilingual dictionary. Trojanskij’s patent, also basically for a mechanical dictionary, went further with detailed proposals for coding and interpreting grammatical functions using ‘universal’ (Esperanto-based) symbols in a multilingual translation device. Neither of these precursors was known to Andrew Booth (a British crystallographer) and Warren Weaver

Machine Translation: History 375 Malmkjær K (2003). ‘On a pseudo-subversive use of corpora in translator training.’ In Zanettin F et al. (eds.). 119–134. Mauranen A (2000). ‘Strange strings in translated language: a study on corpora.’ In Olohan M (ed.) Intercultural faultlines. Manchester: St. Jerome. 119–141. Mauranen A (2002). ‘Where’s cultural adaptation?’ Intralinea 5. Mauranen A (2004). ‘Corpora, universals and interference.’ In Mauranen A & Kujama¨ki P (eds.). 65–82. Mauranen A & Kujama¨ki P (eds.) (2004). Translation universals: do they exist? Amsterdam: Benjamins. Olohan M (2001). ‘Spelling out the optionals in translation: a corpus study.’ UCREL Technical Papers 13, 423–432. Olohan M & Baker M (2000). ‘Reporting that in translated English: evidence for subconscious processes of explicitation?’ Across Languages and Cultures 1(2), 141–158. Øvera˚s L (1998). ‘In search of the third code: an investigation of norms in literary translation.’ Meta 43(4), 571–588. Pa´pai V (2004). ‘Explicitation: a universal of translated text.’ In Mauranen A & Kujama¨kin P (eds.). 143–164. Pearson J (2000). Terms in context. Amsterdam: Benjamins. Pearson J (2003). ‘Using parallel texts in the translator training environment.’ In Zanettin et al. (eds.). 15–24. Puurtinen T (1998). ‘Syntax, readability and ideology in children’s literature.’ Meta 43(4), 524–533. Puurtinen T (2003). ‘Genre-specific features of translationese? Linguistic differences between translated and non-translated Finnish children’s literature.’ Literary and Linguistic Computing 18(4), 389–406. Puurtinen T (2004). ‘Explicitation of clausal relations: a corpus-based analysis of clause connectives in translated and non-translated Finnish children’s literature.’ In Mauranen A & Kujama¨ki P (eds.). 165–176. Schmied J (2002). ‘A translation corpus as a resource for translators: the case of English and German prepositions.’ In Maia B, Haller J & Ulrych M (eds.) Training the language services provider for the new millennium. Porto: Universidade do Porto. 251–269.

Schmied J & Hildegard S (1996). ‘Explicitness as a universal feature of translation.’ In Ljung M (ed.) Corpus-based studies in English: papers from ICAME 17. Amsterdam/ Atlanta: Rodopi. 21–36. Scott M (1996). Wordsmith Tools 3.0. Oxford: Oxford University Press. Teich E (2003). Cross-linguistic variation in system and text. The Hague: Mouton de Gruyter. Tirkkonen-Condit S (2004). ‘Unique items – over- or underrepresented in translated language?’ In Mauranen A & Kujama¨ki P (eds.). 177–184. Tognini-Bonelli E (2001). Corpus linguistics at work. Amsterdam: Benjamins. Tognini-Bonelli E & Manca E (2002). ‘Welcoming children, pets and guests: a problem of nonequivalence in the languages of ‘‘agriturismo’’ and ‘‘farmhouse holidays’’.’ Textus 15(2), 317–334. Toury G (1980). In search of a theory of translation. Tel Aviv: Tel Aviv University. Toury G (1995). Descriptive translation studies and beyond. Amsterdam: Benjamins. Toury G (2004). ‘Probabilistic explanations in translation studies: welcome as they are, would they qualify as universals?’ In Mauranen A & Kujama¨ki P (eds.). 15–32. Vanderauwera R (1985). Dutch novels translated into English: the transformation of a ‘minority’ literature. Amsterdam: Rodopi. Van Halteren H (ed.) (1999). Syntactic Wordclass Tagging. Dordrecht: Kluwer. Varantola K (2003). ‘Translators and disposable corpora.’ In Zanettin F et al. (eds.). 55–70. Zanettin F (1998). ‘Bilingual comparable corpora and the training of translators.’ Meta 43(4), 616–630. Zanettin F, Bernardini S & Stewart D (eds.) (2003). Corpora in translator education. Manchester: St. Jerome.

Relevant Websites http://www.wikipedia.org.

Machine Translation: History J Hutchins, Norwich, UK ! 2006 Elsevier Ltd. All rights reserved.

Precursors and Pioneers, 1933–1954 Although we might trace the origins of ideas related to machine translation (MT) to 17th-century speculations about universal languages and mechanical dictionaries, it was not until the 20th century that the first practical suggestions could be made, in 1933

with two patents issued in France and Russia to Georges Artsrouni and Petr Trojanskij, respectively. Artsrouni’s patent was for a general-purpose machine that could also function as a mechanical multilingual dictionary. Trojanskij’s patent, also basically for a mechanical dictionary, went further with detailed proposals for coding and interpreting grammatical functions using ‘universal’ (Esperanto-based) symbols in a multilingual translation device. Neither of these precursors was known to Andrew Booth (a British crystallographer) and Warren Weaver

376 Machine Translation: History

(a director at the Rockefeller Foundation) when they met in 1946 and 1947 and discussed tentative ideas for using the newly invented computers for translating natural languages. In July 1949, a memorandum by Weaver, which stimulated the start of MT research in the United States, suggested various methods based on his knowledge of cryptography, statistics, information theory, logic, and language universals (Hutchins, 1997). In June 1952, the first MT conference was convened at MIT by Yehoshua Bar-Hillel, who had been appointed to survey the field. It was already clear that full automation of good-quality translation was a virtual impossibility and that human intervention either before or after computer processes (known from the beginning as pre- and post-editing, respectively) would be essential. Ideas were put forward for normalizing input texts and for microglossaries to reduce ambiguity problems and for some kind of syntactic structure analysis. Suggestions for future activity were proposed; in particular, Le´ on Dostert from Georgetown University argued for a public demonstration of the feasibility of MT. Accordingly, he collaborated with IBM on a simple MT system demonstrated in New York on January 7, 1954, with a great deal of media attention. A carefully selected sample of Russian sentences was translated into English using a very restricted vocabulary of 250 words and just six grammar rules. Although the system had little scientific value, its output was sufficiently impressive to stimulate large-scale funding in the United States and to inspire the initiation of MT projects elsewhere, notably in the USSR.

High Expectations, 1954–1966 When MT research began, there was little help to be had from current linguistics. As a consequence, in the 1950s and 1960s, researchers tended to polarize between empirical trial-and-error approaches, often using statistical methods to discover grammatical and lexical regularities that could be applied computationally, and theoretical approaches involving fundamental linguistic research and indeed the beginnings of what was later called computational linguistics (see Computational Linguistics: History). This decade saw the beginnings of three basic approaches to MT (see Machine Translation: Overview): (1) the direct translation model, in which programming rules were developed for the translation specifically from one source language (SL) into one particular target language (TL) with a minimal amount of analysis and syntactic reorganization and in which problems of homonymy and ambiguity were simplified by

providing single TL equivalents for SL words to cover most senses; (2) the interlingua model (see Machine Translation: Interlingual Methods), in which translation was into and from some kind of language-neutral representation and that involved complex syntactic and semantic analysis; and (3) the transfer model, in which SL texts were analyzed into disambiguated SL representations and then converted into equivalent TL representations, from which output text was generated. The direct translation approach was epitomized by research by Erwin Reifler (at the University of Washington, Seattle) and by Gilbert King (at IBM). Large bilingual dictionaries were compiled for a photoscopic store (a purpose-built memory device), in which lexicographic information was used not only for selecting lexical equivalents but also for solving grammatical problems without the use of syntactic analysis. A Russian-English system was installed for the U.S. Air Force, producing translations until the early 1970s – the output was crude and barely intelligible, but appeared to satisfy basic information needs of users. Dictionary development was also the focus of research at a number of other centers. Anthony Oettinger’s group (Harvard University) compiled a massive Russian-English dictionary, both to serve as an aid for translators (a forerunner of the nowcommon computer-based dictionary aids) and to produce crude word-for-word translations. Sydney Lamb (at the University of California, Berkeley) concentrated on developing maximally efficient dictionary routines and a linguistic theory appropriate for MT – his theory of stratificational grammar. The system developed at Georgetown University, for many years the largest research group in the United States, was also essentially direct translation in approach, although incorporating various levels of structure analysis: morphological, syntagmatic (noun-adjective agreement, verb government, etc.), and syntactic (subjects and predicates, clause relationships, etc.). Systems installed by Euratom in 1963 and by the U.S. Atomic Energy Commission in 1964 continued in regular use until the late 1970s. A direct translation approach similar to the Georgetown system was made at the Institute of Precision Mechanics (under D.Y. Panov), but with less practical success, primarily because of lack of adequate computer facilities. Research on interlinguas was almost wholly theoretical. The Cambridge Language Research Unit (Margaret Masterman) investigated a prototype interlingua to produce crude (almost word-for-word) translations, which would be refined by means of the semantic networks of a thesaurus (using mathematical

Machine Translation: History 377

lattice theory.) Silvio Ceccato (Milan University) developed an interlingua based on cognitive processes, involving the conceptual analysis of words and their possible correlations in texts – a forerunner of the neural networks of later years. Igor A. Mel’cˇ uk (Institute of Linguistics, Moscow) worked on the linguistic foundations of an interlingua, his stratificational dependency (meaning-text) model of language (see Mel’cˇuk, Igor Aleksandrovic (b.1932)). Nikolaj Andreev (Leningrad State University) conceived an interlingua as composed of those features statistically most common in a large number of languages. The transfer approach was epitomized by the research at MIT led by Victor Yngve on syntactic analysis and production, mainly for German and English. Despite Chomsky’s association with the group for a short time, transformational grammar (see Generative Grammar) had little influence – indeed Chomskyan linguistics had little impact on any MT group in this period. Apart from MIT, the most explicit concentration on syntactic issues was at Harvard (after 1959), where research focused on the predictive syntactic analyzer (originally developed at the National Bureau of Standards under Ida Rhodes), a system for the identification of permissible sequences of grammatical categories (nouns, verbs, adjectives, etc.) and the probabilistic prediction of following categories. However, the results were often unsatisfactory, caused primarily by the enforced selection at every stage of the ‘most probable’ prediction. (Nevertheless, an improved version, the Multiple-Path Predictive Analyzer, led later to William Woods’s familiar Augmented Transition Network parser.) Computer facilities were frequently inadequate, and much effort was devoted to improving basic hardware (paper tapes, magnetic media, access speeds, etc.) and to devising programming tools suitable for language processing (e.g., COMIT developed at MIT). Some groups were inevitably forced to concentrate on theoretical issues, particularly in Europe and the Soviet Union. For political and military reasons, nearly all U.S. research was for RussianEnglish translation, and most Soviet research focused on English-Russian systems, although the multilingual policy of the Soviet Union inspired research there on a much wider range of languages than elsewhere. By the mid-1960s MT research groups had been established in many countries throughout the world, including most European countries (Hungary, Czechoslovakia, Bulgaria, Belgium, Germany, France, etc.), China, Mexico, and Japan. Many of these were short-lived; an exception was the project that begun in 1960 at Grenoble University.

Throughout this period, research on MT became an umbrella for much contemporary work in structural and formal linguistics (particularly in the Soviet Union), semiotics, logical semantics, mathematical linguistics, quantitative linguistics, and nearly all of what is now called computational linguistics and language engineering – and in the Soviet Union with close ties with cybernetics and information theory (Le´ on, 1997).

The ALPAC Report and Its Consequences In the 1950s, optimism was high; developments in computing and in formal linguistics, particularly in the area of syntax, seemed to promise great improvements in quality; there were many predictions of fully automatic systems operating within a few years. However, disillusion grew as the complexity of the linguistic problems became more apparent and research seemed to face an apparently insuperable semantic barrier. In an influential survey, Bar-Hillel (1960) criticized the prevailing assumption that the goal of MT research should be the creation of fully automatic high-quality translation (FAHQT) systems producing results indistinguishable from those of human translators. He argued that it was not merely unrealistic, given the current state of linguistic knowledge and computer systems, but impossible in principle. Although the first working systems (from IBM and Georgetown) were showing that poor-quality translations could be useful in many circumstances, there was growing disappointment at the slow development of good-quality MT. In 1964, the government sponsors of MT in the United States (mainly military and intelligence agencies) asked the National Science Foundation to set up the Automatic Language Processing Advisory Committee (ALPAC) to examine the situation. It concluded that MT was slower, less accurate, and twice as expensive as human translation and that ‘‘there is no immediate or predictable prospect of useful machine translation’’ (ALPAC, 1966: 32). It saw no need for further investment in MT research; instead, it recommended the development of machine aids for translators, such as automatic dictionaries, and the continued support of basic research in computational linguistics. Paradoxically, ALPAC rejected MT because it fell well short of human quality (i.e., it required post-editing) even though human translations are invariably revised before publication and even though the sponsoring bodies were primarily interested in information gathering and analysis, in which lower quality would be acceptable. The influence of ALPAC was profound, bringing virtually to an end MT research in the United States for over a

378 Machine Translation: History

decade and indirectly bringing to an end much MT research elsewhere. As a consequence, MT was to be no longer the leading area of research in computers and natural language.

forms of SL input that might be used to guide selection of TL forms and construction of acceptable TL sentence structures. To many at the time it seemed that the less ambitious transfer approach offered better prospects.

The Quiet Decade, 1967–1976 Research did not stop completely, however. Even in the United States, groups continued for a few more years at the University of Texas and at Wayne State University. But there was a change of direction. In the United States, activity had concentrated on English translations of Russian scientific and technical materials. In Canada and Europe, the needs were quite different. The Canadian government’s bicultural policy created a demand for English-French (and to a lesser extent French-English) translation beyond the capacity of the translation profession. The demand was even greater within the offices of the European Community for translations of documents from and into all the EC languages. At Montreal, research began in 1970 on a syntactic transfer system for English-French translation. The TAUM project (Traduction Automatique de l’Universite´ de Montre´ al) had two major achievements: first, the Q-system formalism for manipulating linguistic strings and trees (later developed as the Prolog programming language) and, second, the Me´ te´ o system designed specifically for translating the restricted vocabulary and limited syntax (sublanguage) of meteorological reports, which were publicly broadcast from 1976. Other groups continued with essentially interlingua approaches. In the Soviet Union, Mel’cˇ uk continued his research on a meaning-text model for application in MT. At Grenoble University, Bernard Vauquois developed a pivot language for translating Russian mathematics and physics texts into French – not, however, a full interlingua because it represented only the logical properties of syntactic relationships; there were no interlingual representations for lexical items, which were translated by a bilingual transfer mechanism. A similar model was adopted at the University of Texas (under Winfred Lehmann) in its METAL system for German and English: Sentences were analyzed into ‘normal forms’, that is, interlingual semantic propositional dependency structures with no interlingual lexical elements. By the mid-1970s, however, the Grenoble and Texas groups recognized major problems with their interlingua approaches: the rigidity of the levels of analysis (failure at any one stage meant failure to produce any output at all), the inefficiency of parsers (too many partial analyses that had to be filtered out), and in particular loss of information about surface

The Revival, 1976–1989 In the decade after ALPAC, more MT systems came into operational use and attracted public attention. Most significant were the first Systran installations. Developed by Peter Toma, its oldest version is the Russian-English system at the USAF Foreign Technology Division (Dayton, Ohio) installed in 1970. The Commission of the European Communities installed its first Systran system in 1976 and now has versions for most languages of the European Community (now European Union). Over the years, the original (basically, direct translation) design has been greatly modified, with increased modularity and greater compatibility of the analysis and synthesis components of different versions, permitting cost reductions when developing new language pairs. During the 1980s and subsequently, Systran was also installed at many major companies (e.g., General Motors of Canada, Dornier, and Ae´ rospatiale). From the early 1980s until the mid-1990s, Systran’s main commercial rival was the Logos Corporation. After experience with an English-Vietnamese system for translating aircraft manuals during the 1970s, Logos developed a successful German-English system that first appeared on the market in 1982; during the 1980s, other language pairs were also developed. Also in the 1980s a commercial METAL GermanEnglish system appeared, the research at the University of Texas University now funded by Siemens of Munich (Germany). The system was no longer interlingua-based but was now essentially a transfer-based approach. Other language pairs were later marketed for Dutch, French, and Spanish, as well as for English and German. Research also revived in the late 1970s and early 1980s. In contrast to the focus in the late 1960s and early 1970s on interlingua approaches, this was characterized by the widespread adoption of the three-stage transfer-based approach, predominantly syntax-oriented and founded on the formalization of lexical and grammatical rules influenced by the linguistic theories of the time. One major exemplar was the system developed at Grenoble (GETA, Groupe d’Etudes pour la Traduction Automatique), the influential Ariane system. Regarded as the paradigm of the second-generation linguistics-based transfer systems, Ariane influenced projects throughout the world in the 1980s.

Machine Translation: History 379

Of particular note were its flexibility and modularity and its algorithms for manipulating tree representations that incorporated many different levels and types of representation (dependency, phrase structure, and logical). Similar in conception were the multilingual transfer system at Saarbru¨ cken (called SUSY), incorporating a heterogeneity of techniques (phrase structure rules, transformational rules, case grammar and valency frames, dependency grammar, the use of statistical data, etc.), and the Mu system developed at the University of Kyoto under Makoto Nagao, where the most prominent features were the use of case grammar analysis and dependency tree representations. (In 1986, the research prototype was converted to an operational system for use by the Japanese Information Center for Science and Technology for the translation of abstracts.) One of the best-known projects of the 1980s was the Eurotra project of the European Community, intended as an advanced multilingual transfer system for translation among all the EC languages. Like GETA-Ariane and SUSY, the design combined lexical, logico-syntactic, and semantic information in multilevel interfaces at a high degree of abstractness. No facilities for human assistance during translation processes were envisaged, and problems of the lexicon were also neglected; at the end of the 1980s with no operational system in prospect, the project ended, having, however, achieved its secondary aim of stimulating cross-national research in computational linguistics. From the mid-1980s, there was a revival of interest in interlingua systems, motivated in part by contemporary research in artificial intelligence and cognitive linguistics. Two major projects were based in the Netherlands. The DLT (Distributed Language Translation) system in Utrecht (under Toon Witkam) was intended as a multilingual interactive system operating over computer networks, in which each terminal was to be a translating machine from and into one language only, using a modified form of Esperanto as an interlingua. The project made a significant effort in the construction of large lexical databases and proposed the use of a Bilingual Knowledge Bank from a corpus of (human) translated texts (anticipating later example-based approaches). The Rosetta project at Eindhoven (under Jan Landsbergen) explored the use of Montague grammar (see Montague Semantics) – semantic interlingual representations were derived from the interpretation of the derivation trees of syntactic representations – and the feasibility of reversible grammars for analysis and generation. Reversibility subsequently became a feature of many MT projects.

In the latter half of the 1980s, Japan witnessed a substantial increase in activity. Most of the computer companies (Fujitsu, Toshiba, Hitachi, etc.) began to invest large sums into areas that government and industry saw as fundamental to the coming fifth generation of the information society – this included MT. The research, initially greatly influenced by the Mu project at Kyoto University, showed a wide variety of approaches. Although transfer systems predominated, there were also interlingua systems, such as the PIVOT system at NEC and the Japanese-funded multilingual multinational project, with participants from China, Indonesia, Malaysia, and Thailand. There was also considerable commercial activity and most of the computer companies marketed translation software, mainly for the Japanese-English and English-Japanese markets. Many of these systems were low-level direct translation or transfer systems limited to morphological and syntactic analysis and often with little attempt made to automatically resolve lexical ambiguities. Often restricted to specific subject fields (computer science and information technology were popular choices), they relied on substantial human assistance at both the preparatory (pre-editing) and the revision (post-editing) stages. During the 1980s, many research projects were also established in Korea, in Taiwan, in mainland China, and in Southeast Asia, particularly in Malaysia. There was also an increase in activity in the Soviet Union, where from 1976 research was concentrated at the All-Union Center for Translation in Moscow. Systems for English-Russian and German-Russian translation were developed based on the direct translation approach, but there was also work under the direction of Yurij Apres’jan, based on Mel’cˇ uk’s meaning-text model, leading to systems for FrenchRussian and English-Russian. Apart from this group, however, most activity in the Soviet Union focused on the production of relatively low-level operational systems, often involving the use of statistical analyses (influenced by the Speech Statistics group under Rajmund Piotrovskij at Leningrad State University). During this period, many researchers believed that quality improvements would come from language research in the context of artificial intelligence (AI; see Natural Language Understanding, Automatic), such as the investigations by Yorick Wilks on preference semantics and semantic templates and by Roger Schank on expert systems and knowledge-based text understanding. The Japanese investment in artificial intelligence projects also had a major impact and may well have stimulated the revival of government funding in the United States. A number of projects applied

380 Machine Translation: History

knowledge-based approaches in Japan, in Europe, and particularly in North America. In the United States, the most important was at Carnegie-Mellon University (Pittsburgh) under Jaime Carbonell and Sergei Nirenburg, which experimented with a number of knowledge-based MT systems (Nirenburg et al., 1992). For syntactic processing, there was a trend toward the adoption of unification and constraint-based formalisms (e.g., Lexical-Functional Grammar, HeadDriven Phrase Structure Grammar; Categorial Grammar, etc.). Complex multilevel representations and large sets of transformation and mapping rules were replaced by lexicalist approaches: monostratal representations, a restricted set of abstract rules, and constraints incorporated into specific lexical entries. As far as practical use was concerned, one method of improving quality was seen to be the ‘control’ of vocabulary and syntax. Systems such as Systran, Logos, and METAL were in principle designed for general application, although in practice their dictionaries are adapted for particular subject domains by corporation purchasers. The Xerox Corporation went further by restricting, or controlling, input texts, so that transfer and generation was simpler and there was less need for post-editing. They have been followed by many others in subsequent years – indeed, the design of controlled languages is itself an active area of research (see Controlled Languages). There have been several companies that design controlled language systems for specific clients (e.g., LANT (later Xplanation) and ESTeam). The major example since the early 1980s has been the Smart Corporation (New York), with customers such as Citicorp, Ford, and the Canadian Department of Employment and Immigration. Restriction to particular subject areas does not necessarily mean controlled input. Special-purpose systems can be designed for particular environments, allowing for less complication in lexicons. The main examples here are the successful Spanish-English (SPANAM) and English-Spanish (ENGSPAN) systems developed during the 1970s and 1980s at the Pan American Health Organization in Washington (under Muriel Vasconcellos).

Developments since 1989 The dominant framework for MT research until the end of the 1980s was essentially based on linguistic rules for syntactic analysis, for lexical transfer, for morphology, and so forth. Since 1989, however, this dominance has been broken by the emergence of new methods and strategies loosely called corpus-based methods.

Most dramatic has been the revival of statisticsbased approaches – seen as a return to the empiricism of the first decade as opposed to the rationalism of the later rule-based methods. The two sides, empiricists and rationalists, were first directly confronted at a conference in Montreal in June 1992 (TMI, 1992). With the success of their stochastic techniques in speech recognition, a group at IBM (Yorktown Heights, New York) developed a statistical MT (SMT) model, Candide (Brown et al., 1990). Its distinctive feature was that no linguistic rules were applied; statistical methods were virtually the sole means of analysis and generation. Candide was applied to the corpus of French and English texts in the reports of Canadian parliamentary debates. The first step was the alignment of SL and TL sentences and words that are potentially translation equivalents. Translation itself was achieved through a translation model (frequency statistics derived from previously aligned SL and TL words and phrases) and a language model of TL word sequences. During the 1990s, statistical methods (see Language Processing: Statistical Methods) have been the main focus of many research groups, for example, the improvement of bilingual text and word alignment techniques and statistical methods for extracting lexical and terminological data from bilingual corpora. Some have concentrated on the development of purely statistics-based systems, statistical MT (e.g., at the universities of Southern California, Aachen, and Hong Kong), and others have investigated the integration of statistical and traditional rule-based approaches (e.g., at Carnegie-Mellon University; the National Tsing-Hua University, Taiwan; and Microsoft Corporation). The second major corpus-based approach, benefiting likewise from improved rapid access to large databanks of text corpora, has been what is known as the example-based (or memory-based) approach. Although first proposed in 1981 by Makoto Nagao, it was only toward the end of the 1980s that experiments began, initially in some Japanese groups and during the DLT project. The underlying hypothesis is that translation often involves the finding (or recalling) of analogous examples, that is, how a particular expression or similar phrase has been translated before. Example-based MT (EBMT) involves the matching of input phrases against equivalent or similar SL phrases in a corpus of parallel bilingual texts (aligned either statistically or by traditional rulebased methods), the extraction of TL equivalent or closely matching phrases (using, e.g., semantic networks or lexical frequencies), and – the most difficult process – the ‘re-combination’ of selected TL phrases to produce fluent and grammatical output.

Machine Translation: History 381

Example-based MT research is pursued actively at a number of Japanese and U.S. centers (Carl and Way, 2003). Although the main innovations since 1990 have been corpus-based approaches, rule-based research has continued with projects on both transfer and interlingua systems. Examples of transfer systems are the PaTrans system for Danish/English translation of patents (based on Eurotra research) and the LMT system of Michael McCord at IBM. Interlingua-based projects are the CATALYST knowledgebased system at Carnegie-Mellon University for the Caterpillar company and the special-purpose systems (DIPLOMAT) developed for military operations, also at Carnegie-Mellon; the ULTRA system at New Mexico State University (Sergei Nirenburg); the UNITRAN system based on the linguistic theory of Principles and Parameters (see Principles and Parameters Framework of Generative Grammar) (Bonnie J. Dorr, University of Maryland); the Pangloss project, a collaborative project involving the universities of Southern California, New Mexico State, and Carnegie-Mellon; and the Universal Networking Language project at the Institute of Advanced Studies of the United Nations University (Tokyo), involving groups in some 15 countries. Another new departure for MT research in the 1990s has been the growing interest in spoken language translation, with the challenge of combining speech recognition and the interpretation and production of conversation and dialog. The first long-standing group was established in 1986 at ATR Interpreting Telecommunications Research Laboratories (Nara, Japan), which has been developing a system for telephone registrations at international conferences and for telephone booking of hotel accommodation (Kurematsu and Morimoto, 1996). Slightly later came the JANUS project (under Alex Waibel) in a consortium of Carnegie-Mellon University, the University of Karlsruhe, and ATR. The JANUS project has also focused on travel planning, but the system is designed to be readily expandable to other domains (Levin et al., 2000). Both projects continue. A third shorter-lived group, by SRI (Cambridge, UK) as part of its Core Language project, investigated Swedish-English translation via quasi-logical forms (Rayner et al., 2000); and on a larger scale there was a fourth project, Verbmobil (directed by Wolfgang Wahlster), funded from 1993 until 2000 by the German government. Its aim was to develop a transportable aid for face-to-face Englishlanguage commercial negotiations by Germans and Japanese not speaking fluent English. Although the basic goal was not achieved, efficient methodologies

for dialog and speech translation were developed and the project established top-class research groups in German universities as a valuable by-product (Wahlster, 2000). Evaluation of MT systems emerged as a major focus of research activity, now recognized as crucial for progress in MT research itself. Since the early 1990s, there have been numerous workshops dedicated specifically to problems of evaluation (often attached to MT conferences), and in particular there has been the series of Language Resources and Evaluation Conferences (LREC). Methodologies developed by the Japan Electronic Industry Development Association (JEIDA) and the Expert Advisory Group on Language Engineering Standards (EAGLES) and for the evaluation of ARPA supported projects have been particularly influential. The use of MT systems expanded greatly in the 1990s, particularly in commercial agencies, government services, and multinational companies, where translations are produced on a large scale, primarily of technical documentation. This was and remains the major market for the mainframe systems (Systran, Logos, METAL, and ATLAS), now usually on clientserver configurations. Already in 1995, it was estimated that over 300 million words a year were being translated by such companies. The most significant practical development for human translators has been the appearance in the early 1990s of the first translation workstations, which combine various machine aids (see Machineaided Translation, Methods): multilingual word processing, OCR facilities, terminology management software, facilities for concordancing (facilities that translators had become familiar with in the 1980s ), – and in particular translation memories. The historical origins are described in Hutchins (1998). Although MT systems for personal computers began to be marketed in the 1980s (e.g., the systems from Weidner, Globalink, and Toshiba), there has been a great expansion since 1990. The increasing computational power and storage capacities of personal computers makes these commercial systems the equal of previous mainframe systems of the 1980s and earlier – and, in many cases, more powerful. However, there has not been a matching improvement in translation quality. Nearly all are based on older transfer-based (or even direct translation) models; few have substantial and well-founded dictionaries; and most attempt to function as generalpurpose systems, although most vendors do offer specialist dictionaries. In nearly all cases, systems are sold in three basic versions: systems for large corporations (enterprise systems), usually running

382 Machine Translation: History

on client-server configurations; systems intended for independent translators (professional systems); and systems for nontranslators (home use). The Internet has had a major impact since the mid1990s. First, there has been the appearance of MT software products specifically for translating Web pages and electronic mail messages. Second, beginning in the mid-1990s, many MT vendors have provided Internet-based online translation services for translation on demand – pioneered by Systran on the French Minitel network during the 1980s, by CompuServe in 1995 based on the Transcend system, and then shortly afterward by AltaVista (the Babelfish service using Systran). There are now numerous other online services, some offering post-editing by human translators (revisers), at extra cost, but in most cases presenting unrevised results. Hence, translation quality is often poor, inevitably given the colloquial nature of many source texts, but these services are undoubtedly filling a significant (and apparently widely acceptable) demand for immediate rough translations for information purposes. MT is now reaching a mass market.

Further Reading The general history of MT is covered by Hutchins (1986), updated by Hutchins (1988, 1993). Basic sources for the early period are Locke and Booth (1955), Edmundson (1961), Booth (1967), Rozencvejg (1974), Henisz-Dostert et al. (1979), Bruderer (1982), and Hutchins (2000). For the 1970s and 1980s, there are good descriptions of the main systems in Nirenburg (1987), King (1987), and Slocum (1988). For systems developed during the 1990s, sources include Dorr et al. (1999); Somers (2003), the journal Machine Translation, the biennial Machine Translation Summit conferences, workshops and other conferences for MT, conferences for language resources and evaluation, computational linguistics, and the Machine Translation Archive. See also: Chomsky, Noam (b. 1928); Computational Lin-

guistics: History; Controlled Languages; Generative Grammar; Language Processing: Statistical Methods; Machine Translation: Interlingual Methods; Machine Translation: Overview; Machine-Aided Translation: Methods; Mel’cˇuk, Igor Aleksandrovic (b.1932); Montague Semantics; Natural Language Understanding, Automatic; Parsing and Grammar Description, Corpus-Based; Principles and Parameters Framework of Generative Grammar; Speech Acts and Artificial Intelligence Planning Theory.

Bibliography ALPAC (1966). Language and machines: computers in translation and linguistics. Report by the Automatic Language Processing Advisory Committee, Division of Behavioral Sciences, National Academy of Sciences, National Research Council. Washington, DC: National Academy of Sciences, National Research Council. Bar-Hillel Y (1960). ‘The present status of automatic translation of languages.’ Advances in Computers 1, 91–163. Booth A D (ed.) (1967). Machine translation. Amsterdam: North-Holland. Brown P F, Cocke J, Della Pietra S A et al. (1990). ‘A statistical approach to machine translation.’ Computational Linguistics 16(2), 79–85. Bruderer H E (ed.) (1982). Automatische Sprachu¨ bersetzung. Darmstadt: Wissenschaftliche Buch-Gesellschaft. Carl M & Way A (eds.) (2003). Recent advances in example-based machine translation. Dordrecht: Kluwer Academic Publishers. Dorr B J, Jordan P W & Benoit J W (1999). ‘A survey of current paradigms in machine translation.’ Advances in Computers 49, 1–68. Edmundson H P (ed.) (1961). Proceedings of the National Symposium on Machine Translation. London: PrenticeHall. Henisz-Dostert B, Macdonald R R & Zarechnak M (1979). Machine translation. The Hague: Mouton. Hutchins W J (1986). Machine translation: past, present, future. Chichester, UK: Ellis Horwood/New York: John Wiley. Hutchins W J (1988). ‘Recent developments in machine translation: a review of the last five years.’ In Maxwell D et al. (eds.) New directions in machine translation. Dordrecht: Foris. 9–63. Hutchins W J (1993). ‘Latest developments in machine translation technology: beginning a new era in MT research.’ In MT Summit IV: international cooperation for global communication. Tokyo: AAMT. 11–34. Hutchins W J (1997). ‘From first conception to first demonstration: the nascent years of machine translation, 1947–1954. A chronology.’ Machine Translation 12(3), 195–252. Hutchins W J (1998). ‘The origins of the translator’s workstation.’ Machine Translation 13(4), 287–307. Hutchins W J (ed.) (2000). Early years in machine translation: memoirs and biographies of pioneers. Amsterdam/ Philadelphia: John Benjamins. King M (ed.) (1987). Machine translation today: the state of the art. Edinburgh, UK: Edinburgh University Press. Kurematsu A & Morimoto T (1996). Automatic speech translation: fundamental technology for future crosslanguage communications. Amsterdam: Gordon and Breach. Le´ on J (1997). ‘Les premie`res machines a` traduire (1948–1960) et la filiation cyberne´ tique.’ Bulag 22, 9–33.

Machine Translation: Interlingual Methods 383 Levin L, Lavie A, Woszczina M, Gates D et al. (2000). ‘The JANUS-III translation system: speech-to-speech translation in multiple domains.’ Machine Translation 15(1–2), 3–25. Locke W N & Booth A D (eds.) (1955). Machine translation of languages: fourteen essays. Cambridge, MA: MIT Technology Press. Nirenburg S (ed.) (1987). Machine translation: theoretical and methodological issues. Cambridge, UK: Cambridge University Press. Nirenburg S, Carbonell J, Tomita M & Goodman K (1992). Machine translation: a knowledge-based approach. San Mateo, CA: Morgan Kaufmann. Rayner M, Carter D, Bouillon P et al. (2000). The spoken language translator. Cambridge, UK: Cambridge University Press. Rozencvejg V J (ed.) (1974). Machine translation and applied linguistics (2 vols). Frankfurt: Athenaion Velag [Also published as: Essays on lexical semantics (2 vols). Stockholm: Skriptor.]. Slocum J (ed.) (1988). Machine translation systems. Cambridge, UK: Cambridge University Press. Somers H L (2003). ‘Machine translation: latest developments.’ In Mitkov R (ed.) The Oxford handbook of computational linguistics. Oxford: Oxford University Press. 512–528.

TMI (1992). Quatrie`me Colloque international sur les aspects the´ oriques et me´ thodologiques de la traduction automatique. Proceedings of the 4th International Conference on Theoretical and Methodological Issues in Machine Translation: Empiricist vs. rationalist methods in MT. Actes du colloque. Montre´ al, Canada: CCRITCWARC. Wahlster W (ed.) (2000). Verbmobil: foundations of speechto-speech translation. Berlin: Springer.

Relevant Websites http://www.aamt.info – Asia-Pacific Association for Machine Translation, and its conferences. http://www.aclweb.org – Association for Computational Linguistics and its conference and publication archive. http://www.amtaweb.org – Association for Machine Translation, in the Americas, and its conferences. http://www.eamt.org – European Association for Machine Translation, and its conferences. http://www.elra.info – Conferences on Language Resources and Evaluation. http://www.mt-archive.info – Machine Translation Archive.

Machine Translation: Interlingual Methods B Dorr, UMIACS, College Park, MD, USA E Hovy, University of Southern California, Los Angelos, CA, USA L Levin, Carnegie Mellon University, Pittsburgh, PA, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction As described in the article on Machine Translation: Overview, machine translation (MT) methodologies are commonly categorized as direct, transfer, and interlingual. The methodologies differ in the depth of analysis of the source language and the extent to which they attempt to reach a language-independent representation of meaning or intent between the source and target languages. Interlingual MT typically involves the deepest analysis of the source language. Figure 1 – the Vauquois triangle (Vauquois, 1968) – illustrates these levels of analysis. Starting with the shallowest level at the bottom, direct transfer is made at the word level. Moving upward through syntactic and semantic transfer approaches, the translation occurs on representations of the source sentence structure and meaning, respectively. Finally, at the

interlingual level, the notion of transfer is replaced with a single underlying representation – the interlingua – that represents both the source and target texts simultaneously. Moving up the triangle reduces the amount of work required to traverse the gap between languages, at the cost of increasing the required amount of analysis (to convert the source input into a suitable pretransfer representation) and synthesis (to convert the posttransfer representation into the final target surface form). For example, at the base of the triangle, languages can differ significantly in word order, requiring many permutations to achieve a good translation. However, a syntactic dependency structure expressing the source text may be converted more easily into a dependency structure for the target equivalent because the grammatical relations (subject, object, modifier) may be shared despite word order differences. Going further, a semantic representation (interlingua) for the source language may totally abstract away from the syntax of the language, so that it can be used as the basis for the target language sentence without change. Comparing the effort required to move up and down the sides of the triangle to the effort to perform transfer, interlingual MT may be more desirable in

Machine Translation: Interlingual Methods 383 Levin L, Lavie A, Woszczina M, Gates D et al. (2000). ‘The JANUS-III translation system: speech-to-speech translation in multiple domains.’ Machine Translation 15(1–2), 3–25. Locke W N & Booth A D (eds.) (1955). Machine translation of languages: fourteen essays. Cambridge, MA: MIT Technology Press. Nirenburg S (ed.) (1987). Machine translation: theoretical and methodological issues. Cambridge, UK: Cambridge University Press. Nirenburg S, Carbonell J, Tomita M & Goodman K (1992). Machine translation: a knowledge-based approach. San Mateo, CA: Morgan Kaufmann. Rayner M, Carter D, Bouillon P et al. (2000). The spoken language translator. Cambridge, UK: Cambridge University Press. Rozencvejg V J (ed.) (1974). Machine translation and applied linguistics (2 vols). Frankfurt: Athenaion Velag [Also published as: Essays on lexical semantics (2 vols). Stockholm: Skriptor.]. Slocum J (ed.) (1988). Machine translation systems. Cambridge, UK: Cambridge University Press. Somers H L (2003). ‘Machine translation: latest developments.’ In Mitkov R (ed.) The Oxford handbook of computational linguistics. Oxford: Oxford University Press. 512–528.

TMI (1992). Quatrie`me Colloque international sur les aspects the´oriques et me´thodologiques de la traduction automatique. Proceedings of the 4th International Conference on Theoretical and Methodological Issues in Machine Translation: Empiricist vs. rationalist methods in MT. Actes du colloque. Montre´al, Canada: CCRITCWARC. Wahlster W (ed.) (2000). Verbmobil: foundations of speechto-speech translation. Berlin: Springer.

Relevant Websites http://www.aamt.info – Asia-Pacific Association for Machine Translation, and its conferences. http://www.aclweb.org – Association for Computational Linguistics and its conference and publication archive. http://www.amtaweb.org – Association for Machine Translation, in the Americas, and its conferences. http://www.eamt.org – European Association for Machine Translation, and its conferences. http://www.elra.info – Conferences on Language Resources and Evaluation. http://www.mt-archive.info – Machine Translation Archive.

Machine Translation: Interlingual Methods B Dorr, UMIACS, College Park, MD, USA E Hovy, University of Southern California, Los Angelos, CA, USA L Levin, Carnegie Mellon University, Pittsburgh, PA, USA ! 2006 Elsevier Ltd. All rights reserved.

Introduction As described in the article on Machine Translation: Overview, machine translation (MT) methodologies are commonly categorized as direct, transfer, and interlingual. The methodologies differ in the depth of analysis of the source language and the extent to which they attempt to reach a language-independent representation of meaning or intent between the source and target languages. Interlingual MT typically involves the deepest analysis of the source language. Figure 1 – the Vauquois triangle (Vauquois, 1968) – illustrates these levels of analysis. Starting with the shallowest level at the bottom, direct transfer is made at the word level. Moving upward through syntactic and semantic transfer approaches, the translation occurs on representations of the source sentence structure and meaning, respectively. Finally, at the

interlingual level, the notion of transfer is replaced with a single underlying representation – the interlingua – that represents both the source and target texts simultaneously. Moving up the triangle reduces the amount of work required to traverse the gap between languages, at the cost of increasing the required amount of analysis (to convert the source input into a suitable pretransfer representation) and synthesis (to convert the posttransfer representation into the final target surface form). For example, at the base of the triangle, languages can differ significantly in word order, requiring many permutations to achieve a good translation. However, a syntactic dependency structure expressing the source text may be converted more easily into a dependency structure for the target equivalent because the grammatical relations (subject, object, modifier) may be shared despite word order differences. Going further, a semantic representation (interlingua) for the source language may totally abstract away from the syntax of the language, so that it can be used as the basis for the target language sentence without change. Comparing the effort required to move up and down the sides of the triangle to the effort to perform transfer, interlingual MT may be more desirable in

384 Machine Translation: Interlingual Methods

Figure 1 The Vauquois triangle for MT.

some situations than in others. Because in principle an interlingual representation of a sentence contains sufficient information to allow generation in any language, the more (and the more different) target languages there are, the more valuable an interlingua becomes. To translate from one source into N target languages, one needs (1 þ N) steps using an interlingua compared to N steps of transfer (one to each target). But to translate pairwise among all the languages, one needs only 2N steps using an interlingua compared to about N2 with transfer – a significant reduction for the former. In addition, since in theory it is not necessary to consider the properties of any other language during the analysis of the source language or generation of the target language, each analyzer and generator can be built independently by a monolingual development team. Each system developer only needs to be familiar with his/her language and the interlingua. Another advantage of the interlingua approach is that interlingual representations can be used by NLP systems for other multilingual applications, such as cross-lingual information retrieval, summarization, and question answering (see Figure 2). For example, it is a basic assumption of the Semantic Web that webpages will contain not only source text but also some interlingual representations thereof, against which queries issued in other languages and translated into the interlingua can be matched, and from which various target-language versions of the webpages can be generated. In all of these applications, there is a reduction in computation over approaches that tailor the underlying representation to the idiosyncrasies of each of the input/output languages. Without an interlingual representation, all

these multilingual applications require the insertion of a translation step at least once and often in two different places. Although interlinguas are a topic of recurring interest, only one interlingual MT system has ever been made operational in a commercial setting – KANT (Nyberg and Mitamura, 2000) – and only a handful have actually been taken beyond the stage of a research prototype. Interesting research prototypes are Pangloss (Frederking et al., 1994), CICC, NESPOLE!, and ChinMT (Habash et al., 2003).

Interlingua Definition and Components An interlingua is a system for representing the meanings and communicative intentions of language. It can be defined as a triple (S,N,L) where: . S is a collection of representation symbols, often defined in an ontology, where each symbol denotes a particular aspect of meaning or intention (sometimes individually, and sometimes in concert with others according to specific rules of combination). . N is a notation, within which symbols can be composed into meanings. The rules governing notational well-formedness reflect the compositional derivation of complex meaning out of ‘atomic’ symbols, an operation that is basic to the theory of the interlingua. . L is a lexicon, namely a collection of words of a human language such as English, in which each lexical element is associated directly or indirectly with one or more symbols from S. Interlingual MT systems typically include one lexicon for each language.

Machine Translation: Interlingual Methods 385

Figure 2 Use of Interlingua in multiple applications.

An interlingua instance is the representation of the meaning of a given fragment of text, such as a clause, sentence, or document. Such an instance is often written as a list of interconnected nested frame structures, where each proposition in the frame represents some atomic component of the total meaning. Details and examples of each of these components follow. Representation Symbols

Typically, an interlingua comprises several kinds of symbols to represent meaning. The largest set can be thought of as the conceptual primitives; rather like the open-class words in a human language, these symbols stand for specific types of objects, events, relations, qualities, etc. Other, smaller, sets of symbols are defined to represent specific fields of meaning, and usually derive from a logical theory about the nature of some phenomenon. For example, the linguistic system of tense can be related to a theory of time, and time can be represented in an interlingua according to a highly formalized subsystem (Reichenbach, 1947; Allen, 1984). Other typical subfields of meaning represent space, causality, the epistemic status of events (actual, hypothetical, desired, etc.), etc. These symbols are often arranged as taxonomies in which each node stands for a symbol, and information stored at higher-level symbols is inherited downwards and shared by lower ones. The contents and structure of the taxonomy thereby embody, to some degree, the interlingua designer’s conceptualization of the world, making the taxonomy an ontology in the classical sense. Although ontologies are as old as Aristotle and are most commonly used in artificial intelligence systems to support complex reasoning, interlingua ontologies form a distinct type: they are generally large (comprising several thousands of symbols), contain relatively little information per symbol, and what information is contained is primarily

devoted to interlingua instance composition or linguistic behavior instead of to inference. It is not uncommon for an interlingual MT system to contain both an upper-level, very general, ontology and then one or more specific domain-oriented ones. The upper ontology contains notions that are shared over all domains in common language; the lower ones encode distinct subworlds, such as finances, sports, chemistry, etc. Usually, the higher-level symbols represent conceptual and linguistic abstractions for which there are no words, and the lower-level ones more concrete meanings for which words exist in the various languages’ lexicons. (For example, the Penman Upper Model contains the nodes NonDecomposableObject and DecomposableObject to separate mass and count nouns.) One advantage of domain partitioning is ambiguity avoidance: the term bond in the financial domain has only one meaning, and in the chemistry domain another, enabling the MT system to proceed more expeditiously in each domain. Ontologies developed for MT include ONTOS (ONTOS, 1989), SENSUS (Knight and Luk, 1994), and Mikrokosmos/OntoSem (Nirenburg and Raskin, 2004). Ontologies developed and used for language technology applications in general include WordNet (Fellbaum, 1998), the Penman Upper Model (Bateman et al., 1989), and Omega (Philpot et al., 2003). Omega can be browsed using the DINO browser. Notation

The notation is the vehicle by which the symbols’ individual shades of meaning are assembled into a complex meaning. The notation is usually instantiated as a network of propositions represented as a set of nested frames, where each proposition employs the symbols of the interlingua, composed according to the specifications of the interlingua in general and of the symbols in particular. Typically, a frame has a frame header, which may include a frame identifier, and one or more propositions, each being a relation–value pair that links the frame header to the value via the relation. Figure 3 provides an example from the KANT system, representing the meaning of If the error persists, service is required. The frame headers – each marked with an asterisk (*) – of the two clauses are BE-PREDICATE and QUALIFYING-EVENT. BE-PREDICATE has two arguments, an attribute and a theme. Each of these is headed by another frame, REQUIRED and SERVICE, respectively. The QUALIFYING-EVENT has a PERSIST event whose theme is ERROR. In some sophisticated interlinguas, the notation contains separate zones for different kinds of

386 Machine Translation: Interlingual Methods Table 1 Examples of semantic roles Role

Definition

Example

AGENT

An Agent should have the features of volition (able to make a conscious choice), sentience (having perception), causation (able to bring about an effect) and independent existence (existence not resulting from the action) The Theme is causally affected, or is in a state or changes state, or is in a location or changes location, or comes into or out of existence The Instrument has causation but no volition. Typically, an instrument appears with an agent and can be paraphrased with ‘using’

John broke the vase

THEME

INSTR

Figure 3 KANT representation of If the error persists, service is required.

meaning (Nirenburg and Raskin, 2004); typically a zone for world semantics (the conceptual content of the text), a zone for interpersonal semantics (information in the text reflecting the writer, reader, their relationship, etc., which often affects the style of the text rather than the content), and a zone for metatextual information (medium, such as spoken or written; genre, such as telegram, letter, report, article; situation, such as anonymous posting, personal delivery, etc.). Lexicon

An interlingua lexicon includes information about the nature and behavior of each word in the language. For example, events and actions (usually expressed as verbs) include information about their preferred arguments (agents, patients, instruments, etc.). In some interlinguas, this information may reflect the verbal predilections of one language more than another; for example, I swim across the river is expressed in Spanish as I cross the river swimmingly. Should the interlingual representation be anchored on swim or cross? The choice rests with the interlingua symbol set designer. To the degree such asymmetries in the interlingua prefer one language over another, it is said to deviate from true language neutrality. A representation system reflecting one language closely is often called shallow semantics. Within a chosen representation system, the concepts on which events are anchored are called predicates and the participants in the event are called

John broke the vase

John broke the vase with a hammer

arguments following the formalism used in logical representations used in artificial intelligence systems. Predicate-argument structure refers to the combination of an event concept and its participants – and a given predicate is said to have a certain number of potential participants – or valency. For example, the verb load has a valency of 3: the person doing the loading, the item that is loaded, and the place that the item is loaded. Semantic roles – often called thematic roles – are by far the most common approach to representing the arguments of a predicate semantically. However, the numerous variant theories display little agreement even on terminology. A small set of examples is shown in Table 1. A number of interlingua researchers have used semantic roles for interlingual MT (Dorr, 2001; Nyberg and Mitamura, 2000). More details are given in ‘Interlinguas in Machine Translation’ below.

Issues in Interlingua The notion of interlingua appeals to many, but is a complex undertaking. In this section we examine the issues faced by designers of interlinguas and interlingual MT systems. Problems with Representing Meaning

Probably the central problem of interlingua design is the complexity of meaning. A great deal has been written about interlinguas, but no clear methodology exists for determining exactly how one should build a true language-neutral meaning representation, if such a thing is possible at all (Hovy and Nirenburg, 1992). It is always possible to add more detail to a meaning representation, but in order to implement an

Machine Translation: Interlingual Methods 387

MT system, the details must end at some point. To date no adequate criteria have been found for deciding when to stop refining the meaning representation, although some preliminary attempts have been made in the NESPOLE! project (Levin et al., 2003) and in the IAMTC project (see ‘Interlingual Annotation of Multilingual Text Corpora’ below). A basic design choice is granularity: the number of interlingual representation primitives. The parsimonious approach, exemplified by Conceptual Dependency (Schank and Abelson, 1977), declares that a small number of primitives are enough to compositionally represent all actions. This poses a daunting problem of meaning assembly that has never been seriously attempted. In contrast, the profligate approach, called ontological promiscuity (Hobbs, 1985), essentially allows a representation symbol for every shade of meaning (and certainly one for each lexical item). This poses a problem of representing the essential relatedness of notions such as buy and sell, come and go, etc. The ideal seems to have been to aim somewhere in between, seeking conceptual depth and coverage simultaneously. Many researchers (Nirenburg and Raskin, 2004) develop a deep semantic analysis that requires extensive world knowledge; the performance of deep semantic analysis (if required) depends on the (so far unproven) feasibility of representing, collecting, and efficiently storing large amounts of world and domain knowledge. This problem consumes extensive efforts in the broader field of artificial intelligence (Lenat, 1995). We present an example. What, principally, are the primitive concepts of the meaning representation for eat? Do we also need more specific primitives like ‘eat-politely’ and ‘eat-like-a-pig’? This distinction is required to distinguish between the verbs essen and fressen in German. In general, two strategies are possible. One is to adopt arbitrarily the conceptualizations of one language, and specify the variations of all others in terms thereof; the other is to multiply out all the distinctions found in any language. In the latter case one will obtain two interlingual items representing eat (because of German) and two for the object fish (because of the distinction between pez and pescado in Spanish). The situation worsens; in Japanese translation of the verb wear depends on where the object is worn, e.g., head or hands. Ontologies greatly support the profligate approach, because they allow one to concisely represent systematic relationships between groups of concepts. However, building an ontology remains a problem. For example, the WordNet-based component of the Omega ontology (Philpot et al., 2003) mentioned above contains 110 000 nodes and often provides

too many indistinguishable alternatives, whereas the Mikrokosmos-based component of Omega contains only 6000 concepts and does not offer all the concepts needed to represent the full meaning of a word. Thus the word extremely contains four concepts in WordNet-based Omega, and each sense is hard to distinguish from the others: (1) to a high degree or extent, favorably or with much respect; (2) to an extreme degree; (3) to an extreme degree, super; and (4) to an extreme degree or extent, exceedingly. On the other hand, the Mikrokosmos-based part of Omega does not contain even one concept for the word extremely. Another issue raised with respect to interlinguas is that, because this representation is purportedly independent of the syntax of the source text, the target text generated reads more like a paraphrase than a strict translation. That is, the style and emphasis of the original text are lost. However, this is not so much a failure of the interlingua as its incompleteness, caused by a lack of understanding of the discourse and pragmatics required to recognize and appropriately reproduce style and emphasis. In fact, in some cases it may be an advantage to ignore the author’s style. Moreover, many have argued that, outside the field of artistic texts (poetry and fiction), preservation of the syntactic form of the source text in translation is completely superfluous (Goodman and Nirenburg, 1991; Whitelock, 1989). For example, the passive voice constructions in the two languages may not convey identical meanings. Taken overall, the current state of the art seems to confirm that it is possible to produce interlinguas that are reliably adequate between language groups (e.g., Japanese and Western European) for specialized domains only. Divergences

An important problem addressed by interlingua approaches is that of structural differences between languages – language divergences – e.g., English fear vs. Spanish tener miedo de. Some examples from Dorr (1993) are: . Categorial divergence: the translation of words in one language into words that have different parts of speech in another language. For example, to be jealous – tener celos (‘to have jealousy’). . Conflational divergence: the translation of two or more words in one language into one word in another language. Examples include to kick – dar una patada (‘give a kick’). . Structural divergence: the realization of verb arguments in different syntactic configurations in different languages. For example, to enter the house – entrar en la casa (‘enter in the house’).

388 Machine Translation: Interlingual Methods

. Head-swapping divergence: the inversion of a structural dominance relation between two semantically equivalent words when translating from one language to another. For example, to run in – entrar corriendo (‘enter running’). . Thematic divergence: the realization of verb arguments in syntactic configurations that reflect different thematic to syntactic mapping orders. For example, I like grapes – me gustan uvas (‘to-me please grapes’). Many divergences are caused by differences in language typology. For example, many verb-serializing languages express the benefactive (e.g., write a letter for me) in a serial verb constructions (e.g., write letter give me). Some types of meaning are particularly susceptible to divergences. In English, sentences expressing the speech act of suggesting (How about going to the conference? Why not go to the conference?) cannot be translated literally into most other languages. Divergences are also common in expressions of modality. For example, the expression of deontic modality in you had better go in English can be expressed in Japanese roughly as Itta hoo ga ii, literally ‘go(past form) way/option/alternative subjmarker good’ or ‘(the) option (of) going (is) good.’ Some authors have argued that divergences may be the norm rather than the exception (Levin and Nirenburg, 1994). Resolution of cross-language divergences is an area where the differences in MT architecture are most crucial. Many MT approaches resolve such divergences by means of construction-specific rules that map from the predicate-argument structure of one language into that of another. The interlingua approach to MT takes advantage of the compositionality of basic units of meaning to resolve divergences. For example, the conflational divergence above is resolved by mapping English kick into two components, the motional component (movement of the leg) and the manner (a kicking motion) before translating into a language like Spanish.

input and produces one or more sentences with that meaning. In theory, it is not necessary to consider the properties of another language during the analysis of the source language or generation of the target language. To translate from language L1 to L2, L1’s analyzer produces an interlingual representation and L2’s synthesizer generates an L2 sentence with the same meaning. Below we illustrate several representative examples of interlingual representations used by developers of interlingual MT systems. Pangloss

The Pangloss project (Frederking et al., 1994) started as an ambitious attempt to build rich interlingual expressions using humans to augment system analysis. As shown in Figure 5, the representation includes a set of frames representing semantic components (each headed by a unique identifier such as %proposition_5) and a separate frame with aspectual information (see %aspect_5 at bottom) representing duration, telicity, etc. Some modifiers are treated as scalars and represented by numerical values; the phrase active expansion is represented in %expand_1 with an intensity of 0.75 (out of 1.0). Note also that all implicit arguments (for instance, the agent of %expand_1) are explicitly included.

Interlinguas in Machine Translation A typical interlingual system is illustrated schematically in Figure 4. Each language requires an analyzer and a synthesizer. The analyzer takes as input a source language sentence and produces as output an interlingual representation of the meaning. The synthesizer takes an interlingual representation of meaning as

Figure 4 Interlingual MT system architecture.

Figure 5 Pangloss interlingual representation of The Sezon Group will pursue an active overseas expansion policy by means of the tie-up with SAS.

Machine Translation: Interlingual Methods 389

The focus of the Mikrokosmos project – more recently dubbed OntoSem (Nirenburg and Raskin, 2004) – is to produce semantically rich text-meaning representations (TMRs) of unrestricted text that can be used in a wide variety of applications, including as an interlingua for MT. These representations provide the basis for addressing some of the most difficult problems of NLP, such as disambiguation and all aspects of reference resolution, from reconstructing elliptical utterances to linking textual referents to their real-world ‘anchors’ in a fact repository. TMRs (Ontosem’s interlingua expressions) use a language-independent metalanguage compatible with that used to represent the underlying static knowledge resources – the ontology and ontologicallylinked lexicons. A sample TMR for the input He asked the UN to authorize the war, is shown in Figure 6. (Capital letters indicate ontological concepts; the indices represent numbered instances of ontological concepts in the world model built up during this run of the system.) This says that the word ask instantiates the 69th instance of the concept REQUEST-ACTION, whose agent is HUMAN-72 (the instantiation of he, which was resolved as Colin Powell using reference resolution procedures), whose beneficiary is ORGANIZATION-71 (the instantiation of UN, which was resolved to UNITED-NATIONS using reference resolution procedures), and whose theme is ACCEPT-70 (the instantiation of authorize, whose theme is WAR73 – the semantic representation of the meaning of the word war). One goal of recent work in the OntoSem environment has been to create TMRs for large amounts of text, populate a fact repository using a subset of information from the TMRs, and then use the fact repository as a language-independent search

space for applications such as question answering and knowledge extraction.

Figure 6 OntoSem interlingual representation of He asked the

Figure 7 Japangloss interlingual representation of It is possible that you must eat chicken or You might have to eat chicken.

UN to authorize the war.

JapanGloss

The interlingua notation developed for the Japangloss MT system and the Nitrogen generator (Knight and Langkilde, 2000) used symbols from the SENSUS ontology (Knight and Luk, 1994), one of the precursors of Omega. In this notation, frame identifiers are symbols like h1 and SENSUS symbols are delimited by bars; and in contrast to many other Interlinguas, modality predicates (e.g., likelihood and necessity) are represented as frame predicates, the same way other, normal, actions and events are. Thus in the example given in Figure 7, which represents It is possible that you must eat chicken (equivalently, You might have to eat chicken), e4 is the eating by you of the chicken, which by h2 is obligatory, which in turn by h1 is possible. KANT

KANT is the only interlingual MT system that has ever been made operational in a commercial setting. The KANT system (Nyberg and Mitamura, 2000) is a knowledge-based, interlingual machine translation system. KANT is designed for translation of technical documents written in Controlled English to multiple target languages (see Controlled Languages). The KANT Analyzer produces an interlingua expression for each sentence in the input document; an example appeared earlier in Figure 3. This interlingua is mapped into an appropriate target sentence by the KANT generator. For each target language there is a separate lexicon and grammar. The KANT system was integrated with the ClearCheck document checking interface (built by Carnegie Group) and deployed in the Caterpillar document workflow during the middle 1990s. The work for Caterpillar involved development of a Caterpillar Technical English (CTE), a corresponding KANT Analyzer, and KANT Generators for French, Spanish, and German. The system delivered to Caterpillar represents the first large-scale deployment of controlled language checking integrated with machine translation. The interlingua used in the KANT system

390 Machine Translation: Interlingual Methods

is based on research on the generation of additional target languages, such as Portuguese, Italian, Russian, Chinese, and Turkish. Interlingual Systems for MT of Spoken Language

The interlingua approach to machine translation has been implemented in several demos and prototypes for translation of spoken language. MT for spoken language begins with speech recognition. The output of the speech recognizer is then passed to the source language analysis module of the MT system. In addition to the problems faced by MT for text, MT for spoken language must deal with disfluencies in speech and imperfect output from a speech recognizer. For this reason, most spoken language MT systems are restricted to task-oriented domains such as travel planning or doctor–patient interviews. Interlinguas for spoken, task-oriented dialogue typically focus on the dialogue act that the speaker intends to accomplish with his/her utterance. Examples of dialogue acts are suggesting, accepting, and rejecting a time or price. In interlinguas for spoken language, less emphasis is placed on predicate argument structure. The emphasis on speaker intent means that the same interlingual representation will be used for sentences that have very different syntactic structures. For example, the following sentences all carry out the dialogue act of giving information about the price of a room. The concept of costing is expressed by the verb (cost) in the first sentence, and the subject (price) in the second sentence. In the third sentence, the concept of costing is implicit in the predicate nominal (one hundred dollars). The room costs one hundred dollars per night. The price of the room is one hundred dollars per night. The room is one hundred dollars per night.

The JANUS system was the earliest spoken language MT system using an interlingua in the early 1990s (Levin et al., 2000). JANUS was part of the C-STAR consortium (Consortium for Speech Translation Advanced Research), many of whose members adopted the interlingua approach for an international demo in 1999. Other interlingual speech translation systems include Enthusiast, CCLINC, NESPOLE!,

Figure 8 Two sentences and corresponding interlingua instance from the NESPOLE! Project.

Speechalator, Carnegie Mellon University’s Thai speech translation system (Schultz et al., 2004), and FAME. Figure 8 provides an example from the NESPOLE! project, in which both sentences are represented by the given interlingua instance. The NESPOLE! interlingua is based on an annotated corpus of transcribed dialogues in English, German, Italian, and Japanese. It has also been applied to Chinese, Spanish, and French. Its precursor, the C-STAR interlingua, has also been applied to Korean. Universal Networking Language

The Universal Networking Language (UNL) is a formal language designed for rendering automatic multilingual information exchange (Martins et al., 2000). It is intended to be a cross-linguistic semantic representation of sentence meaning consisting of concepts (e.g., ‘cat,’ ‘sit,’ ‘on,’ or ‘mat’), concept relations (e.g., ‘agent,’ ‘place,’ or ‘object’), and concept predicates (e.g., ‘past’ or ‘definite’). The UNL syntax supports the representation of a hypergraph whose nodes represent Universal Words and whose arcs represent Relation Labels. An example is shown in Figure 9 for the sentence The cat sat on the mat. Several semantic relationships hold between universal words (synonymy, antonymy, hyponymy, hypernymy, meronymy, etc.) which compose the UNL ontology. Lexical Conceptual Structure

Lexical Conceptual Structure (LCS) is an interlingual representation used as part of a Chinese-English Machine Translation (MT) system, called ChinMT (Habash et al., 2003) that has also been used for many other MT language pairs (e.g., Spanish and Arabic) and other natural language applications (e.g., cross-language information retrieval). The LCS-based approach focuses on the types of divergences described above in ‘Divergences.’ Consider, for example, the case of a conflational divergence between Arabic and English: Arabic: Gloss: English:

‘The-reporter sent email to Al-Jazeera’. The reporter emailed Al-Jazeera.

Figure 9 UNL representation of The cat sat on the mat.

Machine Translation: Interlingual Methods 391

The LCS representation for this example is shown in Figure 10, glossed as ‘The reporter caused the email to go to Al-Jazeera in a sending manner.’ Here, the primary components of meaning are the top-level conceptual nodes cause and go. These are taken together with their arguments, each identified by a semantic role (agent, theme, and goal), and a modifier (manner) send þ ingly. Approximate Interlingua

One response to the MT divergence problem (discussed in ‘Divergences’) is the use of an approximate interlingua (Habash et al., 2003). In this approach, the depth of knowledge-based systems is approximated by tapping into the richness of resources in one language (often English) and this information is used to map the source-language (SL) input to the target-language (TL) output. The focus of the approximate-interlingua approach is to address the types of divergences covered by the LCS-based approach, but with fewer knowledgeintensive components. Thus, a key feature of an approximate interlingua is the coupling of basic argument-structure information with some, but not all, components of the LCS representation. Only the toplevel primitives and semantic roles are retained. This new representation provides the basis for generating multiple sentences that are statistically pared down so that the most likely sentence is generated according to the constraints of the TL. Consider, for example, the conflational divergence example given above (‘Lexical Conceptual Structure’) between Arabic to English. Figure 11 illustrates the approximate-interlingua approach to translation for this example. The top-level conceptual nodes are first checked for possible matches. Following this, unmatched thematic roles are checked for conflatability, i.e., cases where semantic roles are absorbed into other predicate positions. As long as there is a relation between the conflated argument (EMAILN) and the new predicate node (EMAILV), part-of-speech is disregarded.

Figure 10 LCS representation of The reporter emailed Al-Jazeera.

Annotating Text with Interlingual Information The success of corpus-based language technology over the past decade has shown the value of systems that automatically learn their processing from large collections of annotated examples. Although no one has yet created an Interlingua-annotated corpus to parallel the 1 million sentences plus syntax trees of the Penn Treebank (Kingsbury et al., 2002) (see Treebanks and Tagsets), several efforts to annotate important parts of an interlingua are underway. Principally, these efforts focus on verbs and their arguments. We list these and then describe one initiative – IAMTC – in more detail to illustrate the issues involved in annotation. Semantic Annotation Initiatives

WordNet (Fellbaum, 1998) (see WordNet(s)) provides a terminology taxonomy for English containing over 100 000 terms. Several ontology-building efforts have used this resource as a starting point. Focusing on the creation of wordnets for other languages, the Global WordNet Association lists EuroWordNet, GermaNet, BalkaNet, and many others. Term taxonomizing and ontologizing efforts include the Chinese HowNet and the Mimida multilingual semantic network. Focusing on verbs alone, the FrameNet project (Baker et al., 1998) is classifying all verbs into groups according to the case roles (thematic roles) they support. The SALSA project parallels FrameNet, working on German verbs. Other FrameNet-related projects are available for Japanese and Spanish. The PropBank project resembles FrameNet in that it focuses on verbs, but it does not employ a fixed set of case roles, preferring instead a more neutral set of labels with no overall semantics. VerbNet is an associated effort to assign FrameNet-like case roles to verbs. There is a list that combines VerbNet and FrameNet. The NomBank Project closely parallels PropBank, but focuses on nouns (such as nominalized verbs and relational nouns) with argument structure. The Interlingual Annotation of Multilingual Text Corpora (IAMTC) project is an ambitious attempt to investigate interlingual semantics by annotating and comparing semantic phenomena across six languages. Having prepared bilingual corpora pairing English texts with corresponding text in Japanese, Spanish, Arabic, Hindi, French, and Korean, annotators are assigned to each text impairs to select semantic representation symbols from the Omega ontology

392 Machine Translation: Interlingual Methods

Figure 11 Approximate interlingua for English-Arabic example.

(Philpot et al., 2003) for all nouns, verbs, adjectives, and adverbs. We describe this project in more detail below. Interlingual Annotation of Multilingual Text Corpora

The IAMTC project has the following goals: . Development of an interlingual representation framework based on a careful study of text corpora in six languages and their translations into English. . Development of a methodology for accurately and consistently assigning such representations to texts across languages and across annotators. . Annotation of a corpus of six multilingual parallel subcorpora, using the agreed-upon interlingual representation. . Development of semantic annotation tools that serve to facilitate more rapid annotation of texts. . Design of new metrics and evaluations for the interlingual representations, in order to evaluate the degree of annotator agreement and the granularity of meaning representation. The IAMTC project is radically different from those annotation projects that have focused on morphology, syntax, or even certain types of semantic content (e.g., for word sense disambiguation). It is most similar to PropBank (Kingsbury et al., 2002) and FrameNet (Baker et al., 1998). However, IAMTC places an emphasis on: 1. a more abstract level of mark-up (interpretation); 2. the assignment of a well-defined meaning representation to concrete texts; 3. issues of a community-wide consistent and accurate annotation of meaning. The data set consists of six bilingual parallel corpora. Each corpus is made up of 125 source language news articles along with three independently produced translations into English. (The source news articles for each individual language corpus are different from the source articles in the other language corpora.) The source languages are Japanese, Korean, Hindi, Arabic, French, and Spanish. Typically, each article contains between 300 and 400 words (or the equivalent) and thus each corpus has between

150 000 and 200 000 words. The Spanish, French, and Japanese corpora are based on the DARPA’s 1994 MT evaluation data. The Arabic corpus is based on LDC’s Multiple Translation Arabic, Part 1. The interlingual representation comprises three levels and incorporates knowledge sources such as the Omega ontology (Philpot et al., 2003) and thematic roles (Dorr, 2001). The three levels of representation are referred to as IL0, IL1, and IL2. The aim is to perform the annotation process incrementally, with each level of representation incorporating additional semantic features and removing existing syntactic ones. IL2 is intended as the interlingual level that abstracts away from (most) syntactic idiosyncrasies of the source language. IL0 and IL1 are intermediate representations that are useful stepping stones for annotating at the next level. Issues in Interlingual Annotation

A preliminary investigation of intercoder agreement on multiple annotations shows that the more annotators learn the process, the better they become, resulting in an improvement of intercoder agreement (Mitamura et al., 2004). Two assumptions may be made regarding the training of novice annotators in order to improve intercoder agreement. One is that novice annotators may make inconsistent annotations within the same text, but these may be reconciled through a process of intra-annotator consistency checking, in which annotators go over their results to find any inconsistencies within the text. Another assumption is that, if two annotators at the same site discuss their annotation results after their annotation tasks are completed, their judgments may be reconciled through a process of inter-annotator checking, in which each annotator votes, they discuss the differences, and then vote again. From an MT perspective, issues include evaluating consistency in the use of the annotation language, given that any source text can result in multiple, different, legitimate translations (Farwell and Helmreich, 2003). Along these lines, there is the problem of annotating texts for translation without including in the annotations inferences resulting from the source text. The IAMTC effort described above is the only

Machine Translation: Interlingual Methods 393

initiative, to date, that addresses issues of this type in large-scale annotation of data for use in interlingual MT. See also: Controlled Languages; Machine Translation:

Overview; Treebanks and Tagsets; WordNet(s).

Bibliography Allen J F (1984). ‘Towards a general theory of action and time.’ Artificial Intelligence 23(2), 123–164. Baker C, Fillmore C J & Lowe J B (1998). ‘The Berkeley FrameNet Project.’ In Proceedings of the 17th International Conference on Computational Linguistics, Montreal. 86–90. Bateman J A, Kasper R T, Moore J D & Whitney R A (1989). ‘A general organization of knowledge for natural language processing: the Penman Upper Model.’ Unpublished research report, USC/Information Sciences Institute, Marina del Rey, CA. A version of this paper appears in 1990 as: Upper Modeling: A Level of Semantics for Natural Language Processing. In Proceedings of the 5th International Workshop on Language Generation. Pittsburgh, PA. Dorr B J (1993). Machine translation: a view from the lexicon. Cambridge: MIT Press. Dorr B J (2001). ‘LCS verb database, online software database of lexical conceptual structures and documentation.’ University of Maryland. http://www.umiacs.umd.edu/ "bonnie/LCS_Database_Documentation.html. Farwell D & Helmreich S (2003). ‘Pragmatics-based translation and MT evaluation.’ In Proceedings of Towards Systematizing MT Evaluation. Workshop at the International Machine Translation Summit IX, New Orleans, LA. 21–28. Fellbaum C (ed.) (1998). WordNet: an electronic lexical database. Cambridge: MIT Press. Frederking R, Nirenburg S, Farwell D, Helmreich S, Hovy E H, Knight K, Beale S, Domashnev C, Attardo D, Grannes D & Brown R (1994). ‘The Pangloss Mark III Machine Translation System.’ In Proceedings of the 1st AMTA Conference. Columbia, MD. Goodman K & Nirenburg S (eds.) (1991). The KBMT project: a case study in knowledge-based machine translation. San Mateo: Morgan Kaufmann. Habash N, Dorr B J & Traum D (2003). ‘Hybrid natural language generation from lexical conceptual structures.’ Machine Translation 18(2), 81–128. Hajicˇ J, Vidova´-Hladka´ B & Pajas P (2001). ‘The Prague Dependency Treebank: annotation structure and support.’ In Proceeding of the IRCS Workshop on Linguistic Databases. University of Pennsylvania, Philadelphia, PA. 105–114. Hobbs J R (1985). ‘Ontological promiscuity.’ In Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics (ACL). 61–69.

Hovy E H & Nirenburg S (1992). ‘Approximating an interlingua in a principled way.’ In Proceedings of the DARPA Speech and Natural Language Workshop. Arden House, NY. Kingsbury P, Palmer M & Marcus M (2002). ‘Adding predicate argument structure to the Penn TreeBank.’ In Proceedings of the Human Language Technology Conference (HLT 2002), 252–256. Kipper K & Palmer M (2000). ‘Representation of actions as an interlingua.’ In Proceedings of the Third AMTA SIG-IL Workshop on Interlinguas and Interlingual Approaches. Seattle, WA. Knight K & Langkilde I (2000). ‘Preserving ambiguities in generation via automata intersection.’ In Proceedings of the 17th National Conference on Artificial Intelligence, Austin, TX. 697–702. Knight K & Luk S K (1994). ‘Building a large-scale knowledge base for machine translation.’ In Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA. 773–778. Lenat D B (1995). ‘CYC: a large-scale investment in knowledge infrastructure.’ Communications of the ACM 38(11), 32–38. Levin B (1993). English verb classes and alternations: a preliminary investigation. Chicago: University of Chicago Press. Levin L & Nirenburg S (1994). ‘Construction-based MT lexicons.’ In Zampolli A, Calzolari N & Palmer M (eds.) Current issues in computational linguistics: in honour of Don Walker. Pisa: Giardini editori e stambatori and Kluwer publishers. 321–338. Levin L, Lavie A, Woszczyna M, Gates D, Gavalda` M, Koll D & Waibel A (2000). ‘The Janus III translation system.’ Machine Translation 15, 3–25. Levin L, Langley C, Lavie A, Gates D, Wallace D & Peterson K (2003). ‘Domain specific speech acts for spoken language translation.’ In Proceedings of the 4th SIGdial Workshop on Discourse and Dialogue. Sapporo, Japan. 208–217. Martins T, Machado Rino L H, Volpe Nunes M G, Montilha G & Osvaldo Novais O (2000). ‘An interlingua aiming at communication on the web: how languageindependent can it be?’ In Proceedings of Workshop on Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP. Workshop at ANLP-NAACL. Seattle, WA. 208–217. Mitamura T, Miller K J, Dorr B J et al. (2004). ‘Semantic annotation of multilingual text corpora.’ Proceedings of the Workshop on Semantic Labelling for NLP Tasks, Lisbon. Nirenburg S & Raskin V (2004). Ontological Semantics. Cambridge: MIT Press. Nyberg E & Mitamuran T (2000). ‘The KANTOO machine translation environment.’ In White J S (ed.) Envisioning machine translation in the information future. 4th Conference of the Association for Machine Translation in the Americas (AMTA 2000). Lecture Notes in Artificial Intelligence, Vol. 1934. Berlin: Springer Verlag.

394 Machine Translation: Interlingual Methods Philpot A, Fleischman M & Hovy E H (2003). ‘Semiautomatic construction of a general purpose ontology.’ In Proceedings of the International Lisp Conference. New York. Reichenbach H (1947). Elements of symbolic logic. London: Collier Macmillan. Schank R C & Abelson R P (1977). Scripts, plans, goals, and understanding: an enquiry into human knowledge structures. Hillsdale, NJ: Lawrence Erlbaum. Schultz T, Alexander D, Black A W, Peterson K, Suebvisai S & Waibel A (2004). ‘A Thai speech translation system for medical dialogs.’ In Proceedings of the Conference on Human Language Technologies (HLT-NAACL). Boston, MA. Companion Volume 34–35. Vauquois B (1968). ‘A survey of formal grammars and algorithms for recognition and transformation in machine translation.’ In Proceedings of the IFIP Congress-6. 254–260. Whitelock P (1989). ‘Why transfer and interlingua approaches to MT: are both wrong: a position paper.’ In Proceedings of the MT Workshop: Into the 90’s. Manchester, England.

Relevant Websites http://www.cicc.or.jp – CICC website. http://nespole.itc.it – NESPOLE! website.

http://www.umiacs.umd.edu – UMIACS website. http://www.isi.edu. http://www.lti.cs.cmu.edu. http://blombos.isi.edu – DINO browser. http://www-2.cs.cmu.edu – Enthusiast and Speechalator. http://www.ll.mit.edu – CCLINC. http://isl.ira.uka.de – FAME. http://www.cogsci.princeton.edu – WordNet. http://www.globalwordnet.org – Global WordNet Association. http://www.illc.uva.nl – EuroWordNet. http://www.sfs.nphil.uni-tuebingen.de – GermaNet. http://www.ceid.upatras.gr – BalkaNet. http://www.keenage.comChinese HowNet. http://www.gittens.nl – Mimida multilingual semantic network. http://www.icsi.berkeley.edu – FrameNet project. http://www.coli.uni-sb.de – SALSA project. http://www.nak.ics.keio.ac.jp – FrameNet project for Japanese. http://gemini.uab.es – FrameNet project for Spanish. http://www.cis.upenn.edu – PropBank project. http://www.cis.upenn.edu – VerbNet. http://www.cis.upenn.edu – combination of VerbNet and FrameNet. http://nlp.cs.nyu.edu – The NomBank Project. http://aitc.aitcnet.org – IAMTC project.

Machine-Aided Translation: Methods E Macklovitch, University of Montreal, Montreal, Quebec, Canada ! 2006 Elsevier Ltd. All rights reserved.

Introduction The focus of this article is on machine-aided translation (or MAT), with heavy stress on the word aided, and we shall begin by distinguishing MAT from machine translation (or MT) pure and simple. Both, of course, seek to automate the translation process through the use of computers, and in both humans generally have an important role to play. In MT, however, the initiative in the translation process is given over to the machine, and the aim is to automate this process completely, eliminating the human’s contribution as far as possible. In MAT, on the other hand, the initiative in the translation process remains with the human translator, and the role of the machine is to assist the translator in performing certain tasks – normally, those that can be automated with a good degree of confidence and reliability. The fact that MT systems often do not succeed in

automatically producing a high-quality translation – where high quality is indeed a requirement – and that a human must subsequently intervene to postedit or otherwise improve the machine’s raw output should not cause us to lose sight of the fundamental distinction between MT and MAT. Whereas MT ultimately seeks to dispense with the human translator, MAT proceeds from the double recognition that, for highquality translation at least, the contribution of a human translator is generally indispensable, and furthermore, that this situation is not likely to change for the foreseeable future. Why is this? Quite simply because high-quality translation routinely requires a level of understanding that extends well beyond the literal wording of a source text to encompass unpredictable amounts of real-world knowledge, as well as the capacity to reason over that knowledge. Despite the undeniable progress recently achieved by the new empirical methods in machine translation, such knowledge and reasoning capabilities, remain by and large, beyond the ken of today’s computers. Hence, where high quality is a sine qua non (and not just information scanning,

404 Machine-Aided Translation: Methods Macklovitch E & Russell G (2002). ‘What’s been forgotten in translation memory.’ In White J (ed.) Proceedings of the Fourth Conference of the Association for Machine Translation in the Americas. Berlin: Springer, LNAI 1934. 137–146. Melby A (1982). ‘Multi-level translation aids in a distributed system.’ In Horecky J (ed.) Proceedings of the Ninth International Conference on Computational Linguistics (COLING 82). Amsterdam: North-Holland. 215–220.

Mihalcea R & Pederson T (eds.) (2003). ‘Building and using parallel texts: data driven machine translation and beyond.’ HTL-NAACL 2003 Workshop. Edmonton, Alberta, Canada. Simard M, Foster G & Perrault F (1993). ‘TransSearch: a bilingual concordance tool.’ Technical report, Centre for Information Technology Innovation. Laval, Canada. Available online at: http://rali.iro.umontreal.ca. Ve´ronis J (ed.) (2000). Parallel text processing. Dordrecht: Kluwer Academic Publishers.

Machine Translation: Overview P Isabelle and G Foster, National Research Council of Canada, Gatineau, Quebec, Canada ! 2006 Elsevier Ltd. All rights reserved.

Introduction The term machine translation (MT) is used to refer to any process in which a machine performs a translation operation between two ordinary human languages: the source language (SL) and the target language (TL). This is the sense that appears in the following proposition: Machine translation is cheaper than human translation but less reliable.

The term can also designate the product of such a process, as in: Reading machine translations is not very pleasant but it is a convenient way to get the gist of foreign-language Web pages.

Finally, machine translation can also refer to the study of methods and techniques that render machines capable of producing (better) translations, as in the title of the present article. The term applies equally well to written or spoken language, but there is a tendency to prefer the terms speech translation or speech-to-speech translation for referring specifically to machine translation of spoken language. Translation is a very effective way of helping people communicate across the linguistic barriers. Unfortunately, human translation is expensive enough that it cannot constitute a practical solution to the everyday needs of ordinary people. As the price of human translation is unlikely to fall substantially, machine translation constitutes our best hope of making translation affordable for all.

The idea of using machines to translate is very old, but it was only around 1950, with the advent of digital computers, that serious work could really start. The initial enthusiasm led many to believe that good-quality machine translation was just around the corner, but they were wrong. After some 50 years of research, we can affirm that, barring an unexpected breakthrough, machines will not be able to compete with human translators in the foreseeable future. This prediction applies not only to difficult material such as literary works, but also to all but the very simplest and repetitive texts (e.g., weather reports). The present article explains why this is so. We start from a simplistic conception of machine translation and show where it breaks down. Then we show how computational linguists tried to fix the problems through increasingly elaborate approaches. But these more elaborate approaches turn out to raise their own additional problems.

Why Machine Translation Is a Difficult Problem Let’s assume a very naı¨ve theory: translating between human languages is just a matter of looking up the words of the source text in a bilingual dictionary. The next 12 subsections examine the many ways that simplistic theory breaks down and show why translation requires: (a) a fine-grained understanding of the source text; (b) contextual knowledge that makes it possible to fill information gaps between the SL and the TL; and (c) a detailed knowledge of the grammar of the TL. Segmenting Texts into Words

Before a dictionary search can be performed, the text needs to be segmented into a sequence of individual

Machine Translation: Overview 405

words, but ambiguities arise about where exactly word boundaries lie. This problem is most acute in languages that do not place any whitespace between words, like Chinese. We get a good feel for the problem if we remove the spaces from an English text. The string in (1) then gives rise to several alternative segmentations: (1) Themanywaystowelcome . . . ! The many ways to welcome . . . ! The many ways towel come . . . ! Them any ways to welcome . . . ! Them any ways towel come . . .

Segmentation ambiguity is less severe in Western languages, but it still exists. For example, the English string ‘in.’ can be segmented either as a single word (the abbreviation of the noun inch) or as two words (the preposition in followed by a full stop). Since different segmentations lead to different translations (many of them gibberish), the dictionary lookup procedure needs to be supplemented with a mechanism (maybe based on grammar and semantics) that can select the one segmentation intended by the author of the text. Dictionary Words versus Word Forms: Morphology

Ordinary dictionaries tend to abstract away from the morphological processes of inflection (see ! sees, seeing, saw, seen), derivation (convention ! conventional, conventionalize, etc), and composition (twentystory, engine-driven) (see Morphology: Overview). As a result, many of the word forms that we observe in a text need to be subjected to morphological processing before they can be looked up in the dictionary. At least in the case of morphologically-rich languages, MT systems need to incorporate morphological processing. But such processing often gives rise to morphological ambiguity. For example, the English form saw can be interpreted both as a singular noun and as the past form of a verb. The word form repair can belong to the basic verb repair with the meaning ‘to fix,’ but it can also be taken as a derivation on the basic verb pair with the meaning ‘to pair again.’ And the German compound Staubecken can be segmented either as the sequence Stau þ becken, in which case it means something like ‘reservoir,’ or as the sequence Staub þ ecken, in which case it would mean something like ‘dusty corners.’ These ambiguity problems are similar to those described in the subsection ‘Resolving Lexical Ambiguity in the Source Language’ below. Detecting Idioms

Some dictionary words are realized as several text words. For example, in the sentence John gave up

the game, the string gave up needs to be considered as encoding a single idiomatic lexical unit because its meaning and translation (in French, the simple word abandoner) cannot be predicted from the components give and up (see Lexicography: Overview; Idioms). Moreover, the elements of some idiomatic expressions can be separated by other words, as in the following example: (2) Some are retaining their old customs but others gave them up a long time ago.

But the same two words can also appear together without forming an idiomatic expression. Sentence (3), for example, does not contain any instance of the idiom give up; rather, it is the sequence up to that receives an idiomatic interpretation (in French, jusqu’a`): (3) James would give (Max) up to 100 dollars for a good copy of that book.

Thus, an MT system needs to recognize idiomatic expressions, but here too the recognition process is facing ambiguity problems. Identifying Collocations

The meaning and appropriate translation of a given word is often closely dependent upon the words it co-occurs with, that is, on its collocations (see Collocations). For example, the adverb in dead serious and the adjective in heavy smoker are used to express intensification rather than to express the basic meanings of these words. Since such collocations are strongly language-dependent, MT systems need to account for them: the correct French translations are not mort se´rieux and fumeur lourd, but rather se´rieux comme un pape and gros fumeur. Dealing with Unknown Words

Even when we disregard the morphological processes noted in the section ‘Dictionary Words versus Word Forms: Morphology’, the vocabulary of natural languages is not a closed list. We often meet new proper names in the newspaper, new common nouns in a scientific magazine, etc. A human translator is usually able to make good guesses about the meaning of a novel word, thanks to her understanding of the context. Using various resources, she will most often be able to find a definition and an appropriate translation. In the worst case, she may have to leave the word itself untranslated but the translation of the rest of the sentence is unlikely to be much affected. But for an MT system, unknown words are a tough challenge. They tend to jeopardize the system’s ability

406 Machine Translation: Overview

to understand the structure of the surrounding sentence or phrase. Resolving Lexical Ambiguity in the Source Language

Even assuming a fixed segmentation and a fixed morphological interpretation, text words often match several different dictionary entries. One reason for this multiplicity is part-of-speech ambiguity. The English word light, for instance, can be used either as a noun, an adjective, or a verb, giving rise to a different translation in each case (in French, lumie`re, le´ ger, or allumer). Thus an MT system needs to decide on the part-of-speech associated with each particular instance of a word. But even when their parts of speech are known, words still exhibit word-sense ambiguity. Consider for example the verb replace; it can mean ‘change for a different one’ (in French, remplacer) or ‘put back into place’ (in French, remettre en place). In this case, the grammatical context can help determine what meaning is intended: when the verb appears with the subcategorization frame (that is, pattern of complements – X replaces Y with Z, it tends to have the first meaning: (4) They replaced their old Subaru with a new Volvo.

However, explicit grammatical cues are not always available. Take the word pen for example, which can either designate a writing instrument or some kind of enclosure: (5) Max took his nicest pen and wrote a poem for his mother. ! Max sortit sa plus belle plume et e´ crivit un poe`me a` sa me`re. (6) Jim placed his little brother in his pen and turned on the TV. ! Max plac¸ a son petit fre`re dans son parc et alluma la te´ le´ .

Nouns like pen do not induce any specific subcategorization frames on their environment. So, how can an MT system decide what translation is appropriate in each case? That ambiguity of the word pen was used by Y. Bar-Hillel (1960) to argue against the feasibility of fully automatic high-quality machine translation. Linguistic knowledge (lexical, morphological, grammatical, etc.) is clearly not sufficient for choosing the right translations in examples (5) and (6). In such cases, human translators make use of their knowledge of what situations are likely in the real world: people are unlikely to place other people in a writing instrument or to use an enclosure for the purpose of writing. Since Bar-Hillel couldn’t see any

way to endow machines with such knowledge, he concluded that high-quality MT was impossible. As we will see, other opinions emerged later on. Resolving Coreference Ambiguity in the Source Language

A different kind of ambiguity has to do with words like pronouns, whose interpretation is strongly dependent on that of a coreferent phrase. Consider the following example: (7) Squirrels like peanuts. They are good for them.

The second sentence is ambiguous as to which pronoun (they, them) refers to which noun (squirrels, peanuts). Each interpretation leads to a different translation: (8a) Les e´ cureuils aiment les cacahue`tes. Elles sont bonnes pour eux. (8b) Les e´ cureuils aiment les cacahue`tes. Ils sont bons pour elles.

The first translation conveys the more likely interpretation: peanuts are good for squirrels. Deciding on the right interpretation and translation again hinges upon knowledge of what situations are likely in the real world: people are more likely to claim that peanuts are good for squirrels than to claim that squirrels are good for peanuts. Structural Divergences

It is often impossible to translate a word without making adjustments to the surrounding grammatical structure, because of various types of translation divergences (Dorr, 1990). Consider the following examples in French/English translation: (9)

De`s son lever, Max mange un croissant. As soon as he gets up, Max eats a croissant.

(10) Anne faillit s’e´ trangler. Anne almost choked. (11) Judy swam across the river. Judy traversa la rivie`re a` la nage.

The French noun phrase son lever needs to be translated as the clause he gets up because of a lexical gap in English: there is no noun equivalent to lever. The French verb faillir is translated by the English adverb almost, with the result that the main verb of the translation (choked) does not correspond to the main verb of the source sentence. Finally, the English manner-of-movement verb swim gets translated as a plain movement verb (traverser) plus a manner adverbial (a` la nage). Structural divergences are very common even between languages that are as closely related as French

Machine Translation: Overview 407

and English. As a result, translation equivalences often need to be stated at the level of syntactic patterns rather than at the level of single words. Word Insertion or Deletion

Contrary to what a naı¨ve dictionary-based model would appear to predict, some words of the SL text can be left untranslated and some words can appear in the TL text that have no counterpart in the source. The difference can be in ‘empty’ function words whose presence is required by the grammar of the SL/TL, as for example the t-il particle that shows up in the French interrogative: (12) Has the president seen the results? (13) Le pre´ sident a-t-il vu les re´ sultats?

But in other cases, the word to be inserted does carry some meaning of its own. Consider for example the translation of English nominal compounds in a language which, like French, is not so fond of compounding: (14) oil tank ! re´ servoir a` huile (15) steel tank ! re´ servoir en acier

The choice between the French prepositions a` and en turns out to depend on the implicit semantic relation between the two nouns. The content-to-container relation is usually marked by a`, while the material-toartifact relation is rather marked by en. Consequently, an MT system needs to figure out what semantic relation binds the terms of a nominal compound. Another example is the difficult problem of inserting articles in Chinese-to-English translation (there are no articles in Chinese). Language-dependent Explicitation Constraints

Languages differ a lot on what they force their speakers to make explicit. Consider for example the difference between languages that have a grammatical gender system (like French) and those that do not (like English). In translating sentence (16) into French, the need arises to choose between two possibilities: (16) Jack has invited one of his students for dinner. ! Jack a invite´ un de ses e´ tudiants a` dıˆner. ! Jack a invite´ une de ses e´ tudiantes a` dıˆner.

The first translation is for a male student, the second for a female student. French makes it impossible (or at least awkward) not to disclose the gender of the student. The relevant information is just not available anywhere in sentence (16). It may be available from the rest of the text, but extracting it will generally

require a capability to reason on the situations being described in there. Grammatically-marked politeness forms provide another example of explicitation constraints. For example, when a human translator translates an English dialogue into French, person A’s you to person B will be translated as tu or vous depending on the translator’s assessment of the social and personal relationship between A and B. A more extreme example will be found in the elaborate system of honorifics of the Japanese language. Differences in lexical structures can also force the translator into making extra information explicit. Consider for example the problem of translating the following sentence into French: (17) He decided to cut Milou’s hair.

French has no word that means exactly the same as hair. Rather, it forces users to choose between poils (for animal hair and for human body hair) and cheveux (for human head hair). Again, the relevant information is not available locally and a human translator will resort to her global understanding of the situations described in the text. For example, the text might make it clear that Milou is a person but say nothing more about what kind of hair is to be cut. In that case, the translator might well choose cheveux on the basis of her knowledge that it is more customary for people to have their head hair than their body hair cut by other people. Target-language Grammar

The choice of a correct TL equivalent often depends on the grammatical context of the TL. An obvious example is the fact that the translation of English determiner the as la, le, les, etc., depends on the French noun modified by that determiner. We saw a similar case in discussing the possible translations for utterance (8): in the more likely interpretation of that utterance, the pronoun they gets translated as elles because its antecedent in the French text is the feminine plural noun cacahue`tes. Another example of the importance of the grammatical context in the TL is the interdependence between the choice of the translations for a verb and for the elements that appear within its syntactic frame. For example, the English verb know translates into French as connaıˆtre or savoir depending on whether the direct object in French is a noun phrase or a clause. There are cases where both translations are possible: (18) John knows that Mary is angry. ! John sait que Mary est en cole`re. ! John connaıˆt la cole`re de Mary.

408 Machine Translation: Overview Word Order

Another obvious failure of a dictionary-only approach is that it cannot account for the necessity to place words in a different order when going to a different language. In general, ordering rules apply not to single words but rather to whole phrases based on their grammatical function (see Word Order and Linearization). For example, in German subordinate clauses, the verb needs to be placed after its object:

An MT system clearly needs to understand the grammatical structure of the SL and TL sentences in order to produce the correct word order.

Rule-Based Approaches: Towards More Abstraction Rule-based approaches to MT can be characterized according to the level of abstraction at which the passage from the SL to the TL is attempted. The space of possibilities is often schematized as the ‘Vauquois triangle,’ illustrated in Figure 1 (Vauquois, 1968). At the bottom of the triangle stands the direct approach in which the TL text is produced by applying translation rules directly to the SL text. In the transfer approach, translation rules are applied to abstract language-dependent intermediate representations (IR) that are connected to the source/target texts through analysis/generation rules. In the interlingua approach, the analysis and generation processes connect the source and target text with a shared language-independent meaning representation.

Direct Systems and Morphology-based Translation The pure dictionary lookup approach discussed above amounts to an extreme form of the direct architecture. Even the earliest practitioners of MT realized that something more was needed. Rulebased morphological analysis was soon introduced in order to deal with inflections in morphologically rich languages. This analysis step remains a standard feature of modern MT systems, often implemented through the use of finite-state networks. It was also recognized from the start that additional rule systems are required: (a) for selecting among the various possible translations of each word; and (b) for reordering the words of the TL. In early systems such as the Georgetown-IBM prototype that gave rise in 1954 to the first public MT demo MT (Garvin, 1967), the rules could only operate on a strictly local context (e.g., a three-word window). But later systems were designed to handle a more general context. In so doing, they adopted at least in part the idea of a separate analysis step, thereby blurring the distinction between direct and transfer systems (this can be traced at least as far back as (Zarechnak, 1959)). Over time, the direct approach was refined to the point where it could give rise to commercial systems such as SYSTRAN and SPANAM, that remain to this day among the best existing MT systems. The better direct systems manage to tackle reasonably well many problems of text segmentation, morphology, idioms, lexical ambiguity, collocations, word insertion/deletion, and word order. But since they still rely on shallow and local grammatical analyses, they are unable to guarantee the grammatical coherence of their output. Moreover, in the absence of suitable semantic and pragmatic models, their treatment of ambiguities is based on various heuristics (that is, rules-of-thumb), including a rather massive use of phrases in the dictionary. When the heuristics break down, the target text often becomes semantically and/or pragmatically incoherent. Transfer-based Systems

The transfer architecture was first proposed by Yngve (1957) and is illustrated in Figure 2. There are three important ideas in this architecture:

Figure 1 Vauquois triangle.

1. SL/TL equivalences are stated at the level of abstract representations of the texts; 2. The overall process is separated into three distinct steps: (a) analysis, which builds an abstract representation of the SL text; (b) transfer, which maps that SL representation onto an equivalent

Machine Translation: Overview 409

one for the TL; and (c) generation, which produces the TL sentence from the TL representation; and 3. Linguistic data (dictionaries, grammars) is separated from the algorithms (programs) that manipulate them. Rather than directly writing an ad hoc computer program for sentence analysis, one writes abstract (language-specific) grammar rules in a linguistically perspicuous metalanguage such as augmented context-free grammars, then a general-purpose (language-independent) parsing algorithm is used for applying the grammar rules to the input text. This modular architecture facilitates the reuse of the same algorithms and the same linguistic data in a variety of different tasks. The nature of intermediate representations is variable, but most systems use something at least as abstract as sentence-level Chomskyan syntactic tree structures. A SL sentence such as The hungry lion saw the lamb. might receive a representation like that of Figure 3. Rules of lexical transfer and rules of structural transfer are then applied to this SL tree representation to derive an equivalent one for the TL. The statement of correspondences is facilitated by the fact that many of the obstacles described in the section ‘Why Machine Translation Is a Difficult Problem’ have been removed: word segmentation is explicit, words are morphologically analyzed and their parts of speech are made explicit, the phrasal structure is shown, etc. Lexical transfer rules can refer to various features of the context. For example, the rules of Figure 4 translate the word light with reference to its part-ofspeech assignment and the verb replace with reference to its subcategorization pattern (see discussion about sentence (4) above). It is also possible to handle many idiomatic expressions in a similar way: a subtree that matches the SL expression is mapped en bloc onto a semantically equivalent subtree of the TL. Structural transfer rules are used for handling grammatical differences such as word order.

Figure 2 The transfer architecture.

The subsection ‘Target-language Grammar’ above noted that the statement of translation equivalences was often complicated by issues of TL grammar such as morpho-syntactic agreement. But the use of abstract intermediate representations alleviates this problem too by making it possible to state correspondences only at the level of canonical forms (e.g., THE ! LE). The connection between the canonical form and its appropriate surface realization (e.g., le, la, les, l’) is handled by the language-specific grammar rules of the analysis and generation components. As suggested by the Vauquois triangle, intermediate representations can be made more abstract. For example, they can be akin to the deep structures of Chomsky’s transformational grammar and its descendants (HPSG, LFG, etc., – see Head-Driven Phrase Structure Grammar; Lexical Functional Grammar). With such an approach the two sentences of (19) will have similar deep representations in which their old Subaru is the deep object, with the result that the translation rule of Figure 4 will cover the case of the passive sentence as well: (19) They replaced their old Subaru with a new Volvo. Their old Subaru was replaced with a new Volvo.

Going further in the direction of a semantic transfer, intermediate structures can also feature a normalized lexicon. For example, synonym words can be mapped onto a unique representative of the class. Mel’cˇ uk and Zolkovsky’s (1970) lexical functions are sometimes used as a general framework for lexical normalization. This framework provides a treatment for some language-specific collocations (cf. the subsection ‘Collocations’ above). For example, the notion of ‘intensification’ can be represented by an abstract semantic element Magn common to the intermediate representations of both languages (the creation and use of such elements constitutes a small step in the direction of the language-neutral interlinguas discussed below):

410 Machine Translation: Overview

Figure 3 Syntax-based intermediate representation.

(20) Source text: heavy smoker ! English IR: Magn(smoker) ! French IR: Magn(fumeur) ! French text: gros fumeur

Many transfer-based systems have been developed over the years. Some classical examples are: . . . . .

ARIANE (Vauquois and Boitet, 1985) TAUM-AVIATION (Isabelle and Bourbeau, 1985) EUROTRA (Allegranza et al., 1991) METAL (Slocum, 1987) LMC (McCord, 1989)

METAL and LMC have given rise to commercial systems: Lantworks for the first one and Linguatec and Websphere Translation Server for the second one. Transfer-based systems are in principle capable of attacking many problems that are out of the reach of direct systems. However, their much more ambitious analysis and generation processes increase the risk of processing errors. As a case in point, syntactic parsing tends to get into trouble in the face of highly complex sentences, ill-formed ones, and unknown words. A failed parsing process will then result in missing or unpredictable output. Second, syntactic ambiguity tends to lead to a proliferation of alternative analyses. The following sentence pairs illustrate ambiguities with respect to adjective attachment (either to the next noun or to the rest of the noun phrase) and prepositional phrase attachment (either to the noun phrase or to the verb phrase). (21a) He bought liquid oxygen tanks. (21b) He bought rusted oxygen tanks. (22a) He killed the girl with the red dress. (22b) He killed the girl with the revolver.

Transfer-based systems are better at uncovering such ambiguities (based on syntactic processing) than at resolving them (which requires semantic and knowledge-based processing). In other words, they tend to ask more semantic questions than they have answers for. This inability may be why, in spite of their theoretical advantages, they have not thus far

Figure 4 Lexical transfer rules.

managed to consistently outperform advanced direct systems like SYSTRAN. Interlingua and Knowledge-based Systems

An extensive survey of interlingual methods will be found in a companion article (see Machine Translation: Interlingual Methods). This article provides just an overview of some of the basic issues. Language-independent Representations In MT, the term ‘interlingua’ is used to designate any artificial language intended to serve as a language-independent intermediate representation between SL and TL texts. An interlingua normalizes the way meanings are expressed not only within a given language (as in deep transfer systems) but also across all languages. In the interlingual architecture, the analysis and generation components perform mappings between the source and target texts and their shared interlingua representation. Since no transfer is required, an interlingual system translates in all directions between n languages using only 2n components (n analyzers plus n generators). By contrast, the standard transfer architecture requires n analyzers and n generators, plus no less than n " (n # 1) pair-specific transfer components. For the 20 current official languages of the European Union, such a system would require 380 different transfer components! While this example is often invoked as a strong argument in favor of the interlingua architecture, it should be observed that in the transfer approach most pair-specific components can be eliminated by channeling all translations through one particular language-specific IR that is used as an arbitrary ‘pivot.’ This results in a double-transfer approach that we illustrate in Figure 5, for the example of French-toGerman translation through an English-oriented pivot representation. This architecture only requires n analyzers, n # 1 X-to-pivot transfer modules, n # 1 pivot-to-Y transfer modules, and n generators.

Machine Translation: Overview 411

choice between tu and vous could be triggered by the English-to-French transfer. Another attribute that is sometimes conferred upon interlinguas is that of ‘language neutrality,’ but it is not entirely clear what that means. For example, given structural divergences (cf. the subsection Structural Divergences above) such as the following one between English and French: Figure 5 Double-transfer approach.

Moreover, it can be argued that these remaining 2(n # 1) transfer modules have their hidden counterparts in any interlingua system. Each interlinguaoriented analyzer needs to perform both: (a) a source-language specific analysis such as syntactic parsing; and (b) a mapping from the SL/TL vocabulary and structures and those of the interlingua. And each interlingua-oriented generation component also has such a double structure. Consequently, it can be argued (perhaps controversially) that the doubletransfer and interlingua architecture are not really different. Whether or not we use the term ‘transfer’ to designate a map from language-specific structures onto an artificial interlingua is mostly a terminological issue. One potential advantage of a language-independent interlingua has to do with language-specific explicitation constraints (see the subsection ‘Language-specific Explicitation Constraints’ above). The languagespecific representations of the double-transfer approach will lack various pieces of information that are needed for generating texts in other languages. For example, an English-oriented representation will lack information about the gender of people referred to by common names (see example (16)). Interlinguas, by contrast, are meant to make explicit any piece of semantic information that is required in at least one of the TLs. However, summing over all language-specific requirements has at least two important drawbacks: (a) since languages often classify the same semantic objects along different axes, the consequent need for cross-classification may lead to severe inflation in the size of the required interlingua vocabulary; and (b) if an interlingua system comes to be used just for translating between a single pair of languages, a lot of its effort will be wasted on eliciting irrelevant information. The transfer approach is not susceptible to the cross-classification problem; and it can be organized so as to perform ‘lazy explicitation’ by linking specific explicitation operations with specific transfer modules. For example, the search for indicators of the social relationships that determine the

(23) John ran across the Paris. ! John traversa Paris a` la course.

Should the interlingua opt for a third (neutral) way of expressing the same content? What would be gained by doing that? In practice, language neutrality is seldom really maintained, because interlingua designers tend to make extensive use of the vocabulary and structures of one specific language as a starting point. Since the starting language is most often English, several interlinguas have been criticized for being merely ‘English with capital letters,’ which makes the similarity with the double-transfer approach discussed above even more obvious. The Knowledge-based Approach Transfer-based approaches have historically relied mostly on linguistic descriptions of the morphological, syntactic, and lexical structures of pairs of natural languages. The interlingua architecture, on the other hand, has often been coupled with a more knowledge-based approach. This approach emphasizes the representation and use of general knowledge as practiced in artificial intelligence (AI). The previous section ‘Why Machine Translation Is a Difficult Problem’ discussed several examples where a translation decision involves the use of extra-linguistic knowledge (e.g., disambiguating the meaning of the word pen in examples (5) and (6) or determining the reference of the pronouns in example (8)). In the 1970s and 1980s, researchers in the field of artificial intelligence set out to develop formalisms that would make it possible: (a) to encode the meaning of the text in a language-independent way; (b) to encode large amounts of extra-linguistic knowledge; and (c) to reason about that knowledge in order to bring out pieces of information needed for resolving particular translation problems. Many different representation formalisms were proposed based on semantic networks, on frames (Minsky, 1975), on various logics (Nilsson, 1991), etc. Schank (1975) introduced a language called Conceptual Dependency in which natural language verbs were analyzed in terms of a small number of conceptual primitives that would help capture some kinds of inferences. Schank and Abelson (1977) introduced an interesting abstraction called a script,

412 Machine Translation: Overview

which is meant to capture institutionalized behavior patterns, such as the stereotyped sequence of events that is expected when eating in a restaurant: you walk in, a hostess takes you to a table where you sit, the waiter brings a menu, takes your order, etc. Now, consider the problem of translating the following sentence: (24) He brought the check.

In isolation, the word check is ambiguous between: (a) a written order instructing a bank to pay money (in French, che`que); and (b) a slip indicating what amount is due (in French, l’addition). But when sentence (24) is part of a story about people eating in a restaurant and he is taken to refer to the waiter, meaning (b) will automatically be selected because the restaurant script creates an expectation that a waiter will be bringing a payment-requesting slip, but no expectation that he will be paying the clients. While such knowledge-based solutions appear intuitively plausible and appealing, they have proven impossible thus far to implement on a real-life scale. Common-sense reasoning is still very poorly understood and the amount of world knowledge required to support general-purpose intelligent reasoning is staggering. As a result, the hard-core knowledge-based approach has only produced toy MT systems. More recent work on knowledge-based MT has tended to deemphasize the general-purpose reasoning capabilities and to focus more on the organization of the natural language concepts into an interlinguistic ontology (Nirenburg and Raskin, 2004). Examples of such systems include KBMT-89 (Nirenburg, 1989) and PANGLOSS (Nirenburg, 1994). The KANT system (Nyberg and Mitamura, 1996) is one of the rare knowledge-based systems that has been scaled up and put into real-life use. However, KANT is quite restricted because: (a) it is designed to work with controlled input; and (b) it translates from a single source language only (English), which appears to mitigate the significance of using an interlingua. It appears safe to conclude that in rule-based MT, the trend towards more-abstract representations has not been very successful from the applications point of view. The processes needed to create such representations turn out to be difficult and error-prone enough to jeopardize the benefits of the approach.

Corpus-based Approaches Overview

Corpus-based MT systems rely on translation rules that are learned automatically from examples of

human translations. Very often these rules are statistical, for instance giving the probability that femme in French will translate into woman in English. This approach to automatic translation has become increasingly popular in recent years, especially among research systems. In a recent (2004) evaluation of MT performance conducted by the American National Institute for Standards and Technology (NIST), a large majority of the participating systems were corpus-based. The crucial resource for corpus-based MT is a parallel corpus: a collection of documents that exist in both the SL and the TL, on which the system can be trained and tested. Although much less abundant than monolingual corpora, such bilingual (and even multilingual) corpora are becoming available in increasing size for an increasing number of language pairs. Notable examples of parallel corpora are the Canadian Hansard (an English/French corpus of parliamentary proceedings) and the NIST evaluation corpus (Arabic/English and Chinese/English); both are available from the Linguistic Data Consortium. Parallel corpora typically come in the form of translated document pairs. A first step in virtually all learning approaches is sentence alignment, in which correspondences at the sentence level are made explicit. The output from this process is a sequence of segment pairs in which each half is a single sentence except in cases where no one-to-one sentence correspondence exists, as shown in Figure 6. Sentence alignment poses a chicken-and-egg problem: it is a necessary prerequisite to effective strategies for learning how to translate, but it requires some translation knowledge in order to be performed accurately. Fortunately, due to the coarse-grained nature of the problem, relatively crude methods such as comparing sentence lengths (Gale and Church, 1992) and matching similar-looking words across languages (Simard et al., 1992) yield good results. Alignment accuracies vary widely across different corpora, but are frequently above 95%. A straightforward way to translate using an aligned corpus would be to seek exact matches for source sentences, then output the corresponding translations. This exercise would guarantee a reasonable translation for any new sentence that was found verbatim in the corpus. However, for all but the shortest sentences, the odds of finding an exact match would be almost nil, even in an enormous parallel corpus (barring repeated texts, which are the basis for the translation memories described in the subsection ‘Example-based Machine Translation’). At the other end of the spectrum from a sentence-matching approach is a word-matching strategy in which individual source words are replaced by translations inferred

Machine Translation: Overview 413

Figure 6 Example of non–one-to-one sentence alignments: in the top row, two English sentences translate to a single French sentence; in the middle row, one English sentence translates to two French sentences. The bottom row illustrates a regular one-toone correspondence.

from the training corpus. This strategy would have all the problems described in the section ‘Why Machine Translation Is a Difficult Problem’ above – and more – but it would also have the benefit of producing at least some output for any source sentence. Real corpus-based systems are of course more sophisticated than this, but the sentence-to-word spectrum illustrates a fundamental tradeoff in their design: extracting more information from the training corpus (e.g., using sentence matching or some more refined technique) can lead to better performance, but carries the risk of placing too much emphasis on details that are specific to that corpus. This fundamental difficulty is in fact inherent in any attempt to learn from data, and it is known as the bias-variance tradeoff in the field of machine learning. It can be alleviated to some extent by using larger and more representative training materials, but even the largest available corpora encompass only a small fraction of the complete translation relation that must be learned. Compared to the methods described in the section ‘Rule-based Approaches: Towards More Abstraction’ above, corpus-based methods can be broadly characterized as more robust but less precise. This characterization occurs because they tend to have simpler internal structure and because by definition they do well on frequent phenomena. In general, the result is that corpus-based systems exhibit less variation in performance across different source sentences. However, an important qualification to this oftencited robustness advantage is that it applies only within the domain of the training corpus. A system that has been trained on Canadian Hansard text will perform dismally on a corpus of car repair manuals, and vice versa. Such domain specificity is a doubleedged sword: it is highly advantageous when training corpora are available for a domain of interest, often making it possible to develop an effective system with

less effort than it would take to adapt a rule-based system; but when training corpora are not available, it can be a major stumbling block. There are two main strains of corpus-based MT that are discussed in detail in the following sections. The first is example-based MT (EBMT), which uses the corpus as a source of direct translation examples. The second is statistical MT (SMT), which first derives a statistical model from the corpus, and then uses it to translate. Example-based Machine Translation

The central idea of EBMT is to translate new texts by reusing existing translations. The longer a text segment, the less ambiguous it is and the more likely it is that its translation is directly reusable. Commercially available translation memory systems implement a particularly simple form of EBMT as an aid for human translators. They memorize all source/target pairs of previously translated sentences and when a (quasi) identical sentence appears in a new source text, they automatically retrieve its previous translation. But accidental sentence repetition is not so common. Therefore, translation memories are mostly helpful for texts in which there are reasons to expect sentence repetition (e.g., a new version of a previously translated document). Since real EBMT systems try to provide translations for all sentences, their matching procedure needs to back off to progressively smaller segments (ultimately, the word or morpheme level) until a match is found. Meeting this requirement means that they must be able: (a) to align the translation examples at a subsentential level so as to extract the translation of the matched segments; and (b) to recombine the segments found in different sentences into a coherent translation. A simple example will help convey the flavor of EBMT and of some of the problems it faces. Suppose

414 Machine Translation: Overview

our system is asked to translate sentence (25), and that its database of previous translations contains the three sentences of (26): (25) Don’t leave before he sees you. (26a) Don’t leave now. ! Ne pars pas maintenant. (26b) Eat before Max arrives. ! Mangez avant que Max n’arrive. (26c) He sees you every week. ! Il te voit a` chaque semaine.

If the system can align: (a) Don’t leave with Ne pars pas; (b) before with avant que; and (c) He sees you with Il te voit, then it can easily recombine these French segments into a mildly incorrect translation: (27) Ne pars pas avant que Il te voit.

The remaining problems are: (a) the word Il should not be capitalized; (b) que and il should be fused into qu’il; and (c) the conjunction avant que calls for the subjunctive form of the verb, namely voie. These problems result from ‘boundary friction’ that shows up when assembling together fragments extracted from different contexts. Thus, the EBMT method suggested in the above example needs to be considerably refined. There is considerable diversity among EBMT approaches, and the subsections below try to give an idea of the main trends. Abstraction level EBMT approaches replace handcrafted translation rules with correspondences that are automatically derived from a corpus of translation examples. The issue of the right abstraction level for such correspondences arises in the same way as in rule-based systems. Some EBMT systems, including the first EBMT proposal (Nagao, 1984) use a direct approach: the examples are stored as word strings and the source sentence is directly matched against their source side. But many other EBMT systems are more like transfer-based systems in that they perform the matching at the level of abstract syntactic structures. For example, Watanabe et al. (2003) store their examples as pairs of syntactic dependency trees whose nodes are explicitly aligned; sentences to be translated are then parsed and the corresponding dependency tree is matched against the source side of the examples. Intermediate approaches are sometimes used in which part-of-speech tagging and/or named-entity recognition is used as a substitute for full-fledged parsing. Example Matching Since most input sentences will not be found as such in the example database, a procedure must be defined for decomposing them into smaller elements for which matches can be

found. The nature of this procedure is closely related with the previous question of what representations are used. In the direct approach, one can try to match the various substrings of the input with those of the examples; in syntax-based approaches, one will rather match syntactic units. Since there may be several different ways to decompose an input sentence in terms of partial database matches, some metrics must be used to select the preferred way. One interesting feature of many EBMT approaches is the use of a criterion of lexical similarity. For example, suppose the database contains the following two examples: (28) Max broke his glass. ! Max a casse´ son verre. (29) John broke his promise. ! Max a rompu sa promesse.

Given the input sentence Steve broke his vow, a typical EBMT system would try to assess which one of glass and promise is closer to vow. Since promise is much closer, the system would choose (29) as the best translation model. Similarity is typically defined as the inverse of the distance in the word graph underlying a thesaurus (Nagao, 1984). Translation Retrieval Once a set of examples has been chosen that covers all the segments of the input sentence well enough, the translation of each matched segment must be extracted from the examples. In order to do that, it is necessary to align the source and target material in the matching and/or nonmatching parts of the examples. To take a simplistic example, suppose the database contains example (30) and we have input sentence (31). (30) John likes to read novels. ! John aime lire des romans. (31) Mary likes to read poetry.

The unmatched portions of the example are John and novels. If we manage to align these unmatched parts with their respective TL counterparts John and des romans, we can then extract the correspondence X likes to read Y ! X0 aime lire Y0. It is not necessarily required to explicitly align likes with aime and to read with lire. Various alignment methods are used by EBMT practitioners, ranging from dictionary-based heuristics to statistical methods comparable to those of SMT (see the subsection ‘Statistical Machine Translation’ below). Offline Processing The example database can be processed offline in order to facilitate example matching and translation retrieval.

Machine Translation: Overview 415

When highly abstract representations are used, the examples are most often parsed and aligned offline for efficiency reasons. But in some approaches, translation templates are also extracted in advance from the database to be applied at runtime on input sentences. Suppose now that the database contains both (30) above and (32): (32) Susan likes to read newspapers. ! Susan aime lire les journaux.

Then clearly the same logic as above can be used to extract off-line the equivalence X likes to read Y ! X0 aime lire Y0. When pushed to its extreme, this offline approach can be more or less equivalent to a transfer-based approach in which the transfer rules are produced by machine learning techniques rather than handcrafted. Statistical Machine Translation

Statistical techniques have been considered for MT since the field’s inception, motivated by early successes in cryptography (Weaver, 1955). However, modern SMT began with the work of a group at IBM in the late 1980s, described chiefly in ‘The mathematics of machine translation: parameter estimation’ (Brown et al., 1993). This seminal article sets out a series of statistical models that have become known as the IBM models. These models continue to play a central role in SMT and have found many applications in other areas such as cross-language information retrieval and text summarization. The following subsections begin by looking at statistical methods with an overview of the classical IBM approach, then discuss the phrase-based methods that are the current state of the art, and finally touch on some future directions for the field. The Classical IBM Approach SMT systems work by trying to find the most probable target sentence for a given source sentence (texts are typically translated sentence by sentence). Conceptually, meeting this requirement means enumerating all possible word sequences t in the TL, computing the probability p(tjs) that each will be the best translation of the current source sentence s, and in the end generating the t that was assigned highest probability. Two problems need to be solved in order to do this: creating a model to give an estimate of p(tjs) for an arbitrary pair of source and target sentences; and designing an effective search procedure to find the best target sentence according to the model. Realistic search procedures must exploit the structure of the model in order to avoid having to test all possible target sentences, which is impractical.

For example, the sentence-matching model outlined above lends itself to a particularly efficient search algorithm: if the source sentence is found in the corpus, output its most frequent translation; otherwise, output nothing. Effective search procedures for the IBM models are considerably more complex than this and have been an active topic of research since the models first appeared. Because these procedures are similar to those used for phrase-based models, their description is deferred to the next section. The Source-channel Model As mentioned above, the role of the model in SMT is to estimate the probability p(tjs) for any source sentence s and target sentence t. The IBM models take a source-channel (or noisy-channel) approach to modeling this distribution. In this approach, which arose in cryptography, the source text is viewed as an encoded version of an original target text, and translation as the process of decoding it to retrieve the original text. Decoding is performed with the aid of two models: p(sjt), which models the encoding process; and p(t), which models the original text prior to encoding. In SMT, these models are known as the translation model and the language model respectively. The main advantage of using two models instead of just one for p(tjs) is modularity: the translation model can concentrate on the relation between s and t, and the language model on the structure of t, thus simplifying the design of p(sjt) compared to a single model for p(tjs). Also, a separate language model can be trained on unilingual target-language corpora, which are more abundant than parallel corpora. In technical terms, the source-channel model corresponds to a Bayesian expansion of p(tjs) into p(sjt) p(t)/p(s). This leads to the following strategy for translation, which Brown et al. (1993) refer to as the fundamental equation of statistical MT: t$ ¼ argmaxt pðsjtÞ pðtÞ

ð1Þ

Here t* is the output translation, and the argmax operation represents the search procedure. The IBM translation models are described below. The language model used in the IBM work and almost all subsequent work is the trigram, originally developed for speech recognition (Jelinek et al., 1992). This model calculates sentence probability as a product of word probabilities, and the probability of each word as essentially the proportion of times it was observed to follow the two preceding words in the training corpus. Alignments The IBM translation models for p(sjt) are built around the key concept of alignments

416 Machine Translation: Overview

Figure 7 Alternative IBM-style word alignments for the same sentence pair, assuming French is the SL. Each French word must be accounted for with a single, possibly empty, connection; English words can have any number of connections.

(Figure 7), which are explicit connections between words in the source and target sentences. (Note that these are word alignments, not to be confused with the sentence alignments described earlier.) Alignments determine the probability that a model assigns to a source sentence given a target sentence, in ways that vary among the individual IBM models. For computational reasons, alignments are asymmetric: source words can have at most one connection, but target words can have any number. Because an alignment is not an observable feature of a pair of source and target sentences, the models must assume that any alignment might have occurred, and calculate p(sjt) by summing over the probabilities of all of them. (This is the same as calculating the probability of getting 3 heads in 10 coin flips by summing over the probabilities of all ways in which exactly 3 heads can occur.) In technical terms, alignments are called hidden variables, and they are incorporated into the overall translation probability like this: X pðsjtÞ ¼ pðs;ajtÞ; ð2Þ a where a is a complete alignment for s. Parameter Estimation The fact that alignments are ‘hidden’ poses a problem for estimating model parameters, such as p(arbrejtree), that depend on them. If alignments were explicit, we could estimate such probabilities just by counting the number of times tree was connected to arbre in the training corpus and dividing by the total number of connections to tree. But in the absence of observable links, how can the required probabilities be estimated? The IBM models solve this problem with the Expectation Maximization (EM) algorithm (Dempster et al., 1977), which relies on the trick of replacing actual counts of observed events with expected counts derived from the model itself. Because we do not know which valid alignment actually occurred for a given sentence pair, we assume that they all did, but weight the counts derived from each by its probability according to the model. For example, if

an alignment a in which arbre connects to tree was assigned p(ajs,t) ¼ 0.01, then we would increment the count of the pair arbre/tree by 0.01. This process is called the expectation-step or E-step of the algorithm. Once counts have been derived from all alignments in the training corpus, probabilities are obtained from them in exactly the same way as sketched above for standard counts with observed events; this process is called the maximization-step or M-step. It can be shown that if the E,M sequence is carried out iteratively, the resulting parameters will converge to values that maximize the probability of the training corpus according to the model. This property is called maximum likelihood, and it is a standard and effective strategy for training probabilistic models. The IBM Models The IBM translation models for p(sjt) form a sequence, numbered from 1 to 5 in order of increasing complexity and increasingly accurate modeling of the term p(s,ajt) on the right-hand side of equation (2). The motivation for using a sequence of models is to facilitate training: the earlier models are easier to train, and the sequence is designed so that their parameters can be used to help train the later models. The remainder of this section briefly sketches all five models, omitting most of the technical detail from the original IBM paper. Model 1 (Figure 7) assumes that word connections are independent of each other and a priori equally likely, and that the probability of an aligned source word depends only on the target word it is connected to. To visualize these assumptions, it can be helpful to think of the process for generating s and a from t: first choose a length for s, then randomly choose a connection for each position in s, and finally fill in the source word in each position by picking a translation for the connected target word. It is important to realize that it corresponds only to one way that t can give rise to s; to calculate p(sjt), the model must take into account all possible ways, by summing over all alignments. This calculation turns out to be a product of individual source-word probabilities, each of which is an average over contributions p(wsjwt) from all words wt in the target sentence. The parameters p(wsjwt) form a kind of probabilistic bilingual dictionary and are known collectively as the t-table. They are the backbone of all five IBM models. Model 2 is very similar to model 1, but with the addition of a family of parameters that permit it to distinguish among alignments based on the geometry of their connections. The idea is that alignments having connections between words in more or less the same relative position within s and t (like the leftmost alignment in Figure 7) should be more probable than

Machine Translation: Overview 417

Figure 8 The model 1 generative process, assuming French is the SL: step 1 chooses a source length, step 2 chooses a connection for each source word, and step 3 fills in the source words with translations of the connected target words.

those with wildly divergent connections (like the rightmost alignment in Figure 7). Model 2 calculates p(sjt) the same way as model 1, except that the contributions from target words are weighted by position parameters, so they take the form p(wsjwt) " p(ijj,I,J), where p(ijj,I,J) is the weight assigned to a connection between target and source words in positions i and j respectively, within sentences of lengths I and J. The final three models in the IBM sequence are considerably more complex than the first two. Model 3 (Figure 8) is based on the following generative process: first pick a number of connections for each target word according to a fertility distribution p(fjti), where f gives the number of connections; then choose a source word for each connection as usual; and finally pick a position j for each source word according to a distortion distribution p(jji,I,J), where I and J are target and source lengths. As distortion parameters are similar to model 2’s alignment parameters, the main new feature of model 3 is fertility, which captures the propensity of target words to translate to differing numbers of source words. For example, one would expect p(3jpotato) to be higher than p(2jpotato), reflecting the fact that potato usually translates into pomme de terre. Fertilities have no connection to the actual words selected as translations, so that model 3 would be happier with pomme pomme pomme/potato than with pomme de terre/potato if p(pommejpotato) were higher than the other translation probabilities involved in this expression. As there is no efficient way to calculate the expression for p(sjt) in model 3, the sum over all alignments in equation (2) is approximated by a sum over a small set of high-probability alignments found heuristically with the aid of model 2. Model 4 replaces model 3’s distortion parameters with more-refined parameters intended to model the positions assigned to the sets of source words generated by each target word. These parameters treat such sets compositionally, by breaking them into individual source/target connections, which depend upon previous ones. Calculations of alignment sums for model 4 are carried out with the assistance of model 3.

Figure 9 The model 3 generative process, assuming French is the SL: step 1 chooses a fertility for each target word, step 2 fills in the source words with translations of the connected target words, and step 3 chooses a position for each target word. The connections above the English sentence are empty ones.

Model 5 is similar to model 4; it merely corrects a technical deficiency in the latter by ensuring that the final distribution p(sjt) sums to 1 over all source sentences. Translation Difficulties It is instructive to consider how the IBM models, in conjunction with a trigram language model, can cope with some of the translation difficulties catalogued in the earlier section ‘Why Machine Translation Is a Difficult Problem’. . Morphology. The IBM models are innocent of morphology, and have no idea, for instance, that tree and trees are more closely related than tree and elephant. The benefit of this lack of understanding is that they can learn the precise behavior of any words that occur frequently enough in the corpus, regardless of relationships hypothesized by a particular linguistic theory. The drawback is that they are completely unable to generalize along this dimension. For example, they are not able to infer that trees translates to arbres from knowledge that tree translates to arbre. . Idioms and other multiword relations. The IBM models have no direct mechanism for coping with idioms, since the only explicit relations they capture are word-to-word (i.e., the t-table). Their main indirect mechanism is the fertility concept, which can encourage single target words to connect to multiple source words. Combined with distortion parameters to capture positioning of sets of source words, and the idea of summing over multiple alignments, this connection gives them a capability to model general many-to-many relations, albeit only a very weak one. The language model enhances this capability by enforcing correct ordering and potentially filling in gaps in multiword idioms. . Source ambiguities. Ambiguities requiring extrasentential knowledge to resolve are out of reach for the IBM models, due to their assumption that sentences are independent units. For other types of ambiguities, the main strategy is just to choose the

418 Machine Translation: Overview

most frequent translation. This choice by frequency is bolstered to some extent by potential interactions among words, such as when a source sentence containing light and shine induces a preference for lumie`re over le`ger as the translation of light due to a learned association between shine and lumie`re. The fertility parameters and the language model can also play an indirect role in resolving source ambiguities. . Grammatical and word-order differences. Distortion parameters give model 4 a crude notion of word order that essentially relies on the positions of connected source word sets to determine the relative positions of consecutive target words. Even cruder, but with potentially wider scope, model 2’s alignment parameters would in theory allow it to capture gross systematic word-order differences such as between an SVO and an OVS language. Phrase-based Translation The current state of the art in SMT is distinguished from the classical approach of the previous section mainly through the use of phrase-based models, which were first introduced by Och and Ney (2004). The description presented here is based on work by Koehn et al. (2003), who give a synthesis of Och’s method and related approaches by other researchers. After describing phrase-based models, two other key techniques of modern SMT are briefly discussed, namely dynamicprogramming search algorithms and minimum-errorrate training. Koehn’s model is described by the generative process shown in Figure 10. First, the source sentence is split into ‘phrases,’ which are really just n-grams of various lengths. Next, each source phrase is translated into exactly one target phrase. Finally, the target phrases are re-ordered and concatenated to form the output sentence. This is a much more intuitively appealing process than the ones for the IBM models described in the previous section. The initial step of splitting the source sentence into phrases is a plausible strategy for human translation, in which one selects the

portions of the source sentence that can be translated as independent units. For example, look up should be treated as a phrasal verb and translated as chercher in contexts like look up a word in the dictionary; but it can be treated compositionally as look followed by up in other contexts such as look up into the sky. Given this definition of splitting, IBM-style fertility parameters are unnecessary, because all phrases translate one-to-one. Furthermore, distortion parameters can be much simpler because they apply only to the reordering of phrases and not to their (fixed) contents. Each step of the generative process corresponds to a family of parameters in the phrase-based model. By far the most important is the phrase table, a collection of phrase pairs (s,t), along with the probability p(tjs) that s will translate into t. This is the phrase-based analog to the t-table in the IBM models. The other two kinds of parameters – ones that govern splitting and distortion – tend to be rudimentary in existing phrase models, and this article will not discuss them further. Phrase-Pair Induction Unlike the IBM models, Koehn’s phrase-based model does not have an intrinsic procedure for parameter estimation; it relies on an externally-specified phrase table. (This reliance is not the case for all phrase-based models – a notable exception is the joint model proposed by Marcu and Wong (2002)). The typical way to generate a phrase table is to use the IBM models, first training them on some parallel corpus and then using them to produce word alignments for every sentence pair it contains. The quality of these alignments can be improved by training models in each direction – p(tjs) and p(sjt) – and using heuristics to combine the alignments generated by each. Interestingly, final quality is not very sensitive to the particular IBM model used. Once an alignment has been established for a sentence pair, the next step is to determine the phrase pairs that are consistent with it. To ensure that extracted pairs are self-contained, only those candidates with no links to words outside the pair are considered valid. Figure 11 shows a sample alignment and the resulting phrase pairs satisfying this criterion.

Figure 10 The generative process for phrase-based models, assuming French is the SL: step 1 splits the source sentence into phrases, step 2 selects a translation for each phrase, and step 3 reorders the translations.

Machine Translation: Overview 419

A maximum phrase length is typically imposed to prevent inflating the phrase table with entries that will have very little chance of being used. After phrase pairs have been extracted from a corpus, probability estimates are made from relative frequencies, i.e., pðtjsÞ ¼ countðs; tÞ=countðsÞ. Because different phrase pairs are treated as completely independent n-grams, these estimates will generally have the undesirable property of varying widely across related pairs. Different phrase-based variants address this smoothing problem in different ways – the most sophisticated is Och and Ney’s original alignment template approach, which relies on word classes – but none is entirely satisfactory. Search Recall that the search problem in statistical translation is concerned with finding the most likely translation hypothesis according to the model. The exponential number of possible hypotheses poses a very difficult problem; (Knight, 1999) shows that, for word-replacement models like the IBM series, the problem is NP-complete, which roughly means that it is impossible to solve exactly in any reasonable amount of time. Heuristics must therefore be used; fortunately, experimental evidence has shown that heuristics tend to work well for this problem (Germann et al., 2001). The best current algorithms for phrase-based models are adaptations of dynamic-programming beam search techniques widely used in speech recognition. These algorithms maintain a large list of partial translation hypotheses, each of which is a prefix of a complete translation. Hypotheses are extended by choosing some as-yet-untranslated source phrase and appending a translation obtained from the phrase table. When all hypotheses in the list are complete translations of the source sentence, the algorithm

Figure 11 Extracting phrase pairs from alignments. Assuming a maximum permissible phrase length of 4, the valid phrases for the alignment shown are: i/je, i am/je suis, am/suis, pleased/tre`s heureux, present the petition/la pre´senter, the/la, the/l’, at/a`, time/instant. Notice that the phrases extracted are quite robust to alignment errors, which tend to suppress potentially good pairs rather than create bad ones.

outputs the most probable one and stops. Central to this process are two strategies used to keep the size of the hypothesis list from getting too large: 1. Hypotheses that are indistinguishable for the purposes of evaluating possible extensions are merged (the dynamic-programming part of the algorithm). Under typical assumptions, this equivalence property holds for any two candidates that have the same last two target words, cover the same set of source words, and whose most recentlytranslated source phrases end at the same point. After a merge operation, only the hypothesis with the higher probability is retained. 2. Hypotheses that have low probabilities are discarded (this is called a beam search strategy, because such hypotheses are considered to fall outside of a beam that illuminates only the best candidates). Unlike the merging strategy, this one is a heuristic, and thus has the potential to cause search errors. To avoid unfairly penalizing longer hypotheses, comparisons are made only among candidates that cover the same number of source words. Another important component of search algorithms is the mechanism for assigning scores to partial hypotheses. It is crucial that these be as accurate as possible, otherwise good candidates may get pruned away during the beam search. To improve accuracy, the algorithms try to guess the best way to complete each hypothesis, using strategies such as picking the best translation for each untranslated source phrase without regard for word order. Log-linear Models and Minimum Error-rate Training Recall that the source-channel approach combines contributions from a language model p(t) and a ‘backward’ translation model p(sjt) by multiplying them. This approach has solid theoretical motivations and is highly effective, but it is somewhat inflexible. There may be other sources of information – for example a forward translation model p(tjs) – that we wish to incorporate into the overall probability calculation, but it is not obvious how this might be done. A well-grounded probabilistic framework for incorporating arbitrary sources of information is the log-linear or maximum entropy framework, which encodes information sources as feature functions f(s,t), and assigns them weights according to how valuable they are. The idea of using log-linear combinations instead of the standard source-channel combination is due to Och and Ney (2002), who showed that it could produce significantly more accurate translations, given an appropriate set of features. In

420 Machine Translation: Overview

a second influential paper (2003), Och described a minimum error-rate strategy for choosing weights so as to directly minimize the errors in a system’s output, as measured by a general MT evaluation technique such as BLEU (Papineni, 2001). An interesting characteristic of Och’s minimum error-rate training algorithm is an n-best rescoring step, in which relatively small sets of high-probability translation hypotheses are generated explicitly by the search procedure and then reordered by the model. This permits features that would be too expensive to use during the initial search, for instance ones based on global properties of a given source/target sentence pair. Translation Difficulties It is somewhat difficult to precisely characterize the problems that might be encountered by a phrase-based model coupled with a minimum-error rescoring procedure, because this framework does not place strong limits on the kinds of knowledge that can be encoded as features. Also, unless the length of phrases is restricted, they can in principle handle any phenomena for which there is sufficient evidence in the training corpus. However, their capabilities in this regard will be highly sporadic and will not extend to extra-sentential phenomena like coreference ambiguity and explicitation constraints. In practice, although the errors committed by phrase-based systems tend to cut across the spectrum of the translation difficulties listed in the previous section, we can make two general observations about their performance. First, because phrases are treated as independent n-grams, the model is highly subject to variations in surface form, such as morphology, changes in order, and insertions or deletions. This characteristic can lead to surprising fluctuations in which the system handles a fairly difficult example quite well, but breaks down completely on an apparently trivial variant of it. An even more pronounced effect is that phrase-based output looks like it has been translated in small chunks. As Figure 12 shows,

it often consists of short segments that are good translations in themselves, but that lack syntactic and semantic coherence with the surrounding context. Future Directions Due to the successes of phrasebased models, a lot of research effort in SMT is currently aimed at improving them. A major part of this effort has been concerned with attempting to remedy the fragmentary nature of phrase translations through the incorporation of syntax. To date, the most systematic attempt in this vein has been the Johns Hopkins 2003 SMT workshop (Och et al., 2003), in which a minimum-error rescoring approach was used to incorporate various syntactic features. Although many sophisticated techniques were explored, none turned out to give a truly significant improvement over a baseline phrase-based system. The reasons for this failure are not entirely clear, but the limited scope for improvement offered by n-best lists may have played a role. Apart from syntax, other work directly aimed at improving phrase-based models includes new methods of extracting phrases – ones that improve on the rather ad hoc procedure sketched in the previous section – and new parameterizations for distortion. Research less directly based on phrase models is also very active. Among other topics, it includes investigations into the use of extra-sentential context, weighted finite state transducers, new grammar formalisms, semi-supervised learning of translation relations, and learning from comparable rather than parallel corpora. See also: Bar-Hillel, Yehoshua (1915–1975); Collocations;

Corpus Approaches to Idiom; Head-Driven Phrase Structure Grammar; Idioms; Lexical Functional Grammar; Lexicography: Overview; Machine Translation: History; Machine Translation: Interlingual Methods; Morphology: Overview; Word Order and Linearization.

Bibliography

Figure 12 Sample output from a vanilla phrase-based SMT system trained on Hansard text, illustrating incoherence among phrase translations. The French source sentence is on top, the output in the middle, and the English reference sentence on the bottom.

Allegranza V, Krauwer S & Steiner E (eds.) (1991). Eurotra. Special Issue on Machine Translation 6(2/3). Bar-Hillel Y (1960). ‘The present status of automatic translation of languages.’ Advances in Computers 1, 91–163. Brown P, Della Pietra S, Della Pietra V & Mercer R (1993). ‘The mathematics of machine translation: parameter estimation.’ Computational Linguistics 19(2), 263–312. Dempster A P, Laird N M & Rubin D B (1977). ‘Maximum likelihood from incomplete data via the EM algorithm.’ Journal of the Royal Statistical Society 39(1), 1–38. Dorr B J (1990). ‘Solving Thematic Divergences in Machine Translation.’ In Proceedings of the 28th annual

Machine Translation: Overview 421 conference of the Association for Computational Linguistics. University of Pittsburgh, Pittsburgh, PA. 127–134. Gale W A & Church K W (1991). ‘A program for aligning sentences in bilingual corpora.’ In Proceedings of the 29th annual meeting of the Association for Computational Linguistics (ACL). Berkeley, California. Garvin P (1967). ‘The Georgetown-IBM experiment of 1954: an evaluation in retrospect.’ In Papers in linguistics honor of Leon Doster. The Hague: Mouton. 46–56; reprinted in Garvin P (ed.) (1972). On machine translation. The Hague: Mouton. Germann U, Jahr M, Knight K, Marcu D & Yamada K (2001). ‘Fast decoding and optimal decoding for machine translation.’ In Proceedings of the 39th annual meeting of the Association for Computational Linguistics (ACL). Toulouse. Hutchins W J & Somers H L (1992). An introduction to machine translation. Academic Press. Isabelle P & Bourbeau L (1985). ‘TAUM-AVIATION: its technical features and some experimental results.’ Computational Linguistics 11(1), 18–27. Jelinek F, Mercer R L & Roukos S (1992). ‘Principles of lexical language modeling for speech recognition.’ In Furui S & Sondhi M M (eds.) Advances in speech signal processing. New York: Marcel Dekker. 651–699. Knight K (1999). ‘Decoding complexity in word-replacement translation models.’ Computational Linguistics 25(4). Koehn P, Och F J & Marcu D (2003). ‘Statistical phrasebased translation.’ In Hovy E (ed.) Proceedings of the human language technology conference of the North American chapter of the Association for Computational Linguistics. Edmonton, Alberta, Canada. 27–133. Marcu D & Wong W (2002). ‘A phrase-based, joint probability model for statistical machine translation.’ In Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP). Philadelphia, PA. McCord M (1989). ‘Design of LMT: a prolog-based machine translation system.’ Computational Linguistics 15(1), 33–52. Mel’cˇ uk I & Zolkovsky A (1970). ‘Sur la synthe`se se´ mantique.’ T. A. Informations 2, 1–85. Menezes A & Richardson S (2001). ‘A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora.’ In Proceedings of the Workshop on Data-driven Machine Translation at 39th annual meeting of the Association for Computational Linguistics. Toulouse. 39–46. Minsky M (1975). ‘A framework for representing knowledge.’ In Winston P H (ed.) The Psychology of Computer Vision. McGraw-Hill. Chap. 6. Nagao M (1984). ‘A framework for mechanical translation between Japanese and English by the analogy principle’ In Elithorn A & Banerji R (eds.) Artificial and human intelligence. Amsterdam: North-Holland. 173–180. Nilsson N (1991). ‘Logic and artificial intelligence.’ AI 47, 31–56.

Nirenburg S (1989). ‘Knowledge-based machine translation.’ Machine Translation 4, 5–24. Nirenburg S (ed.) (1994). ‘The PANGLOSS Mark III machine translation system.’ Joint technical report by NMSU CRL, USC ISI, and CMU CMT. Nirenburg S & Raskin V (2004). Ontological semantics. Cambridge, MA: MIT Press. Nyberg E & Mitamura T (1996). ‘Controlled language and knowledge-based machine translation: principles and practice.’ In Proceedings of the 1st international workshop on controlled language applications (CLAW ’96). Och F J (2003). ‘Minimum error rate training for statistical machine translation.’ In Proceedings of the 41st annual meeting of the Association for Computational Linguistics (ACL). Sapporo. Och F J & Ney H (2002). ‘Discriminative training and maximum entropy models for statistical machine translation.’ In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (ACL). Philadelphia. Och F & Ney H (2004). ‘The alignment template approach to statistical machine translation.’ Computational Linguistics 30(4). Och F, Gildea D, Khudanpur S, Sarkar A, Yamada K, Fraser A, Kumar S, Shen L, Smith D, Eng K, Jain V, Jin Z & Radev D (2003). ‘Final report, syntax for statistical MT group.’ JHU workshop 2003. Technical report, The Center for Language and Speech Processing, The Johns Hopkins University. Papineni K, Roukos S, Ward T & Zhu W -J (2001). ‘BLEU: a method for automatic evaluation of Machine Translation.’ Technical Report RC22176, IBM. Schank R (1972). ‘Conceptual dependency: a theory of natural language understanding.’ Cognitive Psychology 3(4), 532–631. Schank R & Abelson R (1977). Scripts, plans, goals, and understanding. Hillsdale, NJ: Lawrence Erlbaum. Simard M, Foster G F & Isabelle P (1992). ‘Using cognates to align sentences in bilingual corpora.’ In Proceedings of the 4th conference on theoretical and methodological issues in machine translation (TMI). Montre´ al, Que´ bec. Slocum J (1987). ‘Metal: the LRC machine translation system.’ In King M (ed.) Machine translation today: the state of the art. Edinburgh University Press. 319–350. Vauquois B (1968). ‘A survey of formal grammars and algorithms for recognition and transformation in machine translation.’ In IFIP, Congress-68. Edinburgh. 254–260. Vauquois B & Boitet C (1985). ‘Automated translation at Grenoble University,’ Computational Linguistics 11(1), 28–36. Watanabe H, Kurohashi S & Aramaki E (2003). ‘Finding translation patterns from paired source and target dependency structures.’ In Carl M & Way A (eds.) Recent advances in example-based machine translation. Dordrecht: Kluwer Academic Publishers. 397–420.

422 Machine Translation: Overview Weaver W (1955). ‘Translation.’ In Machine translation of languages. Cambridge, MA: MIT Press. Yngve V (1957). ‘A Framework for Syntactic Translation.’ Mechanical Translation 4(3), 59–65; reprinted in Nirenburg S, Somers H & Wilks Y (eds.) (2003). Readings in machine translation. MIT Press. 39–44.

Zarechnak M (1959). ‘Three levels of linguistic analysis.’ Journal of the Association for Computing Machinery 6(1), 24–32. Relevant Website http://www.ldc.upenn.edu – Linguistic Data Consortium.

Macro-Jeˆ E R Ribeiro, Museu Antropolo´gico, Universidade Federal de Goia´s, Goia´s, Brazil ! 2006 Elsevier Ltd. All rights reserved.

The Macro-Jeˆ stock comprises the Jeˆ family and a number of possibly related language families, all of which are located in Brazil. Macro-Jeˆ is arguably one of the lesser-known language groups of South America, its very existence as a genetic unit being still ‘‘a working hypothesis’’ (Rodrigues, 1999: 165). According to Rodrigues (1986, 1999), whose classification is the most widely accepted among researchers working on Brazilian languages, the ‘Macro-Jeˆ hypothesis’ comprises 12 different language families: Jeˆ , Kamaka˜ , Maxakalı´, Krena´ k, Purı´, Karirı´, Yateˆ , Karaja´ , Ofaye´ , Boro´ ro, Guato´ , and Rikbaktsa. The existence of Jeˆ as a language family has been recognized since early classifications of South American languages (Martius, 1867). ‘Jeˆ ’ is a Portuguese spelling for a Northern Jeˆ collective morpheme ([je] in Apinaje´ , for instance) that occurs in the names of several Jeˆ -speaking peoples. The term ‘Macro-Jeˆ ’ was coined by Mason (1950), replacing earlier labels, such as ‘Tapuya’ and ‘Tapuya-Jeˆ .’

Comparative Evidence Recent classifications (Rodrigues, 1986; Greenberg, 1987; Kaufman, 1994) differ as to the precise scope of Macro-Jeˆ , although there is agreement on the inclusion of most of the families (Table 1). Except for Karirı´ (included only by Rodrigues), Greenberg and Kaufman included all the families listed above. In addition, Greenberg included Chiquitano (also included by Kaufman), Jabutı´, and Otı´. Given the lack of comprehensive comparative studies, the Macro-Jeˆ status of some of these families is still an open question. Although Guato´ is included in the stock by all of the aforementioned classifications, a case for its inclusion has yet to be made, beyond the superficial, inconclusive evidence presented so far (Rodrigues,

1986, 1999). On the other hand, a preliminary comparison has revealed compelling evidence for the inclusion of the Jabutı´ family into the Macro-Jeˆ Table 1 The Macro-Jeˆ Hypothesisa 1. Jeˆ †Jeiko´ Northern Jeˆ : Panara´, Suya´, Kayapo´, Timbı´ ra

(Parkateˆjeˆ, Pykobjeˆ, etc.), Apinaje´ Central Jeˆ : Xava´nte, Xere´nte, †Akroa´-Mirim, †Xakriaba´ Southern Jeˆ : Kainga´ng, Xokle´ng, †Ingaı´ n

2. Kamaka˜ †Kamaka˜, †Mongoyo´, †Menie´n, †Kotoxo´, †Masakara´ 3. Maxakalı´ Maxakalı´ , †Pataxo´, †Kapoxo´, †Monoxo´, †Makonı´ , †Malalı´ 4. Krena´k Krena´k (Botocudo, Boru´m) 5. Purı´ (Coroado) †Coroado, †Purı´ , †Koropo´ 6. Ofaye´ Ofaye´ 7. Rikbaktsa´ Rikbaktsa´ 8. Boro´ro Boro´ro, †Umutı´ na, †Otu´ke 9. Karaja´ Karaja´ (including four dialects, Southern Karaja´, Northern Karaja´, Javae´, and Xambioa´) 10. Karirı´ †Kipea´, †Dzubukua´, †Pedra Branca, †Sabuya´ (included by Rodrigues but not Greenberg or Kaufman) 11. Jabutı´ Djeoromitxı´ (Jabutı´ ) Arikapu´ (included by Greenberg but not Rodrigues or Kaufman) 12. Yateˆ Yateˆ 13. Guato´ Guato´ 14. Chiquitano Chiquitano (Besiro) (included by Greenberg and Kaufman, but not Rodrigues) 15. Otı´ †Otı´ (Eo-Xava´nte) (the inclusion of Otı´ , proposed only by Greenberg, is not substantiated by the available data) a Extinct languages are indicated by †. Based on Greenberg, 1987; Rodrigues, 1986, 1999; Kaufman, 1994.

394 Machine Translation: Interlingual Methods Philpot A, Fleischman M & Hovy E H (2003). ‘Semiautomatic construction of a general purpose ontology.’ In Proceedings of the International Lisp Conference. New York. Reichenbach H (1947). Elements of symbolic logic. London: Collier Macmillan. Schank R C & Abelson R P (1977). Scripts, plans, goals, and understanding: an enquiry into human knowledge structures. Hillsdale, NJ: Lawrence Erlbaum. Schultz T, Alexander D, Black A W, Peterson K, Suebvisai S & Waibel A (2004). ‘A Thai speech translation system for medical dialogs.’ In Proceedings of the Conference on Human Language Technologies (HLT-NAACL). Boston, MA. Companion Volume 34–35. Vauquois B (1968). ‘A survey of formal grammars and algorithms for recognition and transformation in machine translation.’ In Proceedings of the IFIP Congress-6. 254–260. Whitelock P (1989). ‘Why transfer and interlingua approaches to MT: are both wrong: a position paper.’ In Proceedings of the MT Workshop: Into the 90’s. Manchester, England.

Relevant Websites http://www.cicc.or.jp – CICC website. http://nespole.itc.it – NESPOLE! website.

http://www.umiacs.umd.edu – UMIACS website. http://www.isi.edu. http://www.lti.cs.cmu.edu. http://blombos.isi.edu – DINO browser. http://www-2.cs.cmu.edu – Enthusiast and Speechalator. http://www.ll.mit.edu – CCLINC. http://isl.ira.uka.de – FAME. http://www.cogsci.princeton.edu – WordNet. http://www.globalwordnet.org – Global WordNet Association. http://www.illc.uva.nl – EuroWordNet. http://www.sfs.nphil.uni-tuebingen.de – GermaNet. http://www.ceid.upatras.gr – BalkaNet. http://www.keenage.comChinese HowNet. http://www.gittens.nl – Mimida multilingual semantic network. http://www.icsi.berkeley.edu – FrameNet project. http://www.coli.uni-sb.de – SALSA project. http://www.nak.ics.keio.ac.jp – FrameNet project for Japanese. http://gemini.uab.es – FrameNet project for Spanish. http://www.cis.upenn.edu – PropBank project. http://www.cis.upenn.edu – VerbNet. http://www.cis.upenn.edu – combination of VerbNet and FrameNet. http://nlp.cs.nyu.edu – The NomBank Project. http://aitc.aitcnet.org – IAMTC project.

Machine-Aided Translation: Methods E Macklovitch, University of Montreal, Montreal, Quebec, Canada ! 2006 Elsevier Ltd. All rights reserved.

Introduction The focus of this article is on machine-aided translation (or MAT), with heavy stress on the word aided, and we shall begin by distinguishing MAT from machine translation (or MT) pure and simple. Both, of course, seek to automate the translation process through the use of computers, and in both humans generally have an important role to play. In MT, however, the initiative in the translation process is given over to the machine, and the aim is to automate this process completely, eliminating the human’s contribution as far as possible. In MAT, on the other hand, the initiative in the translation process remains with the human translator, and the role of the machine is to assist the translator in performing certain tasks – normally, those that can be automated with a good degree of confidence and reliability. The fact that MT systems often do not succeed in

automatically producing a high-quality translation – where high quality is indeed a requirement – and that a human must subsequently intervene to postedit or otherwise improve the machine’s raw output should not cause us to lose sight of the fundamental distinction between MT and MAT. Whereas MT ultimately seeks to dispense with the human translator, MAT proceeds from the double recognition that, for highquality translation at least, the contribution of a human translator is generally indispensable, and furthermore, that this situation is not likely to change for the foreseeable future. Why is this? Quite simply because high-quality translation routinely requires a level of understanding that extends well beyond the literal wording of a source text to encompass unpredictable amounts of real-world knowledge, as well as the capacity to reason over that knowledge. Despite the undeniable progress recently achieved by the new empirical methods in machine translation, such knowledge and reasoning capabilities, remain by and large, beyond the ken of today’s computers. Hence, where high quality is a sine qua non (and not just information scanning,

Machine-Aided Translation: Methods 395

or gisting, as it is sometimes called), there will continue to be a need for a human in the translation process, in order to compensate for the machine’s limited understanding. Before we go any further, a brief terminological digression. For the purposes of this article, machineaided or machine-assisted translation, computer-aided or computer-assisted translation (CAT) are all taken to be synonymous. Another synonymous variant is machine-aided human translation (MAHT), which is often contrasted with human-aided machine translation (or HAMT), the latter referring to the manner in which humans are called on to pre-edit source texts or postedit the output of fully automatic MT systems.

A Little History It is one thing to establish that humans are not about to be evicted from the translation process, at least not for the foreseeable future. But then what exactly will the human’s role be in this process? Or, put another way, what is the optimal division of labor between man and machine in the translation process? Yehoshua Bar-Hillel, who was the first full-time MT researcher, was also the first to demonstrate the theoretical infeasibility of fully automatic, high-quality translation – or FAHQT, as he called it – based on his famous ‘‘box in the pen’’ example. (See Bar-Hillel, 1960, particularly Appendix III, see also Machine Translation: Overview; Bar-Hillel, Yehoshua (1915– 1975).) Regarding the optimal division of labor alluded to earlier, Bar-Hillel felt that it would be best for the human to intervene either before or after the translation operation proper, but not during it. He recognized, however, that this was an entirely empirical question, the answer to which could change as computers became more powerful and our understanding of natural language evolved. In any case, the arguments that Bar-Hillel repeatedly advanced in the late 1950s and early 1960s largely fell on deaf ears. For the great majority of researchers working on the problem of translation automation in those years, FAHQT remained the only goal worth pursuing, and one that they were convinced could provide a practical solution to the growing demand for high-quality translation. Bar-Hillel had no problem with MT as a legitimate research goal; it was the second contention that he found wholly unrealistic. However, it was not until 1966, when the American government published its (in)famous ALPAC report, that MT researchers could finally be convinced that their projects would not help respond to the burgeoning worldwide demand for translation, and then only because government

funding for MT research all but dried up (see also Machine Translation: History). Insofar as machine-aided translation is concerned, it is probably true to say that it did not begin to attract researchers’ attention until 1980, when Martin Kay published his seminal paper ‘On the proper place of men and machines in language translation.’ (See also Kay, Martin (b. 1935).) In it, Kay refurbished Bar-Hillel’s original argument, to the effect that fully automatic machine translation, although providing an invaluable research matrix within which to study the workings of human language, had very little to offer in the way of practical solutions to the urgent and growing demands being placed on the overtaxed corps of professional translators. The reason, in Kay’s view, was quite simple: we cannot successfully automate what we do not fully understand. The designers and developers of fully automatic MT systems were attempting to mechanize an essentially linguistic operation (translation) at a time when the science of linguistics had not yet provided an adequate explanation of how language works. As a concrete alternative to fully automatic MT – and this was Kay’s truly original contribution – he advocated machine-aided human translation, or more precisely, a device he called the translator’s amanuensis, now more commonly referred to as a translator’s workstation. At the core of his workstation was a sophisticated text editor, complete with a mouse and a split-screen display, one pane being for the source text and the other for the target. (Remember that Kay was advancing this proposal in 1980, before the appearance of personal computers!) Indeed, Kay argued that a well-designed text editor is probably the single most important tool that translators can be provided with; it is certainly the workstation component they will use most intensively. In the next section of his paper, which is entitled ‘Translation Aids,’ Kay went on to propose a number of ancillary programs specifically designed for professional translators that could be grafted onto this text editor. For example: a shared bilingual lexicon to which users can add various levels of information; a source-text analysis program that flags terms or expressions that occur with higher than normal frequency; and a keyword-in-context program; a document retrieval program that would allow the translator to locate past texts that contain material similar to his/her current text. Many of these suggestions have since been embodied in commercial products, as we will show later. But even more important than the specific details of Kay’s amanuensis (at least for our purposes) is the general philosophy of his incremental approach to MAHT. ‘‘I want to

396 Machine-Aided Translation: Methods

advocate a view of the problem in which machines are gradually, almost imperceptibly, allowed to take over certain functions in the overall translation process. First, they will take over functions not essentially related to translation. Then, little by little, they will approach translation itself. The keynote will be modesty. At each stage, we will only do what we know we can do reliably. Little steps for little feet’’ (Kay, 1980: 226). Before we begin to catalogue the various types of translation-support tools that have emerged since Kay first advanced his MAHT program, we should mention another historical antecedent, in the form of the large, dedicated terminology banks that first appeared in the early 1970s, such as Eurodicautom, which was launched by the European Commission in 1973, and Termium, which the Canadian government inaugurated in 1975. In both cases, the goal was to help standardize the technical terminology and official appellations used in large public administrations that were officially bilingual or, in the case of the European Commission, multilingual. In both cases as well, the professional translators in the employ of the EC or the Canadian Translation Bureau were among the first users of these computerized databases, and their particular needs had been carefully considered during the design and development phases. It therefore seems legitimate to consider these term banks as being among the earliest translation support tools. Since the mid-1970s, both Eurodicautom and Termium have undergone numerous changes, one of the most important being that they are now accessible to users outside their host organizations. And, of course, many other term banks have since appeared, both in the public and private sectors. Among the characteristics that users most appreciate about these term banks, one is their sheer volume. Both Eurodicautom and Termium, for example, contain well over a million records, and hence many millions of terms. Another is their reliability, as their records are normally created by bona fide terminologists, following well-defined and rigorous terminological practices. At the same time, this last characteristic represents something of a limitation for working translators, because they are not usually allowed to modify or contribute to the contents of these large, centralized repositories. Moreover, the normalization and standardization of new terminology is a slow, painstaking process and not always carried to a successful conclusion. Hence, in the real world of commercial translation, it often happens that for the same concept, different clients will insist on their own specific terminology. And finally, although these term banks may be enormous, they simply cannot cover all

domains with the same degree of detail, particularly in those high-tech domains where progress is now so rapid. Consequently, translators have a need for other kinds of terminological support. They need a means of recording, storing, and retrieving the results of their own terminological research and particular observations; a more flexible tool that would complement rather than rival the large centralized repositories. This is why for many years, even after the appearance of Eurodicautom and Termium, translators continued to record their terminological observations on file cards, which were sometimes printed up as domain-specific glossaries. It was only with the advent of the personal computer in the early 1980s, that it became possible to envisage automating these manually produced glossaries in ways that would be affordable, while at the same time increasing their utility and efficiency for working translators.

Terminology Management Programs Finding the correct lexical equivalent to sourcelanguage (SL) words is clearly a necessary (though obviously an insufficient) condition for translation. Hence, it is not altogether surprising that terminology management programs, intended to meet the needs mentioned in the previous paragraph, were among the earliest specialized translation aids to appear for the new generation of personal computers. The first such commercial application was probably Mercury/ Termex (MTX), developed under the direction of Alan Melby of Brigham Young University. MTX was a memory-resident program that ran invisibly in the background until the translator called it to the screen by hitting a hot-key combination from within the word processor. When the MTX pop-up window appeared, the translator would then type in the term to look up, and the program would display the corresponding record, if one was present in its database. Using various hot-key combinations, the translator could also insert a target-language (TL) equivalent directly into a word-processing document and, of course, edit or add new records to the terminology database. Other administrative routines were also available for sharing personal glossaries by importing, exporting, and merging term data files. There are now a plethora of terminology management programs that offer essentially the same basic functions as MTX, some as stand-alone products, others as components of larger translator workstations (see following). What are the characteristics that translators appreciate most about such programs? One obvious benefit, common to all computer-based lexical resources, is the fact that

Machine-Aided Translation: Methods 397

they allow for much easier and more flexible look-up than any printed volume or collection of file cards. Translators do not have to get up from their desk to query a term; in fact, they do not even have to remove their hands from the keyboard. Moreover, most term management programs now allow for queries, not just on the headword, but also on the content of other fields. Suppose, for example, that the user cannot remember a certain headword but knows that some other word appears elsewhere in the body of the entry. It is extremely unlikely that she would be able to find the entry in question in a bound dictionary, although this would pose no problem for a term management program that indexed the content of all its fields. And the same applies, a fortiori, to systems that permit searches with wildcard characters and variable word order. Moreover, most such systems allow the user to specify multiple search criteria, so that they can retrieve, for example, only those terms that belong to a particular domain, on records that were created after a given date, or by a particular person. In short, these term management programs bring to the domain of terminology many of the benefits of other types of computerized database management systems. Another major advantage of these systems, which was only fully realized when individual PCs were linked up in a local area network, is the ease with which they allow individual users to share the results of their terminological research. Once their machines are linked, it becomes possible for one user to immediately have access to new or modified records entered in the database by another user. Of course, this raises obvious problems of database management, such as how to ensure the integrity and the coherence of the database; but none of these problems are insurmountable, and users have generally found that the advantages of sharing their terminological resources far outweigh the costs. (We will return to consider other, more advanced features of recent term management programs later.)

Other Lexical Aids The term management programs described in the previous section are primarily intended for a translator’s personal terminology, i.e., as a repository for the equivalents encountered or researched in the course of personal work. In addition to these programs, and to the bona fide term banks such as Termium and Eurodicautom, other lexical resources also form part of a translator’s basic tool kit, most notably standard bilingual dictionaries. Needless to say, these too stand to benefit from the increased power and flexibility that are afforded by electronic databases.

Interestingly, one of the first to realize this was Alan Melby, who also designed the first personal term management program. In the 1990s, Melby’s company obtained the rights from several well-known publishers to include the contents of certain of their bilingual dictionaries within the MTX product offering. As a result, MTX was able to provide its users with two levels of lexical assistance: when a term was queried that was not in the user’s personal glossary, the system would display the corresponding entry for that term in one of its standard bilingual dictionaries, albeit in MTX format. Somewhat surprisingly, it took the publishers of the best-known bilingual dictionaries some years to respond to the PC revolution and to release their own versions of their standard and still invaluable reference works. Translators also make use of other monolingual lexical resources, including dictionaries, spelling checkers, and even verb conjugation programs. Although these are undeniably useful, their usefulness is certainly not limited to translators. This is why we will have little to say about them here, preferring instead to concentrate on those tools that are designed specifically for translators and that support them in the central and inherently bilingual aspect of their work. In our discussion of the terminology management programs earlier, it was tacitly assumed that it is up to the translator to identify the terms that should be added to the system’s database. In the early 1990s, some researchers (notably Justeson and Katz, 1993) proposed some surprisingly simple techniques that would allow for the automatic identification of candidate terms in a given text. The techniques involved were part-of-speech tagging (see Part-of-Speech Tagging), followed by the extraction of those word sequences in the tagged text that correspond to wellestablished term patterns (e.g., adjective–noun–noun in English), and finally the merging and sorting of the results in order of descending frequency. This algorithm was said to work best on long technical texts, and then only for multiword terms; furthermore, the candidate terms always had to be vetted by a human translator or terminologist. Nevertheless, the results obtained, particularly at the top of the list, very often corresponded to legitimate terms and would therefore be of value to translators who wanted to research their terminology before beginning their translation. But again, these early term extraction programs were monolingual and left the job of locating the TL equivalents of the proposed SL terms to the human translator or terminologist. It was not long, however, before researchers moved to correct this deficiency. In 1994, Ido Dagan and Ken Church published a paper that provided the logical

398 Machine-Aided Translation: Methods

extension to monolingual term extraction, in the form of a program (called Termight) that operated on parallel texts (i.e., texts that are mutual translations) in order to identify candidate terms along with a proposed translation. Actually, Church had been one of the pioneers in the development of automatic alignment algorithms that serve to create large bitextual corpora (i.e., parallel texts in which the translational correspondences are rendered formally explicit). In Termight, a word-level alignment program was used to calculate the proposed target language equivalents of the candidate terms, and according to the authors, the system was deployed with success at the AT&T Translation Services, leading to marked productivity increases in bilingual glossary construction. In recent years, there has been much activity in this area of automated term extraction. Monolingual term extraction programs are now available for a wide range of languages, and there are even some commercial products that, like Termight, propose translations for the candidate terms, e.g., TermFinder from Xerox (now marketed by Temis) and, more recently, Term Extract from Trados.

Repetitions Processing The basic idea behind repetitions processing in the context of translation is as simple as it is appealing: a translator should never have to retranslate a given SL segment if an acceptable translation has already been provided for that segment. The job of the repetitions processing program is to determine which segments in a new text have already been translated and then to provide the translator with easy access to the previous translations of those repeated segments. This is usually done in the following manner: The repetitions processor first segments a new text to be translated into basic units; these are generally sentences, but may also include headings, list elements, or the contents of each cell in a table. It then takes each source unit in turn and conducts a search for it, as a simple character string, in a reference file (or bitextual database) made up of source and target language translation pairs. If a repetition of a source unit is found, the associated target segment is retrieved from the database and shown to the translator, who may or may not decide to incorporate it into his/her translation. (Alternatively, if the repetitions processor is operating in batch mode, the translation will simply be inserted into the provisional target text, or the source sentence may be flagged in some way so that it need not be retranslated.) This somewhat simplified description raises a number of important questions. The first concerns the

nature of the repeated segments: Are they always full sentences, or other complete textual units like headings or list elements? And the second concerns the nature of the bitextual database in which the processor searches for repetitions: How exactly is it constituted? As it turns out, the two questions are interrelated. The default processing unit in almost all commercial repetitions processing systems – which are commonly called translation memories (TM) – is the complete sentence, because this is the unit for which it is easiest to build up large-scale databases of past translations; and the usefulness of this kind of tool is obviously correlated with the size of its databases (see also Translation Memories). Most such systems operate interactively, as follows: If no match is found for a given sentence in the text being translated, the translator must furnish the target language equivalent for it, as would normally be done. When that translation is completed and the translator moves on to the next sentence in the text, the repetitions processor links and stores the previous source- and target-language pair in its current database, so that if the source sentence is repeated later in the text, the system will be able to retrieve the associated translation. Now, suppose that later in the text, a source sentence is encountered that repeats only part of the SL-TL pair that has just been stored; say, its verb phrase, i.e., the auxiliary, the main verb and the noun phrase that serves as its direct object. The repetitions processor might be able to locate the repeated source sequence in its database, although even this is far from obvious. (How does one know which subsentential sequence to search for without searching for them all, thereby opening the door to a combinatorial explosion?) But even if this problem could be resolved, how would the repetitions processor know which subsegment to retrieve from the associated target sentence? Some sort of linguistic analysis would have to be performed in order to determine just what part of that TL sentence corresponds to the translation of the SL verb phrase. Unfortunately, this cannot currently be done with a high degree of precision and reliability. Current automatic alignment technology is quite accurate when it comes to linking corresponding sentences in two texts that are mutual translations; see, e.g., Ve´ ronis (2000). However, performance drops substantially when automatic alignment is attempted below the sentence level; see, e.g., the results reported in (Mihalcea and Pederson, 2003). This is another reason why all commercial repetitions processors employ the full sentence as their default processing unit: by so doing, their search and retrieval can be fully automatic. (Notice, incidentally, that there is nothing to prevent subsentential segments from being manually identified and stored

Machine-Aided Translation: Methods 399

alongside their translation in the database; but then the problem is how to scale up.) Given the current state of repetitions processing technology, a reasonable question to ask, therefore, is the following: How often do complete sentences reappear verbatim in general texts? Needless to say, the question does not admit of a pat answer, in the form of a simple figure that would apply across the board to all types of texts. However, from a study that we conducted on thousands of queries submitted to an online bitextual database containing 70 million words of Canadian parliamentary debates (see Macklovitch, 2000), the answer would seem to be: not very often. More precisely, what we discovered in this study was that as the queries submitted to this database (TSrali) lengthened, the likelihood of finding an exact match decreased proportionally; when the queries reached 14 words in length, there were no longer any exact matches or repetitions. (Similar findings are reported by Langlais and Simard, 2002, who found that less than 4% of 1260 complete sentences submitted to a much larger Hansard database were repeated verbatim.) Perhaps we tend to underestimate the degree to which natural language is creative, as Noam Chomsky argued long ago in Syntactic structures. Be that as it may, what these findings suggest for repetitions processing is that any system that searches for verbatim repetitions of complete sentences will be of very limited use, except on those types of texts that happen to display a high degree of repetition, such as updates, or certain types of technical manuals in which the same commands may appear over and over again. In terms of the general worldwide demand for translation, however, these surely represent only a small proportion of the texts that need to be translated. The developers of commercial translation memory software are certainly aware of this problem and in response have attempted to extend the applicability of their systems by introducing a number of features meant to attenuate the definition of what constitutes a repetition of the same sentence. Suppose, for example, that a sentence in the text being translated matches a SL sentence in the database except for the value of a proper name or a certain numerical expression, such as a date or an amount of money. A translator would normally want to see the previous translation for this SL sentence, particularly because the named entities it contains are often left untranslated. Some translation memory systems will in fact treat the two sentences as though they were identical and may even replace the values of the named entities in the previous translation with the appropriate values from the new source sentence. Indeed, most commercial TM systems now incorporate a notion

of ‘fuzzy matching,’ which allows them to accommodate this and other sorts of nonidentical but similar SL sentences, where the degree of similarity is expressed in terms of a numerical coefficient that the user may adjust in order to constrain the search for approximate matches. How exactly do these fuzzy matching algorithms work? It is difficult to say with certainty, because TM vendors do not generally provide a formal definition of this similarity coefficient. Hence, it is not all obvious how the results of a 70% match will differ, say, from a 74% match or an 81% match. One suspects that some notion of edit distance is being employed; but when evaluating two SL strings, the matching algorithm may also take other factors into account, such as whether the translation has been produced by an MT system or by a human, or whether the database contains multiple equivalents for the same SL segment – not to mention the opaque effects of word order differences on the similarity coefficient. Combining such diverse factors into a single percentage may appear to make things simpler for the naı¨ve user; but the unfortunate result is that those same users are left with a vague and ill-defined understanding of a parameter that is central to the system. As (Hutchins, 2003a) remarked: ‘‘most TM systems have difficulty with fuzzy matching – either too many irrelevant examples are extracted, or too many potentially useful examples are missed’’ (p. 11). Another weakness in many current TM systems, attributable to the structure of the underlying database, is that in these systems, the very notion of a document is lost. When a repetitions processor submits each successive sentence of a source document to the bitextual database, it does so blindly, as it were, without any trace of the context from which it was extracted. Moreover, the contents of the database are also stored as isolated sentences, with no indication of their place in the original document. As every translator knows, however, it is not always possible to translate a sentence in isolation; the same sentence may have to be rendered differently in different documents, or even within the same document, as Be´dard (1998) convincingly argues. It is not hard to come up with examples of phenomena that are simply not amenable to translation in isolation; cross-sentence anaphora is one obvious case, but there are many others. Skeptics may argue that such problems are relatively rare, but they are missing the point. In order to evaluate a translation retrieved from memory, translators routinely need to situate that target sentence in its larger context. Current TM systems offer no straightforward way of doing this because, unlike full document archiving systems, they archive isolated sentences.

400 Machine-Aided Translation: Methods

This may seem to be excessive criticism of current translation memory systems. After all, this technology is in wide use nowadays and is generally appreciated by translators and translation service managers, unlike fully automatic machine translation systems. If we have chosen to focus on some of the shortcomings of current TM systems, rather than extolling their virtues, it is mainly to highlight the fact that their applicability remains limited to a narrow range of texts that happen to exhibit a high degree of repetition: essentially, updated documents and certain types of technical manuals. In all other document types – in the vast majority of translation situations, in other words – the repetition of complete sentences is quite rare. Hence, the usefulness of repetitions processing technology is necessarily limited, at least until such time as these systems acquire the capability of handling repetition below the level of the complete sentence. See Macklovitch and Russell (2000) for further development of this argument. Another reason for our criticism of current TM systems stems from the quasihegemony they seem to enjoy in the field of machine-aided translation. For many people, the terms translation memory and computer-assisted translation are virtually synonymous. This is regrettable, in our view, because it tends to obscure the fact that repetitions processing is just one type of translation support tool, albeit a useful one for certain kinds of repetitive documents. Nevertheless, with time, there is no doubt that we will see the emergence of many other types translation support tools, thereby relegating repetitions processors to their proper and modest place. Indeed, as we will show in the next section, there exist other ways of exploiting the very same bitextual databases on which repetitions processors are based.

Other Tools Based on Bitext Existing translations contain more solutions to more translation problems than any other existing resource (Isabelle et al., 1993).

This assertion must have appeared quite audacious when it was first published back in 1993; but if one thinks about it for a moment, recalling the hundreds of millions of words that are translated every year, it just has to be true. The real challenge, of course, is how to render all the richness lying dormant in past translations readily and easily available to human users. Repetitions processing is one way of doing this, although it is certainly not the only way. For Pierre Isabelle and his colleagues in the machineaided translation group at the CITI, the key to this challenge lay in what was then the novel concept of bitext, i.e., pairs of texts in which the translational

correspondences have been made formally explicit. In the early 1990s, progress in automatic alignment techniques had made it possible to create enormous bitextual databases that were accurately aligned at the sentence level. The question then was: How could these be exploited to help support human translators? One rather straightforward way of allowing translators and other users to benefit from databases of past translations is via an interactive bilingual concordancing tool, like the TransSearch system first developed at the CITI by Michel Simard (see Simard et al., 1993). From the user’s point of view, a concordancer is like a database querying system in which the queries correspond to various sorts of translation problems. Instead of asking colleagues if they’ve ever encountered a particular problem before and, if so, how they previously translated it, the user submits a query to the concordancer. The system responds by searching for the query in the database and displaying all the occurrences it finds, each in its full sentential context; and because this is a bitextual database, it can also display alongside each result the translation of that sentence in the target language. In that translation lies a potential solution to the original translation problem, which users may decide to recycle, as they see fit. For our present purposes, it’s interesting to note that the bitextual database that is queried via a concordancer like TransSearch has exactly the same structure as the databases that underlie commercial repetitions processing systems. (Indeed, some commercial TM packages also include an interactive concordancing facility that accesses the very same databases.) All that distinguishes the two is the manner in which they are queried. One way of expressing this difference is in terms of the trade-off that each effects between automation and flexibility. Repetitions processing provides a higher level of automation, because the system automatically submits each successive sentence in a new text to the bi-textual database, thereby ensuring that no repeated sentences are overlooked. As we have seen, however, this automation comes at the expense of a certain rigidity; because only complete sentences are submitted, repetitions of segments below the sentence level will often be ignored. An interactive concordancer, on the other hand, offers greater flexibility in the units that can be submitted – these may be any arbitrary sequence, from a single word up to a complete sentence, including ellipses – provided the user takes the initiative of manually selecting and submitting the appropriate queries. But regardless of whether the queries are submitted manually (as with a concordancer) or automatically (as with a repetitions processor), the important point to emphasize is that it is the database

Machine-Aided Translation: Methods 401

itself, that constitutes the true translation memory. For this reason, it seems legitimate to propose a more generic or neutral definition of this term, viz., a computerized archive of past translations that is structured in such a way as to promote translation reuse. Nor is it reasonable to argue that one type of translation memory is a priori superior to the other, although each may lend itself better to various configurations of repetition. For texts that exhibit a high degree of full-sentence repetition, a fully automatic repetitions processor will tend to be more suitable; otherwise, an interactive concordancer will likely prove more useful to the translator. Isabelle et al. (1993) contend that the notion of bitext provides the foundation for a whole new generation of translation support tools. One interesting and novel proposal they have made is for a translation checker, which they call TransCheck; see Macklovitch (1995) for a more detailed description of the system. As its name suggests, a translation checker is meant to be employed somewhat like a spelling or a grammar checker, to which a user will submit texts in the hope of catching typos or certain flagrant errors of grammar. The errors detected by these tools, however, are all monolingual, whereas in the case of TransCheck, the errors the system seeks to flag are bilingual, i.e., errors of correspondence between the source text and a draft translation in a target language. In order to do this, a translation checker must first transform the two texts into a bitext in which the correspondences between segments (generally sentences) are made explicit. Once this is done, the system can then verify those bitextual regions to ensure that they conform to certain wellknown properties of an acceptable translation. The properties that are currently amenable to automatic verification tend to be rather simple and straightforward; basically, they involve formal and observable features in the two texts, rather than more abstract semantic properties. So a system like TransCheck will scan the generated bitext, one aligned region at a time, and verify that each contains certain obligatory equivalences while exhibiting no prohibited equivalences. An example of the obligatory sort would be the terminological equivalences provided in a client glossary or the correct transcription of various numerical expressions; and an example of the prohibited sort would be false cognates or other types of source language interference. Needless to say, such a system will overlook many veritable translation errors, and it may flag others erroneously; but then so do monolingual grammar and spelling checkers, and this doesn’t prevent them from being helpful. In any case, the first commercial products offering rudimentary translation checking capabilities have

just begun to appear on the market; e.g., the ErrorSpy system marketed by the German firm of D.O.G.

Translator’s Workstations A translator’s workstation (TWS), conceived most generally, is a computer-based environment that integrates a number of distinct programs, all of which are intended to assist the human translator in various aspects of work. As we saw earlier, Martin Kay may have been the first to advance the idea of a TWS, which he called the translator’s amanuensis. Alan Melby was another influential advocate of this approach; see Melby (1982). A detailed chronicle of the history of the TWS is provided by Hutchins (1998), who highlights some of the lesser-known figures who first proposed various tools that have subsequently become workstation components. The point that we want to emphasize here is that the TWS is indeed an approach, much more than it is a product, and as such it admits of numerous realizations. What all such realizations have in common, of course, is the basic tenet of machine-aided human translation, namely, that it is the human who remains at the center of the translation process. The workstation provides various support tools that are designed to make the translator more productive and ideally to relieve the translator of the more fastidious tasks involved in translation. A brief word on this notion of integration, which is an essential feature of all TWS. As mentioned, a workstation may comprise disparate applications that were not necessarily intended to function together. Hence, a fundamental requirement of a welldesigned workstation is to allow the user to shift focus effortlessly from one component to another and to transfer textual data seamlessly from one application to another. To take one simple example, the user needs to constantly consult his/her terminological glossary while drafting a target text within the word processor and, if the glossary contains the desired TL equivalents, be able to directly insert those terms into the target text with a minimum of keystrokes or mouse clicks. How this is done in practice will, of course, vary from one implementation to another; but in all workstations these kinds of ergonomic considerations are of paramount importance. Today, when almost all personal computers are equipped with operating systems that feature a graphic user interface, the difficulty being referred to here may not be obvious. However, in the first workstation development projects, like the one undertaken at the CITI in the late 1980s, the available interfaces were not nearly as user friendly, and the exchange of data between applications was often a very real problem,

402 Machine-Aided Translation: Methods

due in large part to hardware memory limitations (see Macklovitch, 1989). Recall that in the incremental approach that Kay first advocated for his translator’s amanuensis, the central component was a specialized text editor; the other dedicated programs were to be gradually grafted onto this editor. Nowadays, more than two decades after Kay’s original proposal, there are an impressive number of TWSs available – although not all of them call themselves as such – that offer a variety of translation support tools. Interestingly, almost all of the commercial workstation products are being marketed by the vendors of repetitions processing technology. Hence, it is not altogether surprising that in their promotional literature it is the ‘translation memory’ that is touted as the central, productivity-enhancing component. In actual fact, all these commercial packages include other programs that render them potentially useful to translators when they are working on texts that do not exhibit a sufficient degree of full-sentence repetition; otherwise, these packages would have limited commercial appeal. Foremost among these programs is terminology management software that allows translators to create, update, and share term glossaries. The more sophisticated commercial workstations may also include automatic alignment programs, term extraction programs that operate on the bitexts that these create from legacy translations, and programs that automatically identify terms in a source text that are found within the term glossary. One significant dividing point among the leading commercial workstations is how they interact with the major word processing packages (read MS-Word). Some TWSs do this by integrating their components directly into Word, via the addition of a toolbar menu, for example. This is said to ease the learning curve for users who are already familiar with Word, while allowing them to access its many useful ancillary features, e.g., monolingual spelling and grammar checking. Trados, which is the current market leader in commercial CAT tools, has adopted this approach, as have some of its smaller competitors, such as LogiTerm. Other workstation packages furnish their own proprietary editors, presumably because this allows them greater autonomy and perhaps better component integration; but they also provide filters to facilitate the importing and exporting of text with the major word processing packages. This is the approach adopted by Atril’s De´ ja`-Vu and by Star group’s Transit product. These commercial workstation packages are now so widespread that one is tempted to say that they dominate the translation landscape. Currently, they all offer more or less the same range of functionalities, most of which have already been touched on in this

article. However, at least two other features do deserve to be mentioned, even though (strictly speaking) they may fall outside our scope. The first are sourcetext analysis programs, which provide statistics on a new text to be translated, most notably the degree of sentence repetition; these statistics are meant to allow for an informed decision on the cost-effectiveness of using repetitions processing on a particular text. The second is direct access to a fully automatic machine translation program. This figured prominently in Melby’s original proposal for a multilevel system of translation aids (see Melby, 1982), where full-scale MT was the third, or ultimate, level of automated assistance, just as it was in the historically related ALPS system described in Hutchins (1998). In the 1990s, several major vendors of translation memory technology sought to team up with the developers of commercial MT systems to provide what was vaunted as being the best of both worlds. In these integrated environments, so the argument went, a new input sentence for which no match was found in the translation memory could be sent off for automatic translation by the MT system, after which the translator need only postedit the machine output. (The sequencing of the two processes was significant, the underlying assumption being that past human translations are always preferable to those generated by machine.) There seems to be far less emphasis on such links to full MT systems today. Indeed, of the major CAT system vendors, SDL may be the only one that still offers this option, and even then quite discreetly. One may speculate on the reasons for this important change of strategy. No doubt it had something to do with the quality of the raw translations produced by the large rule-based MT systems that were dominant in the last decade, a level of quality that was probably judged to be insufficient for publication purposes and incompatible with the exacting requirements of professional translators. Indeed, several of those large commercial MT systems have since disappeared or are now targeting very different markets.

Conclusion In the last few years, there has been much encouraging progress in the general field of translation automation. In terms of core technologies, the new statistical or data-driven methods have now made it possible to develop MT systems with a fraction of the human effort that was formerly required. Moreover, the quality of the translations produced by these corpusbased systems has been steadily improving, to the point that they can now rival that produced by the best rule-based systems. In terms of attitudes as well, there has been significant progress: progress in our

Machine-Aided Translation: Methods 403

understanding of which types of technology are most suitable for different kinds of translation demands. The general distinction between translation for assimilation purposes versus translation for dissemination or publication purposes is now widely accepted and with it, the recognition that fully automatic MT is often not appropriate for the latter type of translation demand, at least not given the current state of the technology. This is a most welcome development. For too many years, the ‘‘all or nothing syndrome,’’ as Alan Melby once called it – ‘‘the attitude that the machine must translate every sentence or it is not worth using a machine at all’’ (Melby, 1982: 215) – resulted in the misdirection of energies and resources among researchers and developers, as well as producing a high level of frustration among users, who felt that automated ‘solutions’ were being imposed on them that simply did not fit their needs. In the light of this inglorious history, it is altogether refreshing to read the following from someone as informed as John Hutchins: ‘‘MT systems are not suitable for use by professional translators, who do not like to have to correct the irritatingly ‘naı¨ve’ mistakes made by computer programs. They prefer computer aids that are under their full control’’ (Hutchins, 2003b: 509). Last but not least, it is also encouraging to see that new types of translation aids are finally being added to commercial workstations, beyond the now standard repetitions processing and glossary management programs. To illustrate with one simple example, several CAT packages now include a program that seeks to detect terms that have been translated inconsistently, i.e., not as the target equivalent specified in the term glossary. Small matter that just such a program was proposed nearly 10 years ago as part of the TransCheck system. What’s important for working translators and revisers is that this kind of automated assistance, which will help relieve them of a particularly fastidious aspect of their work, is now becoming available. Of course, checking for terminological consistency is but one small element in the overall job of quality assurance in translation. Again, small matter! As Kay so aptly stated so many years ago, in machineaided translation, it’s little steps for little feet. After years of immobility, the good news is that the feet are finally moving, and moving, it would seem, in the right direction. See also: Bar-Hillel, Yehoshua (1915–1975); Kay, Martin

(b. 1935); Machine Translation: History; Machine Translation: Overview; Part-of-Speech Tagging; Terminology, Term Banks and Termbases for Translation; Translation Memories; Translation: Profession; Writers’ Aids.

Bibliography ALPAC (1966). Languages and machines: computers in translation and linguistics. Washington DC: Automatic Language Processing Advisory Committee, National Academy of Science. Bar-Hillel Y (1960). ‘The present status of the automatic translation of languages.’ Reproduced in Nirenburg S, Somers H & Wilks Y (eds.). Readings in machine translation. Cambridge, MA: The MIT Press, 2003. Be´ dard C (1998). ‘Les me´ moires de traduction: une tendance lourde.’ Circuit 60, 25–26. Dagan I & Church K (1994). ‘Termight: Identifying and translating technical terminology.’ In Proceedings of the European Chapter of the Association for Computational Linguistics (EACL-94): 34–40. Reproduced in: Machine Translation 12, 89–107. Hutchins J (1998). ‘The origins of the translator’s workstation.’ Machine Translation 13(4), 287–307. Hutchins J (2003a). ‘Machine translation and computerbased translation tools: What’s available and how it’s used.’ In Bravo J M (ed.) A new spectrum of translation studies. Valladolid: Univ. Valladolid. Hutchins J (2003b). ‘Machine translation: general overview.’ In Mitkov R (ed.) The Oxford handbook of computational linguistics. Oxford: Oxford University Press. Isabelle P et al. (1993). ‘Translation analysis and translation automation.’ In Proceedings of the Fifth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI–93). Kyoto, Japan. 201–217. Justeson J & Katz S (1993). ‘Technical terminology: some linguistic properties and an algorithm for identification in text.’ Technical Report #RC 18906 (82591). IBM T. J. Watson Research Center, Yorktown Heights, New York. Reproduced in: Natural language engineering 1(1), 1995. Kay M (1980). ‘The proper place of men and machines in language translation.’ Reproduced in Nirenburg S, Somers H & Wilks Y (eds.). Readings in machine translation. Cambridge, MA: The MIT Press, 2003. Langlais P & Simard M (2002). ‘Merging example-based and statistical machine translation: an experiment.’ In Richardson S (ed.) Proceedings of the Fifth Conference of the Association for Machine Translation in the Americas. Berlin: Springer, LNAI 2499. 104–113. Macklovitch E (1989). ‘An off-the-shelf workstation for translators.’ In Hammond D (ed.) Proceedings of the 30th Annual Conference of the American Translators’ Association. Washington, DC. 491–498. Macklovitch E (1995). ‘TransCheck – or the automatic validation of human translations.’ In Proceedings of MT Summit V. Luxembourg. Available online at: http:// rali.iro.umontreal.ca. Macklovitch E (2000). ‘Two types of translation memory.’ In Proceedings of the Twenty-Second International Conference on Translating and the Computer. London: Aslib/ IMI.

404 Machine-Aided Translation: Methods Macklovitch E & Russell G (2002). ‘What’s been forgotten in translation memory.’ In White J (ed.) Proceedings of the Fourth Conference of the Association for Machine Translation in the Americas. Berlin: Springer, LNAI 1934. 137–146. Melby A (1982). ‘Multi-level translation aids in a distributed system.’ In Horecky J (ed.) Proceedings of the Ninth International Conference on Computational Linguistics (COLING 82). Amsterdam: North-Holland. 215–220.

Mihalcea R & Pederson T (eds.) (2003). ‘Building and using parallel texts: data driven machine translation and beyond.’ HTL-NAACL 2003 Workshop. Edmonton, Alberta, Canada. Simard M, Foster G & Perrault F (1993). ‘TransSearch: a bilingual concordance tool.’ Technical report, Centre for Information Technology Innovation. Laval, Canada. Available online at: http://rali.iro.umontreal.ca. Ve´ ronis J (ed.) (2000). Parallel text processing. Dordrecht: Kluwer Academic Publishers.

Machine Translation: Overview P Isabelle and G Foster, National Research Council of Canada, Gatineau, Quebec, Canada ! 2006 Elsevier Ltd. All rights reserved.

Introduction The term machine translation (MT) is used to refer to any process in which a machine performs a translation operation between two ordinary human languages: the source language (SL) and the target language (TL). This is the sense that appears in the following proposition: Machine translation is cheaper than human translation but less reliable.

The term can also designate the product of such a process, as in: Reading machine translations is not very pleasant but it is a convenient way to get the gist of foreign-language Web pages.

Finally, machine translation can also refer to the study of methods and techniques that render machines capable of producing (better) translations, as in the title of the present article. The term applies equally well to written or spoken language, but there is a tendency to prefer the terms speech translation or speech-to-speech translation for referring specifically to machine translation of spoken language. Translation is a very effective way of helping people communicate across the linguistic barriers. Unfortunately, human translation is expensive enough that it cannot constitute a practical solution to the everyday needs of ordinary people. As the price of human translation is unlikely to fall substantially, machine translation constitutes our best hope of making translation affordable for all.

The idea of using machines to translate is very old, but it was only around 1950, with the advent of digital computers, that serious work could really start. The initial enthusiasm led many to believe that good-quality machine translation was just around the corner, but they were wrong. After some 50 years of research, we can affirm that, barring an unexpected breakthrough, machines will not be able to compete with human translators in the foreseeable future. This prediction applies not only to difficult material such as literary works, but also to all but the very simplest and repetitive texts (e.g., weather reports). The present article explains why this is so. We start from a simplistic conception of machine translation and show where it breaks down. Then we show how computational linguists tried to fix the problems through increasingly elaborate approaches. But these more elaborate approaches turn out to raise their own additional problems.

Why Machine Translation Is a Difficult Problem Let’s assume a very naı¨ve theory: translating between human languages is just a matter of looking up the words of the source text in a bilingual dictionary. The next 12 subsections examine the many ways that simplistic theory breaks down and show why translation requires: (a) a fine-grained understanding of the source text; (b) contextual knowledge that makes it possible to fill information gaps between the SL and the TL; and (c) a detailed knowledge of the grammar of the TL. Segmenting Texts into Words

Before a dictionary search can be performed, the text needs to be segmented into a sequence of individual

422 Machine Translation: Overview Weaver W (1955). ‘Translation.’ In Machine translation of languages. Cambridge, MA: MIT Press. Yngve V (1957). ‘A Framework for Syntactic Translation.’ Mechanical Translation 4(3), 59–65; reprinted in Nirenburg S, Somers H & Wilks Y (eds.) (2003). Readings in machine translation. MIT Press. 39–44.

Zarechnak M (1959). ‘Three levels of linguistic analysis.’ Journal of the Association for Computing Machinery 6(1), 24–32. Relevant Website http://www.ldc.upenn.edu – Linguistic Data Consortium.

Macro-Jeˆ E R Ribeiro, Museu Antropolo´gico, Universidade Federal de Goia´s, Goia´s, Brazil ! 2006 Elsevier Ltd. All rights reserved.

The Macro-Jeˆ stock comprises the Jeˆ family and a number of possibly related language families, all of which are located in Brazil. Macro-Jeˆ is arguably one of the lesser-known language groups of South America, its very existence as a genetic unit being still ‘‘a working hypothesis’’ (Rodrigues, 1999: 165). According to Rodrigues (1986, 1999), whose classification is the most widely accepted among researchers working on Brazilian languages, the ‘Macro-Jeˆ hypothesis’ comprises 12 different language families: Jeˆ, Kamaka˜, Maxakalı´, Krena´k, Purı´, Karirı´, Yateˆ, Karaja´, Ofaye´, Boro´ro, Guato´, and Rikbaktsa. The existence of Jeˆ as a language family has been recognized since early classifications of South American languages (Martius, 1867). ‘Jeˆ’ is a Portuguese spelling for a Northern Jeˆ collective morpheme ([je] in Apinaje´, for instance) that occurs in the names of several Jeˆ-speaking peoples. The term ‘Macro-Jeˆ’ was coined by Mason (1950), replacing earlier labels, such as ‘Tapuya’ and ‘Tapuya-Jeˆ.’

Comparative Evidence Recent classifications (Rodrigues, 1986; Greenberg, 1987; Kaufman, 1994) differ as to the precise scope of Macro-Jeˆ, although there is agreement on the inclusion of most of the families (Table 1). Except for Karirı´ (included only by Rodrigues), Greenberg and Kaufman included all the families listed above. In addition, Greenberg included Chiquitano (also included by Kaufman), Jabutı´, and Otı´. Given the lack of comprehensive comparative studies, the Macro-Jeˆ status of some of these families is still an open question. Although Guato´ is included in the stock by all of the aforementioned classifications, a case for its inclusion has yet to be made, beyond the superficial, inconclusive evidence presented so far (Rodrigues,

1986, 1999). On the other hand, a preliminary comparison has revealed compelling evidence for the inclusion of the Jabutı´ family into the Macro-Jeˆ Table 1 The Macro-Jeˆ Hypothesisa 1. Jeˆ †Jeiko´ Northern Jeˆ: Panara´, Suya´, Kayapo´, Timbı´ ra

(Parkateˆjeˆ, Pykobjeˆ, etc.), Apinaje´ Central Jeˆ: Xava´nte, Xere´nte, †Akroa´-Mirim, †Xakriaba´ Southern Jeˆ: Kainga´ng, Xokle´ng, †Ingaı´ n

2. Kamaka˜ †Kamaka˜, †Mongoyo´, †Menie´n, †Kotoxo´, †Masakara´ 3. Maxakalı´ Maxakalı´ , †Pataxo´, †Kapoxo´, †Monoxo´, †Makonı´ , †Malalı´ 4. Krena´k Krena´k (Botocudo, Boru´m) 5. Purı´ (Coroado) †Coroado, †Purı´ , †Koropo´ 6. Ofaye´ Ofaye´ 7. Rikbaktsa´ Rikbaktsa´ 8. Boro´ro Boro´ro, †Umutı´ na, †Otu´ke 9. Karaja´ Karaja´ (including four dialects, Southern Karaja´, Northern Karaja´, Javae´, and Xambioa´) 10. Karirı´ †Kipea´, †Dzubukua´, †Pedra Branca, †Sabuya´ (included by Rodrigues but not Greenberg or Kaufman) 11. Jabutı´ Djeoromitxı´ (Jabutı´ ) Arikapu´ (included by Greenberg but not Rodrigues or Kaufman) 12. Yateˆ Yateˆ 13. Guato´ Guato´ 14. Chiquitano Chiquitano (Besiro) (included by Greenberg and Kaufman, but not Rodrigues) 15. Otı´ †Otı´ (Eo-Xava´nte) (the inclusion of Otı´ , proposed only by Greenberg, is not substantiated by the available data) a Extinct languages are indicated by †. Based on Greenberg, 1987; Rodrigues, 1986, 1999; Kaufman, 1994.

Macro-Jeˆ 423

stock (Voort and Ribeiro, 2004), thus corroborating a hypothesis suggested in the 1930s by ethnographer Curt Nimuendaju (Nimuendaju, 2000: 219–221). Greenberg’s main piece of evidence for the inclusion of Chiquitano was the entire set of singular personal prefixes (Greenberg, 1987: 44), which are strikingly similar to the ones found in several Macro-Jeˆ families; convincing lexical evidence, however, has not been presented thus far. As for Otı´, a poorly documented language once spoken in southern Brazil, the meager available data do not support its inclusion in the Macro-Jeˆ stock. The only family-level reconstruction available is Davis (1966), for Proto-Jeˆ . So far, lexical comparative evidence supporting the inclusion of individual families in the Macro-Jeˆ stock has been presented for Kamaka˜ (Loukotka, 1932), Maxakalı´ (Loukotka, 1931, 1939; Davis, 1968), Purı´ (Loukotka, 1937), Boro´ ro (Gue´ rios, 1939), Krena´k (Loukotka, 1955; Seki, 2002), Karaja´ (Davis, 1968), Ofaye´ (Gudschinsky, 1971), Rikbaktsa´ (Boswood, 1973), and Jabutı´ (Voort and Ribeiro, 2004). In addition, some studies have shown very suggestive cases of morphological idiosyncrasies shared by Jeˆ , Boro´ ro, Maxakalı´, Karirı´, Karaja´ , and Ofaye´ (Rodrigues, 1992, 2000b). Thus, although the inclusion of many of the families into the Macro-Jeˆ stock is being further corroborated by additional research, for others (namely Guato´ , Chiquitano, and Yateˆ ) the hypothesis has yet to be systematically tested. The precise relationship among the suggested members of the stock also remains to be worked out.

Long-Range Affiliations Greenberg (1987) suggested that Macro-Jeˆ would be related to his Macro-Pano and Macro-Carib stocks, as part of a Jeˆ -Pano-Carib branch of ‘Amerind.’ However, as Rodrigues (2000a) pointed out, Greenberg’s purported evidence does not withstand careful examination. Rodrigues (1985, 2000a) proposed instead a relationship between Tupı´, Carib, and Macro-Jeˆ , noting grammatical and lexical similarities among the three language groups (especially between Carib and Tupı´). Davis (1968) also mentioned a few lexical similarities between Proto-Jeˆ and Proto-Tupı´. Although the evidence presented so far suggests that Rodrigues’s proposal is more plausible than Greenberg’s, any hypothesis of distant genetic relationship at such a level must be considered with caution. Considering that the precise boundaries of Macro-Jeˆ are still uncertain, much more research at the family and stock levels needs to be conducted before such

long-range classifications can be proposed on solid scientific grounds.

Location All Macro-Jeˆ languages are spoken in Brazilian territory, although in the past Otu´ ke (Boro´ ro) and Ingaı´n (Southern Jeˆ ), both now extinct, were spoken in Bolivia and Argentina, respectively. Chiquitano, listed as a Macro-Jeˆ language by Greenberg (1987) and Kaufman (1994), is also spoken in Bolivia, as well as in Mato Grosso, Brazil. Although the Jabutı´ languages and Rikbaktsa´ are spoken in the southern fringes of the Amazon (Rondoˆ nia and northern Mato Grosso, respectively), the overall distribution of Macro-Jeˆ languages is typically non-Amazonian. Yateˆ , Krena´ k, and Maxakalı´ languages are spoken in eastern Brazil, the same having been the case of Purı´, Kamaka˜ , and Karirı´ (all now extinct). Central and Northern Jeˆ tribes, as well as the Boro´ ro and the Ofaye´ , traditionally occupy the savanna areas of central Brazil. The southernmost Macro-Jeˆ languages are those belonging to the southern branch of the Jeˆ family, spreading from Sa˜ o Paulo to Rio Grande do Sul. Karaja´ is spoken along the Araguaia River, in central Brazil. The traditional Guato´ territory is the Paraguay River, near the Bolivian border. Since several purported Macro-Jeˆ languages were spoken in eastern Brazil, a number of them became extinct early on, under the impact of European colonization. Yateˆ is a remarkable exception, being the only surviving indigenous language in the Brazilian northeast. Whereas Guato´ , Rikbaktsa´ , Karaja´ , Krena´ k, and Ofaye´ are all single-member families (Table 1), the Jeˆ family has a relatively large number of members, for most of which a fair amount of descriptive material is now becoming available (mostly as graduate theses and dissertations in Brazilian universities). Ofaye´ has around a dozen speakers, although it is mistakenly listed as extinct by some sources (including earlier editions of Ethnologue). Boro´ ro and Maxakalı´ are the only surviving languages of their respective families. All the languages of the Kamaka˜ , Purı´, and Karirı´ families are now extinct. While documentation on Kamaka˜ and Purı´ languages consists only of brief wordlists, the Karirı´ languages Kipea´ and Dzubukua´ were documented in catechisms (Mamiani, 1698; Bernardo de Nantes, 1709; respectively) and, for Kipea´ , a grammar (Mamiani, 1699) – the only published grammar of a non-Tupı´ language from colonial Brazil. Thus, among the extinct Macro-Jeˆ families, Karirı´ is the only one for which detailed grammatical information is available. Many of the languages

424 Macro-Jeˆ

included in the Macro-Jeˆ stock are seriously endangered (Guato´ , Ofaye´ , Krena´ k, and Arikapu´ are especially so).

Characteristics When compared with languages of other lowland South American families (such as Carib and Tupı´Guaranı´), Macro-Jeˆ languages typically present larger vowel inventories. For instance, Davis (1966) reconstructed, for Proto-Jeˆ , a system of nine oral and six nasal vowels, as well as 11 consonants. Syllabic patterns are rather simple, obstruent clusters being uncommon. Stress is generally predictable. Phonologically contrastive tone oppositions occur in Yateˆ and Guato´ (Pala´ cio, 2004). Processes such as nasal spreading and vowel harmony are generally absent. An exception is Karaja´ , which presents advanced tongue root vowel harmony, a rare phenomenon among South American languages (Ribeiro, 2002a). Another remarkable feature of Karaja´ is the existence of systematic differences between male and female speech. Female speech is more conservative, male speech being characterized, in general, by the deletion of a velar stop occurring in the corresponding female speech form (as a result of consonant deletion, vowel assimilation and fusion may also occur). This is a very productive process, applying even to loanwords (Table 2). Most Macro-Jeˆ languages have a relatively simple morphology. In most languages (including those of the Jabutı´, Karirı´, Krena´ k, Jeˆ , Ofaye´ , and Maxakalı´ families), productive inflectional morphology is limited to person marking, the same paradigms being generally shared by nouns, verbs, and adpositions alike. Tense and aspect distinctions are generally conveyed by particles and auxiliaries rather than by inflections (with few apparent exceptions, such as Yateˆ ; cf. Costa, 2004). Noun incorporation is rare,

Table 2 Female versus male speech distinctions in Karaja´ Male speech

kcwcr o dIkar D kcha˜ k d Dra ruku bEraku

cwcr o dIar D cha˜ e

e e II

o˜ b ra abE b Dawa e

e

e

e e kI

e

ko˜ b ra kabE b Dkawa

d Dra

ru bero

‘wood’ ‘I’ ‘armadillo’ ‘sand’ ‘night’ ‘river’ ‘3rd person pronoun’ ‘to buy’ (from Portuguese comprar) ‘coffee’ (from Portuguese cafe´ ) ‘firearm’ (from Lı´ ngua Geral mok Da´wa) e

Female speech

having been reported for a few Northern Jeˆ languages, such as Panara´ (which also presents postposition incorporation; cf. Dourado, 2002). In languages with a more robust morphology, such as Karaja´ , Guato´ , and Yateˆ , inflectional morphology tends to be more complex with verbs than with nouns. In Karaja´ , for example, the verb form includes subject-agreement, voice (transitive, passive, and antipassive), and directional markers (‘thither’ versus ‘hither’), which can be used with evidential purposes (Ribeiro, 2002b); on the other hand, the only category for which nouns inflect is possession (as in most Macro-Jeˆ languages). The majority of the purported Macro-Jeˆ languages are verb final, with postpositions instead of prepositions and possessor-possessed order in genitive constructions (the exceptions being Guato´ , Chiquitano, and Karirı´). Macro-Jeˆ languages seemingly lack the adjective as an independent part of speech, with adjectival meanings being expressed by nouns or descriptive verbs. Oliveira (2003) offered an in-depth discussion of the properties displayed by ‘descriptives’ in a particular Macro-Jeˆ language, Apinaje´ , illustrating well the issues involved in determining part-of-speech membership in languages in which most inflectional properties tend to be shared by nouns, verbs, and adpositions. In attributive constructions, descriptives follow the word they modify. Languages such as Maxakalı´, Karirı´, and Panara´ are described as being predominantly ergative. In addition, a number of Jeˆ languages are described as presenting an ergative split of some sort. That is the case of Xokle´ ng (Urban, 1985) and Northern Jeˆ languages such as Kayapo´ (Silva and Salanova, 2000) and Apinaje´ . Among the latter, however, ergativity seems to be rather epiphenomenal, being found only in constructions involving nominalized verbs (such as relative clauses; cf. Oliveira, 2003). Syntactic ergativity is rarely found in Macro-Jeˆ , with the exception of Karirı´, in which all grammatical criteria (verb inflection, relativization, switch-reference, word order) point to the absolutive argument (S/O) as being the syntactic pivot (Larsen, 1984).

Further Reading For information on the main literature on Macro-Jeˆ languages, including an overview of their phonological and grammatical characteristics and a short list of possible Macro-Jeˆ cognate sets, see Rodrigues (1999). Proceedings of recent conferences (the ‘Encontros Macro-Jeˆ ,’ which have been taking place periodically since 2001) help to provide an updated picture of Macro-Jeˆ scholarship; the proceedings of the first

e

e

Macro-Jeˆ 425

two meetings were published as Santos and Pontes (2002) and D’Angelis (2004), respectively. Population figures for all Macro-Jeˆ groups (including those now monolingual in Portuguese) can be found in Ricardo (2001). See also: Brazil: Language Situation; Cariban Languages; Tupian Languages.

Bibliography Bernardo de Nantes R P Fr. (1709). Katecismo indico da lingua Kariris. Lisbon: Valentim da Costa. [Facsimile reproduction by Platzmann J (ed.). Catecismo da lingua Kariris. Leipzig: Teubner, 1896.] Boswood J (1973). ‘Evideˆ ncias para a inclusa˜o do Aripaktsa´ no filo Macro-Jeˆ .’ In Bridgeman L I (ed.) Se´rie Lingu¨ı´stica, 1. Brasi´lia: SIL. 67–78. Costa J da (2004). ‘Morfologia do verbo em Yaathe.’ In D’Angelis (org.) 149–161. D’Angelis W da R (org.) (2004). LIAMES (Lı´nguas Indı´genas Americanas) 4. Special Issue, Proceedings of the 2nd ‘Encontro de Pesquisadores de Lı´nguas Jeˆ e MacroJeˆ,’ Campinas, May 2002. Davis I (1966). ‘Comparative Jeˆ phonology.’ Estudos Lingu¨ı´sticos: Revista Brasileira de Lingu¨ı´stica Teo´rica e Aplicada 1, 2.10–24. Davis I (1968). ‘Some Macro-Jeˆ relationships.’ International Journal of American Linguistics 34, 42–47. Dourado L (2002). ‘Construc¸ o˜ es aplicativas em Panara´ .’ DELTA 18, 203–231. Greenberg J H (1987). Languages in the Americas. Stanford: Stanford University Press. Gudschinsky S C (1971). ‘Ofaye´ -Xavante, a Jeˆ language.’ In Gudschinsky S C (ed.) Estudos soˆbre lı´nguas e culturas indı´genas. Brasi´lia: SIL. 1–16. Gue´ rios R F M (1939). ‘O nexo lingu¨ ı´stico BororoMerrime-Caiapo´ .’ Revista do Cı´rculo de Estudos Bandeirantes 2, 61–74. Kaufman T (1994). ‘The native languages of South America.’ In Asher R E & Moseley C (eds.) Atlas of the world’s languages. London: Routledge. 46–76; 14–25 (maps). Larsen T W (1984). ‘Case marking and subjecthood in Kipea´, Kirirı´.’ Berkeley Linguistic Society Proceedings 10, 189–205. Loukotka C (1931). ‘La familia lingu¨ ı´stica Masˇ akali.’ Revista del Instituto de Etnologı´a de la Universidad Nacional de Tucuma´n 2, 21–47. Loukotka C (1932). ‘La familia lingu¨ ı´stica Kamakan del Brasil.’ Revista del Instituto de Etnologı´a de la Universidad Nacional de Tucuma´n 2, 493–524. Loukotka C (1937). ‘La familia lingu¨ ı´stica Coroado.’ Journal de la Socie´te´ des Ame´ricanistes de Paris n.s. 29, 157–214. Loukotka C (1939). ‘A lı´ngua dos Patachos.’ Revista do Arquivo Municipal de Sa˜o Paulo 55, 5–15.

Loukotka C (1955). ‘Les indiens Botocudo et leur langue.’ Lingua Posnaniensis 5, 112–135. Mamiani L V (1698). Catecismo da doutrina Christa˜a na lingua brasilica da nac¸a˜o Kiriri. Lisbon: Miguel Deslandes. [Facsimile:. Garcia R (ed.). Catecismo Kiriri. Rio de Janeiro: Imprensa Nacional, 1942.] Mamiani L V (1699). Arte de grammatica da lingua brasilica da nac¸am Kiriri. Lisbon: Miguel Deslandes. [2nd edn.: Rio de Janeiro: Bibliotheca Nacional, 1877.] Martius K F P von (1867). Beitra¨ge zur Ethnographie und Sprachenkunde Amerikas zumal Brasiliens (2 vols). Leipzig: Friedrich Fleischer. Mason J A (1950). ‘The languages of South American Indians.’ In Steward J H (ed.) Handbook of South American Indians, vol. 6. Washington, DC: Smithsonian Institution. 157–317. Nimuendaju C (2000). Cartas do Serta˜o, de Curt Nimuendaju´ para Carlos Esteva˜o de Oliveira. Lisbon: Museu Nacional de Etnologia/Assı´rio & Alvim. Oliveira C C de (2003). ‘Lexical categories and the status of descriptives in Apinaje´ .’ International Journal of American Linguistics 69, 243–274. Pala´ cio A (2004). ‘Alguns aspectos da lı´ngua Guato´.’ In D’Angelis (org.). 163–170. Ribeiro E R (2002a). ‘Directionality in vowel harmony: the case of Karaja´ .’ Berkeley Linguistics Society Proceedings 28, 475–485. Ribeiro E R (2002b). ‘Direction in Karaja´ .’ In Ferna´ ndez Z E & Ciscomani R M O (eds.) Memorias, VI Encuentro de Lingu¨ı´stica en el Noroeste, vol. 3. Hermosillo: Editorial UniSon. 39–58. Ricardo C A (ed.) (2001). Povos indı´genas no Brasil: 1996/ 2000. Sa Do Paulo: Instituto Socioambiental. Rodrigues A (1985). ‘Evidence for Tupı´-Carib relationships.’ In Klein H E & Stark L R (eds.) South American Indian languages: retrospect and prospect. Austin: University of Texas Press. 371–404. Rodrigues A (1986). Lı´nguas brasileiras: para o conhecimento das lı´nguas indı´genas. Sa Do Paulo: Edic¸ o˜ es Loyola. Rodrigues A (1992). ‘Um marcador Macro-Jeˆ de posse aliena´ vel.’ In Anais da 44a. Reunia˜o Anual da SBPC. Sa Do Paulo: SBPC. 386. Rodrigues A (1999). ‘Macro-Jeˆ .’ In Dixon R M W & Aikhenvald A (eds.) The Amazonian languages. Cambridge: Cambridge University Press. 164–206. Rodrigues A (2000a). ‘Ge-Pano-Carib versus Jeˆ -Tupı´Karib: sobre relaciones lingu¨ ı´sticas prehisto´ ricas en Sudame´ rica.’ In Miranda L (ed.) Actas del I Congreso de Lenguas Indı´genas de Sudame´rica (tomo I). Lima: Universidad Ricardo Palma. 95–104. Rodrigues A (2000b). ‘Flexa˜ o relacional no tronco Macro-Jeˆ .’ Boletim da Associac¸a˜o Brasileira de Lingu¨ı´stica 25, 219–231. Santos L dos & Pontes I (eds.) (2002). Lı´nguas Jeˆ: estudos va´rios. Londrina: Editora UEL. Seki L (2002). ‘O Krenak (Botocudo/Borum) e as lı´nguas Jeˆ .’ In Santos & Pontes (eds.). 15–40. Silva M A R & Salanova A (2000). Verbo y ergatividad escindida en Mebeˆ ngoˆ kre. In Voort H van der & Kerke S

426 Macro-Jeˆ van der (eds.) Indigenous Languages of Lowland South America. Leiden: University of Leiden. 225–242. Urban G (1985). ‘Ergativity and accusativity in Shokleng (Geˆ ).’ International Journal of American Linguistics 51, 164–187.

Voort H van der & Ribeiro E R (2004). ‘The westernmost branch of Macro-Jeˆ .’ Paper presented at the 2nd Workshop ‘Linguistic Prehistory in South America,’ September 4–10. University of Oregon, Department of Linguistics.

Macrostructure M M Louwerse and A C Graesser, University of Memphis, Memphis, TN, USA ! 2006 Elsevier Ltd. All rights reserved.

Macrostructures are structures that organize texts globally, just as microstructures organize them locally. Given that texts are not just concatenations of sentences, texts need to be structured both locally (connections between clauses and sentences) and globally (larger fragments of discourse, e.g., paragraphs). Syntactic rules, the meaning of the words of the sentence, and general heuristics of discourse form the microstructure of a text. These microstructures organize sets of interrelated propositional representations of the phrases, clauses, and sentences of the text. Macrorules translate these sequences of propositions into a smaller set of more general propositions by deleting propositions that are less important for the overall meaning of the text, by generalizing propositions into supersets and by constructing new text units that replace the meaning of the old set. Macrorules operate recursively, so that macrostructures that are formed by macrorules may be subject to another cycle of macrorules, further generalizing the gist of the text. Macrostructures are therefore abstract semantic descriptions of the semantic content of the text, similar to the text’s global meaning and theme and providing global coherence. The term ‘macrostructure’ for global principles of text organization was first proposed by Bierwisch in 1965 for narrative structures in literary texts. In 1968, Harris discussed a similar idea of global text structuring. In that same year the Morphology of the folktale by Vladimir Propp was translated from Russian (original 1928). Propp argued that Russian fairy tales share a particular narrative structure. Around that same time, narratologists such as Greimas (1966), Bremond (1964), Labov and Waletzky (1967), Le´ vi-Strauss (1960), and Todorov (1971) proposed similar narrative grammars. According to these grammars, stories are like sentences in that their narrative structures are structures analogous to syntactic structures. The development of the concept of macrostructures should

be seen against the background of the developments of narrative structures. ‘Macrostructures’ became an established term in text linguistics after being further developed by Van Dijk (1972). Van Dijk’s text linguistic approach was very much based on theoretical linguistics and Chomsky’s (1957, 1965) generative-transformational grammar. Chomsky (a student of Harris) argued that sentences have a recursive capacity. Each sentence has a deep structure that is interpreted by the semantic component of the grammar. Syntactic transformations relate the deep structure to the surface structure of the sentence. Some of the elementary transformations consist of adjoining, moving, deleting, and copying constituents. Van Dijk argued that text grammars also have these deep and surface structures. The equivalent to the sentential surface structures are microstructures; the equivalent to the sentential deep structures are macrostructures. As with the sentential surface structures, microstructures have underlying rules to represent the underlying semantic representation of the sentences. As with the sentential deep structures, macrostructures have an abstract semantic character and are specified by macrosemantic rules operating on the microstructures. Although the concept of macrostructures remained the same, the direct link to sentence structures faded in later work (Van Dijk, 1977, 1980). Around the time of Van Dijk’s introduction of text grammars in (text) linguistics and poetics, Kintsch (1974) argued that cognitive psychology should not focus on isolated sentences only, but must focus on texts. Kintsch proposed that the representation of texts in memory is a network of interrelated propositions. These propositions are units of meaning roughly corresponding to phrases or clauses. A proposition consists of a predicate that modifies one or more arguments. A sentence like The teacher explained the concept to the students can be represented propositionally as (EXPLAIN, TEACHER, CONCEPT, STUDENT). It contains one predicate (EXPLAIN) and three arguments (TEACHER, CONCEPT, STUDENT). Arguments are generally the nouns of a clause, but can also be prepositional phrases and

426 Macro-Jeˆ van der (eds.) Indigenous Languages of Lowland South America. Leiden: University of Leiden. 225–242. Urban G (1985). ‘Ergativity and accusativity in Shokleng (Geˆ).’ International Journal of American Linguistics 51, 164–187.

Voort H van der & Ribeiro E R (2004). ‘The westernmost branch of Macro-Jeˆ.’ Paper presented at the 2nd Workshop ‘Linguistic Prehistory in South America,’ September 4–10. University of Oregon, Department of Linguistics.

Macrostructure M M Louwerse and A C Graesser, University of Memphis, Memphis, TN, USA ! 2006 Elsevier Ltd. All rights reserved.

Macrostructures are structures that organize texts globally, just as microstructures organize them locally. Given that texts are not just concatenations of sentences, texts need to be structured both locally (connections between clauses and sentences) and globally (larger fragments of discourse, e.g., paragraphs). Syntactic rules, the meaning of the words of the sentence, and general heuristics of discourse form the microstructure of a text. These microstructures organize sets of interrelated propositional representations of the phrases, clauses, and sentences of the text. Macrorules translate these sequences of propositions into a smaller set of more general propositions by deleting propositions that are less important for the overall meaning of the text, by generalizing propositions into supersets and by constructing new text units that replace the meaning of the old set. Macrorules operate recursively, so that macrostructures that are formed by macrorules may be subject to another cycle of macrorules, further generalizing the gist of the text. Macrostructures are therefore abstract semantic descriptions of the semantic content of the text, similar to the text’s global meaning and theme and providing global coherence. The term ‘macrostructure’ for global principles of text organization was first proposed by Bierwisch in 1965 for narrative structures in literary texts. In 1968, Harris discussed a similar idea of global text structuring. In that same year the Morphology of the folktale by Vladimir Propp was translated from Russian (original 1928). Propp argued that Russian fairy tales share a particular narrative structure. Around that same time, narratologists such as Greimas (1966), Bremond (1964), Labov and Waletzky (1967), Le´vi-Strauss (1960), and Todorov (1971) proposed similar narrative grammars. According to these grammars, stories are like sentences in that their narrative structures are structures analogous to syntactic structures. The development of the concept of macrostructures should

be seen against the background of the developments of narrative structures. ‘Macrostructures’ became an established term in text linguistics after being further developed by Van Dijk (1972). Van Dijk’s text linguistic approach was very much based on theoretical linguistics and Chomsky’s (1957, 1965) generative-transformational grammar. Chomsky (a student of Harris) argued that sentences have a recursive capacity. Each sentence has a deep structure that is interpreted by the semantic component of the grammar. Syntactic transformations relate the deep structure to the surface structure of the sentence. Some of the elementary transformations consist of adjoining, moving, deleting, and copying constituents. Van Dijk argued that text grammars also have these deep and surface structures. The equivalent to the sentential surface structures are microstructures; the equivalent to the sentential deep structures are macrostructures. As with the sentential surface structures, microstructures have underlying rules to represent the underlying semantic representation of the sentences. As with the sentential deep structures, macrostructures have an abstract semantic character and are specified by macrosemantic rules operating on the microstructures. Although the concept of macrostructures remained the same, the direct link to sentence structures faded in later work (Van Dijk, 1977, 1980). Around the time of Van Dijk’s introduction of text grammars in (text) linguistics and poetics, Kintsch (1974) argued that cognitive psychology should not focus on isolated sentences only, but must focus on texts. Kintsch proposed that the representation of texts in memory is a network of interrelated propositions. These propositions are units of meaning roughly corresponding to phrases or clauses. A proposition consists of a predicate that modifies one or more arguments. A sentence like The teacher explained the concept to the students can be represented propositionally as (EXPLAIN, TEACHER, CONCEPT, STUDENT). It contains one predicate (EXPLAIN) and three arguments (TEACHER, CONCEPT, STUDENT). Arguments are generally the nouns of a clause, but can also be prepositional phrases and

Macrostructure 427

even references to other propositions. For instance, the second and third propositions in a text conjoined by the connective or can be represented as (OR, 2, 3). Kintsch’s psychological ideas on texts and representations of meaning in memory and Van Dijk’s linguistic ideas on text grammars resulted in an impetus in research on discourse comprehension, first resulting in Kintsch and Van Dijk’s (1978) influential model of text comprehension. The goal of the model is to explain coherence in text. According to the model, the aim of text comprehension is the formation of microand macrostructures. In the comprehension process, meaningful text units are transformed into propositions. Take, for instance, the following extract of a text and its propositional representation: A number of interesting soccer matches between European teams took place in the early summer days of 2004. (1) (NUMBER, MATCHES) (2) (INTERESTING, MATCHES) (3) (SOCCER, MATCHES) (4) (BETWEEN, MATCHES, EUROPEAN TEAMS) (5) (TIME: IN, MATCHES SUMMER) (6) (EARLY, SUMMER) (7) (TIME: IN SUMMER, 2004)

Coherence is achieved by an overlap of the arguments in these propositions. The comprehender links the arguments of propositions in order to form a coherence graph. In Figure 1, Proposition 4 is linked to Propositions 1, 2, 3, and 5 by the argument MATCHES. Similarly, Propositions 6 and 7 are linked to 5 by the argument SUMMER. Figure 1 is the representation of the microstructure of the text. However, there are working memory constraints to linking propositions. According to the model, only a limited number of propositions can stay in one processing cycle. The processing cycle is a temporal period during which propositions are linked. Propositions that stay in multiple processing cycles, and stay longer in working memory, are more memorable. Propositions enter working memory when they are

Figure 1 Example of coherence graph of text extract.

selected by a leading-edge strategy: recent propositions and/or those higher in the hierarchy. Imagine that the processing cycle could hold only four propositions. In this example, according to the leading-edge strategy, Propositions 4, 5, 7, and 3 would move on to the next processing cycle. The macrorules deletion (those propositions whose deletions do not change the meaning of the text are deleted), generalization (sequences of propositions that can be replaced by supersets are generalized), and construction (sequences of propositions that can be replaced by a single proposition are constructed into a single proposition) are carried out on the microstructure to obtain the macrostructure. The above extract is too short to provide a complete analysis, but one can imagine the following macrorules being applied to the microstructure: Deletion: (INTERESTING, MATCHES) & (BETWEEN, MATCHES, EUROPEAN TEAMS) ! (BETWEEN, MATCHES, EUROPEAN TEAMS) Generalization: (TIME: IN SUMMER, 2004) ! (IN, 2004) Construction: (INTERESTING, MATCHES) & (BETWEEN, MATCHES, EUROPEAN TEAMS) ! (PLAY, EUROPEAN TEAMS, SOCCER)

The long-term recall of the text is based on the propositions in the macrostructure of the text. In the above example, a macroproposition like (PLAY, EUROPEAN TEAMS, SOCCER) is a good approximation of the gist of the text, could be used as the title for the text, and is likely to be remembered best. Microstructures and macrostructures as discussed thus far can be seen as rules and structures of text grammars. Van Dijk and Kintsch (1983) added to these abstract semantic representations of the text the flexibility and fallibility of the user, thereby moving away from the text itself to knowledge–text interaction. One of the problems with the Kintsch and Van Dijk (1978) model and related theories is that text representation is based entirely on the text. The role of the reader’s knowledge and the strategies the reader uses in the comprehension process are limited. In addition to a multileveled propositional text base, Van Dijk and Kintsch argue for a situational model that incorporates the reader’s interpretations – both correct and incorrect – of the text. The situational model can, for instance, explain individual differences in interpretations of texts, differences in the meaning of translations, and the fallibility and flexibility of memory. It also allows for the text being grounded. In the formation of macrostructures, sociocultural contexts (for instance, cultural

428 Macrostructure

information regarding the situation and the type of interaction) play an important role. In addition to the textual macrostructures, Van Dijk and Kintsch (1983) therefore proposed pragmatic superstructures, schematic structures similar to the rhetorical structure of the text. An example of such a superstructure is a narrative schema or an argumentative schema. Knowing the superstructure of a text allows the reader to anticipate the global organization (macrostructure) of the text. Kintsch’s construction–integration model (1988, 1998) continued and extended the Kintsch and Van Dijk (1978) and Van Dijk and Kintsch (1983) process model. In the construction–integration model, the process from text to mental representation consists of two phases. In the construction phase, an approximate mental model is locally constructed based on the text and the reader’s background knowledge. In the integration phase, this tentative mental model stabilizes by filtering out irrelevant and redundant information. An activation process spreads around the network of propositions, boosting strong links between propositions and dampening weak links in order to obtain a well-structured mental representation. The representation resulting from the integration process is then stored in the long-term memory. As with the Kintsch and Van Dijk (1978) and Van Dijk and Kintsch (1983) models, the distinction between microstructure and macrostructure is orthogonal to the distinction between text base and situation model. The text yields a text base. The text base is integrated with the background knowledge of the reader and yields the situation model. This means that two global features characterize the mental representation of a text: its macrostructure and the situational and world knowledge. Various studies have shown that the macrostructure can significantly predict recall and summarization results. Those propositions that end up in the macrostructure are remembered better than propositions in the microstructure (Graesser, 1981; Kintsch and Keenan, 1973; Kintsch et al., 1975; McKoon and Ratcliff, 1980; Meyer, 1975). Because macrostructures are higher level properties of sequences of propositions, it is hard to identify linguistic manifestations of macropropositions and their macrostructures at the surface level of the text. However, some indicators of macropropositions can be identified. Titles of texts, summaries, and topical sentences, often at the beginning or end of a paragraph, generally indicate macropropositions. Furthermore, given the properties of coherence graphs, clauses that tend to be conjoined by connectives are good candidates for macropropositions, as are sentences that contain demonstratives and pronouns.

Propositional macrostructures cannot be computed automatically from the text. One of the reasons for this is that the macrorules that can be applied to the microstructure remain underspecified. Kintsch (1998, 2002) has, however, shown how macrostructures can be derived from the text computationally using latent semantic analysis. The meaning of a sentence is represented as a vector in a high-dimensional semantic space. Those vectors that relate most to the overall text (and have the highest typicality scores) can be identified as macropropositions. See also: Coherence: Psycholinguistic Approach; Comput-

er-Supported Writing; Discourse Processing; Latent Semantic Analysis; Narrative: Linguistic and Structural Theories; Propositions; Psycholinguistics: Overview; Reading Processes in Adults; Rhetorical Structure Theory; Speech Synthesis: Perception and Comprehension; Text and Text Analysis; Thematics.

Bibliography Bierwisch M (1965). ‘Poetik und Linguistik.’ In Kreuzer H & Gunzenha¨ user R (eds.) Mathematik und Dichtung. Munich: Nymphenburger. 46–66. Bremond C (1964). ‘Le message narratif.’ Communications 4, 4–32. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Graesser A C (1981). Prose comprehension beyond the word. New York: Springer Verlag. Greimas A (1966). Se´mantique structurale: Recherches de me´thode. Paris: Larousse. Harris Z S (1968). Mathematical structures of language: Interscience tracts in pure and applied mathematics 21. New York: John Wiley & Sons. Kintsch W (1974). The representation of meaning in memory. Hillsdale, NJ: Erlbaum. Kintsch W (1988). ‘The role of knowledge in discourse comprehension: A construction–integration model.’ Psychological Review 95, 163–182. Kintsch W (1998). Comprehension: A paradigm for cognition. Cambridge: Cambridge University Press. Kintsch W (2002). ‘On the notions of theme and topic in psychological process models of text comprehension.’ In Louwerse M & van Peer W (eds.) Thematics: Interdisciplinary studies. Amsterdam: Benjamins. 157–170. Kintsch W & Keenan J M (1973). ‘Reading rate and retention as a function of the number of propositions in the base structure of sentences.’ Cognitive Psychology 5, 257–274. Kintsch W & Van Dijk T A (1978). ‘Toward a model of text comprehension and production.’ Psychological Review 85, 363–394. Kintsch W, Kozminsky E, Streby W, McKoon G & Keenan J M (1975). ‘Comprehension and recall of text as a

Madang Languages 429 function of content variables.’ Journal of Verbal Learning and Verbal Behavior 14, 196–214. Labov W & Waletzky J (1967). ‘Narrative analysis: Oral versions of personal experience.’ In Helm J (ed.) Essays on the verbal and visual arts. Seattle: University of Washington Press. 12–44. Le´ vi-Strauss C (1960). ‘La structure et la forme.’ Cahiers de I’Institut de Science E´ conomique Applique´ e 99, 3–36. McKoon G & Ratcliff R (1980). ‘Priming in item recognition: The organization of propositions in memory for text.’ Journal of Verbal Learning and Verbal Behavior 19, 369–386. Meyer B J F (1975). Organization of prose and its effect on memory. Amsterdam: North Holland.

Propp V (1968). Morphology of the folktale. Austin: University of Texas Press. [Original work published 1928.] Todorov T (1971). Poe´ tique de la prose. Paris: Seuil. Van Dijk T A (1972). Some aspects of text grammars: A study in theoretical linguistics and poetics. The Hague: Mouton. Van Dijk T A (1977). Text and context: Explorations in the semantics and pragmatics of discourse. London: Longman. Van Dijk T A (1980). Macrostructures. Hillsdale, NJ: Erlbaum. Van Dijk T A & Kintsch W (1983). Strategies of discourse comprehension. New York: Academic Press.

Madang Languages A Pawleyy, Australian National University, Canberra, ACT, Australia ! 2006 Elsevier Ltd. All rights reserved.

The Madang group, containing about 100 languages, is the largest well-defined branch of the Trans New Guinea (TNG) family, which dominates the large island of New Guinea (see Trans New Guinea Languages). The Madang subgroup occupies the central two-thirds of Madang Province in north central Papua New Guinea (see Figure 1). In the east the group’s immediate neighbors are languages of the Finisterre–Huon branch of TNG. In the high mountain valleys to the south lie the Goroka, Chimbu– Wahgi, and Engan branches and to the west are unrelated languages, members of the Lower Sepik– Ramu family. The most important innovations defining the Madang subgroup are the replacement of the Proto-TNG independent pronouns *na ‘1SG,’ *nga ‘2SG,’ and *ya ‘ 3SG’ by Proto-Madang *ya, *na and *nu, respectively (Ross, 2000). The Madang group probably broke up more than 5000 years ago, after diverging from its TNG relatives in the central highlands. This rough estimate of time depth is based chiefly on lexicostatistical agreements between languages belonging to different primary branches within Madang, which are of the order of 5–15%, lower than those between the major branches of Indo-European. The whole of Madang Province has an area smaller than the Netherlands but contains some 150 languages. Most members of the Madang group have between 500 and 2000 speakers and none has more than about 20 000. This extreme linguistic y

Deceased

fragmentation reflects both the considerable time depth of the Madang subgroup and fact that until the colonial era political units in New Guinea seldom exceeded a few hundred people. The first written records of Madang languages were made in the 1870s but to this day most of the languages are documented only by word lists and sketchy grammatical notes (Z’graggen, 1975b gives a history of research and Carrington, 1996 contains a near-exhaustive bibliography). The best-documented languages are probably Amele (Roberts, 1981, 1987, 1991), Kalam (Pawley, 1966, 1987, 1993; Lane, 1991; Pawley et al., 2000; Pawley and Bulmer, in press), and Kobon (Davies, 1980, 1981, 1985). There are detailed grammars of several other languages including Anamuxra (a.k.a. Ikundun or Anamgura) (Ingram, in press), Tauya (MacDonald, 1990) and Usan (Reesink, 1987). Much of the published comparative work on Madang languages is due to John Z’graggen (1971, 1975a, 1975b, 1980a, 1980b, 1980c, 1980d). He posited a ‘Madang–Adelbert Range subphylum’ of 98 languages which corresponds closely to the Madang group as defined here, except that Kalam and Kobon (wrongly assigned by Z’graggen following Wurm, 1975 to a putative East New Guinea Highlands microphylum) are now included in Madang and Isabi is now excluded (it belongs to the Goroka subgroup of TNG). Z’graggen also tentatively proposed an internal classification on typological and lexicostatistical grounds. Recent (and largely unpublished) comparative work using more classical subgrouping methods has led to various revisions in the subgrouping (Pawley, 1998; Pawley and Osmond, 1997; Ross, 2000). Five main branches can be distinguished based

Madang Languages 429 function of content variables.’ Journal of Verbal Learning and Verbal Behavior 14, 196–214. Labov W & Waletzky J (1967). ‘Narrative analysis: Oral versions of personal experience.’ In Helm J (ed.) Essays on the verbal and visual arts. Seattle: University of Washington Press. 12–44. Le´vi-Strauss C (1960). ‘La structure et la forme.’ Cahiers de I’Institut de Science E´conomique Applique´e 99, 3–36. McKoon G & Ratcliff R (1980). ‘Priming in item recognition: The organization of propositions in memory for text.’ Journal of Verbal Learning and Verbal Behavior 19, 369–386. Meyer B J F (1975). Organization of prose and its effect on memory. Amsterdam: North Holland.

Propp V (1968). Morphology of the folktale. Austin: University of Texas Press. [Original work published 1928.] Todorov T (1971). Poe´tique de la prose. Paris: Seuil. Van Dijk T A (1972). Some aspects of text grammars: A study in theoretical linguistics and poetics. The Hague: Mouton. Van Dijk T A (1977). Text and context: Explorations in the semantics and pragmatics of discourse. London: Longman. Van Dijk T A (1980). Macrostructures. Hillsdale, NJ: Erlbaum. Van Dijk T A & Kintsch W (1983). Strategies of discourse comprehension. New York: Academic Press.

Madang Languages A Pawley, Australian National University, Canberra, ACT, Australia ! 2006 Elsevier Ltd. All rights reserved.

The Madang group, containing about 100 languages, is the largest well-defined branch of the Trans New Guinea (TNG) family, which dominates the large island of New Guinea (see Trans New Guinea Languages). The Madang subgroup occupies the central two-thirds of Madang Province in north central Papua New Guinea (see Figure 1). In the east the group’s immediate neighbors are languages of the Finisterre–Huon branch of TNG. In the high mountain valleys to the south lie the Goroka, Chimbu– Wahgi, and Engan branches and to the west are unrelated languages, members of the Lower Sepik– Ramu family. The most important innovations defining the Madang subgroup are the replacement of the Proto-TNG independent pronouns *na ‘1SG,’ *nga ‘2SG,’ and *ya ‘ 3SG’ by Proto-Madang *ya, *na and *nu, respectively (Ross, 2000). The Madang group probably broke up more than 5000 years ago, after diverging from its TNG relatives in the central highlands. This rough estimate of time depth is based chiefly on lexicostatistical agreements between languages belonging to different primary branches within Madang, which are of the order of 5–15%, lower than those between the major branches of Indo-European. The whole of Madang Province has an area smaller than the Netherlands but contains some 150 languages. Most members of the Madang group have between 500 and 2000 speakers and none has more than about 20 000. This extreme linguistic fragmentation reflects both the considerable time

depth of the Madang subgroup and fact that until the colonial era political units in New Guinea seldom exceeded a few hundred people. The first written records of Madang languages were made in the 1870s but to this day most of the languages are documented only by word lists and sketchy grammatical notes (Z’graggen, 1975b gives a history of research and Carrington, 1996 contains a near-exhaustive bibliography). The best-documented languages are probably Amele (Roberts, 1981, 1987, 1991), Kalam (Pawley, 1966, 1987, 1993; Lane, 1991; Pawley et al., 2000; Pawley and Bulmer, in press), and Kobon (Davies, 1980, 1981, 1985). There are detailed grammars of several other languages including Anamuxra (a.k.a. Ikundun or Anamgura) (Ingram, in press), Tauya (MacDonald, 1990) and Usan (Reesink, 1987). Much of the published comparative work on Madang languages is due to John Z’graggen (1971, 1975a, 1975b, 1980a, 1980b, 1980c, 1980d). He posited a ‘Madang–Adelbert Range subphylum’ of 98 languages which corresponds closely to the Madang group as defined here, except that Kalam and Kobon (wrongly assigned by Z’graggen following Wurm, 1975 to a putative East New Guinea Highlands microphylum) are now included in Madang and Isabi is now excluded (it belongs to the Goroka subgroup of TNG). Z’graggen also tentatively proposed an internal classification on typological and lexicostatistical grounds. Recent (and largely unpublished) comparative work using more classical subgrouping methods has led to various revisions in the subgrouping (Pawley, 1998; Pawley and Osmond, 1997; Ross, 2000). Five main branches can be distinguished based

430 Madang Languages

Figure 1 Location of major subgroups of the Madang group.

on innovations in the pronouns (Ross, 2000) and other criteria: . The Rai Coast group, consisting of about 30 languages, extends along the coastal lowlands from around the mouth of the Gogol River eastwards almost to the mouth of the Mot, and in places extends inland as far south as the Ramu River. . The Croisilles group of some 50 languages subsumes the ‘Mabuso’ group and most languages of the ‘North Adelbert’ group proposed by Z’graggen (1975). Croisilles languages occupy the central Madang coast from the Gogol River north almost as far as Bogia, and cover much of the hinterland west and north of Madang town. . The South Adelbert group contains 14 languages. Twelve are centered in the South Adelbert Range north of the Ramu River. The other two, Gants and Faita, are spoken south of the Ramu in separate pockets in or close to the Bismarck Range. . Waskia and Korak, spoken on Karkar Island and on the coast just west of this, form another group. . A fifth group consists of Kalam and Kobon (each a chain of diverse dialects), spoken around the junction of the Bismark and Schrader Ranges

where Madang Province meets Western Highlands Province.

Structural Characteristics of Madang Languages Phonology

A good many Madang languages have syllables of the shape (C)V and (word finally) CVC, five vowels and between 15 and 20 consonants including series of nasals and oral and prenasalized (or voiceless and voiced) obstruents with contrasts at bilabial, apical, and velar (and often palatal) positions. Members of the South Adelbert and Kalam–Kobon groups resemble unrelated languages of the Sepik and Lower Sepik–Ramu families in making heavy use of a high central or mid central vowel which in some contexts is nonphonemic, being epenthetically inserted between consonants (Biggs, 1963; Pawley, 1966; Ingram, forthcoming). Grammar

The preferred order of constituents in verbal clauses is SOV but OVS often occurs as a marked structure.

Madang Languages 431

Adpositions follow the verb, determiners and possessors follow the noun. Case marking is generally absent or little developed. Most languages organize pronominal affixes to show a nominative–accusative/ dative contrast. Common nouns are an open class but there are several closed classes of nominal roots such as kinship terms and locatives. In Kalam and Kobon, verb roots are a small closed class of about 130 members but in most Madang languages they are more numerous and probably form an open class. Minor word classes include adjectives, adverb roots, and verbal adjuncts (see below). Many Madang languages distinguish singular, dual and plural independent pronouns in three persons. The dual and plural forms are usually distinguished by a suffix. Morphology is chiefly suffixal. In certain Madang languages, especially in the west, nouns show considerable morphological complexity, including classifying and case-marking suffixes. In others noun morphology is simple but generally kinship nouns take bound possessor pronouns. Sentence-final verbs are typically inflected for tense-aspect-mood and for subject agreement. In some languages transitive verbs also carry a pronominal prefix or proclitic marking object agreement. Dependent verbs in nonfinal clauses are typically marked for relative tense and subject or topic identity with the final verb. All languages make extensive use of at least one of the following kinds of complex (multiheaded) predicates: (i) in verbal adjunct constructions, a verb, usually carrying a rather general meaning such as ‘make,’ ‘hit,’ or ‘go,’ occurs in partnership with a noninflecting base (the adjunct), which carries more specific meaning; (ii) in serial verb constructions two or more bare verb roots occur in sequence to express a tightly integrated sequence of subevents. Kalam and Kobon allow up to eight or nine verb roots to occur in a single predicate phrase. In constructions denoting uncontrolled bodily and mental processes (e.g., sweating, sneezing, bleeding, feeling sick) a noun denoting bodily condition is, arguably, the subject. The experiencer is generally marked by an object/dative pronoun and is the direct object. Long chains of clauses are commonly used to report a sequence of past events that make up a single episode. Generally, little use of is made of conjunctions to show sequential, conditional, and causal relations. Instead, the main verb in each nonfinal clause carries a suffix which indicates (i) whether the event denoted by the medial verb occurs prior to or simultaneous with that of the final verb, and (ii) whether

that verb has the same subject or topic as the next clause. Paragraphlike boundaries are frequently marked by head-to-tail linkage, in which the last clause of the previous sentence is repeated, to begin a new episode. See also: Papua New Guinea: Language Situation; Papuan Languages; Trans New Guinea Languages.

Bibliography Biggs B (1963). ‘A non-phonemic vowel type in Karam, a ‘‘pygmy’’ language of the Schrader Mountains, central New Guinea.’ Anthropological Linguistics 5(4), 13–17. Carrington L (1996). A linguistic bibliography of New Guinea. Canberra: Pacific Linguistics. Davies J (1980). Kobon phonology. Canberra: Pacific Linguistics. Davies J (1981). Kobon. Amsterdam: North Holland. Davies J (1985). Kobon dictionary. MS. Ukarumpa, Papua New Guinea: Summer Institute of Linguistics. Foley W A (1986). The Papuan languages of New Guinea. Cambridge: Cambridge University Press. Ingram A (in press). Anamuxra: a language of Madang Province, Papua New Guinea. Canberra: Pacific Linguistics. Lane J (1991). Kalam serial verb constructions. M.A. thesis, University of Auckland. MacDonald L (1990). A grammar of Tauya. Berlin: Mouton de Gruyter. Pawley A (1966). The structure of Kalam: a grammar of a New Guinea Highlands language. Ph.D. thesis, University of Auckland. Pawley A (1987). ‘Encoding events in Kalam and English: different logics for reporting experience.’ In Tomlin R (ed.) Coherence and grounding in discourse. Amsterdam: John Benjamins. 329–360. Pawley A (1993). ‘A language that defies description by ordinary means.’ In Foley W (ed.) The role of theory in language description. Berlin: de Gruyter. 87–130. Pawley A (1998). A neogrammarian in New Guinea: searching for sound correspondences in the Middle Ramu. Printout. Department of Linguistics, Research School of Pacific and Asian Studies, Australian National University. Pawley A & Bulmer R (in press). A dictionary of Kalam with ethnographic notes. Canberra: Pacific Linguistics. Pawley A, Gi S P, Kias J & Majnep I S (2000). ‘Hunger acts on me: the grammar and semantics of bodily and mental processes in Kalam.’ In de Guzman V & Bender B (eds.) Grammatical analysis in morphology, syntax, and semantics: studies in honor of Stanley Starosta. Honolulu: University of Hawai’i Press. 153–185. Pawley A & Osmond M (1997). Proto Madang lexical reconstructions. Research School of Pacific and Asian Studies, Department of Linguistics, Australian National University.

432 Madang Languages Reesink G (1987). Structures and their functions in Usan, a Papuan language of Papua New Guinea. Amsterdam: John Benjamins. Roberts J (1981). Amele dictionary. MS. Ukarumpa, Papua New Guinea: Summer Institute of Linguistics. Roberts J (1987). Amele. London: Croom Helm. Roberts J (1991). ‘A study of the dialects of Amele.’ Language and Linguistics in Melanesia 22, 67–125. Ross M (2000). A preliminary subgrouping of the Madang languages based on pronouns. Printout. Research School of Pacific and Asian Studies, Department of Linguistics, Australian National University. Wurm S A (ed.) (1975). New Guinea area languages 3: Papuan languages and the New Guinea linguistic scene. Canberra: Pacific Linguistics. Z’graggen J (1971). Classificatory and typological studies in languages of the Madang district. Canberra: Pacific Linguistics.

Z’graggen J (1975a). ‘The Madang–Adelbert Range subphylum.’ In Wurm (ed.) 569–612. Z’graggen J (1975b). The languages of the Madang district, Papua New Guinea. Canberra: Pacific Linguistics. Z’graggen J (1980a). A comparative word list of the Rai coast languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics. Z’graggen J (1980b). A comparative word list of the Northern Adelbert Range languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics. Z’graggen J (1980c). A comparative word list of the Mabuso languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics. Z’graggen J (1980d). A comparative word list of the Southern Adelbert Range languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics.

Madsen Aarhus, Jacob (1538–1586) A Linn, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Jacob Madsen from A˚rhus in Jutland in the West of Denmark is remembered in linguistics for his pioneering study of general phonetic principles, De literis libri duo of 1586. However, like several of the great general linguists of the 16th and 17th centuries, such as Bishop John Wilkins in England, his language work was subsidiary to his principal employment and responsibilities. After eight years studying in Germany, Madsen returned to Denmark in 1574 to take up the professorship of Latin at the university in Copenhagen, to which post Greek was added the following year. In 1580 he became professor of theology, and in the last two years of his life he was rector of the university. He had earlier served as university librarian as well. He left a number of philosophical works as well as linguistic works, but much of the linguistic opus is known only from its titles (Dictionarium Latino-Danicum, Observationes Danicæ, Grammaticæ Latino-Danicæ, Etymologiæ, Syntaxis, De cognitione lingvæ) since his papers were destroyed by fire in the university library in 1728. His lexical and grammatical work prefigures that of the flourishing community of scholars of Danish of the following century, which includes Peder Syv and Erik Pontoppidan.

The first of Madsen’s Two books on the letters is subtitled De vera literarum doctrina. As was usual at the time, he did not distinguish between the spoken and written elements of the litera. After an exposition of the anatomy of the vocal organs, he worked through the vowels and consonants in Danish and the three classical languages: Latin, Greek, and Hebrew. This first book concludes with a diagram of his phonetic system, showing it to be built on a series of binary oppositions that force him, among other things, to analyze vowels as either ‘tongue vowels’ or ‘lip vowels.’ The second book, De diversæ doctrinæ incommodis, deals with a series of problems in the orthography and phonology of various modern and classical languages. In discussing the question of diphthongs and triphthongs in Romance languages, Madsen showed himself to have a good knowledge of recent linguistic work and to be a particular admirer of Petrus Ramus. The importance of De literis libri duo in the history of linguistics is that it is the first extant attempt to deal with general, rather than language-specific, phonetics, and it also provides a useful source of information about 16th-century Danish, not least that of Jutland. See also: Phonetics: Precursors to Modern Approaches; Ramus, Petrus (1515–1572); Wilkins, John (1614–1672).

432 Madang Languages Reesink G (1987). Structures and their functions in Usan, a Papuan language of Papua New Guinea. Amsterdam: John Benjamins. Roberts J (1981). Amele dictionary. MS. Ukarumpa, Papua New Guinea: Summer Institute of Linguistics. Roberts J (1987). Amele. London: Croom Helm. Roberts J (1991). ‘A study of the dialects of Amele.’ Language and Linguistics in Melanesia 22, 67–125. Ross M (2000). A preliminary subgrouping of the Madang languages based on pronouns. Printout. Research School of Pacific and Asian Studies, Department of Linguistics, Australian National University. Wurm S A (ed.) (1975). New Guinea area languages 3: Papuan languages and the New Guinea linguistic scene. Canberra: Pacific Linguistics. Z’graggen J (1971). Classificatory and typological studies in languages of the Madang district. Canberra: Pacific Linguistics.

Z’graggen J (1975a). ‘The Madang–Adelbert Range subphylum.’ In Wurm (ed.) 569–612. Z’graggen J (1975b). The languages of the Madang district, Papua New Guinea. Canberra: Pacific Linguistics. Z’graggen J (1980a). A comparative word list of the Rai coast languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics. Z’graggen J (1980b). A comparative word list of the Northern Adelbert Range languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics. Z’graggen J (1980c). A comparative word list of the Mabuso languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics. Z’graggen J (1980d). A comparative word list of the Southern Adelbert Range languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics.

Madsen Aarhus, Jacob (1538–1586) A R Linn, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.

Jacob Madsen from A˚rhus in Jutland in the West of Denmark is remembered in linguistics for his pioneering study of general phonetic principles, De literis libri duo of 1586. However, like several of the great general linguists of the 16th and 17th centuries, such as Bishop John Wilkins in England, his language work was subsidiary to his principal employment and responsibilities. After eight years studying in Germany, Madsen returned to Denmark in 1574 to take up the professorship of Latin at the university in Copenhagen, to which post Greek was added the following year. In 1580 he became professor of theology, and in the last two years of his life he was rector of the university. He had earlier served as university librarian as well. He left a number of philosophical works as well as linguistic works, but much of the linguistic opus is known only from its titles (Dictionarium Latino-Danicum, Observationes Danicæ, Grammaticæ Latino-Danicæ, Etymologiæ, Syntaxis, De cognitione lingvæ) since his papers were destroyed by fire in the university library in 1728. His lexical and grammatical work prefigures that of the flourishing community of scholars of Danish of the following century, which includes Peder Syv and Erik Pontoppidan.

The first of Madsen’s Two books on the letters is subtitled De vera literarum doctrina. As was usual at the time, he did not distinguish between the spoken and written elements of the litera. After an exposition of the anatomy of the vocal organs, he worked through the vowels and consonants in Danish and the three classical languages: Latin, Greek, and Hebrew. This first book concludes with a diagram of his phonetic system, showing it to be built on a series of binary oppositions that force him, among other things, to analyze vowels as either ‘tongue vowels’ or ‘lip vowels.’ The second book, De diversæ doctrinæ incommodis, deals with a series of problems in the orthography and phonology of various modern and classical languages. In discussing the question of diphthongs and triphthongs in Romance languages, Madsen showed himself to have a good knowledge of recent linguistic work and to be a particular admirer of Petrus Ramus. The importance of De literis libri duo in the history of linguistics is that it is the first extant attempt to deal with general, rather than language-specific, phonetics, and it also provides a useful source of information about 16th-century Danish, not least that of Jutland. See also: Phonetics: Precursors to Modern Approaches; Ramus, Petrus (1515–1572); Wilkins, John (1614–1672).

Madurese 433

Bibliography Elert C-C (1996). ‘Studiet av ljudla¨ ran i Norden fram till 1900.’ In Henriksen C, Hovdhaugen E, Karlsson F & Sigurd B (eds.) Studies in the development of linguistics in Denmark, Finland, Iceland, Norway and Sweden. Oslo: Novus Forlag. 11–30. Hens H A (1984). ‘Jacob Madsen Aarhus.’ In Cedergreen Bech S (ed.) Dansk biografisk leksikon 16: Woldbye– Aastrup. København: Gyldendal. 240–241.

Hovdhaugen E, Karlsson F, Henriksen C & Sigurd B (2000). The history of linguistics in the Nordic countries. Helsinki: Societas Scientiarum Fennica. Møller C & Skautrup P (eds.) (1930). Acta jutlandica II,3/ ¨ berIII,1: De literis libri duo [. . .] mit einer da¨ nischen U setzung nebst einer Abhandlung u¨ ber Text und Quellen von Franz Blatt. Aarhus: Trykt i stiftsbogtrykkeriet.

Madurese B Nothofer, Universita¨t Frankfurt, Frankfurt, Germany ! 2006 Elsevier Ltd. All rights reserved.

Madurese is the third most widely spoken regional language of Indonesia (after Javanese and Sundanese) and the fourth most widely spoken language in the Austronesian language family (after Malay/ Indonesian, Javanese and Sundanese). There are more than 13 million speakers from the island of Madura, from neighboring islands (Kangean Archipelago, Bawean, and Sapudi Islands), and from the northern parts of East Java that were settled by immigrants from infertile Madura. Java has the largest number of Madurese speakers (more than 6 million). Stevens (1968) distinguished two main dialect groups, Maduran and Kangean. Within the Maduran group there are three subgroups: West Madurese, with Bawean and Bangkalan dialects; Central Madurese, with Pamekasan and Sampang dialects; and East Madurese, with Sumenep and Sapudi dialects. The Sumenep dialect is regarded as standard Madurese. The Madurese dialects of East Java vary according to the origins of the speakers. Madurese is a member of the Western MalayoPolynesian subfamily, which includes the languages of western Indonesia and the Philippines. Lexicostatistically, Madurese appears to be most closely related

to Malay; it is related to a somewhat lesser degree to the major languages of Java, i.e., Sundanese and Javanese (see Dyen, 1965; Nothofer, 1975). So far, however, no satisfactory qualitative evidence has been adduced that may contribute to solving the question of the relationship of Madurese to its neighboring languages. Madurese shares with Javanese, Sundanese, Balinese, and Sasak (spoken on Lombok) the existence of speech levels, which serve to indicate the social relationships of the discourse participants. Meanings, for which there are low and high forms, are mostly connected with human beings and refer above all to body parts, body actions, clothing, and personal belongings. Pronouns also have status forms. It is generally assumed that speech levels represent a Javanese innovation and that this system and its higher forms are borrowed from Javanese.

Madurese Phonology Consonants

The Madurese consonant repertoire resembles that of other western Indonesian languages. However, Madurese has a contrast between a voiceless, voiced, and aspirated stop series. Stevens (1968) characterized these consonant series as follows: (1) voiceless stop, ‘voiceless, tense stops’, (2) voiced stop, ‘voiced,

Table 1 Consonants in the inherited lexicon of Maduresea

Stop Voiceless Voiced Aspirated Nasal Fricative Approximant a

Labial

Dental

Retroflex

Palatal

Velar

p b bh m

t d dh

t0 d0 d0 h

c j jh N

k g gh n

w

Based on Clynes (1995) and Davies (1999a).

n s r,l

y

Glottal

Madurese 433

Bibliography Elert C-C (1996). ‘Studiet av ljudla¨ran i Norden fram till 1900.’ In Henriksen C, Hovdhaugen E, Karlsson F & Sigurd B (eds.) Studies in the development of linguistics in Denmark, Finland, Iceland, Norway and Sweden. Oslo: Novus Forlag. 11–30. Hens H A (1984). ‘Jacob Madsen Aarhus.’ In Cedergreen Bech S (ed.) Dansk biografisk leksikon 16: Woldbye– Aastrup. København: Gyldendal. 240–241.

Hovdhaugen E, Karlsson F, Henriksen C & Sigurd B (2000). The history of linguistics in the Nordic countries. Helsinki: Societas Scientiarum Fennica. Møller C & Skautrup P (eds.) (1930). Acta jutlandica II,3/ ¨ berIII,1: De literis libri duo [. . .] mit einer da¨nischen U setzung nebst einer Abhandlung u¨ber Text und Quellen von Franz Blatt. Aarhus: Trykt i stiftsbogtrykkeriet.

Madurese B Nothofer, Universita¨t Frankfurt, Frankfurt, Germany ! 2006 Elsevier Ltd. All rights reserved.

Madurese is the third most widely spoken regional language of Indonesia (after Javanese and Sundanese) and the fourth most widely spoken language in the Austronesian language family (after Malay/ Indonesian, Javanese and Sundanese). There are more than 13 million speakers from the island of Madura, from neighboring islands (Kangean Archipelago, Bawean, and Sapudi Islands), and from the northern parts of East Java that were settled by immigrants from infertile Madura. Java has the largest number of Madurese speakers (more than 6 million). Stevens (1968) distinguished two main dialect groups, Maduran and Kangean. Within the Maduran group there are three subgroups: West Madurese, with Bawean and Bangkalan dialects; Central Madurese, with Pamekasan and Sampang dialects; and East Madurese, with Sumenep and Sapudi dialects. The Sumenep dialect is regarded as standard Madurese. The Madurese dialects of East Java vary according to the origins of the speakers. Madurese is a member of the Western MalayoPolynesian subfamily, which includes the languages of western Indonesia and the Philippines. Lexicostatistically, Madurese appears to be most closely related

to Malay; it is related to a somewhat lesser degree to the major languages of Java, i.e., Sundanese and Javanese (see Dyen, 1965; Nothofer, 1975). So far, however, no satisfactory qualitative evidence has been adduced that may contribute to solving the question of the relationship of Madurese to its neighboring languages. Madurese shares with Javanese, Sundanese, Balinese, and Sasak (spoken on Lombok) the existence of speech levels, which serve to indicate the social relationships of the discourse participants. Meanings, for which there are low and high forms, are mostly connected with human beings and refer above all to body parts, body actions, clothing, and personal belongings. Pronouns also have status forms. It is generally assumed that speech levels represent a Javanese innovation and that this system and its higher forms are borrowed from Javanese.

Madurese Phonology Consonants

The Madurese consonant repertoire resembles that of other western Indonesian languages. However, Madurese has a contrast between a voiceless, voiced, and aspirated stop series. Stevens (1968) characterized these consonant series as follows: (1) voiceless stop, ‘voiceless, tense stops’, (2) voiced stop, ‘voiced,

Table 1 Consonants in the inherited lexicon of Maduresea

Stop Voiceless Voiced Aspirated Nasal Fricative Approximant a

Labial

Dental

Retroflex

Palatal

Velar

p b bh m

t d dh

t0 d0 d0 h

c j jh N

k g gh n

w

Based on Clynes (1995) and Davies (1999a).

n s r,l

y

Glottal

434 Madurese Table 2 Vowels in the inherited lexicon of Madurese

/i/ ([i], [e])

Back

/ / ([i], [ ]) /a/ ([A], [a]) e

High Mid Low

Central

/u/ ([u], [o])

e

Front

lax stop’, and (3) aspirated stop, ‘voiceless stop with indifferent tension followed by strong aspiration’. Clynes (1995) suggested that the aspirated stop should rather be described as ‘lax voice’ or ‘whispery voiced’. Only Madurese and Javanese oral stops exhibit five places of articulation, both sharing a phonemic distinction between dental and retroflex (described by Stevens (1968) as ‘alveolar stop with larger area of tongue contact than dentals’) consonants. Another unusual feature of Madurese is the existence of phonemic consonant gemination. All consonants with the exception of the glottal stop also occur geminated. The consonants occurring in the inherited lexicon of Madurese are shown in Table 1 (mainly based on Clynes (1995) and Davies (1999a)). Vowels

Madurese inherited vocabulary has four vowel phonemes, each one having two allophones, which are pairings of high and low vowels (Stevens, 1968; Clynes, 1995; Davies, 1999a). The phoneme /i/ is realized as [i] or [e], /u/ as [u] or [o], /e/ as [i] or [e], and /a/ as [A] or [a] (see Table 2). In order to account for vowel allophony, Stevens (1968) established the following three categories of Madurese consonants: DH, voiced and aspirated stops; DL, voiceless stops, nasals, and intervocalic /s/; and DN, liquids, glides, / /, and morpheme-initial and final /s/. The low vowel allophones occur after DL consonants, in word-initial position, and after immediately preceding low vowels. The high vowel allophones occur following DH consonants and after immediately preceding high vowels. The DN consonants do not affect the quality of the vowel. A vowel preceding these consonants determines the quality of the following vowel. If a vowel occurs after a word-initial DN consonant, this vowel behaves as though it is wordinitial. Madurese vowel harmony results in verb forms with vowels that differ depending on whether they occur in a bare stem or in an active verb in which the initial consonant of the stem is replaced by a homorganic nasal (the prefix N- ‘active’). Examples are [melle] ‘active.buy’ vs. [billi] ‘buy’, and [napa ]

‘active.arrive at’ vs. [d0 ApA ] ‘arrive at’ (Davies, 1999a).

Morphology The major morphological processes are affixation and reduplication (see Stevens, 1968). The verbal affixes include prefixes such as a- (‘perform action indicated by root; perform action on oneself; to own, have, or use’), ta- (‘to do unintentionally’), pa- (‘causative’), and ka (‘agentless passive’). The prefix N- marks intransitive verbs (with a meaning such as ‘agentless action; reflexive action; be like; be in location’) and transitive ‘active’ verbs, whereas ‘passive’ verbs are marked by i-. A verbal circumfix is ka–an ‘be affected by’. Verbal suffixes are -a (‘future, conditional, wished for, possible’), -aghi (‘treat like, use object as instrument, perform action with, perform action for, make the object be’), and -i (‘plural, causative’). Nominal affixes include pa- (which derives action nouns from intransitive verbs), paN- (‘agent, instrument, result of action’), pa–an (‘location, agent, instrument’), and -an (‘result of action, that which is affected by action, location of action’). There are three kinds of reduplication: reduplication of the final syllable, total reduplication, and reduplication of the first syllable. The usual meanings with verbs are ‘repetition or frequency of action; no specified object or goal’; with nouns the usual meanings are ‘plural; groups of objects; instrument used to perform action’.

Writing System Madurese used to be written in a script derived from Javanese script (hanacaraka) that originated from the Pallava script of southern India. Today, Latin orthography is common. See also: Austronesian Languages: Overview; Indonesia: Language Situation; Javanese; Malay.

Bibliography Blust R A (1981). ‘The reconstruction of Proto-MalayoJavanic: an appreciation.’ Bijdragen tot de Taal-, Landen Volkenkunde 137, 456–469. Clynes A (1995). ‘Madurese.’ In Tryon D T (ed.) Comparative Austronesian dictionary. An introduction to Austronesian studies, (5 vols): part 1: fascicle 1. Berlin, New York: Mouton de Gruyter. 485–494. Cohn A C & Lockwood K (1994). ‘A phonetic description of Madurese and its phonological implications.’ Working Papers of the Cornell Phonetics Laboratory 9, 67–92.

Madvig, Johann Nicolai (1804–1886) 435 Davies W D (1999a). Madurese. Languages of the world/ materials 184. Mu¨ nchen and Newcastle: LINCOM. Davies W D (1999b). ‘Madurese and Javanese as strict word order languages.’ Oceanic Linguistics 38, 152–167. Dyen I (1965). A lexicostatistical classification of the Austronesian languages. IUPAL memoir 19; supplement to IJAL 25. Baltimore: Waverly Press. Kiliaan H N (1897). Madoereesche spraakkunst (2 vols). Batavia (2nd edition, Semarang: Benjamins, 1911). Kiliaan H N (1904). Madoereesch-Nederlandsch woordenboek. Leiden: Brill.

Nothofer B (1975). The reconstruction of Proto-MalayoJavanic. Verhandelingen van het Koninklijk Instituut voor Taal-, Land- en Volkenkunde 73. The Hague: Nijhoff. Penninga P & Hendriks H (1913). Practisch MadoereeschHollandsch woordenboek. Semarang: Van Dorp. Stevens A M (1965). ‘Language levels in Madurese.’ Language 41, 294–302. Stevens A M (1968). Madurese phonology and morphology. American Oriental Series 52. New Haven: American Oriental Society.

Madvig, Johann Nicolai (1804–1886) D Haug, University of Oslo, Oslo, Norway ! 2006 Elsevier Ltd. All rights reserved.

Madvig was a Danish classical philologist and politician. After short, but intensive studies of Greek and Latin at the University of Copenhagen, he was appointed lektor (‘reader’) of classical philology there in 1828. After the completion of his doctorate on an antique commentary on Cicero, he was appointed professor of Latin in 1829 and from 1851 he was professor of classical philology, which he remained until his retirement in 1879. He was also an active politician between 1848 and 1874, and served as Minister of Cultural Affairs from 1848 to 1851, during which time he reformed the Danish educational system. As an editor, Madvig is most well known for his activities as a text editor, especially of Cicero, and as a conjectural critic. But he also published two very successful grammars of Latin and Greek, which were translated into many languages, and throughout his life he was interested in and worked on questions of general linguistics. Madvig maintained that no language was in itself superior to any other. To his mind, the importance of the classical languages lay in the role they played in a crucial period in the evolution of European culture, not in any intrinsic superiority of Greek or Latin. He was critical of any attempt to view language systems as reflections of the thought and ideas of the peoples who speak them, thus opposing an idea that was influential in speculative grammar as shaped by the German romantics, notably by Wilhelm von Humboldt. Instead, he focused on language as a means for communication, succinctly stating (Madvig,

1971: 157): ‘‘Die Sprache heißt der redende, verstanden sein wollende Mensch.’’ This opinion was at odds with the view of language as an independent organism that was predominant for much of the 19th century. As a classical philologist, he was particularly interested in syntax. He opposed the dominance of etymology in the linguistics of his time and stressed the legitimacy and the importance of a synchronic approach to language. These views brought him in conflict with the neogrammarians as can be seen from Brugmann’s rather condescending review of his linguistic articles (Literarisches Centralblatt, 1876). According to Madvig (1971: 86), the origin of language cannot contradict the presence and life of language, its form of existence. Thus, it can be argued that Madvig introduced uniformitarianism in linguistics, a credit that usually is given to Whitney. In general, Madvig’s linguistic theories are close to Whitney’s, as he discovered to his dismay on the publication of Whitney’s Language and the study of language (London, 1867). He himself had published his linguistic papers in Danish and they remained virtually unknown until they were published in German in 1875, i.e., after Whitney’s book. See also: Whitney, William Dwight (1827–1894).

Bibliography Hauger B (1994). Johan Nicolai Madvig: the language theory of a classical philologist. Mu¨ nster: Nodus Publikationen. Madvig J N (1841). Latinsk sproglære til skolebrug. Kjøbenhavn: O. Siesbye. [First English translation: Latin grammar for schools, Oxford: J. H. Parker, 1849.] Madvig J N (1846). Græsk ordføiningslære, især for den attiske sprogform. Kjøbenhavn: C. A. Reitzel. [First

Madvig, Johann Nicolai (1804–1886) 435 Davies W D (1999a). Madurese. Languages of the world/ materials 184. Mu¨nchen and Newcastle: LINCOM. Davies W D (1999b). ‘Madurese and Javanese as strict word order languages.’ Oceanic Linguistics 38, 152–167. Dyen I (1965). A lexicostatistical classification of the Austronesian languages. IUPAL memoir 19; supplement to IJAL 25. Baltimore: Waverly Press. Kiliaan H N (1897). Madoereesche spraakkunst (2 vols). Batavia (2nd edition, Semarang: Benjamins, 1911). Kiliaan H N (1904). Madoereesch-Nederlandsch woordenboek. Leiden: Brill.

Nothofer B (1975). The reconstruction of Proto-MalayoJavanic. Verhandelingen van het Koninklijk Instituut voor Taal-, Land- en Volkenkunde 73. The Hague: Nijhoff. Penninga P & Hendriks H (1913). Practisch MadoereeschHollandsch woordenboek. Semarang: Van Dorp. Stevens A M (1965). ‘Language levels in Madurese.’ Language 41, 294–302. Stevens A M (1968). Madurese phonology and morphology. American Oriental Series 52. New Haven: American Oriental Society.

Madvig, Johann Nicolai (1804–1886) D Haug, University of Oslo, Oslo, Norway ! 2006 Elsevier Ltd. All rights reserved.

Madvig was a Danish classical philologist and politician. After short, but intensive studies of Greek and Latin at the University of Copenhagen, he was appointed lektor (‘reader’) of classical philology there in 1828. After the completion of his doctorate on an antique commentary on Cicero, he was appointed professor of Latin in 1829 and from 1851 he was professor of classical philology, which he remained until his retirement in 1879. He was also an active politician between 1848 and 1874, and served as Minister of Cultural Affairs from 1848 to 1851, during which time he reformed the Danish educational system. As an editor, Madvig is most well known for his activities as a text editor, especially of Cicero, and as a conjectural critic. But he also published two very successful grammars of Latin and Greek, which were translated into many languages, and throughout his life he was interested in and worked on questions of general linguistics. Madvig maintained that no language was in itself superior to any other. To his mind, the importance of the classical languages lay in the role they played in a crucial period in the evolution of European culture, not in any intrinsic superiority of Greek or Latin. He was critical of any attempt to view language systems as reflections of the thought and ideas of the peoples who speak them, thus opposing an idea that was influential in speculative grammar as shaped by the German romantics, notably by Wilhelm von Humboldt. Instead, he focused on language as a means for communication, succinctly stating (Madvig,

1971: 157): ‘‘Die Sprache heißt der redende, verstanden sein wollende Mensch.’’ This opinion was at odds with the view of language as an independent organism that was predominant for much of the 19th century. As a classical philologist, he was particularly interested in syntax. He opposed the dominance of etymology in the linguistics of his time and stressed the legitimacy and the importance of a synchronic approach to language. These views brought him in conflict with the neogrammarians as can be seen from Brugmann’s rather condescending review of his linguistic articles (Literarisches Centralblatt, 1876). According to Madvig (1971: 86), the origin of language cannot contradict the presence and life of language, its form of existence. Thus, it can be argued that Madvig introduced uniformitarianism in linguistics, a credit that usually is given to Whitney. In general, Madvig’s linguistic theories are close to Whitney’s, as he discovered to his dismay on the publication of Whitney’s Language and the study of language (London, 1867). He himself had published his linguistic papers in Danish and they remained virtually unknown until they were published in German in 1875, i.e., after Whitney’s book. See also: Whitney, William Dwight (1827–1894).

Bibliography Hauger B (1994). Johan Nicolai Madvig: the language theory of a classical philologist. Mu¨nster: Nodus Publikationen. Madvig J N (1841). Latinsk sproglære til skolebrug. Kjøbenhavn: O. Siesbye. [First English translation: Latin grammar for schools, Oxford: J. H. Parker, 1849.] Madvig J N (1846). Græsk ordføiningslære, især for den attiske sprogform. Kjøbenhavn: C. A. Reitzel. [First

436 Madvig, Johann Nicolai (1804–1886) English translation: Greek Syntax, London: Rivingtons, 1853.] Madvig J N (1971). Sprachtheoretische abhandlungen. Kopenhagen: Munksgaard.

Jensen P J (1981). J. N. Madvig. Nicolet A (trans.). Odense: Odense University Press. Spang-Hanssen E (1966). J. N. Madvig-bibliografi. København: Det kongelige bibliotek.

Maffei, Angelus Francis Xavier (1844–1899) M Almeida, Thomas Stephens Konknni Kendr, Goa, India ! 2006 Elsevier Ltd. All rights reserved.

Angelo Maffei was born on November 12, 1844 at Pinzolo in the Rendena valley of the Alps along the border between Italy and Austria. After classical studies at the gymnasium in Trent, he completed his doctorate in philosophy and doctorate in theology from the Gregorian University, Rome. After being ordained a priest he joined the Jesuit order in 1871, and taught philosophy and theology in Tyrol, Albania, and Dalmatia. In 1878 he came to India as one of the nine pioneering Jesuits to the new Mangalore mission. Maffei worked as an educationist and linguist. He died in 1899 in a remote mission station near Mangalore. Maffei specialized in the Konkani language of Mangalore. In addition to Konkani, he learned English, Kannada, and Tulu, languages also spoken in Mangalore. In the opinion of his companions he had a flare for languages and a fantastic memory. He was learning Sanskrit, and even tried to learn Malayalam and Tamil when he was in Kerala for a short time. In 1882 Maffei published his first Konkani grammar in English. Though he went about it quite scientifically, he was aware that it was somewhat tentative, as he gave corrections and improvements at the end of the book itself. Maffei’s second grammar of 1892, however, is far more compact and concise. The grammarian here was more orderly and sure about his presentation. In 1883 Maffei published two Konkani dictionaries to help his confre`res learn Konkani. The English– Konkani dictionary used the Roman alphabet with diacritics of the Lepsius system to write the Konkani

words. The Konkani–English dictionary is smaller, and has the lexemes written in Kannada characters, followed by the Romanized version. In his two grammars and two dictionaries Maffei constantly kept the Konkani learner in mind; they are targeted at the learner who knows English. For this same reason he gave information on idioms, sayings, and proverbs current among Konkani speakers. His vade mecum for confessors, published in 1891, contains specialized Konkani vocabulary to help priests using Konkani. Maffei’s grammars and dictionaries are used even today by learners of Konkani in Karnataka state.

Bibliography Almeida M (1988). ‘Angelo Maffei and the Konkani language.’ Indica 25. Bombay: Heras Institute of Indian History and Culture. 143–150. Coelho J (1977). Restless for Christ. Series II. Mangalore: Karnataka Province. Maffei A F X (1882). A Konkani grammar. Mangalore: Basel Mission Book & Tract Depository. Maffei A F X (1883). An English–Konkani dictionary. (reprinted in 1983) New Delhi: Asian Educational Services. Maffei A F X (1883). A Konkani–English dictionary. (reprinted in 1983) New Delhi: Asian Educational Services. Maffei A F X (1891). The confessor’s Konkani vade mecum. Mangalore: Codialboil Press. Maffei A F X (1892). Konknni ranantlo sobit sundor tallo or a sweet voice from the Konkani desert. Mangalore: Codialboil Press. Moore J (1905). The history of the diocese of Mangalore. Mangalore: Codialboil Press. Rosario R (1981). ‘Jezuche sabhecho Anjelo Maffei.’ Amar Konkani, 1. Mangalore: Institute of Konkani. 14–17.

436 Madvig, Johann Nicolai (1804–1886) English translation: Greek Syntax, London: Rivingtons, 1853.] Madvig J N (1971). Sprachtheoretische abhandlungen. Kopenhagen: Munksgaard.

Jensen P J (1981). J. N. Madvig. Nicolet A (trans.). Odense: Odense University Press. Spang-Hanssen E (1966). J. N. Madvig-bibliografi. København: Det kongelige bibliotek.

Maffei, Angelus Francis Xavier (1844–1899) M Almeida, Thomas Stephens Konknni Kendr, Goa, India ! 2006 Elsevier Ltd. All rights reserved.

Angelo Maffei was born on November 12, 1844 at Pinzolo in the Rendena valley of the Alps along the border between Italy and Austria. After classical studies at the gymnasium in Trent, he completed his doctorate in philosophy and doctorate in theology from the Gregorian University, Rome. After being ordained a priest he joined the Jesuit order in 1871, and taught philosophy and theology in Tyrol, Albania, and Dalmatia. In 1878 he came to India as one of the nine pioneering Jesuits to the new Mangalore mission. Maffei worked as an educationist and linguist. He died in 1899 in a remote mission station near Mangalore. Maffei specialized in the Konkani language of Mangalore. In addition to Konkani, he learned English, Kannada, and Tulu, languages also spoken in Mangalore. In the opinion of his companions he had a flare for languages and a fantastic memory. He was learning Sanskrit, and even tried to learn Malayalam and Tamil when he was in Kerala for a short time. In 1882 Maffei published his first Konkani grammar in English. Though he went about it quite scientifically, he was aware that it was somewhat tentative, as he gave corrections and improvements at the end of the book itself. Maffei’s second grammar of 1892, however, is far more compact and concise. The grammarian here was more orderly and sure about his presentation. In 1883 Maffei published two Konkani dictionaries to help his confre`res learn Konkani. The English– Konkani dictionary used the Roman alphabet with diacritics of the Lepsius system to write the Konkani

words. The Konkani–English dictionary is smaller, and has the lexemes written in Kannada characters, followed by the Romanized version. In his two grammars and two dictionaries Maffei constantly kept the Konkani learner in mind; they are targeted at the learner who knows English. For this same reason he gave information on idioms, sayings, and proverbs current among Konkani speakers. His vade mecum for confessors, published in 1891, contains specialized Konkani vocabulary to help priests using Konkani. Maffei’s grammars and dictionaries are used even today by learners of Konkani in Karnataka state.

Bibliography Almeida M (1988). ‘Angelo Maffei and the Konkani language.’ Indica 25. Bombay: Heras Institute of Indian History and Culture. 143–150. Coelho J (1977). Restless for Christ. Series II. Mangalore: Karnataka Province. Maffei A F X (1882). A Konkani grammar. Mangalore: Basel Mission Book & Tract Depository. Maffei A F X (1883). An English–Konkani dictionary. (reprinted in 1983) New Delhi: Asian Educational Services. Maffei A F X (1883). A Konkani–English dictionary. (reprinted in 1983) New Delhi: Asian Educational Services. Maffei A F X (1891). The confessor’s Konkani vade mecum. Mangalore: Codialboil Press. Maffei A F X (1892). Konknni ranantlo sobit sundor tallo or a sweet voice from the Konkani desert. Mangalore: Codialboil Press. Moore J (1905). The history of the diocese of Mangalore. Mangalore: Codialboil Press. Rosario R (1981). ‘Jezuche sabhecho Anjelo Maffei.’ Amar Konkani, 1. Mangalore: Institute of Konkani. 14–17.

Magnetoencephalography 437

Magnetoencephalography A C Papanicolaou, University of Texas – Health Science Center at Houston, Houston, TX, USA P G Simos, University of Crete, Crete, Greece S Sarkari, University of Texas – Health Science Center at Houston, Houston, TX, USA ! 2006 Elsevier Ltd. All rights reserved.

Magnetoencephalography (MEG) is a totally noninvasive imaging method that allows investigation of cortical dynamics in real time, offering significant advantages over functional brain imaging techniques that rely on hemodynamic measures such as positron– emission tomography and functional magnetic resonance imaging (MRI). MEG involves the measurement of neuromagnetic signals emanating from the brain (for a detailed discussion, see Papanicolaou, 1998). Most of the magnetic activity measurable outside the head originates from the intracellular current flow associated with postsynaptic electrical events in the long apical dendrites of cortical pyramidal ells. Pyramidal neurons comprise nearly 70% of neocortical neurons, and their long dendritic processes are oriented perpendicular to the cortical surface. As electricity flows along the length of the dendrite, it forms an electric dipole: a pair of electric charges or magnetic poles of equal magnitude but opposite polarity, separated by a small distance. An advantage of MEG over electroencephalography (and event-related potentials) is that the neuromagnetic signal penetrates the skull and tissues without significant distortion. Magnetic signals

associated with bioelectrical activity are very weak, and special techniques have to be employed to discriminate them from extraneous magnetic fields (noise). The instrument used to measure the signals is called a neuromagnetometer, equipped with an array of magnetic sensors, each coupled to a special, low-noise amplifier. State-of-the-art systems are equipped with a minimum of 200 magnetic sensors arranged to cover the entire head. The neuromagnetometer is placed over a person’s head, and recording takes place inside a magnetically shielded room designed to reduce extraneous magnetic fields. During the MEG recording session, the phenomenon that exemplifies the function under investigation is repeated several times while the magnetic flux around the head is being sampled at regular intervals (typically every 4 ms for cognitive and language studies). An external stimulus is invariably presented at each instance in order to either induce the phenomenon (in the case of receptive language functions) or to act as a time cue (in the case of expressive language functions). Each segment of recorded activity, beginning a few milliseconds before and extending several hundred milliseconds after each repetition of the stimulus, is stored separately as an MEG epoch. One such epoch is stored for each magnetic sensor. The data are filtered and averaged to enhance quality and to remove components of the recorded magnetic flux caused by extraneous sources (mechanical and biological artifacts). Averaging involves computing the magnetic flux across several epochs. The averaged magnetic response emerges as a waveform, or a time-series of magnetic flux measurements at each

Figure 1 Left: Averaged event-related magnetic response to the spoken word ‘quiet.’ Stimulus onset (set at 0 ms) determines the onset of the response, which consists of an early (!50–200 ms) and a late portion (200–800 ms). Right: Surface distribution of magnetic flux in the form of a contour map measured at 500 ms after stimulus onset over the left hemisphere. Solid circles represent the array of magnetic sensors. ERF, event-related field.

438 Magnetoencephalography

Figure 2 Right: Axial MRI slice displaying activity sources computed during a visual word recognition task. Three distinct cortical areas showed transient increases in neurophysiological activity and were successfully modeled by a series of contiguous current dipoles, one every 4 ms (small white circles). Left: Plots of four complementary measures of the underlying activity in each region. The instantaneous strength of regional activity is indexed by two measures: current moment (Q, or estimated magnitude of the electrical current produced every 4 ms by the population of neurons that showed elevated levels of activity in a given region) and global field power (RMS), which is a direct measure of the magnitude of magnetic flux produced by that electrical current. The temporal aspect of activation is also indexed by two measures: onset latency, which corresponds to the earliest time after stimulus onset when magnetic activity was reliably detected in a given region, and duration of activity, indicated by the number of consecutive 4-ms magnetic activity sources localized in that region. Although correlations among Q, RMS, and duration-of-activity measures are high, duration of activity has the highest degree of empirically established concurrent validity (see text for more details) nA-m ¼ nanoAmperes per meter; fT ¼ femtoTesla.

recording site around the surface of a person’s head, as shown in Figure 1. The resulting averaged eventrelated field consists of early components (50–200 ms after stimulus onset), which correspond to activation of the sensory cortex specific to the modality of stimulus (visual or auditory), and late components (200– ~800 ms after stimulus onset), which reflect activation of the association cortex or higher functions. The surface distribution of magnetic flux at various times can be reconstructed, resulting in a contour map like the one shown on the right-hand side of Figure 1. On the basis of the surface flux distribution, the position and strength of the brain source that produced it can be estimated. This final step consists of the application of a mathematical algorithm that considers the intracranial activity sources as equivalent to electrical current dipoles (Papanicolaou, 1998). Once estimates of the coordinates of the underlying electrical source in the MEG coordinate system are made, they are registered on a set of

anatomical images of the brain (from MRI) in order to identify the anatomical location of each source. The procedures involved in reconstructing cortical activity maps based on MEG data are collectively referred to as magnetic source imaging. Activity profiles consist of a spatial (anatomical) and a temporal component, providing information about where in the brain elevated levels of neurophysiological activity take place and when this activity occurs in relation to stimulus onset. Neurophysiological activation in each brain region where magnetic activity sources are found can be described using four complementary measures (see Figure 2). Empirical evidence demonstrates that brain activation profiles obtained with MEG represent reliable and valid representations of the brain mechanisms responsible for language functions. MEG affords unique insights into the inner workings of these mechanisms in real time, and this information is at least comparable to the data provided by traditional

Magnetoencephalography 439

Figure 3 A coronal and a parasagittal section of a typical normal participant showing left hemisphere dominance for receptive language functions, and the location of Wernicke’s area in the left hemisphere. Each white circle represents one of several temporally contiguous brain activity sources that account for the late (>200 ms after stimulus onset) portions of the brain response to words presented auditorily. Dark circles indicate the location of late magnetic activity sources obtained in response to printed words. Note the overlap of the two activation maps associated with word recognition in the vicinity of Wernicke’s area (squares). Activity in Broca’s area is found mainly during the (silent) reading task (ellipsoid), presumably reflecting articulatory recoding processes.

research methods in neurolinguistics, namely lesion studies, the Wada technique, and electrocortical stimulation mapping. Several studies establish the concurrent validity of MEG activation and data analysis protocols designed to examine the basic outline of the brain circuits responsible for receptive and expressive language functions.

Validity of Brain Activation Profiles The main features of the cortical maps common to spoken and written word recognition and comprehension are shown in Figure 3. These are, first, the greater degree of activation of the left perisylvian region, attesting to the well-documented left hemisphere dominance for language in the overwhelming majority of neurologically intact individuals, both children and adults (between 90% and 95% across studies). These MEG-derived hemispheric asymmetries in the degree of regional activation are also in excellent agreement with the results of a standard invasive procedure used clinically to determine hemispheric dominance (the Wada test) in large consecutive patient series (Maestu et al., 2002; Papanicolaou et al., 2004). Interestingly, the closest agreement with the Wada technique is noted when measures of duration of neurophysiological activity are used. Measures of the strength of neurophysiological activity in each language-related region show hemispheric asymmetries in the same direction and approximate degree as duration but are not as good predictors of hemispheric dominance as the Wada test is (Papanicolaou et al., 2004). A second invariable feature of profiles obtained in the context of receptive language tasks is activation of the posterior portion of the superior temporal

gyrus, often extending into the banks of the superior temporal sulcus and the supramarginal gyrus. This finding is in agreement with prior knowledge (and independent of functional imaging) regarding the location of Wernicke’s area, specialized for highlevel analysis of verbal input (Boatman et al., 1995; Binder et al., 2000). Activation of this region is found during both listening and reading tasks. Activation of the middle temporal gyrus and medial temporal lobe structures is observed mainly when verbal stimuli, whether auditory or visual, are meaningful (pronounceable nonwords result in significantly reduced activation of these regions). Direct tests of concurrent validity showed near perfect agreement between MEG-derived maps of language-specific activity within the dominant hemisphere and the results of direct electrocortical stimulation regarding the precise location and extent of receptive and expressive language cortex (Wernicke’s and Broca’s areas; Castillo et al., 2001; Simos et al., 1999). These results are important in demonstrating that electrical inactivation of cortical regions that reliably show increased levels of activity associated with a particular linguistic function significantly impairs that function. In addition to providing the means for external validation, this approach is generally very promising as an adjunct to any noninvasive functional imaging method but is difficult to implement mainly due to time constraints and safety considerations associated with direct cortical stimulation studies (see for instance Simos, Breier et al., 2002). Finally, performance of tasks that exemplify expressive language functions (overt or silent reading, picture naming, and word generation or fluency tasks) is invariably associated with activation in posterior prefrontal areas of the dominant hemisphere

440 Magnetoencephalography

(in the vicinity of Broca’s area; Billingsley et al., in press; Castillo et al., 2001; Kober et al., 2001; Salmelin et al., 1994). Typically, activity in Wernicke’s area is observed within the first 200–500 ms after stimulus onset, followed by activity in Broca’s area, which in turn precedes activity in motor cortices.

Spatial Resolution While the temporal resolution of MEG is by definition adequate for mapping the temporal course of regional activation in real time, the degree of spatial (anatomical) resolution of the technique had to be empirically established. Spatial resolution of MEG is primarily constrained by the signal-to-noise ratio of the recordings, the suitability of the activity-source model used to convert the raw data into functional images, and finally the accuracy of the coregistration of these images onto the participant’s MRI of the brain. Under ideal conditions the localization accuracy for dipolar current sources is on the order of 4–5 mm (Kwon et al., 2002). Under realistic conditions, the estimated spatial accuracy of MEG is only slightly worse and therefore adequate for neurolinguistic investigations, as attested by the following facts. First, intraparticipant reproducibility for the anatomical locations of the sources of both early (50–200 ms) and late (200–800 ms) components of the brain magnetic responses originating in primary and association auditory cortices, respectively, is in the millimeter range (Roberts et al., 2000; Simos et al., in press). This figure is well within the spatial resolution limits of invasive brain-mapping procedures, which are considered the gold standard in the field of neurolinguistics (i.e., less than 8 mm in all cases; Lesser et al., 1994; Ojemann and Whitaker, 1978). Second, MEG data agree with prior knowledge regarding the intrinsic organization of primary somatosensory sensory cortex (somatotopy) (Tecchio et al., 1998) motor (Beisteiner et al., 2004) and primary auditory cortex (tonotopy) (Pantev et al., 1995).

Functional Brain Plasticity Accumulating evidence indicates that the brain, in the presence of lesions, especially those that have an early onset, is capable of a substantial degree and extent of functional reorganization (e.g., Haglund et al., 1994). Until recently, however, hard evidence to support this claim was restricted to data from invasive studies of patients scheduled to undergo brain surgery. Corroborating evidence to support this claim has been obtained noninvasively using MEG, along with other functional imaging methods (Hertz-Pannier et al.,

2002), as part of a presurgical evaluation protocol for patients with intractable seizure disorder. The results indicate that the proportion of patients who are not left hemisphere dominant for language is significantly higher in the presence of a seizure disorder with left temporal lobe onset compared with patients who have space-occupying lesions in the left hemisphere, or normal controls (Pataraia et al., 2004). This group of patients demonstrates a partial shift of the location of Wernicke’s area within the left (dominant) hemisphere.

Temporal Resolution As mentioned in previous paragraphs, a unique feature of MEG compared with other noninvasive functional brain imaging techniques is its temporal resolution. Several findings afford remarkable insights into the temporal dynamics of cortical activation associated with the processing of verbal stimuli. Temporal Dynamics of Hemispheric Asymmetries for Simple Language Functions

The most pronounced and consistent hemispheric asymmetries are noted for the duration of late magnetic activity, a finding that was originally demonstrated for word recognition and semantic categorization tasks (Simos et al., 1998). Smaller yet significant hemispheric asymmetries are also found in the context of simple phonetic discrimination tasks (Papanicolaou et al., 2003; Eulitz et al., 1995; Roberts et al., 2000). Figure 4 summarizes results from two experiments demonstrating the temporal pattern of hemispheric asymmetries for the duration of activity in auditory association cortex (including Wernicke’s area). Reading and Dyslexia

Another area that has been the subject of many MEG studies is reading. While several studies have focused on the components of the brain mechanism that supports reading in fluent adult readers, others have tried to uncover distinct aberrant features of this mechanism in struggling young and older readers. The majority of studies have examined single-word processing, and only recently have investigators begun to look at words in context, as explained in more detail below. Whenever possible, integration of MEG data with the results of direct electrical interference with cortical function and correlations between MEG and individual performance data have been used to determine the nature of the task-specific operation in which a particular activated brain area is involved (Simos et al., 2002). The general outline of the activation profile specific to reading features initial bilateral activation of

Magnetoencephalography 441

Figure 4 Time plot of hemispheric differences in the duration of neurophysiological activity in the posterior portion of the superior temporal gyrus during a phonetic discrimination task (upper panel) and a word recognition task (lower panel). (Data from Papanicolaou et al., 2003 and Valaki et al., 2004, respectively.)

occipital regions, followed by activity in left lateral and basal occipito–temporal cortices. This activity occurs reliably between 130 and 180 ms after stimulus onset and is stronger for regular printed material (letters, syllables, or words) than for strings of meaningless symbols of the same length (Tarkiainen et al., 1999). Next, activity is noted bilaterally in the vicinity of Wernicke’s area and in the middle temporal gyrus. Finally, magnetic activity sources are found near Broca’s area in the inferior frontal gyrus (Simos et al., 2001). While evidence regarding the role of inferior frontal cortices in reading is far from conclusive, it suggests that this area is involved in the articulatory recoding of print, i.e., analysis of printed stimuli according to their corresponding articulatory representations (Pugh et al., 2000). Reading of meaningful items (real words) entails a higher degree of activation of the left middle temporal gyrus, whereas reading meaningless but pronounceable letter strings (e.g., WOTE) is associated with reduced activation in this region and stronger activity in Wernicke’s area (Wydell et al., 2003). The critical role of the superior temporal gyrus for phonological decoding is supported by the finding that transient electrical interference with neural function in this region results in severe phonological processing

deficits, including an inability to read pseudowords aloud (Simos et al., 2002). In contrast, the ability to read real words aloud was spared in all cases. MEG studies employing lexical decision tasks provide support for the view that the early stages of silent, rapid word recognition do not necessarily involve phonological processing or articulatory recoding. In this context, cortical activity along the banks of the superior temporal sulcus (extending into the middle temporal gyrus) precedes activity in the superior temporal gyrus and posterior–lateral frontal cortices (see Figure 5). The studies reviewed above are important for understanding the dynamic performance of the brain circuit responsible for recognizing printed words as unique linguistic entities, a function that invariably involves lexical access. It is generally held that comprehension of word meaning, which involves access to one or more semantic representations associated with that word, takes place automatically. Few studies have studied the spatiotemporal profile of brain activity that is uniquely associated with semantic access. These studies have examined comprehension of printed words presented in syntactic context, usually words that occur at the end of sentences. Results clearly indicate that semantic access regularly takes

442 Magnetoencephalography

Figure 5 A representative participant’s spatiotemporal activation profile in the left hemisphere, elicited in response to word targets in a lexical decision task. The timing (onset and duration) of regional neurophysiological activity in each of the five regions is shown in the graph. The onset of stimuli was at 0 ms, and the participant’s mean reaction time was 480 ms (both indicated by vertical arrows on the time line). Abbreviations: Occipital, primary visual cortex; STSp, posterior superior temporal sulcus; STSa, cortex along the anterior banks of the superior temporal sulcus; STG, superior temporal gyrus; frontal, prefrontal cortex.

place within the first 250 ms after stimulus onset (Helenius et al., 1998; Simos et al., 1997) and is marked by neurophysiological activity along the banks of the left superior temporal sulcus, in agreement with the single-word data presented above. MEG studies have been particularly useful in the study of development. Given that the technique is much more child-friendly than any other functional brain imaging method, children as young as 5 years can be tested with very low attrition rates. An important conclusion that stems from these studies is that the essential features of the reading-specific activation profile, described in the preceding paragraph, are in place early in the course of normal development (i.e., at least as early as the end of kindergarten; Simos et al., in press). Longitudinal data demonstrate, however, remarkable individual differences, even among children who eventually develop normal reading skills, suggesting that recruitment of brain areas already specialized for linguistic operations (primarily Wernicke’s and Broca’s areas) occurs naturally as

Figure 6 MEG scans from a 9-year-old boy who never experienced difficulty in learning to read (top set of images) and a poor reader (middle set of images) during performance of a printed word recognition task. Note the clear preponderance of activity sources in the vicinity of Wernicke’s area in the left hemisphere (Lt) in the proficient child and in homotopic right hemisphere (Rt) in the poor reader. The poor reader also showed pronounced activity bilaterally in inferior prefrontal cortices (in the vicinity of Broca’s area). Bottom panel: Activation maps from the same poor reader after successful completion of an 8-week intensive intervention program. Note the dramatic increase in Wernicke’s activation associated with significant improvement in phonological decoding and word recognition ability (solid arrow).

children learn the alphabetic principle. Considerable changes take place during systematic instruction, especially in the speed of engagement of complex visual processing areas believed to play a critical role in graphemic analysis, and in the relative timing of regional activity. In addition to establishing the normative activation profile associated with reading, distinct brain activation profiles associated with word recognition and phonological decoding have been found in children with severe reading disability. As shown on the upper and middle panels of Figure 6, these profiles consistently differentiated children diagnosed with severe reading disability from children without reading problems. The most consistent features of the individual aberrant profiles were the lack of activation of left temporo–parietal regions (mainly Wernicke’s area), coupled with pronounced activation of the homotopic right hemisphere regions (Simos, Papanicolaou et al., 2000; Simos, Breier et al., 2000). Compensatory increase in posterior prefrontal activity is also found bilaterally. This finding supports the belief that following visual

Magnetoencephalography 443

processing of printed stimuli (complex visual processing areas), disabled readers do not use the left temporo–parietal region for phonological processing of the visual symbols but rather ineffectively employ the corresponding right hemisphere region. Increased prefrontal activity presumably reflects reliance on articulatory recoding strategies to support phonological decoding and word recognition. Longitudinal data further suggest that the aberrant neural circuit that underlies severe reading problems in older children with dyslexia appears to be present during the early stages of reading acquisition – at a much earlier age than previously believed (Simos et al., in press). Unless intensive, explicit instruction in basic reading skills is provided, this aberrant brain organization may not change. Provided with adequate intervention (i.e., intensive instruction targeting deficient component skills), the aberrant outline of the brain circuit for reading can undergo dramatic change (Simos, Fletcher et al., 2002; see Figure 6, lower panel). As a functional brain imaging technique, MEG has a number of highly desirable features particularly valuable in the study of the brain circuits responsible for language functions. These include millisecond temporal resolution, excellent reproducibility, testretest reliability, and anatomical resolution. Two other properties that are often neglected in state-ofthe-art applications of functional brain imaging are standard features in MEG studies. First, MEGderived brain activation scans are always presented for individuals rather than for groups of participants, circumventing well-known problems with spatial averaging of functional images (see for instance, Fernandez et al., 2003). Secondly, data acquisition and analysis protocols are empirically validated, allowing legitimate interpretation of studies aiming to elucidate neural networks involved in linguistic processing. Although whole-head MEG is a relatively new technique and the number of research centers focusing on neurolinguistic investigations is small, a considerable amount of data has accrued that confirms classical hypotheses regarding the dynamic organization of brain circuits that support language functions.

Bibliography Beisteiner R, Gartus A, Erdler M, Mayer D, Lanzenberger R & Deecke L (2004). ‘Magnetoencephalography indicates finger motor somatotopy.’ European Journal of Neuroscience 19, 465–472. Billingsley R L, Simos P G, Castillo E M et al. (2004). ‘Spatio-temporal cortical dynamics of phonological and

semantic fluency.’ Journal of Clinical and Experimental Neuropsychology 26, 1031–1043. Binder J R, Frost J A, Hammeke T A et al. (2000). ‘Human temporal lobe activation by speech and nonspeech sounds.’ Cerebral Cortex 10, 512–528. Boatman D, Lesser R P & Gordon B (1995). ‘Auditory speech processing in the left temporal lobe: an electrical interference study.’ Brain and Language 51, 269–290. Castillo E M, Simos P G, Venkataraman V et al. (2001). ‘Mapping of expressive language cortex using Magnetic Source Imaging.’ Neurocase 7, 419–422. Eulitz C, Diesch E, Pantev C et al. (1995). ‘Magnetic and electric brain activity evoked by the processing of tone and vowel stimuli.’ Journal of Neuroscience 15, 2748–2755. Fernandez G, Specht K, Weis S et al. (2003). ‘Intrasubject reproducibility of presurgical language lateralization and mapping using fMRI.’ Neurology 60, 969–975. Haglund M M, Berger M S, Shamseldin M et al. (1994). ‘Cortical localization of temporal lobe language sites in patients with gliomas.’ Neurosurgery 34, 567–576. Helenius P, Salmelin R, Service E et al. (1998). ‘Distinct time courses of word and context comprehension in the left temporal cortex.’ Brain 121, 1133–1142. Hertz-Pannier L, Chiron C, Jambaque I et al. (2002). ‘Late plasticity for language in a child’s non-dominant hemisphere: a pre- and post-surgery fMRI study.’ Brain 125, 361–372. Kober H, Moller M, Nimsky C et al. (2001). ‘New approach to localize speech relevant brain areas and hemispheric dominance using spatially filtered magnetoencephalography.’ Human Brain Mapping 14, 236–250. Kwon H, Lee Y H, Kim J M et al. (2002). ‘Localization accuracy of single current dipoles from tangential components of auditory evoked fields.’ Physics in Medicine and Biology 47, 4145–4154. Lesser R, Gordon B & Uematsu S (1994). ‘Electrical stimulation and language.’ Journal of Clinical Neurophysiology 11, 191–204. Maestu´ F, Ortiz T, Ferna´ ndez A et al. (2002). ‘Spanish language mapping using MEG: a validation study.’ Neuroimage 17, 1579–1586. Ojemann G A & Whitaker H A (1978). ‘Language localization and variability.’ Brain and Language 6, 239–260. Pantev C, Bertrand O, Eulitz C et al. (1995). ‘Specific somatotopic organizations of different areas of the human auditory cortex revealed by simultaneous magnetic and electric recordings.’ Electroencephalography and Clinical Neurophysiology 94, 26–40. Papanicolaou A C (1998). Fundamentals of functional brain imaging: a guide to the methods and their applications to psychology and behavioral neurosciences. Netherlands: Swets & Zeitlinger. Papanicolaou A C, Simos P G, Castillo E M et al. (2003). ‘Differential brain activation patterns during perception of voice and tone onset time series: a MEG study.’ NeuroImage, 18, 448–459. Papanicolaou A C, Simos P G, Castillo E M et al. (2004). ‘Magnetoencephalography: a non-invasive alternative

444 Magnetoencephalography to the Wada procedure.’ Journal of Neurosurgery 100, 867–876. Pataraia E, Simos P G, Castillo E M et al. (2004). ‘Reorganization of language-specific cortex in parients with lesions or mesial temporal epilepsy.’ Neurology 63, 1825–1832. Pugh K R, Mencl W E, Jenner A J et al. (2000). ‘Functional neuroimaging studies of reading and reading disability (developmental dyslexia).’ Mental Retardation and Developmental Disabilities Review 6, 207–213. Roberts T P, Disbrow E A, Roberts H C et al. (2000). ‘Quantification and reproducibility of tracking cortical extent of activation by use of functional MR imaging and magnetoencephalography.’ American Journal of Neuroradiology 21, 1377–1387. Roberts T P, Ferrari P, Perry D et al. (2000). ‘Presurgical mapping with magnetic source imaging: comparisons with intraoperative findings.’ Brain Tumor Pathology 17, 57–64. Roberts T P, Ferrari P, Stufflebeam S M et al. (2000). ‘Latency of the auditory evoked neuromagnetic field components: stimulus dependence and insights toward perception.’ Journal of Clinical Neurophysiology 17, 114–129. Salmelin R, Hari R, Lounasmaa O V et al. (1994). ‘Dynamics of brain activation during picture naming.’ Nature 368, 463–465. Simos P G, Basile L F & Papanicolaou A C (1997). ‘Source localization of the N400 response in a sentence-reading paradigm using evoked magnetic fields and magnetic resonance imaging.’ Brain Research 762, 29–39. Simos P G, Breier J I, Fletcher J M, Bergman E & Papanicolaou A C (2000). ‘Cerebral mechanisms involved in word reading in dyslexic children: a magnetic source imaging approach.’ Cerebral Cortex 10, 806–816. Simos P G, Breier J I, Fletcher J M et al. (2001). ‘Age-related changes in regional brain activation during phonological decoding and printed word recognition.’ Developmental Neuropsychology 19, 191–210. Simos P G, Breier J I, Fletcher J M et al. (2002). ‘Brain mechanisms for reading words and pseudowords: an integrated approach.’ Cerebral Cortex 12, 297–305.

Magyar

See: Hungarian.

Mandarin

See: Hokan Languages.

Simos P G, Breier J I, Zourdakis G et al. (1998). ‘Identification of language-related brain activity using magnetoencephalography.’ Journal of Clinical and Experimental Neuropsychology 2, 706–722. Simos P G, Papanicolaou A C, Breier J I et al. (1999). ‘Localization of language-specific cortex by using magnetic source imaging and electrical stimulation mapping.’ Journal of Neurosurgery 91, 787–796. Simos P G, Papanicolaou A C, Breier J I et al. (2000). ‘Brain activation profiles in dyslexic children during nonword reading: a magnetic source imaging study.’ Neuroscience Letters 290, 61–65. Simos P G, Sarkari S, Clear T et al. (Under review). ‘Estimates of neurophysiological activity in Wernicke’s area: reproducibility of complementary measures using magnetoencephalography.’ Simos P G, Fletcher J M, Sarkari S et al. (In press). ‘Early development of neurophysiological processes involved in normal reading and reading disability.’ Neuropsychology. Simos P G, Fletcher J M, Bergman E et al. (2002). ‘Dyslexiaspecific brain activation profile becomes normal following successful remedial training.’ Neurology 58, 1203–1213. Tarkiainen A, Helenius P, Hansen P C et al. (1999). ‘Dynamics of letter string perception in the human occipitotemporal cortex.’ Brain 122, 2119–2132. Tecchio F, Rossini P M, Pizzella V, Cassetta E, Pasqualetti P, & Romiani G L (1998). ‘A neuromagnetic normative data set for hemispheric sensory hand cortical representations and their interhemispheric differences.’ Brain Research Protocols 2, 306–314. Valaki C E, Maestu F, Simos P G, Zhang W, Fernandez A, Amo O, Ortiz T, Papanicolaou A C (2004). ‘Cortical organization for receptive language functions in Chinese: a cross-linguistic study using MEG.’ Neuropsychologia 42, 967–979. Wydell T N, Vuorinen T, Helenius P et al. (2003). ‘Neural correlates of letter-string length and lexicality during reading in a regular orthography.’ Journal of Cognitive Neuroscience 15, 1052–1062.

Malagasy 445

Malagasy R Kikusawa, National Museum of Ethnology, Osaka, Japan

Arabic, Bantu languages (in particular Swahili), and Sanskrit.

! 2006 Elsevier Ltd. All rights reserved.

Writing Systems Geographical Distribution, Dialects, and Speakers Malagasy is the main language spoken in Madagascar (population approximately 17.5 million according to a 2004 estimate), located off the east coast of Africa. Standard Malagasy, which is based on the Merina dialect spoken in and around Antananarivo, the capital city, is one of the two official languages (along with French) in Madagascar and is used in public contexts and also for education in grade schools and high schools. There are said to be 18 ethnic groups in Madagascar and regional dialects are referred to in association with these groups. Many show phonemic as well as phonetic, lexical, and morphosyntactic features that are different from those in Standard Malagasy. Descriptions are available for some of the dialects (Tsimilaza, 1981; Thomas-Fattier, 1982; Manoro, 1983; Rabenilaina, 1983; Raharinjanahary, 1984; Beaujard, 1998); however, there are a number of others that have not yet been well studied. The Malagasy dialects are considered to form two groups, a western group and an eastern group (Dez, 1963; Gueunier, 1988), distinguished by two sets of regular sound correspondences, Western Malagasy di corresponds to Eastern Malagasy li, while Western Malagasy tsi corresponds to Eastern Malagasy ti. Figure 1 shows the boundary of the two dialect groups, as well as some regional dialect names.

The first writing system introduced to Madagascar was an Arabic script. It was introduced by Muslims in the 12th century, and the people in Taimoro learned it and adapted it to their own phonology (referred to as sorabe ‘great writing/drawing’). Currently, a Latinbased alphabetic system is used, which was introduced in 1823 by an early missionary of the London Missionary Society, David Jones.

Linguistic Features of Standard Malagasy Phonology and Orthography

The letters used in the Malagasy languages and their phonemic properties are shown in Table 1. In Spoken Malagasy, /h/ often disappears, and word-

Genetic Relationships Malagasy is an Austronesian language. Its most closely related language is considered to be Ma’anyan, a language that belongs to the Barito group of the Western Malayo-Polynesian languages. This implies that the people ancestral to the present Malagasy population migrated from southeast Kalimantan on Borneo Island, where the Barito languages are spoken. This took place probably around 700 A.D., but the exact routes and the reasons for this migration are still not clear (Adelaar, 1989; Dahl, 1991). Some Austronesian features in Malagasy reflect borrowings from genetically related languages, in particular Malay and Javanese, suggesting the possibility of multiple migrations after the initial Austronesian settlement in Madagascar, and/or of continuous contact among the speakers. The language also shows traces of contact with the speakers of such languages as

Figure 1 The names of the ethnic groups in Madagascar. The line indicates the boundary between the two dialectal groups, namely Western and Eastern Malagasy.

446 Malagasy Table 1 The Malagasy orthography and phonemic system Vowels

a i, y e o oˆ

[i] (The letter y is used at the end of a word.) [u] [o]

Consonants Nasal

Labial Dentalveolar Alveolar (affiricate) Alveolar trill (or, retroflex) Velar

m n

n˜ [N]

Prenasalized stop

mb nd nj [ndz] ndr [ndr"nr"nB] ng [Ng]

Voiced

Voiceless

Stop

Spirant

Stop

Spirant

b d j [dz] dr [dr"B] g

v l z r

p

f

t ts tr [tr"

miambim-po`dy ‘to guard from birds’

>

(2e) le`na ‘wet (with)’ þ rano ‘water’

>

oron-tsa`ka ‘the nose of a cat’ tranon-ka`zo ‘woodshed’ tranom-ba`rotra ‘business association’ len-dra`no ‘wet with water’

> >

Morphosyntactic Characteristics

Typologically, Malagasy is considered to be a ‘prodrop’, verb-initial language that shows the major properties of head-initial languages, with modifiers following the noun and nominal arguments following the verb. Verbs undergo various morphological derivations that are associated with different sentence structures, as shown here with verbs deriving from the root pasaka. In (3), the form mipasaka ‘to burst open’ (the initial consonant appearing as n marking the past tense in the example) appears as an intransitive verb requiring only a nominative argument. (3) N-ipasaka ny ovy PAST-burst.open DET potato ‘the potatoes burst open’

In (4), with the form manapasaka ‘to smash’ (often labeled as ‘active voice’), the actor is expressed with a nominative pronoun aho ‘I,’ while in (5) and (6), where the verb forms are mopasahana ‘to smash something’ and voapasaka ‘have smashed something’ (often labeled as ‘passive voice’), it is expressed with a genitive pronoun -ko ‘I (agent).’ (4) N-anapasaka ny ovy PAST-smash DET potato ‘I was smashing the potatoes’ (5) N-opasahi-ko ny PAST-smash-1SG.GEN DET ‘I smashed the potatoes’

aho I

ovy potato

(6) Voa-pasa-ko ny ovy PERF-smash-1SG. GEN DET potato ‘I have finished smashing the potatoes / I have inadvertently smashed the potatoes.’

Malawi: Language Situation 447

The form manapasahana ‘smash with’ (often labeled as ‘circumstantial’) in (7) typically appears in a relative clause modifying a noun which functions as an instrument or a location. (7) T-amin 0 ny sotro no PAST-with spoon that n-anapasaha-ko ny ovy PAST-APPLI.smash-1SG.GEN DET potato ‘it was a spoon with which she was smashing the potatoes’

The alternations observed in Malagasy verb morphology, as well as the various sentence structures in which they occur, are of interest in that they correspond both typologically and historically to the ‘focus’ system in Philippine and Indonesian languages. See also: Austronesian Languages: Overview.

Bibliography Adelaar K A (1989). ‘Malay influence on Malagasy: linguistic and culture-historical implications.’ Oceanic Linguistics 28(1), 1–46. Beaujard P (1998). Dictionnaire malgache-franc¸ ais: dialecte tan˜ ala (sud-est de Madagascar), avec recherches e´ tymologiques. Paris: Harmattan. Dahl O C (1968). Contes malgaches en dialecte sakalava: textes, traductions, grammaire et lexique. Oslo: Universitetsforlaget. Dahl O C (1991). Migration from Kalimantan to Madagascar. Oslo: Norwegian University Press. Dahl O C (1993). Sorabe: revelant l’evolution du dialect antemoro. Antananarivo: Trano Printy, Fiangonana Loterana Malagasy. Dez J (1963). ‘Aperc¸ u pour une dialectologie de la langue malgache.’ Bulletin de Madagascar 204, 441–451; 205, 507–520; 206, 581–607; 210, 973–994.

Dyen I (1971). ‘Malagasy.’ In Sebeok T A (ed.) Current trends in linguistics 8: Linguistics in Oceania. The Hague/Paris: Mouton. 211–239. Gueunier N-J (1988). ‘Dialectologie et lexicostatistique: cas du dialecte malgache de Mayotte (Comores).’ Etudes Oce´ an Indien 9, 143–170. Keenan E L & Polinsky M (1998). ‘Malagasy (Austronesian).’ In Spencer A & Zwicky A M (eds.) The handbook of morphology. Oxford/Malden, MA: Blackwell. 563–623. Manoro R (1983). Description morphosyntaxique du Tsimihety (Madagascar). Doctoral thesis, University of Paris 7. Pearson M & Paul E (eds.) (1996). The structure of Malagasy 1. Los Angeles: Department of Linguistics, University of California. Rabenilaina Roger-Bruno (1983). Morpho-syntaxe du malgache: description structurale du dialecte Ba`ra. Paris: SELAF. Raharinjanahary S (1984). Aspects de la dialectologie du malgache: morphophonolgie: application a` l’antanosy. Doctoral thesis, University of Paris 7. Rajaona S (1977). Proble`mes de morphologie malgache. Fianarantsoa, Madagascar: Ambozontany. Rajaonarimanana N (2001). Grammaire moderne de la langue malgache. Paris: Langues & Mondes – L’Asiathe`que. Randriamasimanana C (1986). The causatives of Malagasy. Honolulu: University of Hawai’i Press. Rasoloson J (2001). Malagasy-English English-Malagasy dictionary and phrase book. New York: Hippocrene. Thomas-Fattier D (1982). Le dialecte sakalava du nord ouest de Madagascar. Paris: SELAF. Tsimilaza A (1981). ‘Phonologie et morphologie du tsimihety.’ Doctoral thesis, University of Nancy 2. Ve´ rin P, Kottak C & Gorlin P (1969). ‘The glottochronology of Malagasy speech communities.’ Oceanic Linguistics 8(1), 26–83.

Malawi: Language Situation S Mchombo, University of California, Berkeley, Berkeley, California, USA ! 2006 Elsevier Ltd. All rights reserved.

Lying south of Tanzania in the Great Rift Valley, wedged between Mozambique to the east and south and Zambia to the west, is the small country of Malawi. It has a landmass of 118 484 square kilometers. Of that, an area of 24 000 square kilometers is taken by a lake that dominates the country’s landscape. The lake was previously known as Lake Nyasa, from which the country obtained its former

name of Nyasaland. It is the third largest lake in Africa, after Victoria Nyanza (Lake Victoria) and Lake Tanganyika. Although with the change of name of the country from Nyasaland to Malawi the lake was accordingly designated Lake Malawi, political disputes relating to the proper boundaries between Malawi and its neighbors that raged in the late 1960s, led to a reluctance on the part of the government of Tanzania to acquiesce to such a designation of the lake. This renaming was viewed as tantamount to legitimizing the Malawi government’s claim, under the leadership, and at the instigation, of its first president, the late Hastings Kamuzu

Malawi: Language Situation 447

The form manapasahana ‘smash with’ (often labeled as ‘circumstantial’) in (7) typically appears in a relative clause modifying a noun which functions as an instrument or a location. (7) T-amin 0 ny sotro no PAST-with spoon that n-anapasaha-ko ny ovy PAST-APPLI.smash-1SG.GEN DET potato ‘it was a spoon with which she was smashing the potatoes’

The alternations observed in Malagasy verb morphology, as well as the various sentence structures in which they occur, are of interest in that they correspond both typologically and historically to the ‘focus’ system in Philippine and Indonesian languages. See also: Austronesian Languages: Overview.

Bibliography Adelaar K A (1989). ‘Malay influence on Malagasy: linguistic and culture-historical implications.’ Oceanic Linguistics 28(1), 1–46. Beaujard P (1998). Dictionnaire malgache-franc¸ais: dialecte tan˜ala (sud-est de Madagascar), avec recherches e´tymologiques. Paris: Harmattan. Dahl O C (1968). Contes malgaches en dialecte sakalava: textes, traductions, grammaire et lexique. Oslo: Universitetsforlaget. Dahl O C (1991). Migration from Kalimantan to Madagascar. Oslo: Norwegian University Press. Dahl O C (1993). Sorabe: revelant l’evolution du dialect antemoro. Antananarivo: Trano Printy, Fiangonana Loterana Malagasy. Dez J (1963). ‘Aperc¸u pour une dialectologie de la langue malgache.’ Bulletin de Madagascar 204, 441–451; 205, 507–520; 206, 581–607; 210, 973–994.

Dyen I (1971). ‘Malagasy.’ In Sebeok T A (ed.) Current trends in linguistics 8: Linguistics in Oceania. The Hague/Paris: Mouton. 211–239. Gueunier N-J (1988). ‘Dialectologie et lexicostatistique: cas du dialecte malgache de Mayotte (Comores).’ Etudes Oce´an Indien 9, 143–170. Keenan E L & Polinsky M (1998). ‘Malagasy (Austronesian).’ In Spencer A & Zwicky A M (eds.) The handbook of morphology. Oxford/Malden, MA: Blackwell. 563–623. Manoro R (1983). Description morphosyntaxique du Tsimihety (Madagascar). Doctoral thesis, University of Paris 7. Pearson M & Paul E (eds.) (1996). The structure of Malagasy 1. Los Angeles: Department of Linguistics, University of California. Rabenilaina Roger-Bruno (1983). Morpho-syntaxe du malgache: description structurale du dialecte Ba`ra. Paris: SELAF. Raharinjanahary S (1984). Aspects de la dialectologie du malgache: morphophonolgie: application a` l’antanosy. Doctoral thesis, University of Paris 7. Rajaona S (1977). Proble`mes de morphologie malgache. Fianarantsoa, Madagascar: Ambozontany. Rajaonarimanana N (2001). Grammaire moderne de la langue malgache. Paris: Langues & Mondes – L’Asiathe`que. Randriamasimanana C (1986). The causatives of Malagasy. Honolulu: University of Hawai’i Press. Rasoloson J (2001). Malagasy-English English-Malagasy dictionary and phrase book. New York: Hippocrene. Thomas-Fattier D (1982). Le dialecte sakalava du nord ouest de Madagascar. Paris: SELAF. Tsimilaza A (1981). ‘Phonologie et morphologie du tsimihety.’ Doctoral thesis, University of Nancy 2. Ve´rin P, Kottak C & Gorlin P (1969). ‘The glottochronology of Malagasy speech communities.’ Oceanic Linguistics 8(1), 26–83.

Malawi: Language Situation S Mchombo, University of California, Berkeley, Berkeley, California, USA ! 2006 Elsevier Ltd. All rights reserved.

Lying south of Tanzania in the Great Rift Valley, wedged between Mozambique to the east and south and Zambia to the west, is the small country of Malawi. It has a landmass of 118 484 square kilometers. Of that, an area of 24 000 square kilometers is taken by a lake that dominates the country’s landscape. The lake was previously known as Lake Nyasa, from which the country obtained its former

name of Nyasaland. It is the third largest lake in Africa, after Victoria Nyanza (Lake Victoria) and Lake Tanganyika. Although with the change of name of the country from Nyasaland to Malawi the lake was accordingly designated Lake Malawi, political disputes relating to the proper boundaries between Malawi and its neighbors that raged in the late 1960s, led to a reluctance on the part of the government of Tanzania to acquiesce to such a designation of the lake. This renaming was viewed as tantamount to legitimizing the Malawi government’s claim, under the leadership, and at the instigation, of its first president, the late Hastings Kamuzu

448 Malawi: Language Situation

Banda, that the lake as a whole belonged to Malawi. Tanzania claimed, with some justification, that part of the northern extremity of the lake was Tanzanian waters, as the boundary, as originally drawn, included a portion of the lake in Tanzanian territory. This led to the retention of the two labels for the lake, as Lake Nyasa/Lake Malawi, in some of the maps, especially where the publishers expected the maps to be sold in both countries. With altered and somewhat improved political relations between the countries, the label Lake Malawi has become established and accepted, at least provisionally. Malawi, according to recent estimates, has a population of between 11 and 12 million. It is, by all accounts, a relatively densely populated country. The population of Malawi reflects a diversity of ethnicities that accounts for the range of languages within its borders. Malawi has a number of ethnic groups. These include the major ones, which are the Nyanja, Lomwe, Tumbuka, Yao, Tonga, and Sena, as well as some minor ones, such as the Lambya, Ngonde (Nkhonde), Nyakyusa, Nyiha, and Nsukwa/Ndali. Of these, the Nyanja have traditionally been viewed as constituting the majority group (cf. Kishindo, 1994; Matiki, 1997; Mchombo, 1998; Young, 1949). During the era of colonialism the British recognized the importance of chi-Nyanja. It was a language that could contribute to promoting understanding between them as the rulers and the local population, the ruled, for effective government and administration. As a consequence, the colonial government adopted the use of chi-Nyanja, the language spoken by the Nyanja, as the official language in some parts of the country. To facilitate the learning of the language, it was adopted as an official language by the government, with the justification that ‘‘[I]n Nyasaland . . . the Nyanja speakers not only outnumber those of other areas, but their dialects have a much longer history of literary use’’ (Price, 1940: 129). The adoption of Nyanja as the official language gave the impetus for it to be taught in schools. Further, knowledge of the language was required for government duties. As Hailey observed about the situation in Malawi (then Nyasaland), ‘‘Nyanja has been adopted as the official language by the government. Knowledge of it is compulsory for departmental examinations and it is intended to introduce it into all schools after two years instruction in the vernacular’’ (Hailey, 1938: 75). During the preindependence era, chi-Nyanja and chi-Tumbuka (the language of the Tumbuka) were used in the media and in elementary education. The inclusion of chi-Tumbuka in the media and in elementary education rested on the recognition of its being the most widely spoken language in the

northern part of Malawi. Although the Nyanja have predominated in the central and southern parts of the country, the Tumbuka have traditionally been the dominant group in the north. As noted by Leroy Vail, the Tumbuka ‘‘live as subsistence agriculturalists in Zambia and Malawi. In Zambia, they live in the eastern Province’s Isoka and Lundazi Districts, while in Malawi, they are found in Chitipa, Karonga, Nkhata Bay, Rumphi, Mzimba and Kasungu Districts of the Northern and Central Regions. Their territory is about 20 000 square miles in extent, with the Luangwa River of Zambia forming their western boundary, Lake Malawi their eastern boundary, the valley of the North Rukuru River (latitude 10! S) their northern boundary, and approximately 12! 300 S their southern limit’’ (Vail, 1972: xiii). The Tumbuka language is bordered by a number of smaller Bantu languages to its north. These include the languages of the Nyiha, Lambya, Sukwa/Ndali, and the Ngonde. In Zambia, Tumbuka is neighbor to Senga and Bisa. To the south, it borders chi-Chewa, the language of the Chewa group. The Chewa are widely scattered, living in Malawi and Zambia. Although the Chewa were regarded as a subgroup of the Nyanja, the latter also are well represented in Mozambique, with their language, chi-Chewa, considered a dialect variation of chi-Nyanja (cf. Watkins, 1937), political developments in independent Malawi led to a reversal of the roles. With the ascendancy of a nationalist Chewa to the presidency when Malawi gained independence from Britain, subsequently becoming a republic, the roles were reversed, based on basic misconceptions about the distinction between language and dialect (cf. Mchombo, 2005). It was claimed that the Chewa constituted the main group, with chi-Chewa, their language, being the main language. It was the Nyanja who constituted a subgroup of the Chewa, and hence their language, chi-Nyanja, was a dialect of chi-Chewa. Identification of the language was altered in Malawi, from chi-Nyanja to chi-Chewa, with the latter eventually accorded the status of national language (See: Nyanja). With the departure of Kamuzu Banda and altered political circumstances in Malawi, the situation is, yet again, being reversed, with a restoration of chi-Nyanja as the major language of Malawi. Part of the argumentation is that the restoration of this label chi-Nyanja for the language brings Malawi in line with its neighbors. In both Mozambique and Zambia, where the language is one of the major languages, it has always been identified by that label. Politically, the move constitutes a reduction of the dominance of the Chewa in the Malawian political and cultural fabric. By far chi-Nyanja (henceforth Chinyanja) is the most widely spoken language in Malawi. The resources

Malawi: Language Situation 449

devoted to its development and promotion as an aspect of implementation of the language policy that made Chichewa the national language contributed greatly to the enhancement of its diffusion and usage in the country. With the elevation of Chichewa to the status of national language, Chitumbuka, along with other languages, suffered great marginalization. It lost the literary activity and the presence in the mass media that it had previously enjoyed. The profile of Chitumbuka declined after the national language policy was articulated, a policy that created ‘‘a situation which resulted in bitter resentment throughout the northern region, a situation made worse by Chewa-speakers’ triumphal assertions that other people of the country were cultureless because they had no language’’ (Vail and White, 1989 183). The language policy thus angered various sections of the population, primarily the Tumbuka, which was to influence the nature of Malawi politics thereafter (Mchombo, 1998). Of the remaining languages, Tonga is found in an area along the shores of Lake Malawi in an enclave in Nkhata Bay District, between Tumbuka to its north and west and Nyanja to the south. More visible in Malawi than the Tonga are the Yao. They are concentrated along the southern coastline of Lake Malawi, on both the Malawi and Mozambique sides, spreading into parts of southern Malawi. They are more numerous in Mozambique, where they spread into southern Tanzania. The Lomwe, although at some point viewed as constituting the most populous group after the Nyanja and ahead of the Yao and the Tumbuka, have normally been seen as a spill-over from Mozambique. Concentrated along the southeast border of the Malawi/Mozambique borderline, and spreading westward into areas close to the major commercial city of Blantyre, they have rarely had a profile in the country’s linguistic landscape comparable to that of the other ethnic groups. In part this is because during the first census taken in independent Malawi in 1966, most of the Lomwe, fearful of the potential for ethnic isolation and possible political repercussions of that, registered as Chewas. Most spoke Chinyanja/Chichewa, except in their home areas. This contributed toward diminished visibility of their language within the country. It remains the only language of a once major group for which manuals or descriptive grammars are lacking. Even the recent move to include news broadcasts in it, along with the other major languages, has not translated into readily accessible written works in or on the language or generated increased scholarly interest in it. This author intends to have that altered and to stimulate research into this language.

The southern tip of Malawi, spreading into southern Mozambique, is the Lower Shire valley. The Shire River originates from the southern extremity of Lake Malawi. It flows through southern Malawi into Mozambique to join the Zambezi River. The Lower Shire Valley is home to the Sena. They, together with their language, chi-Sena, spread into southern Mozambique, toward the Zambezi River. The remaining languages of Lambya, Nyakyusa, and Ngonde (Nkhonde), are minority languages spoken in the northernmost parts of the country. Also spoken in northern Malawi is Chindali or Chinsukwa. This is such a less commonly spoken language that it is virtually top on the list of endangered languages in the country. Studies of this and other minority languages are currently being undertaken by individual researchers (e.g., Robert Botne of Indiana University), as well as under the auspices of the Center for Language Studies in Malawi, affiliated with the University of Malawi. One language that has been mentioned as bordering with Chitumbuka in the west is Chinsenga. Although the language is primarily spoken in Zambia, it has some presence in Malawi, in the Malawi/Zambia border areas such as Mchinji (cf. Simango, 1995). In some linguistic surveys of east, central, and southern Africa (e.g., work done by scholars in Mozambique [NELIMO]), this language has been classified as a dialect of Chinyanja. There is one ethnic group that is traditionally included in the listing of ethnicities in Malawi. This is the Ngoni. There is a significant percentage of the Malawi population who call themselves Ngoni. Moving in from southern Africa, the Ngoni were a conquering nation. They waged war with other ethnic groups, including the Chewa and the Tumbuka. They then settled in some of the places that they had conquered. In fact, it is noted of the Tumbuka that they were not a tribe per se. As indicated by Vail, ‘‘the Tumbuka people are historically not members of that discrete entity known commonly as a ‘tribe.’ This term, as generally used, has implied some sort of unity based on a common language and common cultural traits. Following from this synchronic cultural unity, it was assumed that a particular ‘tribe’ had a common past and an historical unity. In fact, however, the historical reality is far different, and Tumbuka history is more accurately seen when viewed as the aggregate history of individual clans’’ (xiv). Other historians have commented that the individual clans that constituted the Tumbuka had well-established dynasties and that they had developed their language, culture, and political and economic institutions during the period of their struggles with the Ngoni who had invaded their land (cf. Matiki, 1997; Pachai, 1973).

450 Malawi: Language Situation

Thus, the Ngoni constitute a significant ethnic group that settled in Malawi. However, their language, chiNgoni, is not normally recognized in the linguistic map of Malawi because the Ngoni assimilated to either the Chewa in the south or the Tumbuka in the north. Although some of the culture and customs may have been preserved, such as traditional dances, initiation ceremonies marking rites of passage, and so on, the Ngoni adopted the languages spoken in their new habitat. If chi-Ngoni is used at all, it is used primarily during ceremonial occasions. In fact, when used, chi-Ngoni sounds very much a variant of such Nguni languages as Zulu, spoken in southern Africa. Finally, very little has been said about Nyiha. This is one of the languages that borders with Tumbuka. It does not normally feature among the languages of Malawi because it is spoken primarily in Zambia and makes contact with Tumbuka there. Language Maps (Appendix 1): Maps 21, 23.

Bibliography Greenberg J (1966). The languages of Africa. Bloomington, IN: Indiana University Press. Hailey F (1938). An African survey. London. Kishindo P J (1990). ‘An historical survey of spontaneous and planned development of Chichewa.’ In Fodor I & Hagege L (eds.) Language reform. History and future. Hamburg: Helmut Buske. 59–82.

Kishindo P J (1994). ‘The impact of a national language on minority languages: the case of Malawi.’ Journal of Contemporary African Studies 12, 127–150. Matiki A J (1997). ‘The politics of language in Malawi: a preliminary investigation.’ In Herbert R K (ed.) African linguistics at the crossroads: papers from Kwaluseni. Ko¨ ln: Ru¨ diger Ko¨ ppe. 521–540. Mchombo S (2005). http://www.humnet.ucla.edu/humnet/ aflang/chichewa. Mchombo S A (1998). ‘National identity, democracy, and the politics of language in Malawi and Tanzania.’ The Journal of African Policy Studies 4, 33–46. Nervi L (1994). Malawi: flames in the African sky. Gorle (Bg): Editrice VELAR. Pachai B (1973). Malawi: the history of the nation. London: Longman. Price T (1940). ‘Nyanja linguistic problems.’ Africa 13, 125–137. Simango R (1995). ‘The syntax of Bantu double object constructions.’ Doctoral diss., University of South Carolina. Vail L (1972). ‘Aspects of the Tumbuka verb.’ Doctoral diss., Madison: University of Wisconsin Vail L & White L (1989). ‘Tribalism in the political history of Malawi.’ In Vail L (ed.) The creation of tribalism in southern Africa. London: James Curry. 151–192. Watkins M H (1937). A grammar of Chichewa. A Bantu language of British Central Africa. Philadelphia: The Linguistic Society of America. Young C (1949). ‘A Review of A Practical Approach to Chinyanja with English-Nyanja Vocabulary by T. D. Thomson, Salisbury. 1947.’ Africa 19, 253–255.

Malay B Nothofer, University of Frankfurt, Germany ! 2006 Elsevier Ltd. All rights reserved.

Malay, a member of the Malayic language group, belongs to the subfamily of the western MalayoPolynesian languages of the Austronesian language family. Other Malayic variants that have ProtoMalayic as their common ancestor include Minangkabau, Kerinci, Banjar, Iban, and Jakarta Malay (Adelaar, 1992). Northwestern Borneo is thought to be the homeland of the speakers of the proto-language (Adelaar, 1995; Nothofer, 1997; Collins, 1998). About 2000 years ago some of them migrated to eastern Sumatra, while others remained behind and stayed in northwestern Borneo. Some of the latter traveled south to Ketapang and then crossed over to Bangka and Belitung (Nothofer, 1997). Those remaining in the homeland area are the ancestors of speakers of Malayic Dayak languages (e.g., Iban,

Selako). The Malays who sailed to Sumatra settled the island’s east coast. Some moved on into the interior and to the west coast of southern Sumatra. While Middle Malay, Minangkabau, and Kerinci have inland and west coast variants as their origin, Malay itself developed from isolects spoken on the east coast. Later, Malay speakers from the southeast coast of Sumatra established Malay colonies in the Malay Peninsula. Other Malays returned to west Borneo, where they settled the coastal and riverine areas. The isolects spoken by these relatively recent migrants differ considerably from the isolects of Malays who never left Borneo. Coastal Borneo has other Malay isolects such as Sarawak Malay, Brunei Malay, Kutai Malay, and Banjjar, perhaps as a result of a clockwise settlement that originated in western Borneo (Figure 1). Malay, the native language of the powerful kingdoms along the shores of the Straits of Malacca through which all traders from the west and the east

450 Malawi: Language Situation

Thus, the Ngoni constitute a significant ethnic group that settled in Malawi. However, their language, chiNgoni, is not normally recognized in the linguistic map of Malawi because the Ngoni assimilated to either the Chewa in the south or the Tumbuka in the north. Although some of the culture and customs may have been preserved, such as traditional dances, initiation ceremonies marking rites of passage, and so on, the Ngoni adopted the languages spoken in their new habitat. If chi-Ngoni is used at all, it is used primarily during ceremonial occasions. In fact, when used, chi-Ngoni sounds very much a variant of such Nguni languages as Zulu, spoken in southern Africa. Finally, very little has been said about Nyiha. This is one of the languages that borders with Tumbuka. It does not normally feature among the languages of Malawi because it is spoken primarily in Zambia and makes contact with Tumbuka there. Language Maps (Appendix 1): Maps 21, 23.

Bibliography Greenberg J (1966). The languages of Africa. Bloomington, IN: Indiana University Press. Hailey F (1938). An African survey. London. Kishindo P J (1990). ‘An historical survey of spontaneous and planned development of Chichewa.’ In Fodor I & Hagege L (eds.) Language reform. History and future. Hamburg: Helmut Buske. 59–82.

Kishindo P J (1994). ‘The impact of a national language on minority languages: the case of Malawi.’ Journal of Contemporary African Studies 12, 127–150. Matiki A J (1997). ‘The politics of language in Malawi: a preliminary investigation.’ In Herbert R K (ed.) African linguistics at the crossroads: papers from Kwaluseni. Ko¨ln: Ru¨diger Ko¨ppe. 521–540. Mchombo S (2005). http://www.humnet.ucla.edu/humnet/ aflang/chichewa. Mchombo S A (1998). ‘National identity, democracy, and the politics of language in Malawi and Tanzania.’ The Journal of African Policy Studies 4, 33–46. Nervi L (1994). Malawi: flames in the African sky. Gorle (Bg): Editrice VELAR. Pachai B (1973). Malawi: the history of the nation. London: Longman. Price T (1940). ‘Nyanja linguistic problems.’ Africa 13, 125–137. Simango R (1995). ‘The syntax of Bantu double object constructions.’ Doctoral diss., University of South Carolina. Vail L (1972). ‘Aspects of the Tumbuka verb.’ Doctoral diss., Madison: University of Wisconsin Vail L & White L (1989). ‘Tribalism in the political history of Malawi.’ In Vail L (ed.) The creation of tribalism in southern Africa. London: James Curry. 151–192. Watkins M H (1937). A grammar of Chichewa. A Bantu language of British Central Africa. Philadelphia: The Linguistic Society of America. Young C (1949). ‘A Review of A Practical Approach to Chinyanja with English-Nyanja Vocabulary by T. D. Thomson, Salisbury. 1947.’ Africa 19, 253–255.

Malay B Nothofer, University of Frankfurt, Germany ! 2006 Elsevier Ltd. All rights reserved.

Malay, a member of the Malayic language group, belongs to the subfamily of the western MalayoPolynesian languages of the Austronesian language family. Other Malayic variants that have ProtoMalayic as their common ancestor include Minangkabau, Kerinci, Banjar, Iban, and Jakarta Malay (Adelaar, 1992). Northwestern Borneo is thought to be the homeland of the speakers of the proto-language (Adelaar, 1995; Nothofer, 1997; Collins, 1998). About 2000 years ago some of them migrated to eastern Sumatra, while others remained behind and stayed in northwestern Borneo. Some of the latter traveled south to Ketapang and then crossed over to Bangka and Belitung (Nothofer, 1997). Those remaining in the homeland area are the ancestors of speakers of Malayic Dayak languages (e.g., Iban,

Selako). The Malays who sailed to Sumatra settled the island’s east coast. Some moved on into the interior and to the west coast of southern Sumatra. While Middle Malay, Minangkabau, and Kerinci have inland and west coast variants as their origin, Malay itself developed from isolects spoken on the east coast. Later, Malay speakers from the southeast coast of Sumatra established Malay colonies in the Malay Peninsula. Other Malays returned to west Borneo, where they settled the coastal and riverine areas. The isolects spoken by these relatively recent migrants differ considerably from the isolects of Malays who never left Borneo. Coastal Borneo has other Malay isolects such as Sarawak Malay, Brunei Malay, Kutai Malay, and Banjjar, perhaps as a result of a clockwise settlement that originated in western Borneo (Figure 1). Malay, the native language of the powerful kingdoms along the shores of the Straits of Malacca through which all traders from the west and the east

Malay 451

Figure 1 Map of the Malay-speaking area.

had to sail, was prone to become the means of communication of all those involved in commercial activities in the Indo-Malaysian archipelago. With the development of the spice trade, this language was carried all the way to the Moluccas and to the many other harbor towns of this archipelago. when the Portuguese arrived in the early 16th century, simplified forms of Malay had already spread east and developed into creoles replacing local languages (e.g., Kupang Malay, Ambon Malay, Larantuka Malay). On the Malay Peninsula and on the adjacent southern islands, Malay developed literary varieties at the various royal courts. The most prestigious one was the literary classical Malay of the Riau-Johore kingdom, which had its roots in the literary tradition of the earlier sultanate of Malacca (Sneddon, 2003; Prentice, 1978). The existence of two standard varieties of Malay, namely Malaysian (called ‘Bahasa Melayu’ in Malaysia) and Indonesian (‘Bahasa Indonesia’), is mainly the result of an agreement reached between the British and the Dutch, who in 1824 drew new boundaries of their colonial territories. The mainland part of the Malay-speaking area became part of the British realm, and Sumatra together with the offshore islands became part of the Dutch realm. The treaty divided the former Riau-Johore Sultanate into two separate entities, with Johore belonging to the British and the Riau archipelago belonging to the Dutch. Because of this political demarcation, the influential Riau-Johore variant of Malay was now spoken in two distinct territories, which were to become Malaysia and Indonesia. Since this prestigious Riau-Johore

court language played a major role in the formation of the standard languages of both countries, Malaysian and Indonesian remained closely related and are dialects of one and the same language. The differences between the two are most obvious in the vocabulary. The phonological, morphological, and syntactic differences are few and not very significant. There are a considerable number of cases in which Malaysian borrowed an English word and Indonesian a Dutch word, e.g., tayar vs. ban ‘tire’ or fius vs. sekering ‘fuse.’ Other variations occur when one of the two national variants has borrowed a European word, while the other one is a retention or an innovation, e.g., Malaysian dulang ‘tray’ (retention) vs. Indonesian baki ‘tray’ (from Dutch bakje) or Malaysian panggung wayang ‘cinema’ (innovation) vs. Indonesian bioskop (from Dutch bioscoop). There are cases when both Malaysian and Indonesian share the some word but with minor phonetic variation, e.g., Malaysian kerusi, Indonesian kursi ‘chair’ (from Arabic kursı¯ ). In some instances the Malay word underwent different semantic changes, e.g., Malaysian pusing ‘turn, revolve’ has the meaning ‘dizzy’ in Indonesian. Furthermore, Malaysian has borrowed more from Arabic than Indonesian, while Indonesian has undergone considerable Javanese and Jakarta Malay influence. In Indonesia, the establishment of Malay as the national language was not disputed; its choice was not regarded as favoring any one ethnic group, since ethnic Malays constituted no more than 10% of Indonesia’s population. Furthermore, various forms of Malay had long been established throughout the

452 Malay

Indonesian archipelago. In Malaysia, the situation was different. When Malaysia became independent in 1957, Malay became the national langauge and one of the official languages (the other is English). Malay became the only language of education. Since Malay was more or less the exclusive property of the Malays, who made up about 50% of the population, the Chinese and Indian population of Malaysia felt at a disadvantage. A change of the language name from Bahasa Melayu to Bahasa Malaysia was one of the compromises made to comfort the non-Malay population. Later, however, the name Bahasa Melayu was reintroduced. In Malaysia, English still plays in important role and today competes with Malay as the language of instruction; in 1993, English became the language of instruction in universities. The Malaysian government argued that this was done in the interest of science and technology (Sneddon, 2003). Since 2003, secondary schools have taught mathematics and sciences in English. The introduction of English as language of education was based on the government’s observation that the knowledge of English among pupils and students had deteriorated dramatically. Many Malays are worried that juxtaposing Malay and English against each other will result in a new linguistic scenario and marginalize the original national language policy. In 1984 Malay also became the national language of Brunei Darussalam in northeast Borneo and is also called Bahasa Melayu. In this country it is the sole official language. The standard language is lexically much closer to Malaysian. In addition to Bahasa Melayu the state of Brunei also has another Malay variant (Brunei). This variant constitutes the main lingua franca in the coastal regions of Brunei

Table 1 Vowel phonemes in standard Malay

High Mid Low

i e

Malay Phonology The description of the Malay phonology shown here is that of Standard Malay (SM), as defined by Adelaar (1992: 3). The vowel phonemes of SM are shown in Table 1, and consonant phonemes are shown in Table 2. The consonant /r/ is realized as a velar or uvular fricative and elided word finally by speakers of the traditional Malay areas. It is an apical flap or trill outside these areas and in official Indonesian (Adelaar, 1992: 8).

Malay Morphology

Central

Back

e

Front

(Nothofer, 1991). Brunei has an official bilingual education policy that preserves the status of Malay but recognizes the importance of English by making it the medium of instruction from the upper primary school onward in almost all subjects. Malay is also the national language of Singapore and one of its four official languages, along with English, Mandarin Chinese, and Tamil. Malay is a minority language, spoken by not more than 15% of the population. In southern Thailand, more than a million speakers use a Malay variant, Pattani Malay. Cooperation between Malaysia and Indonesia resulted in the spelling reform of 1972, which removed the differences in the spelling of consonants, e.g., former Malaysian ch and Indonesian tj are now spelled c; former Malaysian and Indonesian dj are now spelled j. The cultural pact between the countries was intensified in 1972 with the establishment of a council known as the Language Council for Indonesia and Malaysia (MBIM). Its main tasks are to create a common scientific terminology and cooperate closely on matters pertaining to language. In 1986, Brunei Darussalam officially joined as a member of the Council, which took the new name MABBIM.

u o

a

Diphthongo: -ay, -aw.

Malay prefixes include: ber- ‘stative, habitual’; meN- ‘active, agent focus’; di- ‘passive, patient focus’; memper-/diper- ‘causative’; ter- ‘accidental state, involuntary, agentless, sudden’; and peN-, per-, pe‘actor of the performance, instrument with which the

Table 2 Consonant phonemes in standard Malay

Voiceless stops Voiced stops Nasals Fricatives Liquids Semivowels

Labial

Dental

p b m

t

w

Alveolar

d n s l

Palatal

Velar

c j N

k g n

Glottal

h r y

Malayalam 453

action is performed, someone having a quality as a characteristic.’ The common Malay suffixes are: -an ‘collectivity, similarity, object of an action, place where the action is performed, instrument with which the action is performed’ -kan ‘causative, benefactive’; and -i ‘locative, repetitive, exhaustive.’ Malay circumfixes include: ber- -an ‘diffuse action, plurality of subject’; ke- -an (verbal) ‘unintentional action or state, potential action’; ke- -an (nominal) ‘nouns referring to a quality, abstract nouns, collectivity’; peN- -an, per- -an ‘abstract nouns, place where the action is performed, goal or result of action.’ See also: Austronesian Languages: Overview; Brunei Darussalam: Language Situation; Indonesia: Language Situation; Javanese; Malayo-Polynesian Languages; Malaysia: Language Situation.

Bibliography Adelaar K A (1992). Pacific Linguistics C–119: ProtoMalayic: the reconstruction of its phonology and parts of its lexicon and morphology. Canberra: Australian National University. Adelaar K A (1995). ‘Borneo as a cross-roads for comparative Austronesian linguistics.’ In Bellwood P, Fox J J & Tryon D (eds.) The Austronesians: historical and comparative perspectives. Canberra: Research School of Pacific and Asian Studies, Department of Anthropology, Australian National University. 75–95. Adelaar K A (1996). ‘Malay: the national language of Malaysia.’ In Wurm S A, Mu¨ hlha¨ usler P & Tryon D T (eds.) Atlas of languages of intercultural communication in the Pacific, Asia, and the Americas. Berlin: Mouton de Gruyter. 673–693. Collins J T (1998). Malay, world language: a short history. Kuala Lumpur: Dewan Bahasa dan Pustaka. Grimes C E (1996). ‘Indonesian: the official language of a multilingual nation.’ In Wurm S A, Mu¨ hlha¨ usler P &

Tryon D J (eds.) Atlas of languages of intercultural communication in the Pacific, Asia, and the Americas. Berlin: Mouton de Gruyter. 719–727. Martin P W, Ozog C & Poedjosoedarmo G (eds.) (1996). Language use and language change in Brunei Darussalam. Athens: Ohio University Center for International Studies. Moeliono A & Grimes C E (1995). ‘Indonesian (Malay).’ In Tryon D T (ed.) Comparative Austronesian dictionary: an introduction to Austronesian studies 1. Berlin: Mouton de Gruyter. 443–457. Nothofer B (1991). ‘The languages of Brunei Darussalam.’ In Steinahuer H (ed.) Papers in Pacific linguistics, A–81. Canberra: Australian National University. 151–176. Nothofer B (1997). Dialek Melayu Bangka. Bangi: Penerbit Universiti Kebangsaan Malaysia. Omar A H (1978). ‘The use of the Malaysian national language in a multilingual society.’ In Udin S (ed.) Spectrum: essays presented to Sutan Takdir Alisjahbana on his seventieth birthday. Jakarta: Dian Rakyat. Prentice D J (1978). ‘The best chosen language.’ Hemisphere 22(3), 18–23; 22(4), 28–33. Prentice D J (1990). ‘Malay (Indonesian and Malaysian).’ In Comrie B (ed.) The major languages of East and Southeast Asia, 2nd edn. London: Routledge. 185–207. Sneddon J N (1996). Indonesian: a comprehensive grammar. Sydney: Allen & Unwin. Sneddon J (2003). The Indonesian language: its history and role in modern society. Sydney: University of New South Wales. Steinhauer H (1980). ‘On the history of Indonesian.’ In Barentsen A A, Groen B M & Sprenger R (eds.) Studies in Slavic and general linguistics, vol. 1. Amsterdam: Rodopi. 349–375. Teeuw A (1959). ‘The history of the Malay language.’ Bijdragen tot de Taal-, Land-en Volkenkunde 115, 138–156. Vikør L (1983). ‘Language policy and language planning in Indonesia and Malaysia.’ In Svensson T & Sørensen P (eds.) Indonesia and Malaysia. Scandinavian studies in contemporary society. London: Curzon. 47–74.

Malayalam B Gopinathan Nair, St Xavier’s College, Thiruvananthapuram, Kerala State, India ! 2006 Elsevier Ltd. All rights reserved.

Language and Speakers Malayalam, a major literary language of South India with long traditions of literature and scripts, is the main language of the state of Kerala and of the islands of Lakshadweep, which are 200–400 km off the southwest coast of India. Malayalis have migrated to different parts of India and overseas, especially

to Malaysia, Singapore, the United States, Canada, the United Kingdom, and Australia. The number of Malayalam speakers in India is 31.83 million. In Kerala, 96% of the total population is composed of the religious majority (comprising Hindus, 58.1%) and the religious minorities (Muslims, 21.3%; Christians, 20.6%); these groups mostly speak Malayalam. Linguistic minorities comprise 5.2% of the population. Kerala has the highest literacy rate in India (90.6% of the population). The number of dailies and periodicals in Malayalam in 2000 was 1505 (according to the Manorama year book in 2004).

Malayalam 453

action is performed, someone having a quality as a characteristic.’ The common Malay suffixes are: -an ‘collectivity, similarity, object of an action, place where the action is performed, instrument with which the action is performed’ -kan ‘causative, benefactive’; and -i ‘locative, repetitive, exhaustive.’ Malay circumfixes include: ber- -an ‘diffuse action, plurality of subject’; ke- -an (verbal) ‘unintentional action or state, potential action’; ke- -an (nominal) ‘nouns referring to a quality, abstract nouns, collectivity’; peN- -an, per- -an ‘abstract nouns, place where the action is performed, goal or result of action.’ See also: Austronesian Languages: Overview; Brunei Darussalam: Language Situation; Indonesia: Language Situation; Javanese; Malayo-Polynesian Languages; Malaysia: Language Situation.

Bibliography Adelaar K A (1992). Pacific Linguistics C–119: ProtoMalayic: the reconstruction of its phonology and parts of its lexicon and morphology. Canberra: Australian National University. Adelaar K A (1995). ‘Borneo as a cross-roads for comparative Austronesian linguistics.’ In Bellwood P, Fox J J & Tryon D (eds.) The Austronesians: historical and comparative perspectives. Canberra: Research School of Pacific and Asian Studies, Department of Anthropology, Australian National University. 75–95. Adelaar K A (1996). ‘Malay: the national language of Malaysia.’ In Wurm S A, Mu¨hlha¨usler P & Tryon D T (eds.) Atlas of languages of intercultural communication in the Pacific, Asia, and the Americas. Berlin: Mouton de Gruyter. 673–693. Collins J T (1998). Malay, world language: a short history. Kuala Lumpur: Dewan Bahasa dan Pustaka. Grimes C E (1996). ‘Indonesian: the official language of a multilingual nation.’ In Wurm S A, Mu¨hlha¨usler P &

Tryon D J (eds.) Atlas of languages of intercultural communication in the Pacific, Asia, and the Americas. Berlin: Mouton de Gruyter. 719–727. Martin P W, Ozog C & Poedjosoedarmo G (eds.) (1996). Language use and language change in Brunei Darussalam. Athens: Ohio University Center for International Studies. Moeliono A & Grimes C E (1995). ‘Indonesian (Malay).’ In Tryon D T (ed.) Comparative Austronesian dictionary: an introduction to Austronesian studies 1. Berlin: Mouton de Gruyter. 443–457. Nothofer B (1991). ‘The languages of Brunei Darussalam.’ In Steinahuer H (ed.) Papers in Pacific linguistics, A–81. Canberra: Australian National University. 151–176. Nothofer B (1997). Dialek Melayu Bangka. Bangi: Penerbit Universiti Kebangsaan Malaysia. Omar A H (1978). ‘The use of the Malaysian national language in a multilingual society.’ In Udin S (ed.) Spectrum: essays presented to Sutan Takdir Alisjahbana on his seventieth birthday. Jakarta: Dian Rakyat. Prentice D J (1978). ‘The best chosen language.’ Hemisphere 22(3), 18–23; 22(4), 28–33. Prentice D J (1990). ‘Malay (Indonesian and Malaysian).’ In Comrie B (ed.) The major languages of East and Southeast Asia, 2nd edn. London: Routledge. 185–207. Sneddon J N (1996). Indonesian: a comprehensive grammar. Sydney: Allen & Unwin. Sneddon J (2003). The Indonesian language: its history and role in modern society. Sydney: University of New South Wales. Steinhauer H (1980). ‘On the history of Indonesian.’ In Barentsen A A, Groen B M & Sprenger R (eds.) Studies in Slavic and general linguistics, vol. 1. Amsterdam: Rodopi. 349–375. Teeuw A (1959). ‘The history of the Malay language.’ Bijdragen tot de Taal-, Land-en Volkenkunde 115, 138–156. Vikør L (1983). ‘Language policy and language planning in Indonesia and Malaysia.’ In Svensson T & Sørensen P (eds.) Indonesia and Malaysia. Scandinavian studies in contemporary society. London: Curzon. 47–74.

Malayalam B Gopinathan Nair, St Xavier’s College, Thiruvananthapuram, Kerala State, India ! 2006 Elsevier Ltd. All rights reserved.

Language and Speakers Malayalam, a major literary language of South India with long traditions of literature and scripts, is the main language of the state of Kerala and of the islands of Lakshadweep, which are 200–400 km off the southwest coast of India. Malayalis have migrated to different parts of India and overseas, especially

to Malaysia, Singapore, the United States, Canada, the United Kingdom, and Australia. The number of Malayalam speakers in India is 31.83 million. In Kerala, 96% of the total population is composed of the religious majority (comprising Hindus, 58.1%) and the religious minorities (Muslims, 21.3%; Christians, 20.6%); these groups mostly speak Malayalam. Linguistic minorities comprise 5.2% of the population. Kerala has the highest literacy rate in India (90.6% of the population). The number of dailies and periodicals in Malayalam in 2000 was 1505 (according to the Manorama year book in 2004).

454 Malayalam

Etymology and Variant Names Malaya¯ am is a combination of mala ‘mountain’ with any of the following terms: a am ‘the place,’ denoting ‘the mountain country’; a¯ łam ‘depth,’ representing ‘the land that lies between the mountain and the deep ocean’; or a¯ ‘man,’ meaning ‘mountain dweller.’ The last term may convey the original meaning of Malaya¯ am, denoting both ‘the people,’ depicted by word forms such as malaya¯ ar, malaya¯ i, and malana¯ uka¯ran, and the region or country, as in the term malana¯ u. Early variants include malaya¯łma, malaya¯yma, and mlaya¯ ma. Malaya¯ am may be a later variant. Lilatilakam, a famous 14thcentury work on the grammar and language of Malayalam, mentions only ke¯ra abha¯ a to denote the language.

Development of Literature Malayalam flourished in Kerala amidst continuous contact and convergence with Sanskrit, Prakrit, and Pali; profusely borrowing lexical items from these languages in addition to incorporating loans from Arabic, Persian, Urdu, Syriac, Portuguese, Dutch, Hindi, and English. The early development of Malayalam was considerably influenced by Sanskrit, the language of scholarship, and Tamil, the language of administration; eventually, Malayalam evolved in written documents and literature. The Brahmin contact made profound impact in adapting several Indo– Aryan features into Malayalam. Malayalam has a recorded literary history of over eight centuries; the earliest document, the Vałappa i inscription of Rajasekhara, dates to the 9th century. The early literature developed through three different traditions: (1) the Tamil tradition of pa¯ u, the classical songs depicted in the first literary work, Ramacaritam, (2) the Sanskrit tradition of ma iprava¯la, a literary innovation portraying a harmonious blend of bha¯ a and Samsk ita (i.e., the native language and Sanskrit – for instance, Vais´ikatantram), and (3) the native tradition of producing folk songs and ballads predominantly concerning indigenous elements. Bhasakautiliyam is the earliest prose written in simple language. All three traditions belong to the 12th century. Modern Malayalam literature is rich in fiction, poetry, prose, drama, short stories, biographies, and literary criticism.

Writing System The early Malayalam writing system had evolved from Va eluttu, traceable to the Pan-Indian Brahmi

script; this system continued for a long period, eventually adding symbols from Grantha script to represent Indo-Aryan loans. The writing is based on the concept of the ak ara ‘graphic syllable,’ wherein the graphic elements have to be read as units, although the individual vowels and consonants are easily recognizable. The script reformation implemented in the 1970s made a reduction of the less frequent conjunct consonants and combinations of the vowel u with different consonants, to make a simpler writing scheme. The orthography is largely phonemic, with separate script for each phoneme (with a few exceptions). The dental and alveolar single nasals (n and n) are depicted by the same script, as are their long counterparts. The direction of writing for several scripts is clockwise. In a few cases, the direction is clockwise plus anticlockwise or vice-versa within a single letter. Malayalam scripts bear simple to complex allographic representations. Geminated consonants and heteroelemental consonant clusters are marked differently by writing the consonants side by side, or one above the other. Additionally, there are other combinations of consonants that seldom follow regular patterns in graphemic depiction. The six consonants m, n, , r, l, and in word-final positions have separate symbols for writing.

Grammatical Tradition Malayalam grammatical tradition commenced with the 14th-century Lilatilakam. The European contributions in the early 18th century were of great importance, especially Hermann Gundert’s Malaya¯abha¯ a¯vya¯kara am (1851, 1868). The 19th century saw the publication of grammatical treatises by a few native scholars, viz., George Mathan (Malaya¯lmayu e vya¯kara am), Kovunni Nedungadi (Ke¯ra akoumudi; 1878), and some others, but the most widely used work was that of A. R. RajaRaja Varma, the Ke¯ra a pa¯ inı¯yam (1896). This was followed by L. V. Ramaswamy Iyer’s profound contribution to various aspects of Malayalam linguistics in 1925. The past four decades have witnessed the production of considerable work based on modern linguistic theories and descriptive techniques applied both to various written texts belonging to different centuries, ranging from Ramacaritam to the 16th-century Adhyatma Ramayanam, and to regional, caste, communal, and tribal dialects, with the ultimate goal of preparing a historical grammar for Malayalam that is still a desideratum (much of this work can be found in the Ph.D. dissertations by scholars in the Department of Linguistics, University of Kerala).

Malayalam 455

Dialect Variation Malayalam dialect variations are discernible with respect to phonetic, phonological, grammatical, semantic, and lexical levels and in intonation patterns along the parameters of caste, community, region, social stratum, education, occupation, style, and register. The speech forms of Travancore, Cochin, South and North Malabar, and the Lakshadweep islands show considerable differences. Among the 48 tribal languages in the hilly tracts of Kerala, many are dialects of Malayalam and a few belong to one or the other of South Dravidian languages. The first systematic dialect survey of Malayalam was based on a single speech community, the Ezhava/ Tiyas groups living throughout Kerala; the survey was completed in 1968, demarcating 12 major dialect areas (Subramoniam, 1974). This was followed by the Nair and Harijan dialect surveys. About 600 dialect maps were prepared concerning the Ezhava/Tiyas and Nair castes, along with frequency charts of the variants, showing differences with regard to 300 diagnostic lexical items. Copies of these maps are preserved in the collections of the Department of Linguistics, University of Kerala, and the International School of Dravidian Linguistics, Thiruvananthapuram, Kerala. Among several dialect variations, the occurrence of y in place of ł is commonplace, as in pałam > payam ‘fruit,’ but the occurrence of t (ketakku ‘east’) is a rare feature found in the northern part of Kasargod. Initial v/b alternation, as in v/barin ‘come’ and v/ba¯ ppa ‘father,’ is a distinct feature of Muslim speech throughout Kerala and Lakshadweep, but in Cannanore district this change is found in the speech of other castes, demonstrating the overlap of caste, communal, and regional traits. Word-final n/m alternation is a feature of the Muslim dialect of Ernad and Lakshadweep, as in ne¯ ran/ne¯ ram ‘time.’ Present tense markers -anRa and -u a, as in kottanRa ‘chops’ and i u a ‘places,’ is a peculiarity of the PaRaya speech of Kasargod; -a a, as in baya a ‘comes,’ is found in the Muslim dialect of Lakshadweep. The literary dialect is almost uniform. The language that is used in newspapers, in mass media, and in formal situations, which is largely understood by the majority of the people irrespective of caste, community, and region, is considered to be the standard variety. A standard colloquial is slowly evolving.

which belong to the South Dravidian branch of the Dravidian family. However, the affinity with Tamil is greater, since Malayalam emerged from Proto-Tamil– Malayalam; divergence occurred over a period of four or five centuries, from the 8th century onward, and distinct languages, separate from Tamil, were established. Three distinctive features of Proto-Tamil– Malayalam include (1) k- > c- before front vowels, whether followed by a retroflex or not, (2) *e, *o > i, u before a derivative suffix beginning with a, and (3) the presence of the accusative suffix -ai. The features that distinguish Malayalam from Tamil are (1) progressive assimilation of nasal þ stop > nasal þ nasal except in retroflexes and labials, (2) loss of person–number–gender in finite verbs, (3) negative periphrastic construction with illa, and (4) prohibitive construction with infinitive þ arutu.

Characteristic Features Phonology

Five vowels with length contrast, i, e, a, o, and u, occur in the literary and spoken dialects. Two diphthongs, ai and au, occur in the literary language. An onglide y and v occur word initially in front and back vowels, respectively, in pronunciation. Vowels occur in all positions except for short o, word finally. The u is pronounced as a high back rounded vowel when it occurs initially, medially without length in the first syllable, and with length finally, whereas it is pronounced as a lower high back unrounded vowel (samvrutookaaram) medially except in the first syllable and without length word finally. Single voiceless stops (except in clusters with homorganic nasals) are pronounced with voicing (with or without slight fricativization) when occurring intervocalically; voiceless stops preceded by homorganic nasals are pronounced with slight voicing. Generally, aspirated plosives lose aspiration in pronunciation. Only six consonants, m, n, , r, l, and , can occur word finally; n occurs medially either with length or in clusters. All other consonants can occur word initially and medially. Voiceless stops occur medially with length and in clusters, except with homorganic nasals. For consonants, length contrast occurs only medially; r, s, s´ , , h, and ł do not geminate. Morphophonemics

Genetic Affiliation Malayalam shows affinity to Tamil, Kota, Toda, Irula, Badaga, Kodagu, Kannada, and Tulu, all of

The sandhi rules (a systematic blend of words; Sanskrit sandhi ‘to join’) fall under two categories, internal and external, the former operating within a word and the latter operating between words; the

456 Malayalam

rules may operate in either category or in both categories (sandhis). For example, for a vowel V, the rule V1 þ V2(V2) þ V2(V2) operates only in close juncture: a¯ yi þ illa > a¯ yilla ‘did not become.’ The following examples show other sandhi blends: i e a kut i nalla tala

i > e þyþ V (internal and external) VV a VV þ u e > ku iyu e ‘child of ’ þ a¯ u > nallaya¯ u ‘good person’ þ alla > talayalla ‘head not’

u

þ

þ

V

V (V )

ra¯ mu þ um

> uþVþ

V(V)

(internal and external)

> ra¯ muvum ‘Ramu also’

kuru

þ a¯ yi

> kuruva¯ yi ‘seed become’

l/

þ n

>

þ nı¯ru

> ka

to ka

m/n/ cem pin pe

þ nu¯ Ru > to

þ STOP þ þ þ

(external) u¯ Ru ‘ninety’ ı¯ru ‘tears’

> ta¯ mara tu a ku i

(homorganic nasal þ stop) > centa¯ mara ‘lotus’ > pintu a ‘support’ > pe ku i ‘girl’

Morphology

Noun stems fall under three categories, viz., personal pronouns (first person, inclusive and exclusive), second person, and reflexives. Demonstrative base þ gender number marker constitutes third-person pronouns, as in av-an ‘he,’ ava ‘she,’ and av-ar ‘they.’ Numerals consist of adjectival and case bases. Number markers are -n (gender singular), -m (gender plural), n˜ a¯ n ‘I,’ nammal ‘we (inclusive),’ -tu (nongender-neutral singular), and atu ‘that.’ Examples of gender markers are -n, -a¯ n, and -an (masculine) and - , -atti, and -a¯ i (feminine). Plural suffixes are -a and -ka . Nominative case has no marker; accusative uses -e and -a, dative uses -u and -kku, and instrumental uses -aal (in literary Malayalam; in dialects, postposition kontu ‘with’ is used). Sociative case uses -o¯ u and locative case uses -il. Verbs do not distinguish person–number–gender. Both finite and nonfinite verbal forms consist of a verb stem followed by verbal suffixes, which take (or can take) tense markers. A few verbs do not take tense but can take negative markers (illa ‘no,’ alla ‘not,’ and arutu ‘do not’). Verbs fall into two groups, intransitive and transitive. Some of the former are transitivized morphologically in three ways: (1) by suffixing the markers -tt- and -kk- to the intransitive verb stem (iru ‘to sit,’ iru-tt- ‘to make to sit’; o i ‘to break,’ o i-kk- ‘to make to break’), (2) by geminating the stem-final stops (aa - ‘to become,’ a¯ kk- ‘to make to become’; aa - ‘to swing,’ aa - ‘to make to swing’; ke¯ r- ‘to climb, ke¯ RR- ‘to make to climb’), and

(3) stem finally (nasal þ nasal > homorganic stop þ homorganic stop (uRan´ n´ -‘to sleep,’ uRakk‘to make to sleep’). Two causative markers, -i- and -ppi-, can occur simultaneously within a verb, as in paRay-i-ccu ‘caused to say,’ paRay-i-ppi-ccu ‘to cause to say.’ Three-way distinctions in tense occur, i.e., present, future, and past. Examples are -unnu (present tense) and -um (future tense), as in var-unnu ‘comes’ and var-um ‘will come.’ All vowel-ending stems take link morph -kk- before present and future markers, as in pa i-kk-unnu ‘learns’ and pa i-kk-um ‘will learn.’ The verb stem ve¯ -is peculiar in that it takes the future tense marker-am (ve¯ -am ‘will need’) but does not take either the present or the past tense suffix. There are several past tense markers: the vowel ending -i (pa¯ i ‘sang’) and nasals -nn-, -n˜ n˜ -, - -, and -nt- (iru-nnu ‘sat,’ kara-n˜ n˜ u ‘wept,’ ta¯ - u ‘drowned,’ no-ntu ‘pained,’ and ve -ntu ‘boiled’; only these last two verbs take the past tense -nt-). Stops are -t-, - -, -R-, and -c- (e u-ttu ‘took,’ ka - u ‘saw,’ pe-RRu ‘delivered,’ and a i-ccu ‘beat’). Negative suffixes are -a¯ tt- before the relative participle marker -a (var-a¯ tt-a ‘that which did not come), -a¯ tbefore the verbal participle marker -e (var -a¯ t -e ‘having not come’), and -a, which freely varies with -a¯ (ve¯ - a(a) ‘not needed’); -a¯ n denotes the purposive infinitive (paRay-a¯ n ‘saying’). The vowel-ending stems can be used as imperatives (paRa ‘(you) tell,’ no¯ kku ‘(you) look’), but are less polite in speech than -u¯ is (the more polite forms are paRay-u¯ and no¯ kk-u¯ ). The optative marker is -a e, as in var-a e ‘let (him) come.’ Syntax

Three major types of sentences, simple, complex, and compound, can be discerned. A simple sentence consists of the subject noun and predicate verb, as in raaman varunnu ‘raman comes.’ A nominal sentence in which both the subject and the predicate are nouns is seen in atu maram ‘that is tree.’ The finite verb aanu is optional. Malayalam word order is not rigid. Subject–object–verb is the usual order. A noun or noun phrase can be the subject in a sentence. A noun phrase (NP) is expandable by modifiers, the structure of which is " possessive " demonstrative " numeral Adj " Adj þ NP, as in enRe a¯ oru nalla peena ‘my that one good pen’ ‘that good pen of mine.’ A noun phrase can be expanded by a relative participle, nouns, case base, and clitics, as in ceyta ka¯ ryam ‘thing done,’ kayka¯ ryam ‘handling the affairs,’ tankaaryam ‘one’s own affair,’ ka¯ uko¯ łi ‘jungle fowl,’ and piRRe divas am ‘next day.’ Nouns/noun phrases can form the direct or indirect object. If the direct object is an animate noun, the

Malayo–Polynesian Languages 457

accusative case suffix -e is added; if the direct object is an inanimate noun, the case suffix is dropped. The indirect object takes the dative case, as in ava sı¯taykku (indirect object) oru pu¯ ccaye (direct object) kot’u u ‘she gave a cat to Sita.’ Verbs/verb phrases are expanded by verbal participles, auxiliary verbs, or adverbial clitics, as in talayil cuma u veccuko uttu ‘placed a bundle on the head’ and patukke pooyi ‘slowly gone.’ Interrogative sentences can be formed by adding the interrogative clitic, which would yield ‘yes’ or ‘no’ types of answers (ava sı¯taya¯ no¯ ‘is she Sita?’). This can also denote doubt. After the defective verbs illa and alla, the interrogative particle -ee is added (illee, alle¯ , ‘is it not’). Interrogative words such as aaru ‘who,’ eetu ‘which,’ and entu ‘what’ can be added to form sentences, as in a¯ ru paRan˜ n˜ u ‘who told’, e¯ tu ka¯ ryam ‘which subject,’ and entu ve¯ am ‘what (do you) want.’ Negative sentences are formed either by negativizing the verb phrase by using morphological negative markers, or by negation of the sentence or verb phrase by using defective verbs such as illa and alla (po¯ yi ‘went,’ po¯ yilla ‘did not go,’ itu

pe¯ na a¯ u ‘this is pen,’ itu pe¯ na alla ‘this is not pen,’ a¯ ha¯ ram u u ‘(there) is food,’ a¯ ha¯ ramilla ‘(there) is no food’). See also: Dravidian Languages; Kannada; Tamil; Telugu.

Bibliography Andrewskutty A P (1971). Malayalam, an intensive course. Trivandrum: Dravidian Linguistic Association. Asher R E & Kumari T C (1996). Malayalam. London and New York: Routledge. GopinathanNair B (1971). ‘Caste dialects of Malayalam.’ In Proceedings of the First All India Conference of Dravidian Linguists. Trivandrum: Dravidian Linguistics Association. 409–414. Panikkar G K (1973). Description of the Ernad dialect of Malayalam. Trivandrum: Dravidian Linguistics Association. Somasekharan Nair P (1979). Cochin dialect of Malayalam. Trivandrum: Dravidian Linguistics Association. Subramoniam V I (1974). Dialect survey of Malayalam. Ezhava/Tiya. Trivandrum: University of Kerala.

Malayo–Polynesian Languages M Ross, The Australian National University, Canberra, Australia ! 2006 Elsevier Ltd. All rights reserved.

Introduction The term ‘Malayo–Polynesian’ today denotes the largest of the ten putative primary subgroups of the Austronesian language family (Blust, 1999). Malayo– Polynesian (MP) embraces perhaps 1100 languages, while the other nine groups consist only of the surviving fourteen Formosan languages of Taiwan (see Formosan Languages). Until Father Wilhelm Schmidt invented the term ‘Austronesian’ in 1899, however, ‘Malayo–Polynesian’ denoted the whole Austronesian language family. Its German equivalent, malayisch-polynesisch, was first used in print by Franz Bopp in 1841 (it is often wrongly attributed to Wilhelm von Humboldt, but is not found in his writings). ‘Malayo–Polynesian’ was in use with this meaning in English by the 1870s, but who first used it to refer to the language family is unclear (Ross, 1996). It continued to be used as a synonym for Austronesian until the 1970s, and is occasionally still used in this sense today. In 1977 Robert Blust showed that the primary division of Austronesian was into several subgroups

of languages spoken in Taiwan and a single subgroup which he labeled ‘Malayo–Polynesian’ and which includes all the Austronesian languages spoken outside Taiwan: the Austronesian languages of the Philippines, Southeast Asia, Madagascar, the IndoMalaysian archipelago, New Guinea, Island Melanesia, Micronesia, and Polynesia. This is the sense of ‘Malayo–Polynesian’ in the remainder of this article. Some scholars prefer the term ‘Extra-Formosan’ in its place. Clearly the potential for overlap between this and a discussion of Austronesian languages is great, and the reader is referred to Austronesian Languages: Overview for further information.

The Integrity of the Malayo–Polynesian Subgroup How do we know that all Austronesian languages outside Taiwan belong to a single subgroup? To determine a family tree we first compare the languages of the family and reconstruct the protolanguage from which they are descended, in this case Proto Austronesian (PAn). Then we identify subgroups of languages whose members share a set of innovations relative to PAn. We infer that the innovations are shared because they have been inherited from a single interstage language. This is far more probable than

Malayo–Polynesian Languages 457

accusative case suffix -e is added; if the direct object is an inanimate noun, the case suffix is dropped. The indirect object takes the dative case, as in ava sı¯taykku (indirect object) oru pu¯ccaye (direct object) kot’u u ‘she gave a cat to Sita.’ Verbs/verb phrases are expanded by verbal participles, auxiliary verbs, or adverbial clitics, as in talayil cuma u veccuko uttu ‘placed a bundle on the head’ and patukke pooyi ‘slowly gone.’ Interrogative sentences can be formed by adding the interrogative clitic, which would yield ‘yes’ or ‘no’ types of answers (ava sı¯taya¯no¯ ‘is she Sita?’). This can also denote doubt. After the defective verbs illa and alla, the interrogative particle -ee is added (illee, alle¯, ‘is it not’). Interrogative words such as aaru ‘who,’ eetu ‘which,’ and entu ‘what’ can be added to form sentences, as in a¯ru paRan˜n˜u ‘who told’, e¯tu ka¯ryam ‘which subject,’ and entu ve¯ am ‘what (do you) want.’ Negative sentences are formed either by negativizing the verb phrase by using morphological negative markers, or by negation of the sentence or verb phrase by using defective verbs such as illa and alla (po¯yi ‘went,’ po¯yilla ‘did not go,’ itu

pe¯na a¯ u ‘this is pen,’ itu pe¯na alla ‘this is not pen,’ a¯ha¯ram u u ‘(there) is food,’ a¯ha¯ramilla ‘(there) is no food’). See also: Dravidian Languages; Kannada; Tamil; Telugu.

Bibliography Andrewskutty A P (1971). Malayalam, an intensive course. Trivandrum: Dravidian Linguistic Association. Asher R E & Kumari T C (1996). Malayalam. London and New York: Routledge. GopinathanNair B (1971). ‘Caste dialects of Malayalam.’ In Proceedings of the First All India Conference of Dravidian Linguists. Trivandrum: Dravidian Linguistics Association. 409–414. Panikkar G K (1973). Description of the Ernad dialect of Malayalam. Trivandrum: Dravidian Linguistics Association. Somasekharan Nair P (1979). Cochin dialect of Malayalam. Trivandrum: Dravidian Linguistics Association. Subramoniam V I (1974). Dialect survey of Malayalam. Ezhava/Tiya. Trivandrum: University of Kerala.

Malayo–Polynesian Languages M Ross, The Australian National University, Canberra, Australia ! 2006 Elsevier Ltd. All rights reserved.

Introduction The term ‘Malayo–Polynesian’ today denotes the largest of the ten putative primary subgroups of the Austronesian language family (Blust, 1999). Malayo– Polynesian (MP) embraces perhaps 1100 languages, while the other nine groups consist only of the surviving fourteen Formosan languages of Taiwan (see Formosan Languages). Until Father Wilhelm Schmidt invented the term ‘Austronesian’ in 1899, however, ‘Malayo–Polynesian’ denoted the whole Austronesian language family. Its German equivalent, malayisch-polynesisch, was first used in print by Franz Bopp in 1841 (it is often wrongly attributed to Wilhelm von Humboldt, but is not found in his writings). ‘Malayo–Polynesian’ was in use with this meaning in English by the 1870s, but who first used it to refer to the language family is unclear (Ross, 1996). It continued to be used as a synonym for Austronesian until the 1970s, and is occasionally still used in this sense today. In 1977 Robert Blust showed that the primary division of Austronesian was into several subgroups

of languages spoken in Taiwan and a single subgroup which he labeled ‘Malayo–Polynesian’ and which includes all the Austronesian languages spoken outside Taiwan: the Austronesian languages of the Philippines, Southeast Asia, Madagascar, the IndoMalaysian archipelago, New Guinea, Island Melanesia, Micronesia, and Polynesia. This is the sense of ‘Malayo–Polynesian’ in the remainder of this article. Some scholars prefer the term ‘Extra-Formosan’ in its place. Clearly the potential for overlap between this and a discussion of Austronesian languages is great, and the reader is referred to Austronesian Languages: Overview for further information.

The Integrity of the Malayo–Polynesian Subgroup How do we know that all Austronesian languages outside Taiwan belong to a single subgroup? To determine a family tree we first compare the languages of the family and reconstruct the protolanguage from which they are descended, in this case Proto Austronesian (PAn). Then we identify subgroups of languages whose members share a set of innovations relative to PAn. We infer that the innovations are shared because they have been inherited from a single interstage language. This is far more probable than

458 Malayo–Polynesian Languages

the alternative assumption – that the innovations have occurred independently in each language that reflects them. All Austronesian languages outside Taiwan reflect certain phonological innovations relative to PAn, and we infer that they occurred in a single interstage language which Blust named Proto MalayoPolynesian (PMP). These innovations are enumerated here (from Blust, 1990) with minimal discussion and examples. A. PAn *t and *C merged as PMP *t. B. PAn *L and *n merged (with some unexplained exceptions) as PMP *n. C. PAn *S became a glottal spirant of some kind, but did not merge with *h. Innovation A is illustrated below, where PAn *t and PAn *C (which remain separate in the Formosan language Rukai) are merged in MP languages, exemplified by Itbayat, a language of the Batanes islands between Taiwan and Luzon: PAn *tuLa ‘freshwater eel’ (Rukai tola) > PMP *tuna (Itbayat tuna) PAn *pitu ‘seven’ (Rukai pito) > PMP *pitu (Itbayat pitu) PAn *Calina ‘ear’ (Rukai tsaUina) > PMP *talina (Itbayat talinDa) PAn *maCa ‘eye’ (Rukai matsa) > PMP *mata (Itbayat mata)

In innovation B, PAn *L and *n merged as PMP *n: PAn *qaLup ‘hunt’ (Rukai alopo) > PMP *qanup (Itbayat anup) PAn *wanan ‘right (hand)’(Rukai vanan) > PMP *wanan (Itbayat wanan)

Innovation C is reflected in PAn *duSa ‘two’ (Rukai dosa) > PMP *duha (Itbayat duha).

A major set of innovations in pronouns involved a ‘politeness shift’ (Blust, 1977). Just as the English plural pronoun you, used as a polite form of address, eventually displaced singular thou, so PMP underwent a set of changes in pronouns which were also related to politeness (for details see Ross, 2002: 51). No MP language reflects forms that predate the shift. PMP added to the verbal system the prefixes *paN-(distributive), *paR-(durative, reciprocal) and *paka-(aptative, potential) (Ross, 2002: 49–50). These are widely reflected in the languages of the Philippines and the western part of the Indo– Malaysian archipelago, and are preserved in fossilized form in many languages elsewhere in the MP subgroup.

History and Subgrouping How did it come about that all Austronesian languages outside Taiwan belong to a single subgroup while perhaps nine coordinate groups are represented in Taiwan itself? The obvious answer is that PAn was spoken in Taiwan and diversified into a group of languages there. Speakers of one of these languages left Taiwan, presumably for the northern Philippines. Their language underwent the innovations noted above, becoming the language we call PMP. Archaeological dating suggests that the culture that spoke PAn flourished in Taiwan around 3000 B.C. and that the migration to the Batanes islands or Luzon which led to the genesis of PMP occurred around 2000 B.C. The descendants of PMP speakers evidently spread, mostly south and then eastward, at an astonishing speed, colonizing the Philippines, the IndoMalaysian archipelago, parts of coastal New Guinea, and the Bismarck Archipelago in the northwest of Island Melanesia within about 500 years, by 1500 B.C. This history is reflected in the tree diagram of the Austronesian family (see Figure 1). This shows some 20 to 25 groups of western MP languages, spoken in the Philippines and the western part of the Indo-Malaysian archipelago, with outliers on Hainan, in the Vietnamese highlands, on the islands along the western coast of Thailand and Myanmar, and on Madagascar (see Figure 2). The migration of MP speakers to Madagascar was a much later event (Adelaar, 1991). Adelaar (2004) provides a listing of western MP groups which reflects current understanding. Although there are frequent references in the literature to ‘Western Malayo–Polynesian,’ there was never a ‘Proto Western MP,’ as western MP languages as a whole share no innovations. The similarities among western MP languages, such as they are, reflect shared retentions from PMP. The fact that the tree shows so many coordinate branches reflects the rapidity with which MP speakers occupied the region. They were agricultural people – rice growers – and probably encountered little significant opposition from the small populations of hunter-gatherers who had previously occupied these territories. There is reasonable evidence in the form of shared innovations that all MP languages in the regions labeled on the map as Central Malayo–Polynesian (CMP), South Halmahera/West New Guinea (SHWNG) and Oceanic are descended from a single language, shown on Figure 1 as Proto Central/Eastern Malayo-Polynesian (PCEMP) (Blust, 1993). However, the set of innovations that defines this grouping is not nearly as substantial as the set defining MP (see above), and we must infer that the period for which

Malayo–Polynesian Languages 459

Figure 1 Austronesian family tree.

PCEMP speakers remained an integrated speech community was short. The conventionally accepted family tree of Austronesian languages (originally proposed by Blust, 1977) gives PCEMP two daughters, ‘Proto CMP’ and ‘Proto Eastern MP.’ The status of both is doubtful. There is agreement among scholars today that PCEMP diversified, apparently rapidly, into a dialect network, and that the Eastern Malayo–Polynesian languages broke away from that network, probably as the dialects of the network were achieving the status of separate languages. There is no significant evidence, however, that there was ever a discrete Proto CMP (for more details, see Ross, 1995). The existence of Proto Eastern MP is also questionable, and it is possible that it was simply a peripheral section of the Central/Eastern MP dialect network (Ross, 1995; Adelaar, 2004). However, there is much less doubt about Proto SHWNG, the ancestor of a small group of languages in the south of Halmahera and scattered around the Bird’s Head Peninsula of New Guinea, and no doubt at all about Proto Oceanic. Proto Oceanic was the ancestor of the MP languages of New Guinea other than those belonging to SHWNG (see map) and of all the MP languages of Island Melanesia, Polynesia, and Micronesia other than Chamorro (Guam) and Palauan (Belau). The Oceanic languages share a striking set of innovations, larger than the set for Proto MP itself. These innovations were first recognised by Dempwolff (1937) in Volume 2 of his pioneering work on Austronesian (see Austronesian Languages: Overview) and have undergone various modifications since as the result of further research (Lynch et al., 2002: 63–67). The history represented by the varying strengths of the nodes in the Austronesian family tree diagram

(Figure 2) shows a period of relative stability during which Proto MP developed from the speech of those who emigrated from Taiwan, followed by 500 years of extraordinary settlement activity which culminated in the arrival of MP speakers in the Bismarck Archipelago. Here there seem to have been a few more centuries of relative stability, during which Proto Oceanic developed into a language that was at least phonologically and lexically rather different from its sisters around the Bird’s Head. Why the apparent halt in settlement activity? There were perhaps two reasons. First, New Guinea was already inhabited by Papuan speaking agriculturalists (see Papuan Languages) with much greater population densities than their hunter-gatherer neighbors to the west, and there was little space for the newcomers. Second, there probably was continued settlement activity during the development of Proto Oceanic, but no further than the Solomon Islands, to the east of which there is a substantial sea gap (Pawley, 1981). There is agreement among many linguists and archaeologists (but not all) working in Island Melanesia that Proto Oceanic was the language of the Lapita Culture, a group that produced distinctive pottery and exploded eastwards into the Pacific from about 1300 B.C. Island Melanesia (the Bismarck Archipelago, Solomon Islands, Vanuatu, New Caledonia, and Fiji), Tonga, and Samoa were all settled within a few hundred years (Kirch, 1997). A linguistic puzzle in this story is that Proto Polynesian, the Oceanic language ancestral to Tongan, Samoan, and the 40 or so languages of scattered Polynesian communities is structurally rather different from other Oceanic languages, yet there is no obvious hiatus in the archaeological record during which these differences might have developed. It is reasonably

Figure 2 The Austronesian family and major Malayo–Polynesian language groups.

460 Malayo–Polynesian Languages

Malayo–Polynesian Languages 461

certain, however, that Proto Polynesian or an immediate ancestor developed in the northeastern islands of Fiji (Geraghty, 1983).

See also: Austronesian Languages: Overview; Formosan Languages; Malaysia: Language Situation; Papuan Languages.

The Structures of Malayo–Polynesian Languages

Bibliography

MP languages show an extraordinary structural diversity. General accounts can be found in Lynch et al. (2002: 34–53) for Oceanic languages and Himmelmann (2004) for other MP languages. The languages of the Philippines and parts of northern Borneo, northern Sulawesi, and Madagascar largely retain the structure of PAn (see Austronesian Languages: Overview; Formosan Languages). The western MP languages of Vietnam, Hainan, and the Thailand/Myanmar islands show the influence of Mon-Khmer languages. In the western MP languages of Malaysia and western Indonesia we find a complex set of developments in which the PAn voice system is much reduced but applicatives take over much of its functional load (Ross, 2002). The CMP, SHWNG, and Oceanic languages (other than Polynesian) have a broad typological similarity, but with many variations. Most of these languages have lost all trace of voice and have subject-referencing verbal prefixes or proclitics. How this system arose from systems reflected in western MP languages is traced by Lynch et al. (2002: 57–63). Klamer (2002) describes CMP language structures, Ross (2004) those of non-Polynesian Oceanic languages.

Further References The MP family is vast, and the serious enquirer will need to look beyond this article. Detailed maps of the locations of MP languages and their dialects are found in Wurm and Hattori (1981–1983), although it is not a reliable source for subgrouping. Adelaar and Himmelmann (2004) is the major reference for MP languages other than Oceanic, Lynch et al. (2002) for Oceanic languages. Both works also include a large collection of grammar sketches of a sample of languages. Tryon (1995) is an extensive comparative lexicon. Associated with the historical study of MP languages, especially of Oceanic, is a solid body of work on culture history. Ross et al. (1998; 2003) are the first two of five volumes in which the terminologies used by Proto Oceanic speakers are reconstructed, following more piecemeal work on the lexicons of MP languages in general (Pawley and Ross, 1994) and work on MP culture history by scholars in various disciplines (Bellwood et al., 1995). Pawley and Ross (1993) is a short survey mostly of MP historical linguistics and cultural history.

Adelaar K A (1991). ‘New ideas on the early history of Malagasy.’ In Steinhauer H (ed.) Papers in Austronesian linguistics no. 1. Canberra: Pacific Linguistics. 1–22. Adelaar K A (2004). ‘The Austronesian languages of Asia and Madagascar: a historical perspective.’ In Adelaar & Himmelmann (eds.) 1–42. Adelaar K A & Himmelmann N (eds.) (2004). The Austronesian languages of Asia and Madagascar. London: Routledge. Bellwood P S, Fox J J & Tryon D (eds.) (1995). The Austronesians: historical and comparative perspectives. Canberra: Department of Anthropology, Research School of Pacific and Asian Studies, the Australian National University. Blust R A (1977). ‘The Proto-Austronesian pronouns and Austronesian subgrouping: a preliminary report.’ University of Hawaii Working Papers in Linguistics 9(2), 1–15. Blust R A (1993). ‘Central and Central-Eastern Malayo– Polynesian.’ Oceanic Linguistics 32, 241–293. Blust R A (1999). ‘Subgrouping, circularity, and extinction: some issues in Austronesian comparative linguistics.’ In Zeitoun E & Jen-kuei Li P (eds.) Selected Papers from the Eighth International Conference on Austronesian Linguistics. Taipei: Academia Sinica. 31–94. Dempwolff O (1937). Vergleichende Lautlehre des Austronesischen Wortschatzes. (Vol. 2). Deduktive Anwendung des Urindonesischen auf Austronesische Einzelsprachen. Berlin: Dietrich Reimer. Beihefte zur Zeitschrift fu¨ r Eingeborenen-Sprachen 17. Geraghty P (1983). The history of the Fijian languages. Honolulu: University of Hawaii Press. Himmelmann N (2004). ‘The Austronesian languages of Asia and Madagascar: typological characteristics.’ In Adelaar & Himmelmann (eds.) 110–181. Kirch P V (1997). The Lapita peoples: ancestors of the Oceanic world. Oxford: Blackwell. Klamer M A F (2002). ‘Typical features of Austronesian languages in Central/Eastern Indonesia.’ Oceanic Linguistics 41, 363–383. Lynch J, Ross M & Crowley T (2002). The Oceanic languages. Richmond, Surrey: Curzon. Pawley A K (1981). ‘Melanesian diversity and Polynesian homogeneity: a unified explanation for language.’ In Hollyman J & Pawley A (eds.) Studies in Pacific languages and cultures in honour of Bruce Biggs. Auckland: Linguistic Society of New Zealand. 269–309. Pawley A & Ross M D (1993). ‘Austronesian historical linguistics and culture history.’ Annual Review of Anthropology 22, 425–459. Pawley A & Ross M D (eds.) (1994). Austronesian terminologies: continuity and change. Canberra: Pacific Linguistics. Ross M D (1995). ‘Some current issues in Austronesian linguistics.’ In Tryon (ed.) 1, 45–120.

462 Malayo–Polynesian Languages Ross M (2002). ‘The history and transitivity of western Austronesian voice and voice-marking.’ In Wouk F & Ross M (eds.) The history and typology of western Austronesian voice systems. Canberra: Pacific Linguistics. 17–62. Ross M (2004). ‘The morphosyntactic typology of Oceanic languages.’ Language and Linguistics 5, 491–541. Ross M (compiler) (1996). ‘On the origin of the term ‘‘Malayo–Polynesian.’’’ Oceanic Linguistics 35, 143–145. Ross M, Pawley A & Osmond M (eds.) (1998). The lexicon of Proto Oceanic: the culture and environment of ancestral Oceanic society (Vol. 1). Material culture. Canberra: Pacific Linguistics.

Ross M, Pawley A & Osmond M (eds.) (2003). The lexicon of Proto Oceanic: the culture and environment of ancestral Oceanic society (Vol. 2). The physical environment. Canberra: Pacific Linguistics. Thurgood G (1999). From ancient Cham to modern dialects: two thousand years of language contact and change. Honolulu: University of Hawai’i Press. Tryon D T (ed.) (1995). Comparative Austronesian dictionary. Berlin: Mouton de Gruyter. Wurm S A & Hattori S (eds.) (1981–1983). Language atlas of the Pacific area. Part 1: New Guinea area, Oceania, Australia. Part 2: Japan area, Taiwan (Formosa), Philiipines, mainland and insular Southeast Asia. Canberra: Australian Academy of the Humanities.

Malaysia: Language Situation H O Asmar ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous editions, volume 5, pp. 2351–2353, ! 1994, Elsevier Ltd.

Malaysia, which consists of Peninsular Malaysia, Sabah, and Sarawak (the latter two situated on Borneo), has from the beginnings of recorded time been a multilingual region, with its indigenous population belonging to different linguistic groups. This situation has been intensified with colonialism, immigration, and the nation’s language policy.

Language Policy The national language policy of Malaysia states that Malay is the national and official language. It is written in Arabic and Roman scripts, but it is the latter that is the official writing system. Malay is also the main medium of instruction, which means that there are other language mediums allowed in the education system of the country: Arabic (Arabic, Standard) in Islamic schools, and Mandarin (Chinese, Mandarin) and Tamil in the national-type schools (those using a language other than the national language as medium of instruction, as opposed to national schools that use Malay). The national-type schools are required to teach Malay as a compulsory subject. All national and national-type schools are required to teach English, which has been given the status of a second language. ‘Pupils’ own languages’ (POL), that is, mother tongues, are taught in national schools if there is a request from at least 15 pupils to form a class. The POL classes so far are for Mandarin, Tamil, and Iban. Foreign languages such as Japanese and French are taught in certain secondary schools.

The multilingual nature of the country, supported by the education system, has made Malaysians bilingual. A great majority of non-Malay Malaysians know at least two languages: their own and Malay. With English, they become trilingual. There are monolinguals among the Malays, but most are bilingual at least, with English or Arabic as their second language. Radio Malaysia provides channels for Malay, English, Mandarin, and Tamil. The Sarawak local station broadcasts programs in Mukah Melanau (Melanau) and Iban, while the one in Sabah provides programs in Bajau Darat (Sama, Southern) and Tengara Dusun. Certain local stations in Peninsular Malaysia broadcast programs in two aboriginallanguages, Temiar and Semai. TV Malaysia has programs in Malay, English, Mandarin, and Tamil. The policy has made Malay and English primary or high languages in the educational, social, and professional lives of the Malaysians. It has also made almost 80% of Malaysians literate in Malay.

Indigenous Languages The indigenous languages belong to two different stocks: Austronesian and Austroasiatic. Austronesian Languages

Malay, which is the mother tongue of about 45% of the total population of Malaysia is Austronesian. Native Malay speakers are found in Peninsular Malaysia and the coastlands of Sabah and Sarawak. The other Austronesian languages are native to Sabah and Sarawak only. A modest estimate of the number of these languages is about 100. ‘New’ languages are being discovered all the time. What had previously been termed dialects may be found to be

462 Malayo–Polynesian Languages Ross M (2002). ‘The history and transitivity of western Austronesian voice and voice-marking.’ In Wouk F & Ross M (eds.) The history and typology of western Austronesian voice systems. Canberra: Pacific Linguistics. 17–62. Ross M (2004). ‘The morphosyntactic typology of Oceanic languages.’ Language and Linguistics 5, 491–541. Ross M (compiler) (1996). ‘On the origin of the term ‘‘Malayo–Polynesian.’’’ Oceanic Linguistics 35, 143–145. Ross M, Pawley A & Osmond M (eds.) (1998). The lexicon of Proto Oceanic: the culture and environment of ancestral Oceanic society (Vol. 1). Material culture. Canberra: Pacific Linguistics.

Ross M, Pawley A & Osmond M (eds.) (2003). The lexicon of Proto Oceanic: the culture and environment of ancestral Oceanic society (Vol. 2). The physical environment. Canberra: Pacific Linguistics. Thurgood G (1999). From ancient Cham to modern dialects: two thousand years of language contact and change. Honolulu: University of Hawai’i Press. Tryon D T (ed.) (1995). Comparative Austronesian dictionary. Berlin: Mouton de Gruyter. Wurm S A & Hattori S (eds.) (1981–1983). Language atlas of the Pacific area. Part 1: New Guinea area, Oceania, Australia. Part 2: Japan area, Taiwan (Formosa), Philiipines, mainland and insular Southeast Asia. Canberra: Australian Academy of the Humanities.

Malaysia: Language Situation H O Asmar ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous editions, volume 5, pp. 2351–2353, ! 1994, Elsevier Ltd.

Malaysia, which consists of Peninsular Malaysia, Sabah, and Sarawak (the latter two situated on Borneo), has from the beginnings of recorded time been a multilingual region, with its indigenous population belonging to different linguistic groups. This situation has been intensified with colonialism, immigration, and the nation’s language policy.

Language Policy The national language policy of Malaysia states that Malay is the national and official language. It is written in Arabic and Roman scripts, but it is the latter that is the official writing system. Malay is also the main medium of instruction, which means that there are other language mediums allowed in the education system of the country: Arabic (Arabic, Standard) in Islamic schools, and Mandarin (Chinese, Mandarin) and Tamil in the national-type schools (those using a language other than the national language as medium of instruction, as opposed to national schools that use Malay). The national-type schools are required to teach Malay as a compulsory subject. All national and national-type schools are required to teach English, which has been given the status of a second language. ‘Pupils’ own languages’ (POL), that is, mother tongues, are taught in national schools if there is a request from at least 15 pupils to form a class. The POL classes so far are for Mandarin, Tamil, and Iban. Foreign languages such as Japanese and French are taught in certain secondary schools.

The multilingual nature of the country, supported by the education system, has made Malaysians bilingual. A great majority of non-Malay Malaysians know at least two languages: their own and Malay. With English, they become trilingual. There are monolinguals among the Malays, but most are bilingual at least, with English or Arabic as their second language. Radio Malaysia provides channels for Malay, English, Mandarin, and Tamil. The Sarawak local station broadcasts programs in Mukah Melanau (Melanau) and Iban, while the one in Sabah provides programs in Bajau Darat (Sama, Southern) and Tengara Dusun. Certain local stations in Peninsular Malaysia broadcast programs in two aboriginallanguages, Temiar and Semai. TV Malaysia has programs in Malay, English, Mandarin, and Tamil. The policy has made Malay and English primary or high languages in the educational, social, and professional lives of the Malaysians. It has also made almost 80% of Malaysians literate in Malay.

Indigenous Languages The indigenous languages belong to two different stocks: Austronesian and Austroasiatic. Austronesian Languages

Malay, which is the mother tongue of about 45% of the total population of Malaysia is Austronesian. Native Malay speakers are found in Peninsular Malaysia and the coastlands of Sabah and Sarawak. The other Austronesian languages are native to Sabah and Sarawak only. A modest estimate of the number of these languages is about 100. ‘New’ languages are being discovered all the time. What had previously been termed dialects may be found to be

Malaysia: Language Situation 463

heterogeneous languages, and single language may turn out to be a subfamily of languages. Based on information available, the languages of Sabah and Sarawak may be listed according to subfamilies of single languages. Most of them are known by the names of the places where they are located. Languages of Sabah 1. Dusun/Kadazan subfamily: Papar, Dumpas, Bisayah (Bisaya, Sabah) Lotud, Tatana, Kuijau, Rungus, Timur, Tengah (Seko Tengah) (with the well-known dialects of Ranau, Bundu Tuhan, Penampang/Tengara, Tambunan), Keningau (Keningau Murut), Tempasu (Dusun, Tempasuk); Sugut (Dusun, Sugut). 2. Murut subfamily: Tagal (Tagal Murut) Tiulon, Beaufort (Dusun, Central), Timugon (Timugon Murut), Rundum (Tagal Murut), Takapan & Sook (Paluan) Nabay, Baukan. 3. Paitan subfamily: Lingkabau (Tombonuwo), Lobu, (Keningau Murut) Abai Sungai, Tambanua (Tombonuwo) Ulu Kinabatangan. 4. Bajau subfamily: Bajau Darat and Bajau Laut (Sama, Southern). 5. Ida’an or Begahak subfamily (also known as Orang Sungai): located in Lahad Datu, Sandakan, and Kinabatangan. 6. Suluk (Tausug) in Semporna and Kinabatangan. 7. Banggi (Baujau, West Coast) in the Banggi Island. 8. Illanun (I lanun) in Kota Belud. 9. Lun Dayeh in the western part of Sabah, bordering Sarawak. 10. Tidong (Kalabakan) in Labuk Sugut, Sandakan, and Tawau. Languages of Sarawak 1. Bidayuh subfamily: Biatah, Bau, Jago (Jagol), Serian (Bukar Sadong), Lara (Lara’). Studies have indicated that Serian is itself a subgroup consisting of various languages. 2. Melanau subfamily: Oya–Dalat, Mukah (Melanay), Matu Daro (Daro–Matu) and Rejang (Kayan, Rejang). 3. Kayan subfamily: Kayan languages of Baram (Kayan, Baram), Baluy, and Belaga. 4. Kenyah subfamily: Upper Baram Kenyah (Kenyah, Upper Baram) (Lepo Umbo’, Lepo Tan, Lio Mato, Long Jaeh, and Long Tungan), and Lower Baram Kenyah (Merawan or Berawan, and Sebop). 5. Kelabit subfamily: Kelabit in Bario, Lun Bawang (Lundayeh) in Limbang, Tring, and Tabun. 6. Penan subfamily: Penan languages and subgroups in Mulu (the Penan Melinau), Upper Baram (Penan Busang), Sungai Tubu (Punan

7. 8. 9. 10. 11. 12.

Tubu) (Penan Tubu), Sungai Terawang (Penan Libang), Long Iman (Penan Selungo), Belaga (Punan Bah–Biau) (Penan Ba), Niah (Penan La’ong and Penan Jelalong (Penan, Western)). Iban: mainly in the Second, Sixth, and Seventh Divisions. Narom in Marudu. Selakau in Lundu. Bintulu in Bintulu. Miri in Miri. Baketan (Bukitan), Ukitan (Ukit) Lahanan, Kajang and Kejaman (Kajaman), which are heterogenous languages of the Rejang area.

Sabah and Sarawak languages are minor languages, each with a few thousand speakers. Malay is the lingua franca among these groups. None has a writing system of its own. Austroasiatic Languages

The speakers of these languages are the aborigines or the orang asli, who are found only in Peninsular Malaysia, around the Central Range stretching from Kedah in the north to Pahang and Selangor in the south. The languages may be divided into three groups: the Northern Group (Kensiu, Kentakbong, Jehai, Mendriq [Miriq], Bateg Deq [Batak], Mintil, Bateq Nong [Batek], Che Wong [Chewong]); the Central Group (Semnam, Sabum, Lanoh Jengjeng [Lanoh], Lanoh Yir, Temiar, Semai, Jak Hut); and the Southern Group (Mah Meri [Besisi], Semoq Beri [Semaq Beri], Semelai, Temoq). Semai has about 16 000 speakers and Temiar about 10 000. The others are well below these figures. Temiar is the lingua franca among the aborigines. The aboriginal groups of Jakun, Belandas, and Temuan are speakers of Malay in various dialectal forms.

Nonindigenous Languages The nonindigenous languages can be divided into two groups: intraregional (whose speakers are from Indonesia and the Philippines) and extraregional (the Chinese dialects, the languages of the Indian subcontinent and Sri Lanka, Arabic, Thai, and English). Languages from Indonesia and the Philippines

These languages are not widespread: Javanese in Selangor and Johor; Achehnese (Aceh) in Kedah; Mandailing (Batak Mandailing) in Perak and Selangor, Buginese (Bugis) and Chabacano (Chavacano) in Sabah. There are also non-Malaysian dialects of Malay which had come to settle in parts of Malaysia: the

464 Malaysia: Language Situation

Rawa (Rawas), Kerinci, and Minangkabau dialects of Sumatra in Perak, Selangor, and Negeri Sembilan respectively; and Cocos (Malay, Cocos Islands) and Brunei Malay (Brunei) in Sabah.

a main requirement for police, customs, and immigration officers working along the northern border.

Chinese Dialects

English has a very small first-speaker community of Eurasians, consisting of a few thousand people. It is an important language, besides Malay, in the professions such the legal and medical professions, the financial sector, trade, and business. It is also a language of academia.

Chinese dialects in Malaysia are: Mandarin, Hokkien (Chinese, Min Nan), Hakka (Chinese, Hakka) (Khek), Cantonese (Chinese, Yue), Teochew & Hainanese (Chinese, Min Nan), Kwongsai, Hockchiu, Henghua (Chinese, Pur-xian), and Hockchua. Except for Mandarin, all the dialects have their own speech communities. The most populous is the Hokkien group, which has about 1 000 000 speakers, followed by Cantonese and Teochew. Mandarin is acquired through school education. Languages of India and Sri Lanka

The languages of the Indian subcontinent and Sri Lanka altogether have just over 1 000 000 speakers. Of these, Tamil has the highest number of speakers (about 80% of the total Indian population). The other languages are Malayalam, Telugu, Gujarati, Sindhi, Bengali, Urdu, and Sinhalese (Sinhala). Arabic

The speakers of Arabic as a first language do not form a community with its own geographical location. There seem to be more second-language speakers who have acquired Arabic through Islamic schools and colleges. Thai

Thai is native to Malaysians of Thai origin living in the northern states of Perlis Kedah and Kelantan. It is

English

Creoles and Pidgins The two creoles in Malaysia are located in Melaka, in Peninsular Malaysia. The first is the Malay-based Baba Malay (Malay, Baba), spoken by the Babas, that is, descendants of the first group of 15th-century Chinese immigrants. The second is the Portuguesebased Kristang (Malaccan Creole Portuguese), spoken by the descendants of the Portuguese who ruled Melaka in the 16th and 17th centuries. Both creoles have a few thousand speakers. A Malay pidgin, known as bazaar Malay, is used in the marketplaces. There are also pidgin Chinese and pidgin Tamil, which are intracommunity ‘market languages.’ See also: Malay.

Bibliography Asmah H O (1982). Language and society in Malaysia. Kuala Lumpur: Dewan Bahasa dan Pustaka. Asmah H O (1983). The Malay peoples, Malaysia and their languages. Kuala Lumpur: Dewan Bahasa dan Pustaka.

Maldives: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

The Republic of Maldives consists of a chain of islands in the Indian Ocean, south of India and west of Sri Lanka, and has a population of about 335 000. The language of the Maldives is Dhivehi (Divehi), also known as Maldivian, an Indo-European language most closely related to Sinhala. The standard form of Dhivehi is based on the speech in the capital Male, while the forms of the southern atolls are the most divergent varieties. Dhivehi is also spoken by

about 5000 speakers on Minicoy in India. Dhivehi is written in its own script called Tana (ta¯ na, Thaana), written from right to left. Dhivehi is the national language of the Maldives and is used in all spheres of society, including public administration and government, the media, and education. The vast majority of Maldivians are Muslims, but Arabic is not widely understood, and Dhivehi plays an important role for religious purposes and instruction. Dhivehi is used both in radio and television, as well as in print media. The literacy rate (2002) is 97.2%. English is used for international communication, and in the tourist

464 Malaysia: Language Situation

Rawa (Rawas), Kerinci, and Minangkabau dialects of Sumatra in Perak, Selangor, and Negeri Sembilan respectively; and Cocos (Malay, Cocos Islands) and Brunei Malay (Brunei) in Sabah.

a main requirement for police, customs, and immigration officers working along the northern border.

Chinese Dialects

English has a very small first-speaker community of Eurasians, consisting of a few thousand people. It is an important language, besides Malay, in the professions such the legal and medical professions, the financial sector, trade, and business. It is also a language of academia.

Chinese dialects in Malaysia are: Mandarin, Hokkien (Chinese, Min Nan), Hakka (Chinese, Hakka) (Khek), Cantonese (Chinese, Yue), Teochew & Hainanese (Chinese, Min Nan), Kwongsai, Hockchiu, Henghua (Chinese, Pur-xian), and Hockchua. Except for Mandarin, all the dialects have their own speech communities. The most populous is the Hokkien group, which has about 1 000 000 speakers, followed by Cantonese and Teochew. Mandarin is acquired through school education. Languages of India and Sri Lanka

The languages of the Indian subcontinent and Sri Lanka altogether have just over 1 000 000 speakers. Of these, Tamil has the highest number of speakers (about 80% of the total Indian population). The other languages are Malayalam, Telugu, Gujarati, Sindhi, Bengali, Urdu, and Sinhalese (Sinhala). Arabic

The speakers of Arabic as a first language do not form a community with its own geographical location. There seem to be more second-language speakers who have acquired Arabic through Islamic schools and colleges. Thai

Thai is native to Malaysians of Thai origin living in the northern states of Perlis Kedah and Kelantan. It is

English

Creoles and Pidgins The two creoles in Malaysia are located in Melaka, in Peninsular Malaysia. The first is the Malay-based Baba Malay (Malay, Baba), spoken by the Babas, that is, descendants of the first group of 15th-century Chinese immigrants. The second is the Portuguesebased Kristang (Malaccan Creole Portuguese), spoken by the descendants of the Portuguese who ruled Melaka in the 16th and 17th centuries. Both creoles have a few thousand speakers. A Malay pidgin, known as bazaar Malay, is used in the marketplaces. There are also pidgin Chinese and pidgin Tamil, which are intracommunity ‘market languages.’ See also: Malay.

Bibliography Asmah H O (1982). Language and society in Malaysia. Kuala Lumpur: Dewan Bahasa dan Pustaka. Asmah H O (1983). The Malay peoples, Malaysia and their languages. Kuala Lumpur: Dewan Bahasa dan Pustaka.

Maldives: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

The Republic of Maldives consists of a chain of islands in the Indian Ocean, south of India and west of Sri Lanka, and has a population of about 335 000. The language of the Maldives is Dhivehi (Divehi), also known as Maldivian, an Indo-European language most closely related to Sinhala. The standard form of Dhivehi is based on the speech in the capital Male, while the forms of the southern atolls are the most divergent varieties. Dhivehi is also spoken by

about 5000 speakers on Minicoy in India. Dhivehi is written in its own script called Tana (ta¯na, Thaana), written from right to left. Dhivehi is the national language of the Maldives and is used in all spheres of society, including public administration and government, the media, and education. The vast majority of Maldivians are Muslims, but Arabic is not widely understood, and Dhivehi plays an important role for religious purposes and instruction. Dhivehi is used both in radio and television, as well as in print media. The literacy rate (2002) is 97.2%. English is used for international communication, and in the tourist

Mali: Language Situation 465

industry, which is an important part of the economy of the Maldives. See also: Dhivehi.

Christopher R (2003). A Maldivian Dictionary. London: Routledge Curzon. Dhivehi Bahuge Gavaaidhu. (Grammar of the Dhivehi Language, 4 vols). (1984). Male, Maldives: National Centre for Linguistic and Historical Research.

Bibliography Cain B & Gair J (2000). Dhivehi (Maldivian). Munich: Lincom Europa.

Maldivian

See: Dhivehi.

Mali: Language Situation V Vydrine, Museum of Anthropology and Ethnography, St. Petersburg, Russia ! 2006 Elsevier Ltd. All rights reserved.

Mali is a large landlocked country of the West Africa of about 1.24 million sq. km in the mid–upper valley of Niger. More than a half of the country is a sparsely inhabited desert; the great majority of the population (10.5 million in 1998) is concentrated in the Niger valley and in the South. Three major macrofamilies of Africa – Niger-Congo, Nilo-Saharan, and Afro-Asiatic – are represented in Mali. The Niger-Congo languages belong to four families: Mande (Bamana, Maninka, Kagoro, MarkaDafing, Xasonka, Jalunga, Soninke, Bozo, Duun, Jo, Banka, Bobo), Atlantic (Pulaar/Fulfulde), Gur (the Senufo group, Bore, Pana, Samoma) and Dogon (Tommo-so, Toro-so, Jamsay, and others), altogether about 20 languages. Afro-Asiatic languages of Mali are Hassaniya (Mauritanian Arabic, the Semitic family) and the Tamasheq language/dialect cluster (the Berber family). Songhay languages form a linguistic family of their own; in fact, their belonging to the Nilo-Saharan macrofamily is largely contested by specialists (as is the coherence of that macrofamily itself). French is the official language of the country and the main language of the education; Arabic is the language of religion for the majority of the population of the country. In the mid-1990s a liberal law on Malian languages was adopted, and today a whole set of languages have status of ‘national’: Bamana, Pular/Fulfulde (Fula), Songhay (Gao), Tamasheq, Soninke, Xasonka, Maninka, Senufo, Minyanka, Boore, Dogon (Toro-so), Bozo (Tieyaxo). Bamana (Bambara is its French name, borrowed from Soninke or Pulaar) is the most widespread

language of the country. It is spoken as the first language by 26% of the population of Mali, and the entire number of the speakers amounts to an estimated 80%. Being the language of the capital, Bamako, it is often considered to be the ‘Malian language.’ The Bamako dialect serves the point of reference for the Standard Bamana, which is spoken in the main urban centers of Mali, its area has largely increased during the period following the independence of the country (1960). Today most of Senufo, Duun, Banka, Jo, and Bozo are bilingual in Bamana, as are a considerable part of Soninke, Fulbe, Dogon, Bobo, Xasonka. Lately, certain infiltration of the Bamana language is attested even in the Songhay area. In Mali the Maninka, whose language is closely related to Bamana (especially if it is the Maninka dialect of the Manding area that is meant), tend to identify themselves with Bamana. The same can be said about the Jula of the Sikasso area (in the South). Rural Bamana dialects, spoken on both banks of the Niger from Bamako up to the Inner Delta, are rather different from the Bamako variant, often to a degree of mutual unintelligibility. The Bamana dialect of Segu is worth special mention: even after the eclipse of the Segu Empire (18th–19th centuries), it remains somewhat prestigious. Maninka is in fact a dialect continuum of 800 000 to 900 000 speakers in the west of Mali. Its southeast variants (the Manding Maninka) are close enough to Bamana (for instance, they share a seven-vowel system), while the northwest variants (the Kenieba Maninka) are very close to the Xasonka and have a five-vowel system. The Kita Maninka is, in many respects, an intermediary variant. Kagoro is very close to the latter (about 15 000 speakers), it is spoken by dispersed groups in a great area to the north of Niger, from Kaarta to Segu. Many people

Mali: Language Situation 465

industry, which is an important part of the economy of the Maldives. See also: Dhivehi.

Christopher R (2003). A Maldivian Dictionary. London: Routledge Curzon. Dhivehi Bahuge Gavaaidhu. (Grammar of the Dhivehi Language, 4 vols). (1984). Male, Maldives: National Centre for Linguistic and Historical Research.

Bibliography Cain B & Gair J (2000). Dhivehi (Maldivian). Munich: Lincom Europa.

Maldivian

See: Dhivehi.

Mali: Language Situation V Vydrine, Museum of Anthropology and Ethnography, St. Petersburg, Russia ! 2006 Elsevier Ltd. All rights reserved.

Mali is a large landlocked country of the West Africa of about 1.24 million sq. km in the mid–upper valley of Niger. More than a half of the country is a sparsely inhabited desert; the great majority of the population (10.5 million in 1998) is concentrated in the Niger valley and in the South. Three major macrofamilies of Africa – Niger-Congo, Nilo-Saharan, and Afro-Asiatic – are represented in Mali. The Niger-Congo languages belong to four families: Mande (Bamana, Maninka, Kagoro, MarkaDafing, Xasonka, Jalunga, Soninke, Bozo, Duun, Jo, Banka, Bobo), Atlantic (Pulaar/Fulfulde), Gur (the Senufo group, Bore, Pana, Samoma) and Dogon (Tommo-so, Toro-so, Jamsay, and others), altogether about 20 languages. Afro-Asiatic languages of Mali are Hassaniya (Mauritanian Arabic, the Semitic family) and the Tamasheq language/dialect cluster (the Berber family). Songhay languages form a linguistic family of their own; in fact, their belonging to the Nilo-Saharan macrofamily is largely contested by specialists (as is the coherence of that macrofamily itself). French is the official language of the country and the main language of the education; Arabic is the language of religion for the majority of the population of the country. In the mid-1990s a liberal law on Malian languages was adopted, and today a whole set of languages have status of ‘national’: Bamana, Pular/Fulfulde (Fula), Songhay (Gao), Tamasheq, Soninke, Xasonka, Maninka, Senufo, Minyanka, Boore, Dogon (Toro-so), Bozo (Tieyaxo). Bamana (Bambara is its French name, borrowed from Soninke or Pulaar) is the most widespread

language of the country. It is spoken as the first language by 26% of the population of Mali, and the entire number of the speakers amounts to an estimated 80%. Being the language of the capital, Bamako, it is often considered to be the ‘Malian language.’ The Bamako dialect serves the point of reference for the Standard Bamana, which is spoken in the main urban centers of Mali, its area has largely increased during the period following the independence of the country (1960). Today most of Senufo, Duun, Banka, Jo, and Bozo are bilingual in Bamana, as are a considerable part of Soninke, Fulbe, Dogon, Bobo, Xasonka. Lately, certain infiltration of the Bamana language is attested even in the Songhay area. In Mali the Maninka, whose language is closely related to Bamana (especially if it is the Maninka dialect of the Manding area that is meant), tend to identify themselves with Bamana. The same can be said about the Jula of the Sikasso area (in the South). Rural Bamana dialects, spoken on both banks of the Niger from Bamako up to the Inner Delta, are rather different from the Bamako variant, often to a degree of mutual unintelligibility. The Bamana dialect of Segu is worth special mention: even after the eclipse of the Segu Empire (18th–19th centuries), it remains somewhat prestigious. Maninka is in fact a dialect continuum of 800 000 to 900 000 speakers in the west of Mali. Its southeast variants (the Manding Maninka) are close enough to Bamana (for instance, they share a seven-vowel system), while the northwest variants (the Kenieba Maninka) are very close to the Xasonka and have a five-vowel system. The Kita Maninka is, in many respects, an intermediary variant. Kagoro is very close to the latter (about 15 000 speakers), it is spoken by dispersed groups in a great area to the north of Niger, from Kaarta to Segu. Many people

466 Mali: Language Situation

who identify themselves as Kagoro today speak only Bamana or Soninke. Xasonka (about 130 000 speakers in the KayesBafoulabe area, in northwest Mali) is very close to the adjacent variant of Maninka; it owes its status of a separate language to the strong ethnic identity of the Xasonka people, who trace their origin to a mixture of Fulbe, Maninka, and Soninke. The dialects of the Marka-Dafing cluster (25 000 speakers in Mali) differ from Bamana and other predominant Manding languages by their ‘inverse’ tones and the presence of the segmental noun article -o (in Bamana, this article is represented by a floating low tone). Marka-Jalan, spoken in the town of San, is close the Marka-Dafin. The Soninke (about 700 000 speakers) area is stretched from the west to the east along the Mauritanian border. Presumably the predominant language of the ancient Wagadu (Ghana) empire, it was much more widespread in the past, and numerous Soninke loanwords are attested in various languages of the western Sudan. Soninke diaspora is represented in all larger towns of Mali, and their villages can be found sometimes far away from the compact Soninke area. The Bozo languages are spoken by traditionally fishermen population of about 120 000 in the middle flow of Niger and its tributaries. Four languages (sometimes referred to as dialects) are usually singled out: Xainyaxo (to the east of Segu), Tieyaxo (further to the east, especially around Diafarabe), Sorogaama, or Janama (Mopti and further to the north and west), and Tiema Cewe (in the northern part of the Inner Delta of Niger). Sorogaama, the largest of the Bozo languages, is internally diverse and has six different dialects. The Tieyaxo seems to be the most prestigious among the Bozo. In the west, the Bozo are mainly bilingual in Bozo and Bamana, and in the center and east, in Bozo and Fulfulde. Only a minor number of Bobo speakers (15 000 to 20 000) live in Mali, to the south of the Boomu area; the others live in Burkina Faso. This language is characterized by a high degree of dialectal divergence, the dialects being not mutually intelligible; of these, Tinkire, Yebe, and Bana dialects are spoken in Mali. Most of the Bobo seem to be bilingual in Bamana or Jula. The position of the Bobo language within the Mande family is uncertain; it is sometimes singled out into a separate branch. On the other hand, it shares some common features and vocabulary with geographically close Samogo languages. Duun (70 000 speakers), Jo (9100 in Mali and Burkina Faso), and Banka (5000) languages in the Sikasso area are also known as Samogo group (together with Dzuun, Kpan, and Seeku in Burkina

Faso); the great majority of their speakers seems to be bilingual in Bamana. Traditionally included into the West Mande branch, they have much in common with the East Mande languages in vocabulary and morphology, including a complicated pronoun system and more than two tone levels. Jo is the only language in the Mande family that has the grammatical opposition of feminine vs. masculine in the pronoun system. There are some Jalunga, or Jallonke (about 10 000 speakers) settlements in the Faleya area, at the Guinean border (near the Senegalese border). This Mande language is close to Soso on the Guinean coast. The Jallonke are descendants of the ancient population of the Futa-Jallon plateau (Guinea), ousted by Fulbe during the religious wars of the 18th century. Jalunga speakers in Mali are bilingual in Bamana, and their language seems to be endangered. The only Atlantic language represented in Mali is Pulaar (this name is reserved for the western dialects), or Fulfulde (for the central and eastern dialects), the language of Fulbe, spoken by about 1 000 000 people in this country only. Traditionally nomads and livestock breeders, Fulbe are predominantly settled today, and their settlements are scattered all over Mali. There is a Fulbe family or two in most Bamana, Maninka, or Senufo villages taking care of the cows of the entire village. The largest compact area of Fulbe is Masiina (the Inner Delta of Niger), where the central dialects are spoken (more than 900 000 speakers). Some specialists consider these dialects to be the most archaic. Another compact zone is in the northwest of Mali, near the border with Mauritania and Senegal, where about 180 000 speak a Pulaar dialect that is close to the Senegalese ones. Being culturally diverse from the agricultural population, Fulbe have a strong sense of their identity and are often reluctant to use Bamana as lingua franca (that is also related to their attachment to Islam, which is much stronger than among the Bamana). Malian Fulbe represent just a segment of a their large diaspora stretching from Senegal to Sudan and even Ethiopia. The most important Gur language spoken in Mali is Senufo, represented by several variants in the south of the country: Mamara, or Minyanka (500 000 to 900 000 speakers), Supyire (364 000), Syenara (137 000). These variants are divergent enough to be considered different languages. A great majority of Senufo are bilingual in Bamana, the language of the culturally dominant community. Another big Gur language in Mali is Bore (Bomu, Bobo Wulen), spoken by about 100 000 people at the Burkina border. Two minor languages, Pana (Sama)

Malinowski, Bronislaw Kaspar (1884–1942) 467

and Samoma (Kalamse), along the Burkina border, belong to the same family. Dogon, earlier attached to the Gur family, is today considered to be a separate language family within Niger-Congo. The Dogon family numbers more than 20 languages (usually referred to as ‘dialects of Dogon’), the most important ones being Jamsay, Toro-so, Tommo-so, Donno-so, and Togo-kan. The central languages and dialects are close to each other, while smaller peripheral languages are sometimes very different, to the extent that their belonging to the Dogon language family may be called into question. Of these languages, Toro-so has been given the status of ‘standard Dogon,’ a decision that is not yet generally approved. Many Dogon are bilingual in Bamana and in Fulfulde, but the latter often has negative connotations. The Songhay family is represented in eastern Mali, mainly in the Niger valley, by Koyra Chiini (the Tombuktu Songhay, about 200 000 speakers), Koyraboro Senni (the Gao Songhay, about 400 000), Humburi Senni (about 15 000 in Mali), Tadaksahak (30 000 to 40 000, nomads culturally close to Tuaregs); there are also a couple of Zarma-speaking villages at the Niger border. The Songhay area demonstrates a staunch resistance to the penetration of Bamana as the lingua franca. The Berber branch of the Afroasiatic family is represented in Mali by the languages of Tuaregs: Tamasheq (of which there are two considerably divergent dialects, Tombuktu and Tadhaq,with about 250 000 speakers) and Tamajaq (about 190 000 at the extreme east of Mali). After continuous wars against the central government in 1980–1990s, Tuaregs are being integrated into the Malian life (at least at the political level), but the penetration of Bamana in their milieu is still very slow.

Hasaniya, or Mauritanian Arabic, is spoken by about 100 000 people at the Mauritanian border. The language of education at all levels is French; Arabic is taught in Koranic schools and madrasas. Education in the national languages (especially in Bamana, Fulfulde, and Songhay) is increasing both in schools and in adult literacy programs, but its quality often remains inferior. Nko writing is used for Bamana and Maninka to a certain extent, although Latin writing remains largely predominant. In Masiina, Adjami (arabographic writing) is still used by elders, but it is being overshadowed by Fulfulde writing on the basis of the Latin script. The same can be said about the Tefinag script among the Tuaregs. See also: Berber; Fulfulde; Gur Languages; Mande Lan-

guages. Language Maps (Appendix 1): Maps 19, 20.

Bibliography Barry A (1990). ‘Etude de plurilinguisme au Mali: le cas de Djenne´ ’. In Kawada J (ed.) Boucle du Niger, approches multidisciplinaires. Tokyo: Institut de Recherches sur les Langues et Cultures d’Asie et d’Afrique, 2. 183–210. Canut C (1996). Dynamiques linguistiques au Mali. Paris: Didier Erudition, Collection ‘Langues et De´ veloppement.’ Dombrowsky K, Dumestre G & Simonis F (1993). L’alphabe´tisation fonctionnelle en Bambara dans une dynamique de de´veloppement. Le cas de la zone cotonnie`re (MaliSud). Institut d’E´ tudes Cre´ oles et Francophones URA 1041 du CNRS, Universite´ de Provence. Dumestre G (ed.) (1994). Strate´gies communicatives au Mali: langues re´gionales, bambara, franc¸ais. Institut d’E´ tudes Cre´ oles et Francophones URA 1041 du CNRS, Universite´ de Provence. N’Diaye´ B (1970). Groupes ethniques au Mali. Bamako: Editions Populaires.

Malinowski, Bronislaw Kaspar (1884–1942) A T Campbell ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 5, pp. 2354–2355, ! 1994 Elsevier Ltd.

Malinowski was born in Cracow, Poland. He was the son of a well-known Slavic philologist (Lucyan Malinowski). After a Ph.D. in physics and mathematics, he turned to anthropology, inspired, he said, by a reading of Frazer’s Golden bough. In 1910 he arrived at the London School of Economics, and in 1914

began anthropological fieldwork in New Guinea. Between 1915 and 1918 he did two years of fieldwork on the Trobriand Islands (off Papua New Guinea). After some years teaching at the LSE he was appointed to the first Chair of Anthropology there (1927). In 1938 he went to the United States where he was appointed professor at Yale. He died in New Haven at the age of 58. In the three great ethnographies of the Trobriand Islands (Argonauts, Sexual life, and Coral gardens), Malinowski established the tradition of intensive fieldwork, participating in the life of the society

Malinowski, Bronislaw Kaspar (1884–1942) 467

and Samoma (Kalamse), along the Burkina border, belong to the same family. Dogon, earlier attached to the Gur family, is today considered to be a separate language family within Niger-Congo. The Dogon family numbers more than 20 languages (usually referred to as ‘dialects of Dogon’), the most important ones being Jamsay, Toro-so, Tommo-so, Donno-so, and Togo-kan. The central languages and dialects are close to each other, while smaller peripheral languages are sometimes very different, to the extent that their belonging to the Dogon language family may be called into question. Of these languages, Toro-so has been given the status of ‘standard Dogon,’ a decision that is not yet generally approved. Many Dogon are bilingual in Bamana and in Fulfulde, but the latter often has negative connotations. The Songhay family is represented in eastern Mali, mainly in the Niger valley, by Koyra Chiini (the Tombuktu Songhay, about 200 000 speakers), Koyraboro Senni (the Gao Songhay, about 400 000), Humburi Senni (about 15 000 in Mali), Tadaksahak (30 000 to 40 000, nomads culturally close to Tuaregs); there are also a couple of Zarma-speaking villages at the Niger border. The Songhay area demonstrates a staunch resistance to the penetration of Bamana as the lingua franca. The Berber branch of the Afroasiatic family is represented in Mali by the languages of Tuaregs: Tamasheq (of which there are two considerably divergent dialects, Tombuktu and Tadhaq,with about 250 000 speakers) and Tamajaq (about 190 000 at the extreme east of Mali). After continuous wars against the central government in 1980–1990s, Tuaregs are being integrated into the Malian life (at least at the political level), but the penetration of Bamana in their milieu is still very slow.

Hasaniya, or Mauritanian Arabic, is spoken by about 100 000 people at the Mauritanian border. The language of education at all levels is French; Arabic is taught in Koranic schools and madrasas. Education in the national languages (especially in Bamana, Fulfulde, and Songhay) is increasing both in schools and in adult literacy programs, but its quality often remains inferior. Nko writing is used for Bamana and Maninka to a certain extent, although Latin writing remains largely predominant. In Masiina, Adjami (arabographic writing) is still used by elders, but it is being overshadowed by Fulfulde writing on the basis of the Latin script. The same can be said about the Tefinag script among the Tuaregs. See also: Berber; Fulfulde; Gur Languages; Mande Lan-

guages. Language Maps (Appendix 1): Maps 19, 20.

Bibliography Barry A (1990). ‘Etude de plurilinguisme au Mali: le cas de Djenne´’. In Kawada J (ed.) Boucle du Niger, approches multidisciplinaires. Tokyo: Institut de Recherches sur les Langues et Cultures d’Asie et d’Afrique, 2. 183–210. Canut C (1996). Dynamiques linguistiques au Mali. Paris: Didier Erudition, Collection ‘Langues et De´veloppement.’ Dombrowsky K, Dumestre G & Simonis F (1993). L’alphabe´tisation fonctionnelle en Bambara dans une dynamique de de´veloppement. Le cas de la zone cotonnie`re (MaliSud). Institut d’E´tudes Cre´oles et Francophones URA 1041 du CNRS, Universite´ de Provence. Dumestre G (ed.) (1994). Strate´gies communicatives au Mali: langues re´gionales, bambara, franc¸ais. Institut d’E´tudes Cre´oles et Francophones URA 1041 du CNRS, Universite´ de Provence. N’Diaye´ B (1970). Groupes ethniques au Mali. Bamako: Editions Populaires.

Malinowski, Bronislaw Kaspar (1884–1942) A T Campbell ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 5, pp. 2354–2355, ! 1994 Elsevier Ltd.

Malinowski was born in Cracow, Poland. He was the son of a well-known Slavic philologist (Lucyan Malinowski). After a Ph.D. in physics and mathematics, he turned to anthropology, inspired, he said, by a reading of Frazer’s Golden bough. In 1910 he arrived at the London School of Economics, and in 1914

began anthropological fieldwork in New Guinea. Between 1915 and 1918 he did two years of fieldwork on the Trobriand Islands (off Papua New Guinea). After some years teaching at the LSE he was appointed to the first Chair of Anthropology there (1927). In 1938 he went to the United States where he was appointed professor at Yale. He died in New Haven at the age of 58. In the three great ethnographies of the Trobriand Islands (Argonauts, Sexual life, and Coral gardens), Malinowski established the tradition of intensive fieldwork, participating in the life of the society

468 Malinowski, Bronislaw Kaspar (1884–1942)

being studied, emphasizing the effort to appreciate a total view of the society and to understand matters ‘‘from the natives’ point of view.’’ He constantly emphasized the importance of learning and working with the indigenous language. The vivid style of writing and the colossal amount of detail have established these ethnographies as classics. The account of the kula ring, a complex system of formal gift giving described in Argonauts, is one of the most celebrated cases in anthropology. In terms of sociological theory, Malinowski’s functionalism and his ideas about ‘basic needs’ are pretty banal, and are only of interest as indicating an important shift in anthropology away from questions about origins and evolution which had previously been so dominant. On the other hand, his concern with language is much more fruitful. He was proud of his ability to pick up languages quickly, and, as well as insisting on the importance of conducting field research in the native language, he shows an intense concern with the processes and details of translation (Coral gardens, vol. 2). He laments the lack of sound ethnolinguistic theory. Although his attempt to provide one does little with regard to phonetics, phonology, or grammar, his writing on meaning is original. In a remarkable essay (1923) appended to Ogden and Richards’ The meaning of meaning, he coined the phrase ‘phatic communion’ (gossip, pleasantries, and so on) to make the point that language should not just be seen as a vehicle for thought through which to communicate ideas, but as a mode of action which, in this example, establishes personal bonds between people. Also in this essay he claims

originality for the notion of ‘context of situation’ in an argument which prefigures Wittgenstein’s ‘the meaning of words lies in their use,’ developed in Philosophical investigations. J. R. Firth wrote: ‘‘I think it is a fair criticism to say that Malinowski’s technical linguistic contribution consists of sporadic comments, immersed and perhaps lost in what is properly called his ethnographic analysis’’ (1957: 117). But beyond a strictly ‘technical’ contribution, his expertise in translating from Trobriand and his writing on meaning and translation deserve a more generous judgment.

Bibliography Malinowski B K (1922). Argonauts of the Western Pacific. London: Routledge. Malinowski B K (1923). ‘The problem of meaning in primitive languages.’ In Ogden C K & Richards I A (eds.) The meaning of meaning (suppl. 1). New York: Harcourt Brace. Malinowski B K (1929). The sexual life of savages. London: Routledge. Malinowski B K (1935). Coral gardens and their magic (2 vols). London: Allen and Unwin. Malinowski B K (1948). Magic, science and religion and other essays. Glencoe, IL: Free Press. Malinowski B K (1967). A diary in the strict sense of the term. London: Routledge and Kegan Paul. Firth J R (ed.) (1957). Man and culture: an evaluation of the work of Bronislaw Malinowski. London: Routledge and Kegan Paul. Kuper A (1983). Anthropology and anthropologists: the modern British school. London: Routledge and Kegan Paul.

Malkiel, Yakov (1914–1998) S N Dworkin, University of Michigan, Ann Arbor, MI, USA ! 2006 Elsevier Ltd. All rights reserved.

Born in Kiev, Malkiel received his doctorate from the Friedrich-Wilhelms Universita¨t in Berlin in 1938, where he studied Romanistik under Ernst Gamillscheg. He left Germany in 1940 for the United States and joined the faculty of the University of California Berkeley in 1943, where he taught until his retirement in 1983. Throughout his career, Malkiel was one of the leading and most prolific specialists in Romance historical linguistics. His most important work dealt with Spanish historical grammar, especially derivational morphology and etymology, the latter understood

as the preparation of full-blown lexical biographies. Studies on the genesis and diffusion of a given derivational suffix were often flanked by inquiries into the history of etymologically obscure items that bear the suffix at issue. Malkiel was one of the few Romanists explicitly concerned with the theory and methodology of etymology. The need to rejuvenate Romance etymology by demonstrating how it could be integrated into ongoing research in other branches of diachronic research, especially phonology and morphology, is a constant theme in Malkiel’s writings. His last book, Etymology (1993), is a personal overview and meditation on this field. Malkiel also studied numerous problems in historical phonology and inflectional morphology, areas to which he made a number of significant

468 Malinowski, Bronislaw Kaspar (1884–1942)

being studied, emphasizing the effort to appreciate a total view of the society and to understand matters ‘‘from the natives’ point of view.’’ He constantly emphasized the importance of learning and working with the indigenous language. The vivid style of writing and the colossal amount of detail have established these ethnographies as classics. The account of the kula ring, a complex system of formal gift giving described in Argonauts, is one of the most celebrated cases in anthropology. In terms of sociological theory, Malinowski’s functionalism and his ideas about ‘basic needs’ are pretty banal, and are only of interest as indicating an important shift in anthropology away from questions about origins and evolution which had previously been so dominant. On the other hand, his concern with language is much more fruitful. He was proud of his ability to pick up languages quickly, and, as well as insisting on the importance of conducting field research in the native language, he shows an intense concern with the processes and details of translation (Coral gardens, vol. 2). He laments the lack of sound ethnolinguistic theory. Although his attempt to provide one does little with regard to phonetics, phonology, or grammar, his writing on meaning is original. In a remarkable essay (1923) appended to Ogden and Richards’ The meaning of meaning, he coined the phrase ‘phatic communion’ (gossip, pleasantries, and so on) to make the point that language should not just be seen as a vehicle for thought through which to communicate ideas, but as a mode of action which, in this example, establishes personal bonds between people. Also in this essay he claims

originality for the notion of ‘context of situation’ in an argument which prefigures Wittgenstein’s ‘the meaning of words lies in their use,’ developed in Philosophical investigations. J. R. Firth wrote: ‘‘I think it is a fair criticism to say that Malinowski’s technical linguistic contribution consists of sporadic comments, immersed and perhaps lost in what is properly called his ethnographic analysis’’ (1957: 117). But beyond a strictly ‘technical’ contribution, his expertise in translating from Trobriand and his writing on meaning and translation deserve a more generous judgment.

Bibliography Malinowski B K (1922). Argonauts of the Western Pacific. London: Routledge. Malinowski B K (1923). ‘The problem of meaning in primitive languages.’ In Ogden C K & Richards I A (eds.) The meaning of meaning (suppl. 1). New York: Harcourt Brace. Malinowski B K (1929). The sexual life of savages. London: Routledge. Malinowski B K (1935). Coral gardens and their magic (2 vols). London: Allen and Unwin. Malinowski B K (1948). Magic, science and religion and other essays. Glencoe, IL: Free Press. Malinowski B K (1967). A diary in the strict sense of the term. London: Routledge and Kegan Paul. Firth J R (ed.) (1957). Man and culture: an evaluation of the work of Bronislaw Malinowski. London: Routledge and Kegan Paul. Kuper A (1983). Anthropology and anthropologists: the modern British school. London: Routledge and Kegan Paul.

Malkiel, Yakov (1914–1998) S N Dworkin, University of Michigan, Ann Arbor, MI, USA ! 2006 Elsevier Ltd. All rights reserved.

Born in Kiev, Malkiel received his doctorate from the Friedrich-Wilhelms Universita¨t in Berlin in 1938, where he studied Romanistik under Ernst Gamillscheg. He left Germany in 1940 for the United States and joined the faculty of the University of California Berkeley in 1943, where he taught until his retirement in 1983. Throughout his career, Malkiel was one of the leading and most prolific specialists in Romance historical linguistics. His most important work dealt with Spanish historical grammar, especially derivational morphology and etymology, the latter understood

as the preparation of full-blown lexical biographies. Studies on the genesis and diffusion of a given derivational suffix were often flanked by inquiries into the history of etymologically obscure items that bear the suffix at issue. Malkiel was one of the few Romanists explicitly concerned with the theory and methodology of etymology. The need to rejuvenate Romance etymology by demonstrating how it could be integrated into ongoing research in other branches of diachronic research, especially phonology and morphology, is a constant theme in Malkiel’s writings. His last book, Etymology (1993), is a personal overview and meditation on this field. Malkiel also studied numerous problems in historical phonology and inflectional morphology, areas to which he made a number of significant

Malmberg, Bertil (1913–1994) 469

methodological and analytical contributions. He advocated the concept of ‘weak’ sound change, i.e., a sound change that offered a relatively low degree of predictability. A key notion central to much of his diachronic research was multiple causation, the proposal that many linguistic shifts result from the interaction of several independent factors that come together to effect change. An explanatory hypothesis dear to his heart was the possible role of morphological conditions in sound change. Some sound changes may have taken as their starting point morphologically conditioned alternations, which then spread (often sporadically and unpredictably) beyond their original morphological locus. In his later years, Malkiel became fascinated by the possible role of phonosymbolism as a motivating and guiding factor in sound change. Phonosymbolism is to be understood here as the study of the intimate connection between the phonetic shape of a root morpheme and its meaning. The reader who wishes a sample of Malkiel’s writings on this topic can turn to his Edita and inedita, 1979–1988. Vol. I: diachronic problems in phonosymbolism. Although the overwhelming majority of his studies deal with specific issues in the history of Spanish and Portuguese, he also wrote on questions pertaining to the Romance languages as a whole, as well as on topics germane to Gallo-, Italo-, and Daco-Romance. His favorite medium for presenting the findings of his research was the monograph and the heavily documented article channeled through a learned journal or a homage volume. Starting around the mid–1960s, Malkiel placed more emphasis on the broader issues of linguistic change to be learned from the particular Romance problem under study. The symbiotic relationship between data-rich historical Romance linguistics and more theoretically oriented general

linguistics is a major theme in his writings. Malkiel also wrote extensively on the history of Romance linguistics as a scholarly discipline. He was founding editor (1947) of the journal Romance Philology. Collections of selected articles can be found in Malkiel (1968, 1979, 1989, 1990, 1992). An overview of his life and work can be found in Dworkin (1998, 2004).

Bibliography Dworkin S N (1998). ‘Yakov Malkiel (1914–1998).’ La coro´nica 27, 248–262. Dworkin S N (2004). ‘Yakov Malkiel (1914–1998).’ Language 80, 153–162 [revised version of Dworkin 1998]. Malkiel Y (1968). Essays on linguistic themes. Oxford: Blackwell. Malkiel Y (1979). From particular to general linguistics: selected essays 1965–1978. Amsterdam and Philadelphia: John Benjamins. Malkiel Y (1988–1989). A tentative autobibliography. Romance Philology. Special Issue, Duggan J J & Faulhaber C B (eds.) Berkeley and Los Angeles: University of California Press. Malkiel Y (1989). Theory and practice of romance etymology: studies in language, culture, and history (1947– 1987). Tuttle E F (ed.). London: Variorum. Reprints. Malkiel Y (1990). Edita and inedita, 1979–1988. Vol. I: diachronic problems in phonosymbolism. Amsterdam & Philadelphia: Benjamins. Malkiel Y (1992). Edita and inedita, 1979–1988. Vol. 2: diachronic studies in lexicology, affixation, phonology. Amsterdam & Philadelphia: Benjamins. Malkiel Y (1993). Etymology. Cambridge: Cambridge University Press. Malkiel Y (1995). ‘Supplement to A tentative autobibliography.’ Romance Philology 48, 351–388.

Malmberg, Bertil (1913–1994) B Sigurd, Lund, Sweden ! 2006 Elsevier Ltd. All rights reserved.

Bertil Malmberg, professor of phonetics (1950–1969) and general linguistics (1969–1978) at the University of Lund, Sweden, was born on April 8, 1913 and died on October 8, 1994. He was an internationally wellknown general linguist, phonologist, and Romanist, who also played a role in applied linguistics. As a student, Malmberg encountered the structural movements in Europe, above all, the ideas of Ferdinand de Saussure and the Prague school with Nicolaj

Trubetzkoj and Roman Jakobson. He spent several periods in Paris, where he learned experimental phonetics in the 1930s and was visiting professor of phonetics in the 1960s. Malmberg was a pioneer in phonetics. He was quick to employ the sound spectrograph (Sonagraph) and managed to define the characteristics of the Swedish word accents manifested, e.g., in the famous minimal pair a´nden (‘the duck’): a`nden (‘the spirit’). He showed that the difference between accent one (acute) in a´nden and accent two (grave) in a`nden in the Southern Swedish dialect lies in the different placement of the intonation peak. Malmberg was also

Malmberg, Bertil (1913–1994) 469

methodological and analytical contributions. He advocated the concept of ‘weak’ sound change, i.e., a sound change that offered a relatively low degree of predictability. A key notion central to much of his diachronic research was multiple causation, the proposal that many linguistic shifts result from the interaction of several independent factors that come together to effect change. An explanatory hypothesis dear to his heart was the possible role of morphological conditions in sound change. Some sound changes may have taken as their starting point morphologically conditioned alternations, which then spread (often sporadically and unpredictably) beyond their original morphological locus. In his later years, Malkiel became fascinated by the possible role of phonosymbolism as a motivating and guiding factor in sound change. Phonosymbolism is to be understood here as the study of the intimate connection between the phonetic shape of a root morpheme and its meaning. The reader who wishes a sample of Malkiel’s writings on this topic can turn to his Edita and inedita, 1979–1988. Vol. I: diachronic problems in phonosymbolism. Although the overwhelming majority of his studies deal with specific issues in the history of Spanish and Portuguese, he also wrote on questions pertaining to the Romance languages as a whole, as well as on topics germane to Gallo-, Italo-, and Daco-Romance. His favorite medium for presenting the findings of his research was the monograph and the heavily documented article channeled through a learned journal or a homage volume. Starting around the mid–1960s, Malkiel placed more emphasis on the broader issues of linguistic change to be learned from the particular Romance problem under study. The symbiotic relationship between data-rich historical Romance linguistics and more theoretically oriented general

linguistics is a major theme in his writings. Malkiel also wrote extensively on the history of Romance linguistics as a scholarly discipline. He was founding editor (1947) of the journal Romance Philology. Collections of selected articles can be found in Malkiel (1968, 1979, 1989, 1990, 1992). An overview of his life and work can be found in Dworkin (1998, 2004).

Bibliography Dworkin S N (1998). ‘Yakov Malkiel (1914–1998).’ La coro´nica 27, 248–262. Dworkin S N (2004). ‘Yakov Malkiel (1914–1998).’ Language 80, 153–162 [revised version of Dworkin 1998]. Malkiel Y (1968). Essays on linguistic themes. Oxford: Blackwell. Malkiel Y (1979). From particular to general linguistics: selected essays 1965–1978. Amsterdam and Philadelphia: John Benjamins. Malkiel Y (1988–1989). A tentative autobibliography. Romance Philology. Special Issue, Duggan J J & Faulhaber C B (eds.) Berkeley and Los Angeles: University of California Press. Malkiel Y (1989). Theory and practice of romance etymology: studies in language, culture, and history (1947– 1987). Tuttle E F (ed.). London: Variorum. Reprints. Malkiel Y (1990). Edita and inedita, 1979–1988. Vol. I: diachronic problems in phonosymbolism. Amsterdam & Philadelphia: Benjamins. Malkiel Y (1992). Edita and inedita, 1979–1988. Vol. 2: diachronic studies in lexicology, affixation, phonology. Amsterdam & Philadelphia: Benjamins. Malkiel Y (1993). Etymology. Cambridge: Cambridge University Press. Malkiel Y (1995). ‘Supplement to A tentative autobibliography.’ Romance Philology 48, 351–388.

Malmberg, Bertil (1913–1994) B Sigurd, Lund, Sweden ! 2006 Elsevier Ltd. All rights reserved.

Bertil Malmberg, professor of phonetics (1950–1969) and general linguistics (1969–1978) at the University of Lund, Sweden, was born on April 8, 1913 and died on October 8, 1994. He was an internationally wellknown general linguist, phonologist, and Romanist, who also played a role in applied linguistics. As a student, Malmberg encountered the structural movements in Europe, above all, the ideas of Ferdinand de Saussure and the Prague school with Nicolaj

Trubetzkoj and Roman Jakobson. He spent several periods in Paris, where he learned experimental phonetics in the 1930s and was visiting professor of phonetics in the 1960s. Malmberg was a pioneer in phonetics. He was quick to employ the sound spectrograph (Sonagraph) and managed to define the characteristics of the Swedish word accents manifested, e.g., in the famous minimal pair a´nden (‘the duck’): a`nden (‘the spirit’). He showed that the difference between accent one (acute) in a´nden and accent two (grave) in a`nden in the Southern Swedish dialect lies in the different placement of the intonation peak. Malmberg was also

470 Malmberg, Bertil (1913–1994)

one of the first to use the equipment called Visible Speech at the Haskins laboratory in New York. With his broad knowledge of the Romance languages, Malmberg had a wide field to which he could apply his ideas. He became a world famous Romanist, and his publications include works on and in French, Italian, and Spanish. The sign plays a paramount role in Saussure’s thinking, and Malmberg tried to include the theory of the linguistic sign under the general theory of signs (semiotics), as in Le Langage—signe de l0 human (1979). Malmberg often criticized the modern generative linguists for their lack of a sign concept. He did not take much interest in Chomsky and the generative school, which is also evident from his history of linguistics published in 1991, Histoire de la linguistique: de Sumer a` Saussure. The phonetics chair at Lund was partly motivated by the demands of language teaching, a field in which Malmberg was very active. He also started courses for speech therapists. Many of his books have been translated into several languages. The French version of his book on phonetics is a best-seller in the series Que sais-je?

See also: Jakobson, Roman (1896–1982); Phonetics: Over-

view; Phonology: Overview; Saussure, Ferdinand (-Mongin) de (1857–1913); Swedish.

Bibliography Malmberg B (1944). Die Quantita¨ t als phonetischphonologischer Begriff: Eine allgemeinsprachliche Studie. Lund: ˚ rsskrift. Lunds Universitets A Malmberg B (1948). L’espagnol dans le nouveau monde: proble`me de linguistique ge´ ne´ rale. Lund: Gleerup. Malmberg B (1959). Nya va¨ gar inom spra˚kforskningen: en orientering i modern lingvistik Stockholm: Norstedt; (1964) New trends in linguistics: an orientation translated from the Swedish original by Edward Carney. . Stockholm: Naturmetodens spra˚kinstitut. Malmberg B (1963). Structural linguistics and human communication: an introduction into the mechanism of language and the methodology of linguistics. Berlin: Springer-Vlg. Malmberg B (1973). La phone´tique. Paris: Presses Universitaires de France, Que sais-je? Malmberg B (1979). Le Langage—signe de l0 human. Paris: Picard. Malmberg B (1991). Histoire de la linguistique: de Sumer a` Saussure. Paris: Presses Universitaires de France.

Malta: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

The republic of Malta is a group of islands in the central Mediterranean Sea, about 100 km south of Sicily. The three main islands are Malta, Gozo, and Comino. With about 400 000 inhabitants, the islands have a comparatively high population density. The main economic activities are shipping and port activities, finance, manufacturing, and tourism. Malta has been inhabited since prehistoric times and was an important port of the Phoenicians and Romans. The islands were under Arab control from 870 until the Norman Conquest in 1090. From the 16th century onward, the islands were the seat of the Knights of Malta, the monastic Order of St. John, who ruled the islands until the French invaded in 1798. In 1814, the islands became a British colony and only gained independence in 1964. Since 2004, Malta is a member of the European Union. The two official languages of Malta are English and Maltese, a South Central Semitic language of

Arabic origin. Maltese is in addition the national language. Besides Standard Maltese, there are also several dialects. Maltese is predominantly used in church services and other religious, national, and cultural activities. Until recently, much written official and administrative business was conducted in English, but increasingly efforts are being made to use Maltese. Maltese is used in courts, and the Maltese text is considered as legally binding. Both Maltese and English are used in broadcasting and for the print media. However, most television is broadcast in English or received from Italian stations in Italian. There is no explicit policy regarding the use of languages in schools, but usually both Maltese and English are used, the former mostly in spoken communication and the latter for writing. Most Maltese are bilingual in English and Maltese to a varying degree, and code-switching and-mixing between the two languages is common. See also: Code Switching; Maltese.

470 Malmberg, Bertil (1913–1994)

one of the first to use the equipment called Visible Speech at the Haskins laboratory in New York. With his broad knowledge of the Romance languages, Malmberg had a wide field to which he could apply his ideas. He became a world famous Romanist, and his publications include works on and in French, Italian, and Spanish. The sign plays a paramount role in Saussure’s thinking, and Malmberg tried to include the theory of the linguistic sign under the general theory of signs (semiotics), as in Le Langage—signe de l0 human (1979). Malmberg often criticized the modern generative linguists for their lack of a sign concept. He did not take much interest in Chomsky and the generative school, which is also evident from his history of linguistics published in 1991, Histoire de la linguistique: de Sumer a` Saussure. The phonetics chair at Lund was partly motivated by the demands of language teaching, a field in which Malmberg was very active. He also started courses for speech therapists. Many of his books have been translated into several languages. The French version of his book on phonetics is a best-seller in the series Que sais-je?

See also: Jakobson, Roman (1896–1982); Phonetics: Over-

view; Phonology: Overview; Saussure, Ferdinand (-Mongin) de (1857–1913); Swedish.

Bibliography Malmberg B (1944). Die Quantita¨t als phonetischphonologischer Begriff: Eine allgemeinsprachliche Studie. Lund: ˚ rsskrift. Lunds Universitets A Malmberg B (1948). L’espagnol dans le nouveau monde: proble`me de linguistique ge´ne´rale. Lund: Gleerup. Malmberg B (1959). Nya va¨gar inom spra˚kforskningen: en orientering i modern lingvistik Stockholm: Norstedt; (1964) New trends in linguistics: an orientation translated from the Swedish original by Edward Carney. . Stockholm: Naturmetodens spra˚kinstitut. Malmberg B (1963). Structural linguistics and human communication: an introduction into the mechanism of language and the methodology of linguistics. Berlin: Springer-Vlg. Malmberg B (1973). La phone´tique. Paris: Presses Universitaires de France, Que sais-je? Malmberg B (1979). Le Langage—signe de l0 human. Paris: Picard. Malmberg B (1991). Histoire de la linguistique: de Sumer a` Saussure. Paris: Presses Universitaires de France.

Malta: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.

The republic of Malta is a group of islands in the central Mediterranean Sea, about 100 km south of Sicily. The three main islands are Malta, Gozo, and Comino. With about 400 000 inhabitants, the islands have a comparatively high population density. The main economic activities are shipping and port activities, finance, manufacturing, and tourism. Malta has been inhabited since prehistoric times and was an important port of the Phoenicians and Romans. The islands were under Arab control from 870 until the Norman Conquest in 1090. From the 16th century onward, the islands were the seat of the Knights of Malta, the monastic Order of St. John, who ruled the islands until the French invaded in 1798. In 1814, the islands became a British colony and only gained independence in 1964. Since 2004, Malta is a member of the European Union. The two official languages of Malta are English and Maltese, a South Central Semitic language of

Arabic origin. Maltese is in addition the national language. Besides Standard Maltese, there are also several dialects. Maltese is predominantly used in church services and other religious, national, and cultural activities. Until recently, much written official and administrative business was conducted in English, but increasingly efforts are being made to use Maltese. Maltese is used in courts, and the Maltese text is considered as legally binding. Both Maltese and English are used in broadcasting and for the print media. However, most television is broadcast in English or received from Italian stations in Italian. There is no explicit policy regarding the use of languages in schools, but usually both Maltese and English are used, the former mostly in spoken communication and the latter for writing. Most Maltese are bilingual in English and Maltese to a varying degree, and code-switching and-mixing between the two languages is common. See also: Code Switching; Maltese.

Malukan Languages 471

Maltese J Cremona ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 5, pp. 2356–2357, ! 1994, Elsevier Ltd.

Maltese is the national language of the Republic of Malta and one of its two official languages, the other being English. It is spoken by virtually all the 345 418 (1985) inhabitants (plus ca. 80 000 Maltese immigrants in Australia). Until the 1930s, its status was low, with the prestige languages being Italian and English. The first text written in Maltese, a poem, is ca. 1460 AD, but although texts appear sporadically thereafter, Maltese only began to be written systematically from about the end of the eighteenth century. Maltese, a language of Arabic origin, shares many of the features that distinguish the modern Arabic vernaculars from literary Arabic. Maltese also displays those features that distinguish Maghrebine dialects from the rest; for example, the loss of gender distinctions in second-person-singular pronouns and verbs and the leveling of first-person markers in the imperfect to give {n . . . ø} for the singular and {n . . . u} for the plural. However, Maltese differs from most ‘core’ vernaculars of Arabic by having (a) adopted the Roman alphabet; (b) a phonemic system without

emphatics, with fewer back consonants but more vowels; (c) virtually the whole of the vocabulary pertaining to intellectual, technical, and scientific pursuits taken from Sicilian, Italian, and English; (d) a number of conservative lexical features (e.g., ra ‘he saw’); and (e) grammatical innovations of Romance origin (e.g., passives with kien ‘he was’ or g. ie ‘he came’ as auxiliaries). These features reflect the fact that Maltese has not had Classical Arabic as an acrolect for some seven centuries. This and the fact that it is now the national language of an independent state have given Maltese the status of a distinct language. - ra sibt ruh - i, gh - all-ewwel Sample: Il-g. imgha l-oh /il "dZima l "ohra sipt "ru:hi all "ewwel Last week I found myself, for the first - ajti, f" ‘lecture theatre’ ta" l-Universita`. darba f"h "darba f "hajti f "lektSœr "ti:œtœr ta l universi"ta/ time in my life, in a lecture theatre of the University.

See also: Malta: Language Situation.

Bibliography Isserlin B S J (1990). ‘The Maltese language.’ In Bosworth C E, Van Donzel E, Lewis B & Pellat C (eds.) The encyclopaedia of Islam. Leiden: E. J. Brill. 295–298.

Malukan Languages M Florey, Monash University, Clayton, VIC, Australia ! 2006 Elsevier Ltd. All rights reserved.

Some 128 languages are spoken in the geopolitical region of the Malukan islands in eastern Indonesia (see Figure 1). The majority of the 111 Austronesian languages of Maluku are subgrouped within the Central Malayo-Polynesian branch of Central-Eastern Malayo-Polynesian (CEMP) (Blust, 1978). A number of the Austronesian languages of north Maluku are subgrouped in the South Halmahera-West New Guinea branch of CEMP (Collins and Voorhoeve, 1983). Current information indicates that there are also 17 non-Austronesian languages spoken in Maluku (Grimes, 2000). Sixteen West Papuan phylum languages are found in the northernmost parts of Maluku, on Morotai, Ternate, Tidore, Halmahera, and nearby smaller islands. Oirata is a Trans-New Guinea phylum language of southern Kisar island, located near the north-eastern tip of East Timor.

Linguistically, Maluku is characterized by high linguistic diversity, serious endangerment, and little detailed documentation. Speaker populations have historically been much smaller than in ethnolinguistic communities in the western Austronesian region. Larger languages include Kei with 86 000 speakers and Buru with perhaps 43 000 speakers (Grimes, 1995). These numbers, however, give an overly optimistic picture of linguistic vitality. The highest documented degree of language endangerment in Indonesia is located in Maluku. A recent survey indicated that 10 languages are close to extinction and a further nine languages are seriously endangered (Florey, 2005). Centuries of contact with nonindigenous peoples through colonization and intensive trade for spices, and conversion to nonindigenous religions have all played a role in language endangerment, which is particularly severe in the central Malukan islands of Seram and Ambon. Language contact has resulted in the wide use of a number of Malay creoles throughout Maluku of which Ambonese Malay is the best known (Minde, 1997).

Malukan Languages 471

Maltese J Cremona ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 5, pp. 2356–2357, ! 1994, Elsevier Ltd.

Maltese is the national language of the Republic of Malta and one of its two official languages, the other being English. It is spoken by virtually all the 345 418 (1985) inhabitants (plus ca. 80 000 Maltese immigrants in Australia). Until the 1930s, its status was low, with the prestige languages being Italian and English. The first text written in Maltese, a poem, is ca. 1460 AD, but although texts appear sporadically thereafter, Maltese only began to be written systematically from about the end of the eighteenth century. Maltese, a language of Arabic origin, shares many of the features that distinguish the modern Arabic vernaculars from literary Arabic. Maltese also displays those features that distinguish Maghrebine dialects from the rest; for example, the loss of gender distinctions in second-person-singular pronouns and verbs and the leveling of first-person markers in the imperfect to give {n . . . ø} for the singular and {n . . . u} for the plural. However, Maltese differs from most ‘core’ vernaculars of Arabic by having (a) adopted the Roman alphabet; (b) a phonemic system without

emphatics, with fewer back consonants but more vowels; (c) virtually the whole of the vocabulary pertaining to intellectual, technical, and scientific pursuits taken from Sicilian, Italian, and English; (d) a number of conservative lexical features (e.g., ra ‘he saw’); and (e) grammatical innovations of Romance origin (e.g., passives with kien ‘he was’ or g. ie ‘he came’ as auxiliaries). These features reflect the fact that Maltese has not had Classical Arabic as an acrolect for some seven centuries. This and the fact that it is now the national language of an independent state have given Maltese the status of a distinct language. - ra sibt ruh - i, gh - all-ewwel Sample: Il-g. imgha l-oh /il "dZima l "ohra sipt "ru:hi all "ewwel Last week I found myself, for the first - ajti, f" ‘lecture theatre’ ta" l-Universita`. darba f"h "darba f "hajti f "lektSœr "ti:œtœr ta l universi"ta/ time in my life, in a lecture theatre of the University.

See also: Malta: Language Situation.

Bibliography Isserlin B S J (1990). ‘The Maltese language.’ In Bosworth C E, Van Donzel E, Lewis B & Pellat C (eds.) The encyclopaedia of Islam. Leiden: E. J. Brill. 295–298.

Malukan Languages M Florey, Monash University, Clayton, VIC, Australia ! 2006 Elsevier Ltd. All rights reserved.

Some 128 languages are spoken in the geopolitical region of the Malukan islands in eastern Indonesia (see Figure 1). The majority of the 111 Austronesian languages of Maluku are subgrouped within the Central Malayo-Polynesian branch of Central-Eastern Malayo-Polynesian (CEMP) (Blust, 1978). A number of the Austronesian languages of north Maluku are subgrouped in the South Halmahera-West New Guinea branch of CEMP (Collins and Voorhoeve, 1983). Current information indicates that there are also 17 non-Austronesian languages spoken in Maluku (Grimes, 2000). Sixteen West Papuan phylum languages are found in the northernmost parts of Maluku, on Morotai, Ternate, Tidore, Halmahera, and nearby smaller islands. Oirata is a Trans-New Guinea phylum language of southern Kisar island, located near the north-eastern tip of East Timor.

Linguistically, Maluku is characterized by high linguistic diversity, serious endangerment, and little detailed documentation. Speaker populations have historically been much smaller than in ethnolinguistic communities in the western Austronesian region. Larger languages include Kei with 86 000 speakers and Buru with perhaps 43 000 speakers (Grimes, 1995). These numbers, however, give an overly optimistic picture of linguistic vitality. The highest documented degree of language endangerment in Indonesia is located in Maluku. A recent survey indicated that 10 languages are close to extinction and a further nine languages are seriously endangered (Florey, 2005). Centuries of contact with nonindigenous peoples through colonization and intensive trade for spices, and conversion to nonindigenous religions have all played a role in language endangerment, which is particularly severe in the central Malukan islands of Seram and Ambon. Language contact has resulted in the wide use of a number of Malay creoles throughout Maluku of which Ambonese Malay is the best known (Minde, 1997).

472 Malukan Languages

Figure 1 Map of Indonesia showing the location of the Malukan islands.

Tryon (1994: 12) suggested that Maluku is possibly the least known Austronesian area and many languages – both Austronesian and non-Austronesian – remain undescribed. The richest descriptions to date include those of Alune (Florey, 1998, 2001) and Buru (Grimes, 1991, 1995) in central Maluku, Taba (Bowden, 2001) and Tidore (Staden, 2000) in north Maluku, and Leti (Engelenhoven, 1995) in south Maluku. These descriptions provide some insights into oral genres, including origin tales, historical narratives, folktales, riddles, and incantations. Parallelism (paired correspondences at the semantic and syntactic levels) is a feature of incantations and some narrative genres. Among the special registers which have been documented are those which were associated with avoidance relationships, hunting, fishing, healing, and headhunting. In some communities, ritual language still accompanies ceremonies held to mark the passing of life stages, and ritual practices associated with agriculture, renewing inter-village alliances, and the building of ritual houses. Comparative analysis indicates that, in central Malukan languages, the preferred word order within clauses is SVO. Actor arguments in Alune may occur as a full noun phrase, a pronoun, or a proclitic, and actor NPs and pronouns are optionally crossreferenced with a proclitic on the verb. Undergoer arguments may occur as a full noun phrase, a pronoun, or an enclitic. Au 1SG

beta-’u-ru opp.sex.sibling-1SG.POSS.INALIEN-PL

esi-tneu behe a-’eri-’e sarei 3PL-ask CMP 2SG-work-APP what ‘my younger siblings they asked me: ‘‘What did you work at?’’’ (Alune AK: 45)

This pattern of crossreferencing is not always apparent today as the rapid language change which accompanies language endangerment is typically characterized by extensive variation both within and between speech communities. A morphologically marked alienable–inalienable contrast has been described for a number of the languages of Central Maluku. Synchronically, this contrast is not found across all languages. In those languages in which the contrast is marked, inalienable possession denotes all items which are culturally considered to be intrinsically a part of oneself – the things which we as humans are born with, and certain physical and emotional states. Alienable possession denotes the things which we might acquire through our lives: certain relationships and objects or possessions. Inalienable possession is marked with enclitics and alienable possession with proclitics, as demonstrated in the following Haruku examples: Au oi kura au ama-’u 1SG go with 1SG father-1SG.POSS.INALIEN kura au ina-’u and 1SG mother-1SG.POSS.INALIEN ‘I went with my father and my mother’ Esi-kana esi-lapu-na 3PL-fetch 3PL.POSS.ALIEN-shirt-NM.PL lalu ani reu then.MAL 1PL return.home ‘they fetched their shirts then we went home’

Mambila 473 See also: Austronesian Languages: Overview; Endangered Languages; Indonesia: Language Situation; Malayo-Polynesian Languages; Papuan Languages.

Bibliography Blust R A (1978). ‘Eastern Malay-Polynesian: a subgrouping argument.’ In Wurm S A & Carrington L (eds.) Proceedings of the Second International Conference on Austronesian Linguistics. Canberra: Pacific Linguistics. Bowden J (2001). Taba: description of a South Halmahera language. Canberra: Pacific Linguistics. Collins J T (1983). The historical relationships of the languages of Central Maluku, Indonesia. Canberra: Pacific Linguistics. Collins J T & Voorhoeve C L (1983). ‘Moluccas (Maluku).’ In Wurm S A & Hattori S (eds.) Language atlas of the Pacific area 2: Japan area, Taiwan (Formosa), Philippines, mainland and insular south-east Asia. Canberra: Australian Academy of the Humanities in collaboration with the Japan Academy. Engelenhoven A van (1995). A description of the Leti language (as spoken in Tutukei). Ridderkerk, Netherlands: Offsetdrukkerij Ridderprint. Florey M (1998). ‘Alune incantations: continuity or discontinuity in verbal art?’ Journal of Sociolinguistics 2(2), 205–231. Florey M (2001). ‘Verb and valence in Alune.’ La Trobe Papers In Linguistics 11, 73–120. Florey M (2005). ‘Language shift and endangerment.’ In Adelaar A & Himmelmann N P (eds.) The Austronesian

languages of Asia and Madagascar. London: Routledge. 43–64. Florey M & Kelly B F (2002). ‘Spatial reference in Alune.’ In Bennardo G (ed.) Representing space in Oceania: culture in language and mind. Canberra: Pacific Linguistics. 11–46. Florey M & Wolff X Y (1998). ‘Incantations and herbal medicines: Alune ethnomedical knowledge in a context of change.’ Journal of Ethnobiology 18, 39–67. Grimes B F (ed.) (2000). Ethnologue: languages of the world (14th edn.). Dallas: SIL International. Grimes C E (1991). ‘The Buru language of eastern Indonesia.’ Ph.D. dissertation, Australian National University. Grimes C E (1995). ‘Buru (Masarete).’ In Tryon D T (ed.) Comparative Austronesian dictionary: an introduction to Austronesian studies. Berlin: Mouton de Gruyter. 623–636. Minde D van (1997). Malayu Ambong: phonology, morphology, syntax. Leiden: Research School CNWS. Staden M van (2000). Tidore: a linguistic description of a language of the North Moluccas. Delft: Systeem Drukkers. Steinhauer H (forthcoming). ‘Endangered languages in south-east Asia.’ In Collins J T & Steinhauer H (eds.) Endangered languages and literatures in south-east Asia. Leiden: KITLV Press. Tryon D T (1994). ‘The Austronesian languages.’ In Tryon D T (ed.) Comparative Austronesian dictionary: an introduction to Austronesian studies 1. Berlin: Mouton de Gruyter. 5–44.

Mambila B Connell, York University, Toronto, ON, Canada ! 2006 Elsevier Ltd. All rights reserved.

Mambila is a Bantoid language situated in the Nigeria-Cameroon borderland. Mambila is a diverse language with approximately 20 different dialects. Among its interesting characteristics is its system of four level tones, and in one lect the presence of two fricative vowels that appear to be reflexes of the so-called super-close vowels of Proto-Bantu. Several Mambila lects are endangered, with some on the verge of extinction.

Classification Mambila has been recognized since the early 1960s as a Bantoid language. A subgrouping within Bantoid now known as Mambiloid, which includes a number of other languages in the region, was proposed a

decade later, although the precise relationship between Mambiloid and the rest of Bantoid remains a matter of debate. Mambila is the most diverse of the Mambiloid languages. It is spoken on both sides of the NigeriaCameroon border, on the Mambila Plateau in Nigeria, and on the western edges of the Adamawa Plateau and the Tikar Plain in Cameroon. The great majority of Mambila speakers – an estimated 90 000 of 100 000 total speakers – are in Nigeria. Mambila comprises some 20 dialects, which are divided into two clusters, referred to by their rough geographical orientation as East and West Mambila. Within each cluster there is limited mutual intelligibility among dialects, reflecting a dialect continuum; between the two clusters, mutual intelligibility does not exist, although speakers do recognize the relatedness of their languages to other languages in the region. Strictly on the basis of linguistic criteria, one might be inclined to refer to many of these dialects as distinct languages.

Mambila 473 See also: Austronesian Languages: Overview; Endangered Languages; Indonesia: Language Situation; Malayo-Polynesian Languages; Papuan Languages.

Bibliography Blust R A (1978). ‘Eastern Malay-Polynesian: a subgrouping argument.’ In Wurm S A & Carrington L (eds.) Proceedings of the Second International Conference on Austronesian Linguistics. Canberra: Pacific Linguistics. Bowden J (2001). Taba: description of a South Halmahera language. Canberra: Pacific Linguistics. Collins J T (1983). The historical relationships of the languages of Central Maluku, Indonesia. Canberra: Pacific Linguistics. Collins J T & Voorhoeve C L (1983). ‘Moluccas (Maluku).’ In Wurm S A & Hattori S (eds.) Language atlas of the Pacific area 2: Japan area, Taiwan (Formosa), Philippines, mainland and insular south-east Asia. Canberra: Australian Academy of the Humanities in collaboration with the Japan Academy. Engelenhoven A van (1995). A description of the Leti language (as spoken in Tutukei). Ridderkerk, Netherlands: Offsetdrukkerij Ridderprint. Florey M (1998). ‘Alune incantations: continuity or discontinuity in verbal art?’ Journal of Sociolinguistics 2(2), 205–231. Florey M (2001). ‘Verb and valence in Alune.’ La Trobe Papers In Linguistics 11, 73–120. Florey M (2005). ‘Language shift and endangerment.’ In Adelaar A & Himmelmann N P (eds.) The Austronesian

languages of Asia and Madagascar. London: Routledge. 43–64. Florey M & Kelly B F (2002). ‘Spatial reference in Alune.’ In Bennardo G (ed.) Representing space in Oceania: culture in language and mind. Canberra: Pacific Linguistics. 11–46. Florey M & Wolff X Y (1998). ‘Incantations and herbal medicines: Alune ethnomedical knowledge in a context of change.’ Journal of Ethnobiology 18, 39–67. Grimes B F (ed.) (2000). Ethnologue: languages of the world (14th edn.). Dallas: SIL International. Grimes C E (1991). ‘The Buru language of eastern Indonesia.’ Ph.D. dissertation, Australian National University. Grimes C E (1995). ‘Buru (Masarete).’ In Tryon D T (ed.) Comparative Austronesian dictionary: an introduction to Austronesian studies. Berlin: Mouton de Gruyter. 623–636. Minde D van (1997). Malayu Ambong: phonology, morphology, syntax. Leiden: Research School CNWS. Staden M van (2000). Tidore: a linguistic description of a language of the North Moluccas. Delft: Systeem Drukkers. Steinhauer H (forthcoming). ‘Endangered languages in south-east Asia.’ In Collins J T & Steinhauer H (eds.) Endangered languages and literatures in south-east Asia. Leiden: KITLV Press. Tryon D T (1994). ‘The Austronesian languages.’ In Tryon D T (ed.) Comparative Austronesian dictionary: an introduction to Austronesian studies 1. Berlin: Mouton de Gruyter. 5–44.

Mambila B Connell, York University, Toronto, ON, Canada ! 2006 Elsevier Ltd. All rights reserved.

Mambila is a Bantoid language situated in the Nigeria-Cameroon borderland. Mambila is a diverse language with approximately 20 different dialects. Among its interesting characteristics is its system of four level tones, and in one lect the presence of two fricative vowels that appear to be reflexes of the so-called super-close vowels of Proto-Bantu. Several Mambila lects are endangered, with some on the verge of extinction.

Classification Mambila has been recognized since the early 1960s as a Bantoid language. A subgrouping within Bantoid now known as Mambiloid, which includes a number of other languages in the region, was proposed a

decade later, although the precise relationship between Mambiloid and the rest of Bantoid remains a matter of debate. Mambila is the most diverse of the Mambiloid languages. It is spoken on both sides of the NigeriaCameroon border, on the Mambila Plateau in Nigeria, and on the western edges of the Adamawa Plateau and the Tikar Plain in Cameroon. The great majority of Mambila speakers – an estimated 90 000 of 100 000 total speakers – are in Nigeria. Mambila comprises some 20 dialects, which are divided into two clusters, referred to by their rough geographical orientation as East and West Mambila. Within each cluster there is limited mutual intelligibility among dialects, reflecting a dialect continuum; between the two clusters, mutual intelligibility does not exist, although speakers do recognize the relatedness of their languages to other languages in the region. Strictly on the basis of linguistic criteria, one might be inclined to refer to many of these dialects as distinct languages.

474 Mambila

For this reason, the neutral term ‘lect’ is used in referring to individual varieties of Mambila. The main characteristic distinguishing the two dialect clusters is a difference in morpheme structure: In East Mambila a disyllabic root structure, CVCV(C), predominates, which corresponds to a monosyllabic CVC structure in West Mambila. A number of sound correspondences also serve to distinguish the two groupings, e.g., initial /f/ and /h/ in East Mambila correspond to /p/ and /f/, respectively, in West Mambila. Little descriptive work has been done on Mambila. The two lects that have received the greatest attention are Tungba, spoken in Nigeria, and Ba, in Cameroon. Both are West Mambila lects. The following paragraphs present a short summary of Mambila structural characteristics.

Phonology Consonants

Across Mambila lects there is little difference among consonant systems; what differences do exist are mostly related to the historical developments described above. The Ba system, /p, b, t, d, k, g, , , m, n, J, N, , mb, mv, nd, ndZ, Ng, , f, v, s, h, l, j, w/, is fairly typical, both in its inventory and in the fact that /p, , , and / in those lects, where they do occur, are infrequent. Distribution of consonants within the word is skewed in Mambila, with all consonants occurring in the initial position, but typically only /p, t, k, m, n, J, N, l/ being found word finally. Vowels

There is greater variation in the vowel systems of Mambila lects than for the consonants. The vowel system found in Ba, /i, e, a, e, u, o, O/ is the smallest; Tungba has only a slightly larger system, /i, e, E, a, u, o, O, A/, but its phonetic realization is divergent, with allophonic front rounded vowels. Len, another West Mambila lect, is even more divergent, particularly with the presence of two fricative vowels, / , /. These vowels appear to be the result of sub- or adstratal influence from the neighboring Grassfields languages, which may ultimately reflect the super-close vowels of Proto-Bantu.

has not been systematically investigated, although it is known that they have only three level tones.

Morphology Mambila marks grammatical functions through affixation, typically suffixation. In most West Mambila dialects many of these functions are indicated only with a tonal morpheme; comparative evidence from both West and East Mambila reveals that a !CV melody is reconstructible in most cases. Despite the fact that Mambila is a Bantoid language, there is no system of nominal classification, and only traces remain of a former noun class system that reveals the heritage shared with Bantu. Pluralization is marked through means of a segmental suffix, !bV, except in Ba and other lects on the Tikar Plain, where a cognate prefix is used. There is evidence of an older means of marking plurals, and the presence of the common !bV is likely a recent development, perhaps through areal influences.

Syntax Little can be said at this point concerning the syntax of Mambila. It has basic SVO word order, which varies to indicate narrative, focus, and other pragmatic functions. As mentioned above, tone is used to indicate a number of grammatical functions, including negation, imperatives, and discontinuous verb phrases.

Language Vitality Since it has more than 100 000 speakers, one might expect that Mambila will remain relatively stable for the foreseeable future. However, when its internal dialect variation is considered, an average of approximately 5000 speakers exists for each lect. Many of these are spoken by considerably fewer speakers; indeed, a few Mambila lects are on the verge of extinction, while one other has just recently become extinct. Given the potential contribution that these lects could make not only to our understanding of our common linguistic heritage but also to the history and prehistory of sub-Saharan Africa, documentation of these lects must be considered a priority.

Tone

Mambila is a register tone language; in West Mambila, it features four level tones that function both lexically and grammatically, and tones combine on single syllables to form a number of surface contours. Pitch realization in the Ba lect has been the subject of a number of experimental phonetic studies (Connell, 1999, 2000b, 2002). Tone in East Mambila lects

See also: Bantu Languages; Endangered Languages; Tone: Phonology.

Bibliography Blench R (1993). ‘An outline classification of the Mambiloid languages.’ Journal of West African Languages, 23(1), 105–118.

Manambu 475 Connell B (1999). ‘Four tones and downtrend: a preliminary report on pitch realization in Mambila.’ In Kotey Paul (ed.) Trends in African linguistics 3: New dimensions in African linguistics and languages. Trenton, N.J: Africa World Press. 75–88. Connell B (2000a). ‘The integrity of Mambiloid.’ In Wolff E & Gensler O (eds.) Proceedings of the 2nd World Congress of African Linguistics, Leipzig. Cologne: Ru¨ diger Ko¨ ppe. 197–213. Connell B (2000b). ‘The perception of lexical tone in Mambila.’ Language and Speech 43, 163–182. Connell B (2002). ‘Tone languages and the universality of intrinsic F0: evidence from Africa.’ Journal of Phonetics 30, 101–129.

Perrin M (1974). ‘Direct and indirect speech in Mambila.’ Journal of Linguistics 10, 27–37. Perrin M (1974). ‘Mambila.’ In Bendor-Samuel J (ed.) Ten Nigerian tone systems. Jos: Institute of Linguistics. 93–108. Perrin M (1994). ‘Rheme and focus in Mambila.’ In Levinsohn S (ed.) Discourse features of ten languages of West-Central Africa. Dallas: Summer Institute of Linguistics and University of Texas at Arlington. 231–241. Zeitlyn D (1993). ‘Reconstructing kinship or the pragmatics of kin talk.’ Man 28(2), 199–224. Zeitlyn D & Connell B (2003). ‘Fractal history on an African frontier: Mambila – Njerep – Mandulu.’ Journal of African History 44, 117–138.

Manambu A Y Aikhenvald and P Y L Laki, La Trobe University, Bundoora, Australia ! 2006 Elsevier Ltd. All rights reserved.

Manambu belongs to the Ndu language family, and is spoken by about 2500 people in five villages: Avatip, Yawabak, Malu, Apa:n and Yuanab (Yambon) in East Sepik Province of Papua New Guinea. Between 200 and 400 speakers live in the towns of Port Moresby, Wewak, Lae, and Madang. Most Manambu speakers are proficient in Tok Pisin, the lingua franca of Papua New Guinea; many know English. In terms of number of speakers, the Ndu family is the largest in the Sepik area, comprising 32% of the Sepik basin dwellers (Roscoe, 1994). It consists of at least eight languages spoken by over 100 000 people along the course of the middle Sepik River and to the north of it. Other documented languages in the family are: Abelam or Ambulas (ca. 40 000; this number includes speakers of a variety of dialects under the names of Maprik, Wosera, West Wosera, and Hanga Kundi); Boikin (ca. 30 000); Iatmul (ca. 12 000); Sawos (ca. 9000); Yelogu (ca. 200); and Ngala (ca. 130). No genetic links between Ndu and other languages of the Sepik area have been proved. The origins, protohome, and the internal classification of the Ndu languages remains a matter for debate. Manambu’s closest relatives are Iatmul and Ngala. The trade relationship and marriage exchange with the Iatmul contributed to a large amount of lexical diffusion between the two groups in close contact. Manambu is synthetic, agglutinating with some fusion, mostly suffixing, and predominantly verbfinal. The phonology of Manambu is complicated,

with 21 consonants, nine vowels, and contrastive stress. Nouns distinguish eight cases (subject, definite object/locative; dative/aversive; allative/instrumental; comitative; terminative ‘up to the point’; and two cases referring to ‘means of transport’). Three numbers (singular, dual, and plural) and two genders (feminine and masculine in the singular) are expressed via agreement on demonstratives, interrogatives, in possessive constructions, on verbs and on two adjectives (‘big’ and ‘small’). Singular and plural numbers are marked on kinship nouns, and on a few nouns from other semantic groups. The noun ‘child’ has a semisuppletive form for the dual number. Associative plural is marked on kinship nouns and personal names, as in Tanina-b%r ‘Tanina and others.’ Gender is distinguished in second and third person singular independent pronouns, and neutralized in the plural. Nouns are assigned genders according to the sex of a human referent, and to shape and size of any other referent. That is, men are assigned to the masculine, and women to the feminine gender; a large house is masculine, and a small house feminine. By semantic extension, an unusually big or bossy woman can be treated as masculine, and a squat fattish man as feminine. Personal names are a distinct subclass of nouns, with special derivational suffixes not used anywhere else in the grammar. Verbs have a plethora of grammatical categories, covering person, number, gender, tense, numerous aspects (e.g., completive, habitual, and repetitive) and modalities including irrealis, purposive, desiderative, and conditional. A verb in the declarative mood can cross-reference the person, number, and gender of the subject. Or, if a clause contains a constituent that is more topical than the subject, this constituent can also be cross-referenced alongside the subject. The

Manambu 475 Connell B (1999). ‘Four tones and downtrend: a preliminary report on pitch realization in Mambila.’ In Kotey Paul (ed.) Trends in African linguistics 3: New dimensions in African linguistics and languages. Trenton, N.J: Africa World Press. 75–88. Connell B (2000a). ‘The integrity of Mambiloid.’ In Wolff E & Gensler O (eds.) Proceedings of the 2nd World Congress of African Linguistics, Leipzig. Cologne: Ru¨diger Ko¨ppe. 197–213. Connell B (2000b). ‘The perception of lexical tone in Mambila.’ Language and Speech 43, 163–182. Connell B (2002). ‘Tone languages and the universality of intrinsic F0: evidence from Africa.’ Journal of Phonetics 30, 101–129.

Perrin M (1974). ‘Direct and indirect speech in Mambila.’ Journal of Linguistics 10, 27–37. Perrin M (1974). ‘Mambila.’ In Bendor-Samuel J (ed.) Ten Nigerian tone systems. Jos: Institute of Linguistics. 93–108. Perrin M (1994). ‘Rheme and focus in Mambila.’ In Levinsohn S (ed.) Discourse features of ten languages of West-Central Africa. Dallas: Summer Institute of Linguistics and University of Texas at Arlington. 231–241. Zeitlyn D (1993). ‘Reconstructing kinship or the pragmatics of kin talk.’ Man 28(2), 199–224. Zeitlyn D & Connell B (2003). ‘Fractal history on an African frontier: Mambila – Njerep – Mandulu.’ Journal of African History 44, 117–138.

Manambu A Y Aikhenvald and P Y L Laki, La Trobe University, Bundoora, Australia ! 2006 Elsevier Ltd. All rights reserved.

Manambu belongs to the Ndu language family, and is spoken by about 2500 people in five villages: Avatip, Yawabak, Malu, Apa:n and Yuanab (Yambon) in East Sepik Province of Papua New Guinea. Between 200 and 400 speakers live in the towns of Port Moresby, Wewak, Lae, and Madang. Most Manambu speakers are proficient in Tok Pisin, the lingua franca of Papua New Guinea; many know English. In terms of number of speakers, the Ndu family is the largest in the Sepik area, comprising 32% of the Sepik basin dwellers (Roscoe, 1994). It consists of at least eight languages spoken by over 100 000 people along the course of the middle Sepik River and to the north of it. Other documented languages in the family are: Abelam or Ambulas (ca. 40 000; this number includes speakers of a variety of dialects under the names of Maprik, Wosera, West Wosera, and Hanga Kundi); Boikin (ca. 30 000); Iatmul (ca. 12 000); Sawos (ca. 9000); Yelogu (ca. 200); and Ngala (ca. 130). No genetic links between Ndu and other languages of the Sepik area have been proved. The origins, protohome, and the internal classification of the Ndu languages remains a matter for debate. Manambu’s closest relatives are Iatmul and Ngala. The trade relationship and marriage exchange with the Iatmul contributed to a large amount of lexical diffusion between the two groups in close contact. Manambu is synthetic, agglutinating with some fusion, mostly suffixing, and predominantly verbfinal. The phonology of Manambu is complicated,

with 21 consonants, nine vowels, and contrastive stress. Nouns distinguish eight cases (subject, definite object/locative; dative/aversive; allative/instrumental; comitative; terminative ‘up to the point’; and two cases referring to ‘means of transport’). Three numbers (singular, dual, and plural) and two genders (feminine and masculine in the singular) are expressed via agreement on demonstratives, interrogatives, in possessive constructions, on verbs and on two adjectives (‘big’ and ‘small’). Singular and plural numbers are marked on kinship nouns, and on a few nouns from other semantic groups. The noun ‘child’ has a semisuppletive form for the dual number. Associative plural is marked on kinship nouns and personal names, as in Tanina-b%r ‘Tanina and others.’ Gender is distinguished in second and third person singular independent pronouns, and neutralized in the plural. Nouns are assigned genders according to the sex of a human referent, and to shape and size of any other referent. That is, men are assigned to the masculine, and women to the feminine gender; a large house is masculine, and a small house feminine. By semantic extension, an unusually big or bossy woman can be treated as masculine, and a squat fattish man as feminine. Personal names are a distinct subclass of nouns, with special derivational suffixes not used anywhere else in the grammar. Verbs have a plethora of grammatical categories, covering person, number, gender, tense, numerous aspects (e.g., completive, habitual, and repetitive) and modalities including irrealis, purposive, desiderative, and conditional. A verb in the declarative mood can cross-reference the person, number, and gender of the subject. Or, if a clause contains a constituent that is more topical than the subject, this constituent can also be cross-referenced alongside the subject. The

476 Manambu

imperative mood also marks person and number of the subject employing a different set of markers. The only fully productive prefix in the language is a-, the marker of second person imperative. Three suffixes expressing prohibition differ in their illocutionary force. Many of the verbal categories – including person and tense – are neutralized in negative clauses. Verb compounding is highly productive; up to three verbal roots can occur together, but the meaning of the combination is frequently unpredictable. Directionality (up, down, inside, outside) is marked both on verbs and on demonstratives. In addition, demonstratives encode six degrees of distance and visibility. Similarly to other non-Austronesian languages of New Guinea, Manambu has extensive clausechaining and a complex system of switch-reference, whereby a nonfinal clause is marked differently depending on whether its subject is the same, or differs, from that of the main clause. See Aikhenvald with Laki (forthcoming) for a full account of Manambu grammar, and also Aikhenvald (1998) and Allen and Hurd (1972). The relative complexity of Manambu could be partially accounted for by the substrata of languages spoken by members of neighboring tribes conquered by the Manambu as a result of inter-tribal warfare (Harrison, 1993). Manambu culture places particular importance on ownership of personal names and various kinds of cultural knowledge. Ritualized debates among rival leaders and the clan groups they represent are, traditionally, the main political forum, and ownership of names is an oft-debated issue. A detailed study of Manambu ethnography is in Harrison (1990, 1993), which also contains a detailed analysis of the kinship system and relationships (of Siouan type). Traditional genres include mourning songs grakudi and foiled marriage songs namai (Harrison, 1982; Takendu, 1977). Manambu is an endangered language. All the Manambu are bilingual in Tok Pisin (and some also

Manchu-Tungus

Mandarin

See: Tungusic Languages.

See: Chinese.

know English). Children in the villages prefer using the local lingua franca, Tok Pisin, in their day-to-day interaction. A literacy program in Manambu is currently being implemented at the local school at Avatip.

See also: Gender, Grammatical; Papua New Guinea: Language Situation; Switch Reference; Tok Pisin.

Bibliography Aikhenvald A Y (1998). ‘Physical properties in a gender system: a study of Manambu.’ Language and Linguistics in Melanesia 27, 175–187. Aikhenvald A Y with Laki P Y L (Forthcoming). The Manambu language from East Sepik, Papua New Guinea. Allen J D & Hurd P W (1972). ‘Manambu phonemes.’ Te Reo 15, 37–44. Farnsworth R & Farnsworth M (1966). Manambu grammar sketch 1: Morphology. Ukarumpa: Summer Institute of Linguistics. Harrsion S (1982). Laments for foiled marriages: love songs from a Sepik River village. Port Moresby: Institute for Papua New Guinea Studies. Harrison S (1990). Stealing people’s names: history and politics in a Sepik river cosmology. Cambridge: Cambridge University Press. Harrison S (1993). The mask of war. Manchester/ New York: Manchester University Press. Laycock D (1965.). The Ndu language family (Sepik District, New Guinea). Canberra: Linguistic Circle of Canberra Publications. Roscoe P (1994). ‘Who are the Ndu? Ecology, migration, and linguistic and cultural change in the Sepik Basin.’ In Strathern A J & Stu¨ rzenhofecker G (eds.) Migration and transformations: regional perspectives on New Guinea. Pittsburgh: University of Pittsburgh Press. 49–84. Takendu D (1977). ‘Avatip village, Ambunti sub-province, East Sepik.’ Oral History 5(5), 2–53.

Mande Languages 477

Mande Languages D Dwyer, Michigan State University, East Lansing, MI, USA ! 2006 Elsevier Ltd. All rights reserved.

The Distribution of the Mande-Speaking People Today, the Mande (or Mande´ in French) language group consists of some 30 languages spoken in West Africa from Nigeria to Senegal by an estimated 10 million speakers. The term Mande and its variants (see Table 1) provide not only the basis for many of the names of the Northern Mande languages, but the various names accorded the language family as well. These variants are attributable to (1) minor vowel alternations (e/i and a/e), (2) the consonantal alternation (nd/n/l) found throughout the Mande-speaking area, and (3) the suffix -ka(n), meaning ‘language or dialect.’ The map of the distribution of the Mande languages (Figure 1) shows that the heaviest concentration of Mande languages is in the republics of Guinea and Mali, and adjacent areas of Senegal, Sierra Leone, Liberia, and Ivory Coast. Furthermore, these western languages are contiguous and cover larger areas than those to the east, which appear as islands in a sea of Niger-Congo languages.

The Reconstructed History of the Mande While scholars have not reached a total consensus on how the Mande evolved, evidence from the historical, archaeological, and linguistic record suggests the following six stages. Phase 1: The Drying of the Desert

According to McIntosh and McIntosh (1984), the Mande originally lived in a much wetter Saharan area and practiced a herding–fishing–collecting economy. Lexicostatistical evidence (Dwyer, 1989) suggests that 4000 years ago the Mande people were undifferentiated linguistically. Around 3000 B.P. (before present), in response to the increasing lack of rainfall, one branch of ProtoMande (the earliest form of Mande) speakers migrated southward where wetter conditions would permit their herding–fishing–collecting way of life. The other branch, known as the western branch, responded to the increasing dryness by intensifying their cultivation of cereals. Phase 2: The Development of Agriculture

At Jenno-Jene in the upper Niger delta, archaeologists have identified a site, continuously occupied from

about 2250 B.P., that exhibits a second agricultural phase. This elaboration of agriculture may well have been responsible for the diversification and the westward expansion of the Central Mande speakers. This expansion may also have been responsible for pushing the pastoral Soninke further to the north. Phase 3: The Rise of the Sudanic Kingdoms of Ghana and Mali

At the time of the sedentarization of the Western Mande, the people of this area were engaged in extensive trans-Saharan trade with North Africa. The stimulus for the trade was the alluvial gold found in deposits along the upper Niger River, which was exchanged for Mediterranean merchandise and salt. This trade gave rise to the Soninke-speaking empires of Ghana (700–1100) and the Manding-speaking empire of Mali (800–1550) The substantial area taken up by the Western Mande can be attributed to the expansion of this empire. Phase 4: Rice and the Development of Forest Agriculture

While research in this area is still in progress, evidence suggests that a form of upland rice in the Guinea Highlands and iron tools permitted the Mande (and Atlantic) populations living along the rainforest– savannah border to enter the forest to practice swidden agriculture. The map shows a number of Mande groups straddling this line including the Southwestern Mande, the Vai/Kono into present-day Sierra Leone and Liberia, and many of the Eastern Mande peoples into Liberia and Ivory Coast. Using oral traditions and genealogies, Person (1961) concludes that this movement into the rain forest took place in the 15th century. As these agricultural people moved into the sparsely populated rain forests, they increased the risk of malarial infections. In response to this situation, the percentages of the sickle cell trait (an adaptation to malaria) increased in these populations to the point where they are among the highest in the world (Livingstone, 1958). Table 1 Variants of Mande Mandi(ng) Mande (Mande´) Mende Mani Mane Mali Male

Mandinka, Mandingo Mandekan Manianka Malinke

478 Mande Languages

Figure 1 Map of the Mande languages.

Table 2 Linguistic evidence

vocabulary common to all or most members of the branch are considered to be in existence at the time the branch was an undifferentiated language. Thus the terms for wine, mortar, and dog were common to the western branch but not the eastern branch, and are presumed to have been part of Western Mande before it separated into its constituent groups. This linguistic evidence is consistent with that proposed for the early phases of Mande.

The Classification of Mande Languages Current Classification

The internal classification of Mande (see Table 3) has undergone a series of revisions, the most recent and most accurate being that done by Kastenholtz (1996). For a full classification of the Mande languages go to the Ethnologue website. Earlier Classifications Phase 5: The Arrival of the Europeans

Beginning with the Portuguese in 1455, contacts, trade, and finally settlement in this area increased, so that by 1500 permanent trading outposts and slaving operations were fully established. One effect of this development was the decreasing economic importance of the trans-Saharan trade and the decline of the Sudanic kingdoms. The Linguistic Evidence

Following a technique developed by Ehret (1980), Table 2 shows the Proto-Mande terminology relating to economic activity (hunting, herding, and agriculture). After establishing the lexicostatistical dates,

Mande was first recognized as a related group of languages by Sigmund Koelle, who used the term Mandinga (Koelle, 1854). Shortly thereafter Heymann Steinthal (1867) introduced the term Mande (or Mande´ ). Maurice Delafosse offered the first subclassification of Mande in 1901, in which the major distinction was between Mande Tan (which is the northern group minus Susu and Yalunka) and Mande Fou, based on the words for ‘ten.’ Over time, the Tan/Fou categorization became increasingly suspect, but it was not until William Welmers (1958), using a lexicostatistical approach based on the Swaddesh 100-word basic vocabulary list, rejected it and produced the first version of the currently accepted system. Welmers concluded that the word tan was a

Mande Languages 479 Table 3 Current classification Mande 4000 BP a West Mande 3200 BP

Eastern Mande 2400 BP

Central (southwestern) 3200 BP

Northwestern 2400 BP

Eastern-eastern

Eastern southern

South western 3000 BP Kpelle Mende Looma (Loma) Central 2100 BP Susu-Yalunka Manding-Jogo

Jowulu Soninke-Bobo 2750 BP Bobo Soninke

Bisa (Bissa) Barka Lebir Busa Boka (Boko) Bokabaru (Bokobaru) Busa-Bisa (Busa) Tyenga (Kyenga) Sam (Samo) San (Samo) Sane (Samo)

Guro-Tura Guro Yaoure´ Tura (Toura) Dan Mano Nwa-Ben Ben Gban (Gogu) Nwa (Wan) Mwan

a The BP dates are from Dwyer (1989). Each date represents the estimated date at which the languages in the group separated, based on common percentages of basic vocabulary cognates. Thus Central Mande, for example, with its time depth of 2100 BP, is based on a common cognate percentage of 40%.

Table 4 Westermann West Sudanic

West Atlantic

Mande

Gur

Togo Remnant

Kwa

Benue-Cross River

Source: Westermann (1927). Table 5 Greenberg

Table 6 Williamson

Niger-Kordofanian Niger-Congo West Atlantic Mande Gur Togo Remnant Kwa Benue-Cross River Kordofanian

Niger-Congo Kordofanian Mande Atlantic-Congo Atlantic North Bijago South Volta-Congo Kru Kwa Benue-Congo Dogon Adamawa-Gur-Ubangi Ijoid?

Source: Greenberg (1963).

more recent innovation in Western Mande, not the fundamental split that Delafosse had assumed, and introduced the East–West division that remains today.

Source: Williamson (1977).

Mande as a Niger-Congo Language

Since the time of Koelle, four major hypotheses concerning the placement of Mande in Niger-Congo have been offered: . Westermann (1927) included Mande in his West Sudanic, which was very similar to Greenberg’s Niger-Congo (Table 4). . In 1963, Joseph Greenberg, using a methodology based on the mass comparison of lexical items, accepted and refined Westermann’s view. He renamed West Sudanic as Niger-Congo, the name it bears now, and included it as a branch of a larger

grouping, Niger-Kordofanian. Of all the NigerCongo languages, Greenberg considered Mande the least remote (Table 5). . Consistent with common usage, Williamson (1977) replaced Greenberg’s Niger-Kordofanian with the term Niger-Congo. Williamson then placed Mande along with Atlantic Congo (the main body of Niger-Congo languages) and Kordofanian as the first three branches of Niger-Congo (Table 6). . Also in 1977, Hans Mukarovsky proposed a substantial restructuring in which Mande and

480 Mande Languages Table 7 Mukarovsky Sahelian West Sahelian [Niger-Congo minus Mande and Benue-Congo]

[Mande-Benue-Songhai]

West Atlantic

West Nigritic

Kwa

Senegalian Mel West Guinean

Mel West Guinean Togo Remnant Gur Western Kwa Eastern Kwa

Western Kwa Eastern Kwa

Mande Benue-Congo Songhai

Source: Mukarovsky (1976–1977).

Benue-Congo were removed from the old NigerCongo (renamed West Sahelian) and placed with Songhai (Songhay), previously not considered a Niger-Congo language, as branches of Sahelian (Table 7). Although the Mukarovsky model is still seen as an interesting hypothesis, currently most scholars favor the Williamson proposal. Nevertheless, the progression from Westermann to Williamson to Mukarovsky does show an increasing awareness of Mande as a remote branch of Niger-Congo. This development has raised questions about whether Mande is actually a Niger-Congo language. Part of this suspicion is due to the fact that Mande is also unique among the Niger-Congo languages because of its lack of evidence of a noun class system, found in other Niger-Congo languages, and its almost universal subject-object-verb word order. This led Dwyer (1998) to compare the vocabulary of Mande and samples from other Niger-Congo branches. This study shows that Mande is a lexically coherent group. By lexically coherent, I mean that the best way to explain the vocabulary basic and other is to attribute a common ancestor (Proto-Mande) to these languages. The study also found that NigerCongo (specifically Mande, Benue-Congo (including Bantu) and the western Nigritic core) is also lexically coherent. Finally, the study concluded that of these three language groups, Mande is lexically least related. These conclusions are fully consistent with the Williamson hypothesis but not that of Mukarovsky.

Linguistic Properties Phonology

A tentative reconstruction of the Proto-Mande consonant system (Table 8) suggests a series of labial, alveolar, velar, and labiovelar voiced and voiceless stops. Because of the eccentric, but relatively consistent bimodal patterning of the voiceless stops, Dwyer

(1994) tentatively suggested the possibility of a second series of fortis voiceless stops (t0 , k0 , kp0 ). Interestingly, this dual series of voiceless stops is analogous to that postulated for Upper Cross by Dimmendaal (1978) and Sterk (1979) and for Volta-Congo, Stewart (1976). In addition Mande appears to have only an (s/z) fricative contrast and labial, alveolar, and palatal nasals along with the liquid (l) and the glides (y and w). Tone

Most Mande languages have two level lexical tones (high and low), along with a falling tone, analyzed as a sequence of high followed by low, and a rising tone. Bobo (Bobo Madare´ ), Mano, and Kpelle have three tones and one language, Sembla (Seeku), has four. Both Kpelle and Bobo (Bobo Madare´ ) (Dwyer, 1994) can be shown to have independently evolved a third tone through tone splitting. This suggests that they originally had a two-tone system. Morphosyntax

One of the most striking facts about the Mande languages is the structural unity of the group and its distinctiveness from other Niger-Congo languages. Syntactically, the Mande languages have an SOV word order with oblique objects being marked as the objects of specialized postpositions. None of the Mande languages use serial verbs. Many Mande languages distinguish between alienable and inalienable possession. Tense and aspect are generally marked through a combination of verb suffixes and postsubject formatives. Definite articles, demonstratives, and plurals tend to follow the noun or noun þ attribute while possessive pronouns precede the noun. Research in the area of comparative morphology and syntax is beginning to emerge. Creissels (1980) charted the distribution of four verbal particles in his Mandekan dialects with the conclusion that from

Mande Languages 481 Table 8 Mande consonants

Stop Fricative Nasal Liquid/Glide

Labial

Dental

p/b

t/d s/z n 1

m

Palatal

n˜ y

Velar

Labiovelar

k/g

kp/gb

w

these data no clear evolutionary sequence could be ascertained. Gre´ goire (1980) compared the rather unique properties of Mande relative clauses from all of its major branches: northern, southwestern, southeastern, and Bobo. Dwyer (1985) has traced the evolution of the definite articles in Northwestern Mande. Comparative reconstruction is a far more challenging task than lexicostatistical analysis, but promises more interesting results, not only in the study of the development of the language, but also in the area of cultural history and in understanding the relationship between synchronic and diachronic rules.

Noun Classes Typically, Niger-Congo languages have several semantically based noun classes (animate, inanimate, diminutive, augmentative, and abstract), usually marked by prefixes, both singular and plural. Languages of the Mande branch do not make use of this morphological device. One possible explanation is that the noun class system developed after Mande separated from Niger-Congo. Alternatively the system could have been part of Niger-Congo and subsequently lost in Mande. In the latter situation, one would expect some evidence in some of the Mande languages of remnants of such a noun class system. However, despite numerous attempts no evidence has turned up. For example, Dwyer (1990) examined Bobo (Bobo Madare´ ), which has a very complex system of plural formation requiring the positing of a number of noun classes in order to derive the correct form. However, these noun classes did not turn out to be related (semantically or morphologically) to the Niger-Congo noun classes. A number of Mande languages have developed writing systems, including a Vai syllabary (Stewart, 1972) that has been in use continuously since the 1830s.

Resources The Mande Studies Association (MANSA) website has several useful links to other resources in French (Actualite´ s de la recherche au Mali and the Bulletin d’Anthropologie et d’Histoire Africaines en Langue Franc¸ aise) and English. Both the Summer Institute of

Linguistics and Ethnologue contain descriptions of individual Mande languages and more detailed maps. The site of the Union Mandingue has posted a history of the Manding-speaking peoples. Additional sources can by found by entering the individual language names given in this article in any search engine. The most thorough bibliography can be found in Kastenholz (1988). See also: Benin: Language Situation; Guinea: Language Situation; Ivory Coast: Language Situation; Liberia: Language Situation; Niger-Congo Languages; Nigeria: Language Situation; Senegal: Language Situation; Sierra Leone: Language Situation.

Bibliography Bendor-Samuel J (ed.) (1989). Niger-Congo. Boston: University Press of America. Creissels D (1980). ‘Variations dialectales dans les syste`mes de marques pre´ dicatives des parlers manding.’ In Guarisma & Platiel (eds.). 139–155. Delafosse M (1901). Essai de manuel pratique de la langue Mande´ ou Mandingue. Paris: Leroux. Dimmendaal G (1978). The consonants of Proto UpperCross and their implications for the classification of Upper Cross languages. Leiden: Department of African Linguistics. Dwyer D (1989). ‘Mande.’ In Bendor-Samuel (ed.). 47–65. Dwyer D (1990). ‘A second look at Bobo plurals.’ Colloquium on African Linguistics, University of Leiden, September 5–7. Dwyer D (1994). ‘Is there tone splitting in Bobo?’ Journal of African Languages and Linguistics 15(1), 29–45. Dwyer D (1998). ‘The place of Mande in Niger-Congo.’ In Maddieson I & Hinnebusch T J (eds.) Language history and linguistic description in Africa. Trenton, NJ: Africa World Press. Ehret C (2002). The civilizations of Africa: a history to 1800. Charlottesville: University Press of Virginia. Gre´ goire C (1980). ‘Les Structures relatives en mande´ .’ In Guarisma & Platiel (eds.). 139–155. Greenberg J (1966). The languages of Africa. Bloomington: Indiana University Press. Guarisma G & Platiel S A (eds.) (1980). Dialectologie et comparatisme en Afrique Noire. Paris: SELAF. Kastenholz R (1996). Sprachgeschichte im West-Mande: Methoden und Rekonstruktionen. Cologne: Ru¨ diger Ko¨ ppe. Kastenholz R (1988). Mande Languages and Linguistics. Hamburg: Helmut Buske. Koelle S W ([1854] 1963). Polyglotta Africana. Graz: Akademische Druck-u. Verlagsanstalt. Livingstone F (1958). ‘Anthropological implications of sickle cell gene distribution in West Africa.’ American Anthropologist 60, 533–562. McIntosh R & McIntosh S (1984). ‘Early iron age economy in the inland Niger delta.’ In Clark D & Brandt S

482 Mande Languages (eds.) From hunters to farmers. Berkeley: University of California Press. Mukarovsky H (1976–1977). A study of Western Nigritic (2 vols). Vienna: Institut fu¨ r A¨ gyptologie und Afrikanistik. Sterk J (1979). ‘The fortis/lenis contrast in upper cross consonants: a survey.’ Kiaba´ ra´ special issue. Stewart J M (1976). Towards Volta-Congo reconstruction. Leiden: Leiden University Press. Stewart G (1972). ‘The early Vai script as found in the book of Ndole.’ Paper presented to the Conference on Manding Studies, School of Oriental and African Studies, University of London. Welmers W (1971). ‘Mande Niger Congo.’ Current Trends in Linguistics 7, 113–140.

Westermann D (1927). Die westlichen Sudansprachen und ihre beziehungen zum Bantu. Berlin: Walter de Gruyter. Williamson K (1989). ‘Niger-Congo Overview.’ In BendorSamuel (ed.). 4–45.

Relevant Websites www.ethnologue.com – Ethnologue. www.manding-benelux.org – Union Mandingue. www.sil.org – Summer Institute of Linguistics. www.swt.edu – Mande Studies Association (MANSA).

Maori W Bauer, Wellington, New Zealand ! 2006 Elsevier Ltd. All rights reserved.

Maori is the language of the Polynesian people who settled in New Zealand over 1000 years ago. It belongs to the Eastern Polynesian branch of the Malayo-Polynesian language family. Its current situation is typical of indigenous languages subjected to the effects of European colonization. The early English missionaries Samuel Marsden and Thomas Kendall were instrumental in having a writing system devised for Maori. This has largely served the language well, although it failed to distinguish between long and short vowels. The dictionary produced by three generations of the Williams family remains highly significant, and W. L. Williams made a substantial contribution to the grammatical description of Maori. In the first half of the 20th century Maori children were taught English at the expense of Maori. Influential Maori leaders advocated the use of English in Maori homes, and speaking Maori at school was often punished. By the mid-20th century Maori was rapidly dying, although small Maori-speaking communities remained in isolated rural areas. In 1951, Auckland University introduced Maori as an academic subject, which raised its status a little as did the grammatical descriptions produced by the linguist Bruce Briggs. However, the future looked bleak. In the 1970s, a Maori political revival began (the ‘Maori renaissance’). It was accompanied by serious endeavors to revitalize the language. Kohanga reo ‘language nests’ were established (preschool education centers providing education in Maori), and these were followed by Maori medium schools (kura kaupapa Ma¯ ori), or immersion or bilingual units in

mainstream schools. Small Maori radio stations were established with variable amounts of broadcasting in Maori. The Maori language became an official language of New Zealand. A Maori Language Commission was established to aid revitalization, and it has manufactured many vocabulary items from Maori elements to cater to modern needs. A Maori television channel went on the air in 2004. Today the future of Maori is unclear. It might survive, testimony to the success of the revitalization process, but it is not yet secure. Most speakers have learned Maori as a second language, and many are ‘semi-speakers.’ Many teachers of Maori are not fully fluent, and the quality of the Maori taught in some Maori-medium classrooms is poor. Many children who leave kohanga reo speaking fluently do not use the language as teenagers. Most native speakers of Maori are over 70 years old, although there are some (particularly from the Tuhoe area, where Maori remained strong 20 years longer than elsewhere) still in the workforce. It is common to hear ‘relexicalized English’ – English structures where the content words and some of the grammatical words are replaced by Maori lexical items. Fluent Maori speakers often speak English to each other rather than Maori. While the latest surveys suggest that more people are speaking more Maori, there are very few who are fully conversant with the language. The last New Zealand census contained a question about use of Maori. Such dialect differences as exist are tribally based. Most are lexical and phonological (with divergent realizations of phonemes rather than different phonological systems), and grammatical differences are not very significant. Maori has a small phoneme inventory, with ten consonants (/p t k m N r w f h/), and five vowels (/a e i o u/), each of which may be short or long.

482 Mande Languages (eds.) From hunters to farmers. Berkeley: University of California Press. Mukarovsky H (1976–1977). A study of Western Nigritic (2 vols). Vienna: Institut fu¨r A¨gyptologie und Afrikanistik. Sterk J (1979). ‘The fortis/lenis contrast in upper cross consonants: a survey.’ Kiaba´ra´ special issue. Stewart J M (1976). Towards Volta-Congo reconstruction. Leiden: Leiden University Press. Stewart G (1972). ‘The early Vai script as found in the book of Ndole.’ Paper presented to the Conference on Manding Studies, School of Oriental and African Studies, University of London. Welmers W (1971). ‘Mande Niger Congo.’ Current Trends in Linguistics 7, 113–140.

Westermann D (1927). Die westlichen Sudansprachen und ihre beziehungen zum Bantu. Berlin: Walter de Gruyter. Williamson K (1989). ‘Niger-Congo Overview.’ In BendorSamuel (ed.). 4–45.

Relevant Websites www.ethnologue.com – Ethnologue. www.manding-benelux.org – Union Mandingue. www.sil.org – Summer Institute of Linguistics. www.swt.edu – Mande Studies Association (MANSA).

Maori W Bauer, Wellington, New Zealand ! 2006 Elsevier Ltd. All rights reserved.

Maori is the language of the Polynesian people who settled in New Zealand over 1000 years ago. It belongs to the Eastern Polynesian branch of the Malayo-Polynesian language family. Its current situation is typical of indigenous languages subjected to the effects of European colonization. The early English missionaries Samuel Marsden and Thomas Kendall were instrumental in having a writing system devised for Maori. This has largely served the language well, although it failed to distinguish between long and short vowels. The dictionary produced by three generations of the Williams family remains highly significant, and W. L. Williams made a substantial contribution to the grammatical description of Maori. In the first half of the 20th century Maori children were taught English at the expense of Maori. Influential Maori leaders advocated the use of English in Maori homes, and speaking Maori at school was often punished. By the mid-20th century Maori was rapidly dying, although small Maori-speaking communities remained in isolated rural areas. In 1951, Auckland University introduced Maori as an academic subject, which raised its status a little as did the grammatical descriptions produced by the linguist Bruce Briggs. However, the future looked bleak. In the 1970s, a Maori political revival began (the ‘Maori renaissance’). It was accompanied by serious endeavors to revitalize the language. Kohanga reo ‘language nests’ were established (preschool education centers providing education in Maori), and these were followed by Maori medium schools (kura kaupapa Ma¯ori), or immersion or bilingual units in

mainstream schools. Small Maori radio stations were established with variable amounts of broadcasting in Maori. The Maori language became an official language of New Zealand. A Maori Language Commission was established to aid revitalization, and it has manufactured many vocabulary items from Maori elements to cater to modern needs. A Maori television channel went on the air in 2004. Today the future of Maori is unclear. It might survive, testimony to the success of the revitalization process, but it is not yet secure. Most speakers have learned Maori as a second language, and many are ‘semi-speakers.’ Many teachers of Maori are not fully fluent, and the quality of the Maori taught in some Maori-medium classrooms is poor. Many children who leave kohanga reo speaking fluently do not use the language as teenagers. Most native speakers of Maori are over 70 years old, although there are some (particularly from the Tuhoe area, where Maori remained strong 20 years longer than elsewhere) still in the workforce. It is common to hear ‘relexicalized English’ – English structures where the content words and some of the grammatical words are replaced by Maori lexical items. Fluent Maori speakers often speak English to each other rather than Maori. While the latest surveys suggest that more people are speaking more Maori, there are very few who are fully conversant with the language. The last New Zealand census contained a question about use of Maori. Such dialect differences as exist are tribally based. Most are lexical and phonological (with divergent realizations of phonemes rather than different phonological systems), and grammatical differences are not very significant. Maori has a small phoneme inventory, with ten consonants (/p t k m N r w f h/), and five vowels (/a e i o u/), each of which may be short or long.

Maori 483

Orthographically, all use the obvious single roman letters except for (for /N/) and (for /f/). The proper analysis of the long vowels is debatable: they may be analyzed as clusters of identical short vowels (reflected in ‘double vowel’ orthography), or as separate phonemes (reflected in the macron orthography, which is now official): Maaori vs. Ma¯ ori. Syllables are open, and there are no consonant clusters. All pairs of short vowels can occur in clusters, but some behave as one syllable and some as two. There are also longer vowel clusters. The rhythm is based on the mora (of form (C)V, where V is short), but stress operates with a bigger unit (C)V(V). Word stress is predictable, with syllables ranked according to the nature of the vowel. All content words contain at least two morae. Maori has virtually no inflectional morphology and very little derivational morphology, although the allomorphs of the passive suffix have raised significant linguistic interest. The syntax is surface VSO, but the most likely underlying word order is VOS, with a rule that normally moves all but the first phrase in a complex predicate to the right of the subject. The basic unit of syntax is the phrase, which has a grammatical particle indicating the phrase function preceding the lexical material. Modifiers usually follow the lexical head. The grammatical particles include markers of tense and aspect, and prepositions that indicate nounphrase function, and may be tense-marked. The sentence subject is the only NP without an introductory preposition. Maori does not have a copular verb, and many sentences lack overt verbs. The basic syntax is illustrated in the following: kei raro nga¯ pukapuka at-PRES underside DEF.PL book a Hone i DEF.PL.APOSS John at. TNSNEUTRAL t-o¯ -na moe-nga DEF.SG-OPOSS-3SG sleep-NOML ‘John’s old books are under his bed’ ka i

tope chop te

ACC

DEF.SG

FUT

a PERS ART

ra¯ kau tree

Wa¯ ka Wa¯ ka rimu rimu

tawhito old

a at.FUT

te

Ra¯ hina Monday ‘Wa¯ka will chop down the rimu tree on Monday’ DEF.SG

The pronoun system distinguishes singular, dual, and plural, and in the first person, inclusive and exclusive. Number is marked on determiners, not nouns (except for a handful of personal nouns), the

singular usually with an initial t- and the plural with Ø (e.g., te¯ tahi – sg, e¯ tahi – pl) and there is a special determiner for proper names. There is a very complex system for the expression of ownership, including a distinction akin to alienable/inalienable (marked by a- vs. o-forms). Most lexical items can serve without morphological change in noun phrases, verb phrases, or as modifiers. In narratives, the passive is used for most event sentences, which has given rise to much debate about ergative vs. accusative syntax. A typologically unusual feature is that lexical modifiers of passive verbs take a passive suffix in agreement with the verb. The direct object is not integrated into the grammatical system of Maori in the way that would be expected if the language was originally accusative: for instance it cannot normally be relativized on directly. This is probably one remnant of a former ergative syntax. See also: Malayo-Polynesian Languages; New Zealand:

Language Situation; Number; Passives and Impersonals.

Bibliography Bauer W A (1993). Descriptive grammar series: Maori. London: Routledge. Bauer W A (1997). The Reed reference grammar of Ma¯ ori. Auckland: Reed Books. Benton R & Benton N (2000). ‘RLS in Aotearoa/New Zealand 1989–1999.’ In Fishman J A (ed.) Can threatened languages be saved? Reversing language shift, revisited: a 21st century perspective. Clevedon: Multilingual Matters Ltd. 423–450. Biggs B (1961). ‘The structure of New Zealand Ma¯ ori.’ Anthropological Linguistics 3, 1–54. Biggs B (1973). Let’s learn Maori: a guide to the study of the Maori language (2nd edn.). Wellington: A. H. & A. W. Reed. Hale K (1968). Review of P. W. Hohepa, Memoir 20, International Journal of American Linguistics: A profilegenerative grammar of Maori. Journal of the Polynesian Society 77, 83–99. Harlow R B (2001). A Ma¯ ori reference grammar. Auckland: Pearson Education NZ Ltd. Head L (1989). Making Maori sentences. Auckland: Longman Paul. Hohepa P W (1967). Memoir 20, International Journal of American Linguistics: a profile-generative grammar of Maori. Waverley, Baltimore: Indiana University Publications in Anthropology and Linguistics. King J (2001). ‘Te ko¯ hanga reo: Ma¯ori language revitalization.’ In Hinton L & Hale K (eds.) The green book of language revitalization in practice. San Diego: Academic Press. 118–128.

484 Mapping Syntax Using Imaging: Problems and Prospects for the Study of Neurolinguistic Computation

Mapping Syntax Using Imaging: Problems and Prospects for the Study of Neurolinguistic Computation D Embick, University of Pennsylvania, Philadelphia, PA, USA D Poeppel, University of Maryland, College Park, MD, USA ! 2006 Elsevier Ltd. All rights reserved.

Noninvasive imaging techniques fall into two classes: hemodynamic (positron–emission tomography, or PET; functional magnetic resonance imaging, or fMRI) and electromagnetic (electroencephalography, magnetoencephalography). A primary thread in imaging research has been directed at identifying brain areas associated with subcomponents of linguistic competence, such as phonology (Poeppel, 1996), syntax (Stowe et al., 2004), and semantics (Bookheimer, 2002). Research in this area makes extensive use of hemodynamic imaging because the impressive spatial resolution offered by such techniques (millimeter scale) lends itself naturally to functional mapping. Here we examine a set of findings that implicate Broca’s area – the canonical language area – in syntactic processing and review the extent to which recent results converge in a way that is interpretable from the perspective of the language sciences. Our focus is on the experimental and conceptual issues that render results from imaging difficult to interpret. We put aside discussion of the more technical problems that confront functional imaging studies, including issues associated with the experimental design (e.g., subtraction methodology), analysis (complexity and potential arbitrariness of criteria in data analysis), and implicit assumptions about the relationship between loci of activation and cognitive systems. The activation of Broca’s area has been reported in many studies of both syntactic comprehension and production (Caplan et al., 1998; Dapretto and Bookheimer, 1999; Embick et al., 2000; Friederici et al., 2000; Friederici, 2002; Homae et al., 2002; Indefrey et al., 2001; Kaan and Swaab, 2002; Kang et al., 1999; Moro et al., 2001; Musso et al., 2003; Newman et al., 2003; Ni et al., 2000), from which it has been concluded that this area has a privileged status with respect to this aspect of grammar. One complicating factor is the use of different terminologies in identifying the relevant cortical regions. In particular, there are reports in terms of both Brodmann areas (i.e., BA 44/45) and in terms of gyral and sulcal anatomy (pars opercularis, pars triangularis; also F3op, F3t). Because these definitions are not coextensive, there is a potential for terminology-induced confusion. This situation is in principle capable of being remedied by making reference to standardized

coordinate systems, such as the atlas of Talairach and Tournoux (1988), although this process itself is not unproblematic because of the anatomical transformations that are required for such a standardization. The identification of ‘Broca’s area’ in such terms ranges in Talairach coordinates from x!28 to !55, y!8 to þ34, and z 0 to þ28 (even without including studies that take BA 47 to be part of Broca’s area). It seems, then, that the great variability found in the discussion of this area precludes any straightforward biological interpretation. Specific studies that associate Broca’s area with syntax employ a number of different design types. Dapretto and Bookheimer (1999) used fMRI in a block design and presented sentences auditorily to subjects, who performed one of two tasks. In the ‘syntactic’ condition, the task was to judge whether two sentences – one active (the policeman arrested the thief ) and one passive (the thief was arrested by the policeman) – were the same or different. In a ‘semantic’ condition, subjects judged ‘same’ for two sentences that differed by a single word (the lawyer/attorney questioned the witness) or (the lawyer/driver questioned the witness). Activation in BA 44 was reported for the comparison syntax minus semantics (as well as syntax minus rest), and activation in BA 47 for semantics minus syntax. Auditory presentation was also used in the event-related fMRI study performed by Ni et al. (2000). Subjects performed syntactic and semantic oddball tasks, in which a sequence of grammatical sentences contained an occasional deviant oddball (syntactic: *trees can grew; semantic: #trees can eat). Activation in BA 44/ 45 was reported for the subtraction of semantics from syntax. The PET study of Moro et al. (2001) used a block design with visual presentation; the task involved silent reading and acceptability judgments on four types of Italian sentences. In addition to a baseline of Jabberwocky (il gulco gianigevale brale), there were three types of violation: word-order (*gulco il gianigevale brale); morphosyntactic (*il gulco ha gianigiataquesto bralo); and phonotactic (*il gulco gianigzlevale brale). Activation for the syntactic and morphosyntactic conditions minus the phonotactic condition was found in left BA 45 and right BA 44/45. Kang et al. (1999) used an event-related design with fMRI; subjects were presented visually with phrasal stimuli containing syntactic and semantic violations. The stimuli were verb phrases like drove cars (the normal condition), in addition to which there were two deviant conditions: syntactically deviant *forgot made and semantically deviant *wrote beers. Relative to the normal condition, activation was found for both the

Mapping Syntax Using Imaging: Problems and Prospects for the Study of Neurolinguistic Computation 485

syntactically and semantically deviant stimuli in BA 44/ 45; the activation in left BA 44 was greater for syntax than for semantics. In addition to the studies using anomaly detection/ judgment outlined above, activation in Broca’s area has been reported in studies of the syntax of artificial language learning (Musso et al., 2003) as well as in studies of syntactic complexity (Caplan et al., 1998). Musso et al. taught subjects artificial ‘grammars’ with two distinct types of rules. One type was a rule found in natural languages but not in the language of the subjects. A second type of rule involved a string manipulation of a type that is not attested in the world’s languages. Subjects were asked to perform an acceptability task on visually presented sentences. Results showed increased activation in BA 45 for the first type of rule (real) in comparison with the second type of rule (unreal). Musso et al. concluded from these results that Broca’s area is specialized for the learning of syntactically possible rules, independent of the age of the learner. Caplan et al. (1998) showed that syntactic complexity effects (measured in terms of processing time differences) are reflected in an increase in signal in Broca’s area. They concluded that this region is specialized for the processing of certain aspects of syntactic structure. The activation of Broca’s area (defined as BA 44/45) in a number of ‘syntax’ studies employing distinct tasks and designs seems at first glance to confirm the claim that this area is specialized for syntax. There are two further considerations, however, that suggest that this conclusion is at best an oversimplification. First, Broca’s area has been reported to be active in a number of linguistic tasks that are not syntactic: tasks ranging from lexical tasks, for instance auditory lexical decision (Zatorre et al., 1992; Poeppel et al., 2004) and studies of minimal pairs in tone languages (Gandour et al., 2000), to phonological/phonetic tasks, such as the discrimination of rapid phonetic transitions (Fiez et al., 1995) or the processing of phoneme sequences as opposed to hummed notes (Gelfand and Bookheimer, 2003). The role of Broca’s area in phonetics/phonology is reviewed in Burton (2001). The claim that Broca’s area is exclusively devoted to syntax is thus incorrect, although the possibility that Broca’s area might be specialized for language in some broader sense is left open. The second consideration that complicates the view of a syntactic Broca’s area is the fact that Broca’s area is active in a number of entirely nonlinguistic tasks. The tasks include motor activation (Iacoboni et al., 1999), motor imagery (Binkofski et al., 2000; Hamzei et al., 2003), and rhythmic perception (Halpern and Zatorre, 1999; Platel et al., 1997). These findings constitute a challenge to the weaker position

that Broca’s area is specialized for language in the broad sense. The interpretation that identifies Broca’s area as responsible for syntax is informed by sources of evidence other than imaging studies: deficit-lesion studies, electrophysiological studies, clinical findings, and so on. From the imaging studies, it is clear that a simple association between Broca’s area and syntax cannot be maintained. At the same time, the apparent set of contradictions generated by imaging studies cannot be surprising given a realistic view of how cognitive functions, including syntax, are computed. In linguistic domains other than syntax, for instance, a complex internal structure is clearly required for processes such as phonetic and phonological analysis, lexical analysis, and so on. Therefore the expectation that syntax should be a simplex, unstructured computation associated with a single undifferentiated cortical region is unrealistic. While it is clear that some of the computational subroutines that are essential for syntactic processing/production are computed in the inferior frontal gyrus (IFG), these are not syntax per se; they are subcomponents of syntax. What is required is a theory of these computations at the correct level of abstraction or granularity, a theory that seeks to associate these computations with different subparts of Broca’s area. For example, two components essential to syntax are the creation of hierarchical structures and a process that linearizes these hierarchical structures. Computations of this type may be factored out of syntax in the broad sense and are perhaps associated with different subparts of the IFG. The natural assumption is that differently structured cortical areas are specialized for performing different types of computations, some of which are necessary for language but also for other cognitive functions. For instance, the activation of ‘mirror neurons’ in the IFG has a role in motor action/imitation, but this activation also is relevant to the linguistic domain in the context of ‘forward’ models of speech perception (Halle and Stevens, 1962). Some preliminary proposals making distinctions among subregions of BA 44/45 are found in Horwitz et al. (2003). In conjunction with an appropriately granular theory of the computations performed in the brain, the spatial information provided by imaging has the potential to illuminate aspects of the biological foundation of language by providing the critical link between specialized cortical areas and cognitively relevant types of computations.

Acknowledgments David Poeppel is supported by NIH R01 DC05660.

486 Mapping Syntax Using Imaging: Problems and Prospects for the Study of Neurolinguistic Computation See also: fMRI Studies of Language; Imaging Brain Lateralization; Syntax of Words.

Bibliography Binkofski F, Amunts K, Stephan K et al. (2000). ‘Broca’s region subserves imagery of motion: a combined cytoarchitectonic and fMRI study.’ Human Brain Mapping 11, 273–285. Bookheimer S (2002). ‘Functional MRI of language: new approaches to understanding the cortical organization of semantic processing.’ Annual Review of Neuroscience 25, 151–188. Burton M (2001). ‘The role of inferior frontal cortex in phonological processing.’ Cognitive Science 25, 695–709. Caplan D, Alpert N & Waters G (1998). ‘Effects of syntactic structure and propositional number on patterns of regional cerebral blood flow.’ Journal of Cognitive Neuroscience 10, 541–552. Dapretto M & Bookheimer S (1999). ‘Form and content: dissociating syntax and semantics in sentence comprehension.’ Neuron 24, 427–432. Embick D, Marantz A, Miyashita Y et al. (2000). ‘A syntactic specialization for Broca’s area.’ Proceedings of the National Academy of Science USA 97, 6150–6154. Fiez J, Tallal P, Raichle M et al. (1995). ‘PET studies of auditory and phonological processing: effects of stimulus type and task condition.’ Journal of Cognitive Neuroscience 7, 357–375. Friederici A (2002). ‘Towards a neural basis of auditory sentence processing.’ Trends in Cognitive Science 6(2), 78–84. Friederici A, Opitz B & von Cramon D (2000). ‘Segregating semantic and syntactic aspects of processing in the human brain: an fMRI investigation of different word types.’ Cerebral Cortex 10(7), 698–705. Gandour J, Wong D, Hsieh L et al. (2000). ‘A crosslinguistic PET study of tone perception.’ Journal of Cognitive Neuroscience 12(1), 207–222. Gelfand J & Bookheimer S (2003). ‘Dissociating neural mechanisms of temporal sequencing and processing phonemes.’ Neuron 38(5), 831–842. Halle M & Stevens K (1962). ‘Speech recognition: a model and a program for research.’ IEEE Transactions on Information Theory 8(2), 155–160. Halpern A & Zatorre R (1999). ‘When that tune runs through your head: a PET investigation of auditory imagery for familiar melodies.’ Cerebral Cortex 9, 697–704. Hamzei F, Rijntjes M, Dettmers C et al. (2003). ‘The human action recognition system and its relationship to Broca’s area: an fMRI study.’ Neuroimage 19, 637–644. Homae F, Hashimoto R, Nakajima K et al. (2002). ‘From perception to sentence comprehension: the

convergence of auditory and visual information of language in the left inferior frontal cortex.’ NeuroImage 16, 883–900. Horwitz B, Amunts K, Bhattacharyya R et al. (2003). ‘Activation of Broca’s area during the production of spoken and signed language: a combined cytoarchitectonic mapping and PET analysis.’ Neuropsychologia 41(14), 1868–1876. Iacoboni M, Woods R, Brass M et al. (1999). ‘Cortical mechanisms of human imitation.’ Science 286, 2526–2528. Indefrey P, Brown C, Hellwig F et al. (2001). ‘A neural correlate of syntactic encoding during speech production.’ Proceedings of the National Academy of Science USA 98(10), 5933–5936. Kaan E & Swaab T (2002). ‘The brain circuitry of syntactic comprehension.’ Trends in Cognitive Sciences 6(8), 350– 356. Kang A, Constable R, Gore J et al. (1999). ‘An event-related fMRI study of implicit phrase-level syntactic and semantic processing.’ Neuroimage 10, 555–561. Moro A, Tettamanti M, Perani D et al. (2001). ‘Syntax and the brain: disentangling grammar by selective anomalies.’ NeuroImage 13, 110–118. Musso M, Moro A, Glauche V et al. (2003). ‘Broca’s area and the language instinct.’ Nature Neuroscience 6, 774–781. Newman S, Just M, Keller T et al. (2003). ‘Differential effects of syntactic and semantic processing on the subregions of Broca’s area.’ Brain Research Cognitive Brain Research 16(2), 297–307. Ni W, Constable R, Mencl W et al. (2000). ‘An eventrelated neuroimaging study distinguishing form and content in sentence processing.’ Journal of Cognitive Neuroscience 12, 120–133. Platel H, Price C, Baron J et al. (1997). ‘The structural components of music perception: a functional anatomical study.’ Brain 120, 229–243. Poeppel D (1996). ‘A critical review of PET studies of phonological processing.’ Brain and Language 55, 317–351. Poeppel D, Guillemin A, Thompson J et al. (2004). ‘Auditory lexical decision, categorical perception, and FM direction discrimination differentially engage left and right auditory cortex.’ Neuropsychologia 42, 183–200. Stowe L, Haverkort M & Zwarts F (2004). ‘Rethinking the neurological basis of language.’ Lingua 115, 997–1042. Talairach J & Tournoux P (1988). Co-planar stereotaxic atlas of the human brain. Stuttgart, Germany: Thieme Medical Publishers. Zatorre R, Evans A, Meyer E et al. (1992). ‘Lateralization of phonetic and pitch discrimination in speech processing.’ Science 256, 846–849.

Mapudungan 487

Mapudungan F Zu´n˜iga, University of Zurich/University of Leipzig/ Centro de Estudios Pu´blicos, Zurich, Switzerland ! 2006 Elsevier Ltd. All rights reserved.

The dialects of Mapudungu(n), less deviant from one another than the dialects of English, are spoken by the Mapuche of south-central Chile and central Argentina. Current conservative estimates place the number of fluent speakers at one-third of the almost 1 000 000 ethnic Mapuche; more than 90% are in Chile, of which more than 40% are in or around Santiago and only 30% live in the traditional Mapuche territory. The main present-day dialects are (1) Mapudungun proper or Central Mapudungun, in south-central Chile, and (2) Pehuenche, to the east of the former. Further dialects such as Argentinean Ranquel and Chilean Picunche and Huilliche are either obsolescent or extinct. Other names for the language are Araucanian (from araucanos, the ethnonym used by the Spaniards; present-day Mapuche avoid using this term), Mapuchedungun, and (Re)chedungun. Several genetic affiliations have been proposed, not only with languages spoken in the south of the continent such as Kawe´ sqar (Qawaskar) and Yaghan (Ya´ mana) but also with language families as distant as Arawak, Carib, and Mayan; to date, none of these proposals has been convincingly substantiated. The first grammar dates from the early 17th century. Written texts began to appear in the early 1900s and have become more numerous only at the end of the 20th century. The phoneme inventory is simpler than the ones found in neighboring languages: the vowels are /a, e, i, o, u, $i / (where /i$/ is unrounded high central when stressed and close to a schwa when unstressed). The glides are palatal /j/ , labiovelar /w/, and velar /M / . The consonants are the voiceless unaspirated noncontinuants /p, t9 , t, ¡, c , k/ (where /c/ is alveolopalatal) , the voiceless fricatives /f, y , s /, the nasals /m, n9 , n, n˜ , N ¯ /, and the liquids /l9 , l, L , ¡/. The dental series /t9 , n9 , 9l / contrasts with the alveolar one /t, n, l/ only in highly conservative speech; most speakers have an alveolar series only. Pehuenche has voiced fricatives [v] and [ð] instead of [f] and [y]. Primary stress can be largely predicted from syllable structure (it tends to fall on the penultimate mora) – with some exceptions, as in a number of disyllabic adverbs whose stress is lexically assigned to the ultima. Because there is no universally accepted orthographic convention, it is not uncommon to find some variation in the literature and deviant spellings such as instead of

(for /y/), instead of (for /N/), and instead of (for /1000 (1993)

a The Mexican census also lists the numbers of speakers of ‘Guatemalan’ languages now resident in Mexico: Chuj 3900, Jakateko (Popti’, Ab’xab’al) 1300, Q’eqchi’ 1700, K’iche’ 640, Kaqchikel 610, Ixil 310, Awakateko 60, Teko 50.

evident in market contexts), which blurs language and dialect boundaries. Often the divisions between language groups are determined more by political divisions and historical identities than by isoglosses. Rivalry between families in Aguacata´ n brought about the splintering of Awakateko, spawning a new ‘language,’ Chalchiteko, which won official recognition in Guatemala in 2003. Likewise, historic autonomy and a tradition of armed and political conflict between Q’umarkaaj (the K’iche’ capital) and Rab’inal (the Achi’ center) have created localized identities, which override mutual intelligibility in determining the language boundaries between the two groups. Residents of San Miguel Acata´ n and San Rafael La Independencia have traditionally considered themselves Q’anjob’al speakers, but the official recognition of Akateko as a language, with its own representation in the Academy of Mayan Languages

of Guatemala, has served to accentuate linguistic differences and has discouraged use of educational materials, more widely available in Q’anjob’al. On the other hand, Mam, one of the four largest Guatemalan Mayan languages, and Chuj, spoken in northwestern Guatemala, both have deep internal dialectal splits. Dialects may differ in more that 20% of their core vocabulary, undergo different syntactic processes, and allow different sentential word orders, yet these languages maintain a shared identity. Estimates of number of speakers are also highly political. Despite official rhetoric praising the ethnic richness of their countries, both Guatemala and Mexico have traditionally promoted assimilation to a national identity that is indigenous only ancestrally. (Also, El Salvador does not recognize any modern Maya as traditional ethnicities, although it does again host Mayan populations displaced by the genocidal

Mayan Languages 551 Table 3 Sample verbs in Kaqchikel First-person plural

Gloss

Second-person plural

Gloss

xojwa’ xixqaq’etej

‘we ate’ ‘we hugged y’all’

xixwa’ xoj iq’etej

‘y’all ate’ ‘y’all hugged us’

Table 4 Sample positionals in Mam State

Gloss

Intransitive

Gloss

Transitive

Gloss

wa’li xhjewli

‘standing’ ‘twisted’

wa’ee xhjewee’

‘s/he stands’ ‘it twists’

twa’b’in txhjewb’in

‘s/he stood her/him up’ ‘she/he twists it’

war in Guatemala, 1960–1995. Honduran populations until recently were counted as Spanish speaking, although in the north there were ethnically Ch’orti’ peoples; in the 2001, some Honduran rural schools began limited bilingual education, although without materials.) Leopoldo Tzian (1994) points out that official governmental censuses in Guatemala consistently underestimate the number of Mayas compared to surveys done by linguists, by international development agencies, and by health workers. Table 1 gives population figures for Guatemala: official census figures for the Mayan population (note the difference between ethnically identified Maya and those who speak their mother tongue), Tzian’s data, the figures of AJPOPAB’CHI’ (the Commission for the Officialization of the Indigenous Languages of Guatemala), and those of the Ministry of Education Survey for 2003. Table 2 contains population figures for Mexico, showing the official government figures (Instituto Nacional de Estadı´stica, Geografı´a e Informa´ tica, 2000) and those of the Summer Institute in Linguistics (published 2004) with the date of the survey in parentheses. The first label under the rubric ‘language’ gives the traditional name for the language/ethnic group, used in most academic publications and in official documents prior to 2000, the second is the indigenous autodenomination.

Grammatical Characteristics The Mayan Languages share many important characteristics, among these are ergativity, positionals, directional particles, and noun and numeral classifiers. These categories are developed in different ways in the various languages.

separate category (marked by ergative pronouns). Most of the Mayan languages show this system, with variations in subparts of the grammar, in which a nominative–accusative agreement pattern (like that Indo-European languages) surfaces. Such systems are referred to as split-ergative. Ch’orti’ has a splitergative system, with the change being triggered by subordination. In addition, Ch’orti’ has a third pronominal set, which serves as prefixed subject markers of incompletive intransitive verbs. Table 3 shows sample verbs in Kaqchikel with subject pronouns in bold type and object pronouns in italics. Note the homology of intransitive subjects and transitive objects. Positionals

Positionals are a special word class in Mayan languages, so-called because many denote positions such as ‘standing,’ ‘lying prone,’ and ‘stuck crosswise in an opening.’ However, some simply name conditions or states, such as ‘wet,’ ‘naked,’ and ‘round.’ Words that belong to this class have special derivational characteristics. The roots are inflected to form two or three types of nonverbal predicates (adjectives), intransitive verbs, and transitive verbs. Table 4 shows examples from Mam. Some languages form reduplicated adjectives from the positional roots, for example, in Chuj nhojanhojan ‘walk fluffily, like a shaggy sheep’ and linganlingan ‘be hanging.’ Kaqchikel (Table 5), Tz’utujiil, and K’iche’ copy the vowel of the root and the first consonant and then add a suffix for singular or plural agreement to form adjectives from positional roots.

Ergativity

Directional Particles

Ergative languages mark the relationship between the verb and its arguments with inflections that treat the subjects of intransitives and objects of transitive verbs as one category (marked by absolutive pronouns) and the subjects of transitives and possessors of nouns as a

These particles, usually variants of intransitive verbs of motion, serve as a complement to main verbs. They may indicate actual movement of the actor or action, or they may add aspectual information. In Mam, transitive verbs almost always cooccur with a

552 Mayan Languages Table 5 Reduplicated adjectives in Kaqchikel State

Gloss

Adjective

Gloss

setesı¨ k kotokı¨ k

‘round, singular’ ‘crooked, singular’

setesa¨q kotoka¨q

‘round, plural’ ‘crooked’

Table 6 Poqomam directionals Directional

Gloss

Intransitive verb

Gloss

Phrasal use

Gloss

ala aka koon pa qa

‘out’ ‘in’ ‘stay’ ‘thither’ ‘down’

-il-ok-kahn-pan-qaj-

‘leave’ ‘enter’ ‘stay, remain’ ‘arrive there’ ‘descend’

xa’ila ala xah’oka aka xahkahna koon xahpana pa xahqaja qa

‘we go out, we leave’ ‘we go in, we enter’ ‘we stay here’ ‘we arrive there’ ‘we go down, we descend’

Table 7 Noun classifiers in Popti’ Classifer

Objects in the class

Classifier

Objects in the class

komam ya’ xo’ ix no’ te’ tx’al q’ap ch’en ha’

Male supernaturals, diseases Adult person Young women Female, unknown, not respected Animals (other than the dog) General plants and their products Cotton or synthetic thread Cloth Metal, rock, mineral Water, liquid

komi’ ho’ naj unin metx ixim tx’anh tx’otx’ atz’am q’a’

Female supernaturals Young men Male, unknown, not respected Human baby Dog Grains Fiber, string Earth, earthenware Salt Fire

directional complement (see Table 6 for examples in Poqomam). Verb phrases in Yucatec, however, now rely on conjunction rather than complementation. Noun Classifiers

These particles precede the nouns they modify and ascribe some property, social or material, to the noun. In the Q’anjob’alan group, noun classifiers are highly exploited by the grammar. They serve as definite articles and as pronouns. (See Table 7 for examples in Popti.’) In the neighboring Mamean languages, the system is more attenuated. In K’iche’an languages, classifiers are used more as titles before names than as classifiers. In Yucatec, only morphological vestiges appear in names for a few plants and animals. Numeral Classifiers

These may be of two types of numerical classifiers: suffixal, marking what kind of entity is being counted (Table 8), and independent, showing how the object counted is measured (Table 9). The suffixal type distinguishes three classes in Q’anjob’alan languages: people, animals, and other. Other Mayan

languages have only trace suffixes, sometimes invariant in form.

Vocabulary Mayan languages have borrowed words from many languages, including Nahuatl (Na´ hautl) (masat ‘deer,’ tinamı¨t ‘town,’ in Kaqchikel), Spanish (mexa ‘table,’ kaxtilanh winakhin ‘I’m a Spaniard, cock’s crow,’ in Chuj), and English (tab’ana’ klik pa ruwi’ ruk’in ri maws ‘click on it with the mouse,’ in Kaqchikel). They have also lent many words, for example, English hurricane < Kaqchikel juraqa¨n ‘(lit.) one leg’ and Spanish makuy < majkuy ‘an herb.’ New words are constantly developed with the contact of cultures and the implementation of new educational curricula. In Guatemala, the Academia de las Lenguas Mayas de Guatemala, an semi-autonomous branch of the government, is authorized to promote and develop the national languages. In Mexico, the federal government provides bilingual educational support and is supplemented by the efforts of the Academia de La Lengua Maya in Yucata´ n, Campeche, and Quintana Roo and by Sna Tz’ib’alom, the independent writers’ cooperative in Chiapas.

Mayotte: Language Situation 553 Table 8 Popti’ numeral classifier suffixes Number root

Gloss

Root with suffix

Gloss

kanhwajb’alunh-

‘four’ ‘six’ ‘nine’

kanhwanh wajk’onh b’alunhe’

‘four people’ ‘six animals’ ‘nine things’

Table 9 Te’utujiil measure wordsa Measure

Gloss

Number root

mooq’ quum tz’uur seel peer raab’

‘fistful’ ‘sip’ ‘drop’ ‘slice, layer’ ‘plane surface’ ‘long, cylindrical’

oxkajjuuka’waqwuq-

a

‘three’ ‘four’ ‘one’ ‘two’ ‘six’ ‘seven’

Combined form

Gloss

oxmooq’ kajquum juutz’uur ka’seel waqpeer tz’alam wuqraab’ kolo’

‘three fistfuls’ ‘four sips’ ‘one drop’ ‘two slices’ ‘six planed boards’ ‘seven ropes’

Note that the measure word or classifier serves as the base. The number is prefixed in an abbreviated combinatorial form.

See also: Bilingual Education; Communities of Practice; Ergativity; Guatemala: Language Situation; Language Education Policy in Latin America; Mesoamerica: Scripts; Mexico: Language Situation; Protolanguage.

Bibliography AJPOPAB’CHI (1998). Propuesta de modalidad de oficializacio´ n de los idiomas indı´genas de Guatemala. Comisio´ n de Oficializacio´ n de los Idiomas Indı´genas de Guatemala. Guatemala City, Guatemala: Editorial Nojib’sa. Campbell L (1977). University of California publications in linguistics 81: Quichean linguistic prehistory. Berkeley, CA: University of California Press. Campbell L & Kaufman T (1983). ‘Mesoamerican historical linguistics and distant genetic relationships: getting it straight.’ American Anthropologist 85, 362–372. Campbell L & Kaufman T (1985). ‘Mesoamerican linguistics: where are we now?’ Annual Review of Anthropology 14, 187–198.

England N C (1992). La autonomı´a de los idiomas mayas: historia e identidad. Guatemala City, Guatemala: Cholsamaj. Fox J (1978). Proto-Mayan accent, morpheme structure, conditions and velar innovations. Ph.D. diss., University of Chicago. Instituto Nacional de Estadı´stica, Geografı´a e Informa´ tica (2000). Censo nacional. Guatemala City, Guatemala: Cholsamaj. Kaufman T (1976). ‘Archaeological and linguistics correlations in Mayaland and associated areas of MesoAmerica.’ World Archaeology 8, 101–118. Tzian L (1994). Kajlab’aliil Maya’iib’ xuq mu’siib’: ri ub’antajiik Iximuleew. Mayas y ladinos en cifras: el caso de Guatemala. Guatemala City, Guatemala: Cholsamaj.

Relevant Website http://www.sil.org – Ethnologue.

Mayotte: Language Situation W Full, University of Mainz, Mainz, Germany ! 2006 Elsevier Ltd. All rights reserved.

Mayotte is the easternmost of the four islands that form the Comorian archipelago halfway between the east African coast and the northern tip of Madagascar. Actually Mayotte consists of two islands, Grande Terre and Petite Terre, inhabited by more than 100 000 people.

In the referendum for independence in 1974, the majority on Mayotte voted for staying with France. Therefore, when the other three Comorian islands, Grande Comore (Ngazija/Shingazija), Mohe´ li (Mwali), and Anjouan (Ndzwani/Shindzwani), became independent in 1975, Mayotte remained a French overseas territory with the status of a collectivite´ territoriale. Despite many resolutions by the United Nations and the Organization of African Unity demanding a unified, independent Comorian

548 Maxims and Flouting Joshi A K (1982). ‘Mutual beliefs in question-answer systems.’ In Smith N V (ed.) Mutual knowledge. London: Academic Press. 181–197. Kasher A (1976). ‘Conversational maxims and rationality.’ In Kasher A (ed.) Language in focus: foundations, methods and systems. Dordrecht: D. Reidel Publishing Company. 197–216. Keenan E O (1976). ‘The universality of conversational postulates.’ Language in Society 5, 67–80. Leech G (1983). Principles of pragmatics. London: Longman. Levinson S C (1983). Pragmatics. Cambridge: Cambridge University Press.

Recanati F (1987). Meaning and force: the pragmatics of performative utterances. Cambridge: Cambridge University Press. Rundquist S (1992). ‘Indirectness: a gender study of flouting Grice’s maxims.’ Journal of Pragmatics 18, 431–449. Sperber D & Wilson D (1995). Relevance: communication and cognition (2nd edn.). Oxford: Blackwell. Thomas J (1995). Meaning in interaction. London: Longman. Weiner B (1985). ‘‘‘Spontaneous’’ causal thinking.’ Psychological Bulletin 97, 74–84.

Mayenowa, Maria Renata (1910–1988) R A Rothstein, University of Massachusetts, Amherst, MA, USA ! 2006 Elsevier Ltd. All rights reserved.

Maria Renata Mayenowa (ne´e Gurewicz) was born on June 2, 1910, in Belostok, Russia (from 1919 Białystok, Poland) and died on May 7, 1988, in Warsaw. She studied Polish language and literature at Stefan Batory University in Wilno, graduating in 1932. After teaching for four years in a Jewish high school she returned to the university, earning her doctorate in May 1939. She worked as a teaching assistant at the university (after 1939 the Vilnius University in Soviet Lithuania) until the Nazi invasion of 1941. She survived the occupation in hiding in northern Lithuania. After working as a librarian in the Białystok City Library in 1945, she spent the next five years in Prague, where her husband, Jo´zef Mayen, was pressattache´ at the Polish Embassy. In 1948 she was one of the cofounders of the Institute for Literary Studies of the Polish Academy of Sciences (IBL PAN); she served as deputy director of IBL from 1957 to 1968. She held the rank of professor at the academy and at Warsaw University from 1954 until 1968, when she was dismissed as a result of the ‘anti-Zionist’ campaign of that year. She remained active, however, as a scholar, editor, and organizer of scholarly activity to her very last days. The main focus of Mayenowa’s individual publications was a linguistically informed stylistics and poetics, summed up in her 1974 handbook Theoretical Poetics: Questions of Language (in Polish; second, enlarged and revised edition 1979), which she characterized as dealing with the question of ‘‘what in verbal messages (especially those belonging to

literature) is determined by their linguistic character.’’ (p. 5) Stylistics and poetics were also at the center of her work as editor and translator, which resulted in collective volumes with titles like Tekst, j zyk, poetyka (Text, language, poetics); Semantyka tekstu i j zyka (The semantics of text and language); Tekst i j zyk: problemy semantyczne (Text and language: semantic problems); O spo´jnos´ci tekstu (On textual coherence), as well as annotated collections of translations of French, Czech, and Russian works on stylistics. Several of the volumes were the result of the symposia that she organized in Warsaw; they brought together linguists and literary scholars from Poland, Czechoslovakia, and the Soviet Union, as well as Western scholars, and served as an important medium of international intellectual communication. Mayenowa initiated and played an active editorial role in the dictionary of 16th-century Polish, which has reached volume 31 (volume 1 appeared in 1966), and she was similarly responsible for a series of editions of Polish literary texts, the Library of Polish writers, covering works from the middle ages to the 17th century with extensive explanatory lexicographic and grammatical apparatus. Through such editorial and organizational work, and through her formal and informal didactic activity (including the weekly colloquia that she held at her Warsaw apartment), Mayenowa trained and inspired a whole pleiad of younger scholars, including the Polish-born Australian semanticist Anna Wierzbicka, the Lublin linguist and folklorist Jerzy Bartmin´ski, the Warsaw lexicologist Zygmunt Saloni, and the late medievalist and versologist from Wrocław University, Jerzy Woronczak. See also: Lexicography: Overview; Poetry: Stylistic As-

pects; Stylistics.

Mayan Languages 549

Bibliography Chodz´ ko B et al. (eds.) (2003). Okna pami ci: Maria Renata Mayenowa, 1910–1988. (Windows of memory: Maria Renata Mayenowa, 1910–1988.) Białystok: Towarzystwo Literackie im. Adama Mickiewicza and Uniwersytet w Białymstoku, Instytut Filologii Polskiej. Czachowska J (comp.). (1980). Bibliografia prac Marii Renaty Mayenowej. (Bibliography of works of Maria Renata Mayenowa.) Warsaw: Polska Akademia Nauk.

Mayenowa M R (1968). ‘Stylistics in Poland.’ Style 2, 159–173. Mayenowa M R (1979). Poetyka teoretyczna: Zagadnienia j zyka (Theoretical poetics: questions of language), 2nd edn. Wrocław: Ossolineum. Mayenowa M R (1985). ‘Textual coherence and the reader’s attitude.’ In Dziechcin´ ska H (ed.) Textual coherence and problems of the autobiography. Wrocław: Ossolineum. 7–25. Mayenowa M R (1993). Studia i rozprawy. (Studies and papers). Axer A & Dobrzyn´ ska T (eds.). Warszawa: Wydawnictwo IBL PAN.

Mayan Languages J M Maxwell, Tulane University, New Orleans, LA, USA ! 2006 Elsevier Ltd. All rights reserved.

The Mayan language family traditionally stretched from what is now northern El Salvador and Honduras, through Guatemala and Belize, and up to the southern states of Mexico, including Chiapas, Quintana Roo, Campeche, Yucata´ n, and part of La Huasteca. Today the family is more dispersed due to out-migration. Large colonies of Mayan speakers can be found in Los Angeles and other California communities, Arizona, Texas, and Florida. Most linguistic descriptions recognize 31 Mayan languages, including the extinct Chikomulselteko (Chicomuceltec). Most historical linguists posit the Maya homeland as the Cuchumata´ n peaks of Guatemala, the area with the greatest linguistic diversity today (Kaufman, 1976; Campbell, 1977; Fox, 1978; Campbell and Kaufman, 1983, 1985). The model of diversification correlates phonological, morphological, and syntactic changes with a least-moves model of out-migration, seeking confirmation in the archaeological method. Based on these reconstructions, Proto-Maya, the mother language from which the modern diversity springs, would have been spoken approximately 41 000 years ago. People began to migrate outward, sharing innovations as they moved. The family eventually split into four divisions. (Note that many of the names of Mayan languages have a variety of spellings. These spellings reflect not only the writing traditions of various authors (English, Hispanic, Mayan), but also their political orientation. In Guatemala, particularly, Mayans have fought for and won official recognition of their own orthographies. In Chiapas, Mayan educators have

also elected non-Spanish-based spelling systems. In Yucata´ n, however, a long tradition of literacy in Maya Yucateco has militated against changing orthographies. The spellings used in this article reflect the local practice.) 1. Wasteko, composed today only of Wasteko (Huasteco). 2. Yucatecan, composed of Maya Yucateco (Yucata´ n Maya), Mopan (Mopa´ n Maya), Itzaj (Itza´ ), and Lakantun (Lacanda´ n). 3. Western Division, broken into two branches: Ch’olan and Q’anjob’alan. The Ch’olan branch in turn has two subgroups, Ch’olan Proper, consisting of Chontal, Ch’ol, and Ch’orti’ (Chortı´), and Tzeltalan, consisting of Tzotzil and Tzeltal. The Q’anjob’al branch has two subgroups, Chujean, consisting of Tojolab’al and Chuj, and Q’anjob’alan Proper, consisting of Q’anjob’al (Eastern Kanjobal), Akateko (Western Kanjobal), Popti’ (formerly Jakalteko (Jacalteco)), and Mocho’ (Mocho´ ). 4. Eastern Division, which is subdivided into the Mamean and K’iche’an subgroups. The Mamean branch is broken into Mam Proper, consisting of Mam and Teko (also called Tektiteko (Tectiteco)), and Ixilan, which includes Ixil (Nebaj Ixil) and Awakateko (Aguacateco). The K’iche’an branch includes the outliers Uspanteko (Uspanteco) and Q’eqchi’ (Kekchı´) and two major subdivisions, K’iche’an Proper, consisting of K’iche’ (Quiche´ ), Achi’, Kaqchikel (Central Cakchiquel), Tz’utujiil (Tzutujil), Sakapulteko (Sacapulteco), and Sipakapense (Sipacapense), and Poqom, consisting of Poqomchi’ (Pocomchı´) and Poqomam (Pocomam). Within subgroups there is a high degree of mutually intelligibility and multilingualism (particularly

Mayotte: Language Situation 553 Table 8 Popti’ numeral classifier suffixes Number root

Gloss

Root with suffix

Gloss

kanhwajb’alunh-

‘four’ ‘six’ ‘nine’

kanhwanh wajk’onh b’alunhe’

‘four people’ ‘six animals’ ‘nine things’

Table 9 Te’utujiil measure wordsa Measure

Gloss

Number root

mooq’ quum tz’uur seel peer raab’

‘fistful’ ‘sip’ ‘drop’ ‘slice, layer’ ‘plane surface’ ‘long, cylindrical’

oxkajjuuka’waqwuq-

a

‘three’ ‘four’ ‘one’ ‘two’ ‘six’ ‘seven’

Combined form

Gloss

oxmooq’ kajquum juutz’uur ka’seel waqpeer tz’alam wuqraab’ kolo’

‘three fistfuls’ ‘four sips’ ‘one drop’ ‘two slices’ ‘six planed boards’ ‘seven ropes’

Note that the measure word or classifier serves as the base. The number is prefixed in an abbreviated combinatorial form.

See also: Bilingual Education; Communities of Practice; Ergativity; Guatemala: Language Situation; Language Education Policy in Latin America; Mesoamerica: Scripts; Mexico: Language Situation; Protolanguage.

Bibliography AJPOPAB’CHI (1998). Propuesta de modalidad de oficializacio´n de los idiomas indı´genas de Guatemala. Comisio´n de Oficializacio´n de los Idiomas Indı´genas de Guatemala. Guatemala City, Guatemala: Editorial Nojib’sa. Campbell L (1977). University of California publications in linguistics 81: Quichean linguistic prehistory. Berkeley, CA: University of California Press. Campbell L & Kaufman T (1983). ‘Mesoamerican historical linguistics and distant genetic relationships: getting it straight.’ American Anthropologist 85, 362–372. Campbell L & Kaufman T (1985). ‘Mesoamerican linguistics: where are we now?’ Annual Review of Anthropology 14, 187–198.

England N C (1992). La autonomı´a de los idiomas mayas: historia e identidad. Guatemala City, Guatemala: Cholsamaj. Fox J (1978). Proto-Mayan accent, morpheme structure, conditions and velar innovations. Ph.D. diss., University of Chicago. Instituto Nacional de Estadı´stica, Geografı´a e Informa´tica (2000). Censo nacional. Guatemala City, Guatemala: Cholsamaj. Kaufman T (1976). ‘Archaeological and linguistics correlations in Mayaland and associated areas of MesoAmerica.’ World Archaeology 8, 101–118. Tzian L (1994). Kajlab’aliil Maya’iib’ xuq mu’siib’: ri ub’antajiik Iximuleew. Mayas y ladinos en cifras: el caso de Guatemala. Guatemala City, Guatemala: Cholsamaj.

Relevant Website http://www.sil.org – Ethnologue.

Mayotte: Language Situation W Full, University of Mainz, Mainz, Germany ! 2006 Elsevier Ltd. All rights reserved.

Mayotte is the easternmost of the four islands that form the Comorian archipelago halfway between the east African coast and the northern tip of Madagascar. Actually Mayotte consists of two islands, Grande Terre and Petite Terre, inhabited by more than 100 000 people.

In the referendum for independence in 1974, the majority on Mayotte voted for staying with France. Therefore, when the other three Comorian islands, Grande Comore (Ngazija/Shingazija), Mohe´li (Mwali), and Anjouan (Ndzwani/Shindzwani), became independent in 1975, Mayotte remained a French overseas territory with the status of a collectivite´ territoriale. Despite many resolutions by the United Nations and the Organization of African Unity demanding a unified, independent Comorian

554 Mayotte: Language Situation

archipelago, the political split still exists. In 2002 Mayotte received the increased status of a collectivite´ de´ partementale. Because of financial support from France, the economic and social conditions on Mayotte are better than on the independent islands of the archipelago. While the three other islands are linguistically very homogeneous, with one Comorian dialect per island, the situation on Mayotte is quite different. Although most people also use a form of Comorian as first language, there are two groups with other idioms. In the biggest town, Mamoudzou, and on Petite Terre, we find people of French or Creole origin who speak French as their first language. French is also the official language of Mayotte, and because its education system is better and more efficient than that of the other islands, the knowledge of French is generally more profound. Mainly in the western and southern part of Mayotte, a Sakalava (Bushi) dialect of Malagasy, the national language of Madagascar, is spoken in some towns and villages. These people are descendants of immigrants from Madagascar who arrived on Mayotte in the 19th century. They were culturally assimilated to a large extent but retained their original language. The most widespread Comorian dialect on Mayotte is Shimaore, which shows greater regional differences than the other dialects do. The ‘purest’ Shimaore is spoken in the west and south of the island, whereas in the northeast many immigrants from Anjouan have affected the language spoken there. Generally, Shimaore and Shindzwani from Anjouan are the two Comorian dialects that are linguistically closest to one another (phonologically, lexically, and morphologically).

In the west and south of Mayotte are also two small enclaves (with two settlements each) where mainly descendants of immigrants from Grande Comore live. Parts of their language, such as verbal morphology, have preserved the Shingazija character while others, like lexicon or nominal morphology, have been clearly influenced by the surrounding Shimaore. Today their language is so different from the other dialects that it must be regarded as an independent dialect within Comorian. I have named it Shikombani after the main settlement. Linguistically its position on the continuum of Comorian dialects is in the middle between Shingazija on the one hand and Shindzwani/Shimaore on the other hand, sharing this position with Shimwali, the dialect on Moheli. See also: Bantu Languages; Comoros: Language Situation; Malagasy; Swahili.

Bibliography Blanchy S (1996). Dictionnaire Mahorais-Franc¸ ais, Franc¸ aisMahorais. Paris: L’Harmattan. Full W (Forthcoming). Dialektologie des Komorischen. Gueunier N (1986). ‘Lexique du dialecte malgache de Mayotte (Comores).’ E´ tudes Oce´ an Indien 7 (nume´ ro spe´ cial). Nurse D & Hinnebusch T (1993). Swahili and Sabaki: a linguistic history. Berkeley: University of California Press. Ottenheimer M & Ottenheimer H (1994). Historical dictionary of the Comoro Islands. Metuchen: Scarecrow Press. Rombi M (1983). Le Shimaore (Ile de Mayotte, Comores): premie`re approche d’un parler de la langue comorienne. Paris: SELAF.

Mayrhofer, Manfred (b. 1926) R Schmitt, Laboe, Germany ! 2006 Elsevier Ltd. All rights reserved.

Manfred Mayrhofer, one of the most eminent Indo– European and Indo–Iranian scholars of our time, was born on September 26, 1926 in Linz (Austria). After completing his secondary education in 1944 and being held as a prisoner of war for several months, he matriculated at the University of Graz, studying German language and literature in the winter term of 1945–1946. He soon turned, however, to Indo–European comparative grammar and to Indo–Iranian and

Semitic studies. Later, he admitted being an autodidact of those fields in many respects. In 1949, he obtained his doctorate, and two years later he qualified as a lecturer in Indo–European linguistics and Indo–Iranian philology. In 1953, he was offered a position at the University of Wu¨ rzburg, where he continued to teach as a lecturer and finally was appointed full professor of comparative linguistics in 1958. After a 4-year interlude as professor of comparative Indo–European linguistics and Indo– Iranian studies at the University of the Saarland in Saarbru¨ cken, from 1962 to 1966, he went back to his native Austria and was appointed to the chair of

554 Mayotte: Language Situation

archipelago, the political split still exists. In 2002 Mayotte received the increased status of a collectivite´ de´partementale. Because of financial support from France, the economic and social conditions on Mayotte are better than on the independent islands of the archipelago. While the three other islands are linguistically very homogeneous, with one Comorian dialect per island, the situation on Mayotte is quite different. Although most people also use a form of Comorian as first language, there are two groups with other idioms. In the biggest town, Mamoudzou, and on Petite Terre, we find people of French or Creole origin who speak French as their first language. French is also the official language of Mayotte, and because its education system is better and more efficient than that of the other islands, the knowledge of French is generally more profound. Mainly in the western and southern part of Mayotte, a Sakalava (Bushi) dialect of Malagasy, the national language of Madagascar, is spoken in some towns and villages. These people are descendants of immigrants from Madagascar who arrived on Mayotte in the 19th century. They were culturally assimilated to a large extent but retained their original language. The most widespread Comorian dialect on Mayotte is Shimaore, which shows greater regional differences than the other dialects do. The ‘purest’ Shimaore is spoken in the west and south of the island, whereas in the northeast many immigrants from Anjouan have affected the language spoken there. Generally, Shimaore and Shindzwani from Anjouan are the two Comorian dialects that are linguistically closest to one another (phonologically, lexically, and morphologically).

In the west and south of Mayotte are also two small enclaves (with two settlements each) where mainly descendants of immigrants from Grande Comore live. Parts of their language, such as verbal morphology, have preserved the Shingazija character while others, like lexicon or nominal morphology, have been clearly influenced by the surrounding Shimaore. Today their language is so different from the other dialects that it must be regarded as an independent dialect within Comorian. I have named it Shikombani after the main settlement. Linguistically its position on the continuum of Comorian dialects is in the middle between Shingazija on the one hand and Shindzwani/Shimaore on the other hand, sharing this position with Shimwali, the dialect on Moheli. See also: Bantu Languages; Comoros: Language Situation; Malagasy; Swahili.

Bibliography Blanchy S (1996). Dictionnaire Mahorais-Franc¸ais, Franc¸aisMahorais. Paris: L’Harmattan. Full W (Forthcoming). Dialektologie des Komorischen. Gueunier N (1986). ‘Lexique du dialecte malgache de Mayotte (Comores).’ E´tudes Oce´an Indien 7 (nume´ro spe´cial). Nurse D & Hinnebusch T (1993). Swahili and Sabaki: a linguistic history. Berkeley: University of California Press. Ottenheimer M & Ottenheimer H (1994). Historical dictionary of the Comoro Islands. Metuchen: Scarecrow Press. Rombi M (1983). Le Shimaore (Ile de Mayotte, Comores): premie`re approche d’un parler de la langue comorienne. Paris: SELAF.

Mayrhofer, Manfred (b. 1926) R Schmitt, Laboe, Germany ! 2006 Elsevier Ltd. All rights reserved.

Manfred Mayrhofer, one of the most eminent Indo– European and Indo–Iranian scholars of our time, was born on September 26, 1926 in Linz (Austria). After completing his secondary education in 1944 and being held as a prisoner of war for several months, he matriculated at the University of Graz, studying German language and literature in the winter term of 1945–1946. He soon turned, however, to Indo–European comparative grammar and to Indo–Iranian and

Semitic studies. Later, he admitted being an autodidact of those fields in many respects. In 1949, he obtained his doctorate, and two years later he qualified as a lecturer in Indo–European linguistics and Indo–Iranian philology. In 1953, he was offered a position at the University of Wu¨rzburg, where he continued to teach as a lecturer and finally was appointed full professor of comparative linguistics in 1958. After a 4-year interlude as professor of comparative Indo–European linguistics and Indo– Iranian studies at the University of the Saarland in Saarbru¨cken, from 1962 to 1966, he went back to his native Austria and was appointed to the chair of

Mayrhofer, Manfred (b. 1926) 555

general and Indo–European linguistics at the University of Vienna, which had a worldwide reputation from the time of Friedrich Mu¨ ller and Paul Kretschmer. Since his retirement in 1988, he has lived as a professor emeritus in Vienna. He is a member of more than a dozen academies of sciences and was particularly active in the Austrian Academy of Sciences, not only as a member of the Committee from 1970 to 1982. Mayrhofer is the author of a large number of manuals, dictionaries, other monographs, articles, and reviews (for a full bibliography up to 1995, see Schmitt, 1979–1996), not only in the field of Indo– Iranian linguistics and onomastics, but also general Indo–European studies and the history of this discipline. He is one of the most prolific linguists of our time, and in an autobiographical sketch (1991) he characterized himself as an enthusiastic writer of manuals (begeisterter Handbuch-Schreiber). Mayrhofer’s magnum opus, and actually his lifework, is the etymological dictionary of Old Indo– Aryan; more exactly, his two etymological dictionaries of this language. Beginning during his student days, he gathered material for such a piece of work, the publication of which began when he reached the age of 25. It was finally completed in four volumes in 1980. As the publication went on, the conception of the Kurzgefaßtes etymologisches Wo¨ rterbuch des Altindischen/A Concise Etymological Sanskrit Dictionary (1956–1980), which originally had been planned as a rather succinct outline, changed radically, and the dictionary became more and more detailed and informative. As a result of decades of practical experience and of thorough considerations about writing an etymological dictionary of such a richly attested language, Mayrhofer began, immediately after the completion, to work out an entirely novel successor work, the Etymologisches Wo¨ rterbuch des Altindoarischen (1992–2001). This book had nothing in common with its predecessor but its author and subject matter. Here, he introduced into etymological research of Old Indo–Aryan a sharp differentiation between the words attested already in (more ancient) Vedic literature, and those of the younger language, (Sanskrit proper) documented not earlier than the two great epics for the first time. Those two layers of the vocabulary differ to a large extent in the principal sources of the words themselves, for the older, already Vedic part of the lexicon mainly consists of inherited words (going back to Indo–Iranian and Indo–European). Among the words that appear for the first time only in later centuries, there is a great number of loan words borrowed from either the Middle Indo–Aryan vernaculars (Prakrits), or from non-Aryan languages. As a supplement to this dictionary, in a sense anyway, Mayrhofer most

recently published a monographic study on the personal names attested in the Rig-Veda, to which were added the names of the poets of the Rig-Vedic hymns (2003). With this, he drew attention to a longneglected field, which would deserve intensive efforts by future scholars. He did other work in the field of Indo–Aryan linguistic history, beginning with the linguistic remains of the early Indo–Aryans in the 2nd millennium Ancient Near East (1966). Likewise, he also treated problems of the alleged Dravidian and Austro-Asiatic substratum influences in the Old Indo–Aryan lexical stock, which he viewed recently, however, in a more and more skeptical way. The first major contribution of Mayrhofer’s in the field of the Old Iranian languages was the etymological glossary, which he contributed to the 1958 Spanish and, together with the morphological section, to the later German versions of the Handbuch des Altpersischen (Brandenstein and Mayrhofer, 1964), which his teacher Wilhelm Brandenstein had begun. This glossary was a milestone, insofar as it was the first Old Persian dictionary to also contain the vocabulary attested (outside the royal inscriptions themselves) in the so-called collateral tradition of Elamite, Aramaic, and other non-Iranian sources. By that and by several subsequent reports on current research, he got more and more involved in anthroponomastic studies, which in Old Iranian are of crucial importance for etymological reasons. In 1973, he published a most careful interim analysis of the (mostly Old Iranian) personal names attested in the Elamite cuneiform tablets excavated in Persepolis and belonging, for the most part, to the heyday of the Achaemenid Empire under Darius I (and more exactly to the years 509–494 B.C.). Previously, in 1969, Mayrhofer had initiated the Iranisches Personennamenbuch as a large-scale project of the Austrian Academy of Sciences. Someday it will replace Ferdinand Justi’s age-old Iranisches Namenbuch (1895), which over the course of time became somewhat obsolete, owing to the enormous increase in linguistically Iranian onomastic material. He himself prepared the first volume (after which several others followed, and many more will follow in the future), treating the names attested in the Avesta and in the Old Persian inscriptions (Mayrhofer, 1977– 1979). After having taken over the editorship of the Indogermanische Grammatik, founded by Jerzy Kuryłowicz, Mayrhofer prepared a systematic account of the phonology of the Indo–European proto-language (1986), which took into consideration all the particulars and subtleties of the laryngeal theory as well. This book became his most important

556 Mayrhofer, Manfred (b. 1926)

contribution to Indo–European studies in general. Moreover, a rather characteristic feature of his research is that he also has a keen interest in problems of the history of Indo–European studies in the 19th and 20th centuries. He once dealt with the question of how reconstructing the Proto-Indo–European phonological system left step-by-step the original Sanskrit-like notation (Mayrhofer, 1983), and two years before that he had investigated the reception of Ferdinand de Saussure’s seminal ideas and theories by today’s Indo–Europeanists (Mayrhofer, 1981). Mayrhofer’s writings stand out for their admirable acquaintance with all the relevant specialist literature and for their careful and balanced judgment. They clearly show that he is capable of presenting even the most complicated matters and theories with didactic skill and in a clear, concise, and quite agreeable linguistic form. See also: Benfey, Theodor (1809–1881); Kuryl/owicz, Jerzy (1895–1978); Mu¨ller, Friedrich Max (1823–1900); Saussure, Ferdinand (-Mongin) de (1857–1913).

Bibliography Brandenstein W & Mayrhofer M (1964). Handbuch des Altpersischen. Wiesbaden: Harrassowitz. [First in Spanish: Antiguo Persa, 1958. Madrid: Consejo Superior de Investigaciones Cientı´ficas.] Justi F (1895). Iranisches Namenbuch. Marburg: Elwert. [Repr. Hildesheim: Olms. 1963.]

Mayrhofer M (1956–1980). Kurzgefaßtes etymologisches Wo¨ rterbuch des Altindischen (4 vols). Heidelberg: Winter. Mayrhofer M (1966). Die Indo–Arier im Alten Vorderasien. Wiesbaden: Harrassowitz. Mayrhofer M (1973). Onomastica Persepolitana. Das altiranische Namengut der Persepolis-Ta¨felchen. Wien: Verlag ¨ sterreichischen Akademie der Wissenschaften. der O Mayrhofer M (1977–1979). Iranisches Personennamenbuch. I: Die altiranischen Namen. Wien: Verlag der ¨ sterreichischen Akademie der Wissenschaften. O Mayrhofer M (1981). Nach hundert Jahren. Ferdinand de Saussure’s Fru¨ hwerk und seine Rezeption durch die heutige Indogermanistik. Heidelberg: Winter. Mayrhofer M (1983). Sanskrit und die Sprachen Alteuropas. Zwei Jahrhunderte des Widerspiels von Entdeckungen und Irrtu¨ mern. Go¨ ttingen: Vandenhoeck & Ruprecht. Mayrhofer M (1986). Indogermanische Grammatik. Vol. I, part 2: Lautlehre. Heidelberg: Winter. [Segmentale Phonologie des Indogermanischen.] Mayrhofer M (1991). ‘Mein Weg zur Sprachwissenschaft.’ In Gauger H-M & Po¨ ckel W (eds.) Wege in die Sprachwissenschaft. Festschrift fu¨ r Mario Wandruszka. Tu¨ bingen: Narr. Mayrhofer M (1992–2001). Etymologisches Wo¨ rterbuch des Altindoarischen (3 vols). Heidelberg: Winter. Mayrhofer M (2003). Die Personennamen in der R. gvedaSam ˙ hita¯: Sicheres und Zweifelhaftes. Munich: Verlag der Bayerischen Akademie der Wissenschaften. Schmitt R (1979–1996). ‘Bibliographie Manfred Mayrhofer.’ In Mayrhofer M Ausgewa¨hlte Kleine Schriften, vols I–II. Wiesbaden: Reichert.

McCawley, James (1938–1999) M Brdar, University of Osijek, Osijek, Croatia ! 2006 Elsevier Ltd. All rights reserved.

One of the truly great figures of 20th-century linguistics, most famous as the champion of generative semantics, and a genuine libertarian/anarchist original, James Quillan McCawley Jr., was born in Glasgow, Scotland, on March 31, 1938, as the first child of James Quillan McCawley, a businessman, and Dr. Monica Bateman McCawley, a physician and surgeon. The family emigrated to the United States during World War II; first James Sr. went to Toronto and then to New York with his two brothers, finally settling in Chicago after the war when he was joined by his wife and children. James Jr., (later, on becoming a U.S. citizen, he changed his name to James

David McCawley, jettisoning the Jr.) was soon recognized as a very bright student, skipping several grades while attending parochial grade schools and St Mel’s High School. In his teenage years, he displayed an interest in and also started learning languages (so that during his linguistic career he was able to draw on a dozen or so languages). In 1954, at the age of 16, he entered the University of Chicago and progressed rapidly towards the graduate school, where he received an M.S. in mathematics in 1958. Many years later he would say that it was for no particularly good reason that he went for mathematics, and in fact he was increasingly turned off by mathematics. He soon discovered linguistics through a course taught by Eric Hamp. He accepted a Fulbright fellowship to study mathematics and logic at the Westfa¨ lische Wilhelms University in Mu¨ nster,

556 Mayrhofer, Manfred (b. 1926)

contribution to Indo–European studies in general. Moreover, a rather characteristic feature of his research is that he also has a keen interest in problems of the history of Indo–European studies in the 19th and 20th centuries. He once dealt with the question of how reconstructing the Proto-Indo–European phonological system left step-by-step the original Sanskrit-like notation (Mayrhofer, 1983), and two years before that he had investigated the reception of Ferdinand de Saussure’s seminal ideas and theories by today’s Indo–Europeanists (Mayrhofer, 1981). Mayrhofer’s writings stand out for their admirable acquaintance with all the relevant specialist literature and for their careful and balanced judgment. They clearly show that he is capable of presenting even the most complicated matters and theories with didactic skill and in a clear, concise, and quite agreeable linguistic form. See also: Benfey, Theodor (1809–1881); Kuryl/owicz, Jerzy (1895–1978); Mu¨ller, Friedrich Max (1823–1900); Saussure, Ferdinand (-Mongin) de (1857–1913).

Bibliography Brandenstein W & Mayrhofer M (1964). Handbuch des Altpersischen. Wiesbaden: Harrassowitz. [First in Spanish: Antiguo Persa, 1958. Madrid: Consejo Superior de Investigaciones Cientı´ficas.] Justi F (1895). Iranisches Namenbuch. Marburg: Elwert. [Repr. Hildesheim: Olms. 1963.]

Mayrhofer M (1956–1980). Kurzgefaßtes etymologisches Wo¨rterbuch des Altindischen (4 vols). Heidelberg: Winter. Mayrhofer M (1966). Die Indo–Arier im Alten Vorderasien. Wiesbaden: Harrassowitz. Mayrhofer M (1973). Onomastica Persepolitana. Das altiranische Namengut der Persepolis-Ta¨felchen. Wien: Verlag ¨ sterreichischen Akademie der Wissenschaften. der O Mayrhofer M (1977–1979). Iranisches Personennamenbuch. I: Die altiranischen Namen. Wien: Verlag der ¨ sterreichischen Akademie der Wissenschaften. O Mayrhofer M (1981). Nach hundert Jahren. Ferdinand de Saussure’s Fru¨hwerk und seine Rezeption durch die heutige Indogermanistik. Heidelberg: Winter. Mayrhofer M (1983). Sanskrit und die Sprachen Alteuropas. Zwei Jahrhunderte des Widerspiels von Entdeckungen und Irrtu¨mern. Go¨ttingen: Vandenhoeck & Ruprecht. Mayrhofer M (1986). Indogermanische Grammatik. Vol. I, part 2: Lautlehre. Heidelberg: Winter. [Segmentale Phonologie des Indogermanischen.] Mayrhofer M (1991). ‘Mein Weg zur Sprachwissenschaft.’ In Gauger H-M & Po¨ckel W (eds.) Wege in die Sprachwissenschaft. Festschrift fu¨r Mario Wandruszka. Tu¨bingen: Narr. Mayrhofer M (1992–2001). Etymologisches Wo¨rterbuch des Altindoarischen (3 vols). Heidelberg: Winter. Mayrhofer M (2003). Die Personennamen in der R. gvedaSam ˙ hita¯: Sicheres und Zweifelhaftes. Munich: Verlag der Bayerischen Akademie der Wissenschaften. Schmitt R (1979–1996). ‘Bibliographie Manfred Mayrhofer.’ In Mayrhofer M Ausgewa¨hlte Kleine Schriften, vols I–II. Wiesbaden: Reichert.

McCawley, James (1938–1999) M Brdar, University of Osijek, Osijek, Croatia ! 2006 Elsevier Ltd. All rights reserved.

One of the truly great figures of 20th-century linguistics, most famous as the champion of generative semantics, and a genuine libertarian/anarchist original, James Quillan McCawley Jr., was born in Glasgow, Scotland, on March 31, 1938, as the first child of James Quillan McCawley, a businessman, and Dr. Monica Bateman McCawley, a physician and surgeon. The family emigrated to the United States during World War II; first James Sr. went to Toronto and then to New York with his two brothers, finally settling in Chicago after the war when he was joined by his wife and children. James Jr., (later, on becoming a U.S. citizen, he changed his name to James

David McCawley, jettisoning the Jr.) was soon recognized as a very bright student, skipping several grades while attending parochial grade schools and St Mel’s High School. In his teenage years, he displayed an interest in and also started learning languages (so that during his linguistic career he was able to draw on a dozen or so languages). In 1954, at the age of 16, he entered the University of Chicago and progressed rapidly towards the graduate school, where he received an M.S. in mathematics in 1958. Many years later he would say that it was for no particularly good reason that he went for mathematics, and in fact he was increasingly turned off by mathematics. He soon discovered linguistics through a course taught by Eric Hamp. He accepted a Fulbright fellowship to study mathematics and logic at the Westfa¨lische Wilhelms University in Mu¨nster,

McCawley, James (1938–1999) 557

Germany in 1959 and 1960, but instead of doing mathematics, he started taking all kinds of language courses there. This continued on his return to Chicago, where he fell in love with Japanese. He learned about the new graduate program in linguistics being started at the Massachussets Institute of Technology (MIT) and applied for it, spending three years in the first Ph.D. class there. He worked as a research assistant at MIT’s Mechanical Translation Group from 1962 to 1963, leaving to study Japanese at Seton Hall in 1963 and to do research at the IBM Watson Center in Yorktown Heights in 1964. His Ph.D. dissertation on the accentual system of modern standard Japanese, submitted in 1965, was supervised by Noam Chomsky. He returned in 1964 to the University of Chicago as an assistant professor of linguistics, where he spent the remainder of his career, except for numerous travels as visiting professor. He was tenured in 1969 as an associate professor and was promoted to full professor in 1970. He married linguist Noriko Akatsuka in 1971, their marriage ending in an amicable divorce in 1978. He died on April 10, 1999. McCawley’s general philosophy may be said to be characterized by consistency across epistemology (science) and praxeology (politics, economics, and ethics), vision being matched by sensitivity to empirical details and data. Although McCawley is chiefly known as a syntactician and semanticist, he clearly resists such ready-made classificatory labels as he was also a phonologist, morphologist, lexicologist, logician, epistemologist, and pragmaticist (besides being a passionate musicologist and a culinary artist). He first became widely known with his 1968 paper, where he proposed what can be seen in retrospect as a first draft of generative semantics. This movement suited him very well because it was compatible with his background in mathematics and logic, and its arguments were data-driven. He characterized his approach as a revisionist version of transformational grammar, centering on some ideas of transformational grammar that he considered to be fruitful, such as constituency, multiple syntactic strata, and the cyclic principle. McCawley clearly disagreed with generative linguists regarding the autonomy of linguistics, especially syntax. For him it was impossible to do any syntax without crossing area and disciplinary boundaries. In his framework, syntactic phenomena constituted primarily dependent variables, to be explained as epiphenomena caused by interactions of such independent variables as semantics and pragmatics, as well as morphology and the lexicon. In a very important paper, entitled An un-syntax, McCawley claimed that ‘‘much of what has been thought of as syntax

is largely a reflection of other things, such as morphology, logic, production strategies, and principles of cooperation’’ (1981b: 168). This insight actually epitomized all of his work. He kept revising his model incrementally, adding more and more rules and thus extending its reach to accommodate more and more phenomena, until it reached its zenith in The syntactic phenomena of English (1988, 2nd edn., 1998), a reference full of details about English grammar that have always wanted more attention from linguists. Along with his logic book (Everything that linguists have always wanted to know about logic (but were ashamed to ask;) 1981a, 2nd edn., 1993), this was no doubt among his most enduring contributions to the pool of classical linguistic literature. They were eclectic references that synthesize the most significant findings by himself and other linguists about semantics, syntax, pragmatics, and the philosophy of language during his career. Both books were constructive proof that generative grammar can be used to produce sane descriptive statements about language. See also: Chinese; Chomsky, Noam (b. 1928); Constituent Structure; Information Theory; Generative Grammar; Generative Semantics; Japanese; Modularity; Transformational Grammar: Evolution.

Bibliography Brentari D, Larson G N & Macleod L A (eds.) (1992). The joy of grammar: a festschrift in honor of James D. McCawley. Amsterdam, Philadelphia: John Benjamins. Koyama W (2000). ‘How to be a singular scientist of words, worlds, and other (possibly) wonderful things: an obituary for James D. McCawley (1934–1999).’ Journal of Pragmatics 32, 651–686. Lawler J (2003). ‘James D. McCawley.’ Language 79, 614–625. McCawley J D (1968). ‘The role of semantics in a grammar.’ In Bach E & Harms R T (eds.) Universals in linguistic theory. New York: Holt, Rinehart and Winston. 124–169. McCawley J D (1970). ‘English as a VSO language.’ Language 46, 286–299. McCawley J D (ed.) (1976). Notes from the linguistic underground. New York: Academic Press. McCawley J D (1981a, 2nd edn., 1993). Everything that linguists have always wanted to know about logic (but were ashamed to ask). Chicago: University of Chicago Press, and Oxford: Blackwell. McCawley J D (1981b). ‘An un-syntax.’ In Moravcsik E & Wirth J (eds.) Current approaches to syntax. New York: Academic Press. 167–193. McCawley J D (1982). Thirty million theories of grammar. London: Croom Helm, and Chicago: University of Chicago Press.

558 McCawley, James (1938–1999) McCawley J D (1988, 2nd edn., 1998). The syntactic phenomena of English (2 vols). Chicago: University of Chicago Press.

Mufwene S S, Francis E J & Wheeler R S (eds.) (2005). Polymorphous linguistics: Jim McCawley’s legacy. Cambridg, MA: MIT Press.

McDavid, Raven I., Jr (1911–1984) W Viereck, Universita¨t Bamberg, Bamberg, Germany ! 2006 Elsevier Ltd. All rights reserved.

Raven Ioor McDavid, Jr. was born in Greenville, South Carolina, in 1911. In 1931 he graduated from Greenville’s Furman University. In graduate school at Duke University he specialized in Milton, receiving his Ph.D. in 1935 with a dissertation on Milton as a political thinker. The following year McDavid taught at a military college in Charleston. In 1937 he participated in the Second Linguistic Institute, taught by famous scholars in the field such as Leonard Bloomfield, Edward Sapir, Hans Kurath, and Bernard Bloch; the latter discovered in McDavid an aptitude for phonetics. Here he was also introduced to H. L. Mencken’s American language. The Linguistic Institute was the strongest impetus to McDavid’s career. He began to study American English as it was then spoken, an activity only interrupted during the World War II years, when he was called upon to do war-related work. He then studied Burmese and also worked on a dictionary of spoken Chinese. From 1945 to 1950 McDavid conducted over 500 interviews for the Linguistic atlas project at Kurath’s request – first on the Atlantic seaboard and then in the north central states. These investigations, which he had begun as early as 1941, led to his life’s work of linguistic geography. In 1952 McDavid finally found a regular teaching position at Western Reserve, and in 1957 he went to the University of Chicago, where he became full professor of English and linguistics in 1964, a position he occupied until his retirement in 1978. Mencken’s American language was instrumental in changing McDavid’s interest from literature to language. In 1963 he published a one-volume abridged edition of this influential book with annotations and modifications reflecting developments after 1948.

McDavid is best known for his work in linguistic geography, both regional and social. The first major outcome of his work in this field was The pronunciation of English in the Atlantic states (1961), written together with Kurath. Moreover, McDavid was editor-in-chief of a number of American regional linguistic atlases. Lexicography, an interest of McDavid’s dating back to his dictionary work during the war years, attracted his attention again later in his career. Among other areas, he investigated the prepublication criticism of the controversial Webster’s third. McDavid was no doubt the leading researcher in American English speech patterns. He made a significant contribution not only to traditional dialectology, but also to the study of social variation. Without him, variationists could not have made their advances. See also: Bloch, Bernard (1907–1965); Bloomfield, Leonard

(1887–1949); Kurath, Hans (1891–1992); Sapir, Edward (1884–1939).

Bibliography Davis L M (ed.) (1972). Studies in linguistics in honor of Raven I. McDavid, Jr. The University, AL: University of Alabama Press. Dill A S (ed.) (1980). Varieties of American English. Essays by Raven I. McDavid, Jr. Stanford, CA: Stanford University Press. Kretzschmar W A et al. (eds.) (1979). Dialects in culture: Essays in general dialectology by Raven I. McDavid, Jr. University, AL: The University of Alabama Press. Mencken H L (1963). The American language. An inquiry into the development of English in the United States. (4th edn. and 2 supplements, abridged, with annotations and new material by Raven I McDavid, Jr.) New York: Afred A. Knopf.

558 McCawley, James (1938–1999) McCawley J D (1988, 2nd edn., 1998). The syntactic phenomena of English (2 vols). Chicago: University of Chicago Press.

Mufwene S S, Francis E J & Wheeler R S (eds.) (2005). Polymorphous linguistics: Jim McCawley’s legacy. Cambridg, MA: MIT Press.

McDavid, Raven I., Jr (1911–1984) W Viereck, Universita¨t Bamberg, Bamberg, Germany ! 2006 Elsevier Ltd. All rights reserved.

Raven Ioor McDavid, Jr. was born in Greenville, South Carolina, in 1911. In 1931 he graduated from Greenville’s Furman University. In graduate school at Duke University he specialized in Milton, receiving his Ph.D. in 1935 with a dissertation on Milton as a political thinker. The following year McDavid taught at a military college in Charleston. In 1937 he participated in the Second Linguistic Institute, taught by famous scholars in the field such as Leonard Bloomfield, Edward Sapir, Hans Kurath, and Bernard Bloch; the latter discovered in McDavid an aptitude for phonetics. Here he was also introduced to H. L. Mencken’s American language. The Linguistic Institute was the strongest impetus to McDavid’s career. He began to study American English as it was then spoken, an activity only interrupted during the World War II years, when he was called upon to do war-related work. He then studied Burmese and also worked on a dictionary of spoken Chinese. From 1945 to 1950 McDavid conducted over 500 interviews for the Linguistic atlas project at Kurath’s request – first on the Atlantic seaboard and then in the north central states. These investigations, which he had begun as early as 1941, led to his life’s work of linguistic geography. In 1952 McDavid finally found a regular teaching position at Western Reserve, and in 1957 he went to the University of Chicago, where he became full professor of English and linguistics in 1964, a position he occupied until his retirement in 1978. Mencken’s American language was instrumental in changing McDavid’s interest from literature to language. In 1963 he published a one-volume abridged edition of this influential book with annotations and modifications reflecting developments after 1948.

McDavid is best known for his work in linguistic geography, both regional and social. The first major outcome of his work in this field was The pronunciation of English in the Atlantic states (1961), written together with Kurath. Moreover, McDavid was editor-in-chief of a number of American regional linguistic atlases. Lexicography, an interest of McDavid’s dating back to his dictionary work during the war years, attracted his attention again later in his career. Among other areas, he investigated the prepublication criticism of the controversial Webster’s third. McDavid was no doubt the leading researcher in American English speech patterns. He made a significant contribution not only to traditional dialectology, but also to the study of social variation. Without him, variationists could not have made their advances. See also: Bloch, Bernard (1907–1965); Bloomfield, Leonard

(1887–1949); Kurath, Hans (1891–1992); Sapir, Edward (1884–1939).

Bibliography Davis L M (ed.) (1972). Studies in linguistics in honor of Raven I. McDavid, Jr. The University, AL: University of Alabama Press. Dill A S (ed.) (1980). Varieties of American English. Essays by Raven I. McDavid, Jr. Stanford, CA: Stanford University Press. Kretzschmar W A et al. (eds.) (1979). Dialects in culture: Essays in general dialectology by Raven I. McDavid, Jr. University, AL: The University of Alabama Press. Mencken H L (1963). The American language. An inquiry into the development of English in the United States. (4th edn. and 2 supplements, abridged, with annotations and new material by Raven I McDavid, Jr.) New York: Afred A. Knopf.

McLuhan, Marshall (b. 1911) 559

McLuhan, Marshall (b. 1911) R Chatterjee, Lado International College, Silver Spring, MD, USA ! 2006 Elsevier Ltd. All rights reserved.

Herbert Marshall McLuhan, a professor at the University of Toronto for most of his active life, was born July 21, 1911, in Edmonton, Alberta. After his college years in Canada, he entered the University of Cambridge for higher studies in English. At an early stage, McLuhan found himself absorbed in James Joyce, Giambattista Vico, and G. K. Chesterton. At Cambridge, he studied and worked closely with I. A. Richards, author of Practical criticism, and, with C. K. Ogden, of The meaning of meaning. McLuhan was also deeply interested in William Blake’s work. He converted to Roman Catholicism in 1937, later emphasizing the connection between his readings in religion and his media study (Gordon, 1997: 75). Marshall McLuhan achieved distinction as a critic and analyst of the media. His aim on leaving Cambridge after the completion of his studies has been described as ‘the training of perception,’ an activity compatible with Cambridge criticism in general. McLuhan saw media (e.g., phonetic writing, radio) as extensions of the human body and the nervous system, an idea originating in the work of his friend Harold Innis (Gordon, 1997: 148), to which he was much indebted. From Innis too came the idea that print creates nationalism. The slogan ‘the medium is the message’ is universally associated with McLuhan. Its meaning has been much expounded and debated. In a letter to David Segal, McLuhan wrote: ‘‘Major insight on ‘medium is the message’ is this: each new technology, be it house, or wheel, or radio, creates a new human environment’’ (1964). Here it is notable that McLuhan thought of language as mankind’s first technology. He was concerned with the question, ‘How can we escape the inevitable changes that new technologies bring?’ Before escape, the changes have to be understood. Commentators list some 10 different ways in which media have an impact on the human being and the environment.

McLuhan classified media like TV and radio into two groups, ‘hot’ and ‘cool.’ Hot media are ‘high definition,’ meaning they are loaded with information, leaving little for the user to do, and include radio, print, and movies, in contrast to cool media like telephone, speech, and TV respectively. He expressed the ‘laws of media’ as a tetrad. For example, e-mail (1) extends writing, (2) renders obsolete long-distance phone calls, (3) retrieves the telegram (gives it a new, intensified form) and (4) reverses into dialogue when writers are online together (Gordon, 1997: 341). McLuhan’s literary training aided his richly metaphoric thinking about media. Among linguists, he read Saussure and Whorf closely. It is debated whether his reading of the former is more correct than Jacques Derrida’s postmodern one. Since the 1990s, critical works on McLuhan have appeared. An early challenging and dismissive work is Jonathan Miller’s McLuhan. See also: Ogden, Charles Kay (1889–1957); Richards, Ivor Armstrong (1893–1979); Saussure, Ferdinand (-Mongin) de (1857–1913); Whorf, Benjamin Lee (1897–1941).

Bibliography Gordon W T (1997). Marshall McLuhan: escape into understanding: a biography. New York: Basic Books. McLuhan M (1951). The mechanical bride: folklore of industrial man. New York: Vanguard Press. McLuhan M (1962). The Gutenberg galaxy: the making of typographic man. Toronto: University of Toronto Press. McLuhan M (1964). Understanding media: the extensions of man. New York: New American Library. McLuhan M (1967). The medium is the message. New York/Toronto: Bantam Books. Miller J (1971). Marshall McLuhan. New York: Viking Press. Neill S D (1993). Clarifying McLuhan: an assessment of process and product. Westport, CT: Greenwood Press. Stamps J (1995). Unthinking modernity: Innis, McLuhan, and the Frankfurt School. Montreal: McGill-Queen’s University Press. Willmott G (1996). McLuhan, or modernism in reverse. Toronto: University of Toronto Press.

560 Mead, George Herbert (1863–1931)

Mead, George Herbert (1863–1931) N Denzin, University of Illinois at Urbana–Champaign, Urbana, IL, USA ! 2006 Elsevier Ltd. All rights reserved.

George Herbert Mead was a leading figure in American pragmatic philosophy, along with William James, Charles Sanders Peirce, and John Dewey. Mead never published a finished version of his theories, and his ideas and lectures were gathered together by students in several posthumous publications, including The philosophy of the present (1932), Mind, self, and society (1934), Movements of thought in the nineteenth century (1936), The philosophy of the act (1938), George Herbert Mead on social psychology (1956), Mead: selected writings (1962), and The individual and the social self: the unpublished work of George Herbert Mead (1982). These works have figured centrally in the American school of sociology called symbolic interactionism. The son of a New England clergyman and a mother who would later become president of Mt. Holyoke College, Mead graduated from Oberlin College in 1883. From 1887 to 1888, he did graduate work in philosophy and psychology at Harvard University, studying under William James (he lived in the James’s house and tutored their children), Josiah Royce, and George Herbert Palmer. From 1888 to 1891, Mead studied in Berlin and Leipzig, where he came into contact with Wihelm Wundt’s work on linguistics and physiological psychology and met G. Stanley Hall, who had studied with Wundt. On October 1, 1891, in Berlin, Mead married Helen Castle of the rich landowning Castle family of Honolulu, Castle and Cooke. In the same year, Mead returned to the United States (without a Ph.D.) and took a position in the philosophy department at the University of Michigan, where John Dewey, James Tufts, and Charles Horton Cooley were teaching. Mead followed Dewey to the University of Chicago in 1891, where he remained until his death on April 26, 1931. With Dewey, Angell, and Tufts, he played a major part in the formation of what he called the Chicago School of philosophy. Mead taught a variety of courses at Chicago, including Social Psychology, Movements of Thought in the Nineteenth Century, the Philosophy of Eminent Scientists, French Philosophy, Aristotle, Hume, Hegel, Leibniz, German Romanticism, Ethics Problems in the Theory of Relativity, and Whitehead. Mead’s course in social psychology was a major influence on Chicago-trained sociologists, including Ellsworth Faris, Robert Park,

and Ernest Burgess. After he died, this course was taken over by Herbert Blumer, who would, in 1937, coin the phrase ‘symbolic interaction’ to describe Mead’s unique approach to social psychology. Five terms were central to Mead’s social psychology: gesture, symbol, self, mind, and act. Mead developed a theory of mind and self that did not regard these phenomena as antecedents to society and social behavior. He argued that the self arises out of the human’s ability to use language in a way that permits the perspective of the other to be entered into. The self consists of two parts, the ‘I’ and the ‘me.’ The ‘I’ is the subjective source of action, of the impulse to act in a situation. The ‘me’ consists of the attitudes of the other who is taken into account as the ‘I’ thinks and acts. The self is a social process involving constant symbolic interaction between the ‘I’ and the ‘me’ phases of experience. The self emerges through three stages of development called the play, the game, and the generalized other phases. In each phase, the person is better able to take the collective attitudes of the other toward their action in a concrete situation. Mind is a social process. It involves the use of gestures and symbols that call out in the person meanings that are shared with others. A triadic process structures the conferral of meaning on a thing: a gesture signifies action to another, it signifies what the person making the gesture plans to do, and it signifies what the person making the joint action that is to occur for both parties in the situation plans to do. Human experience is not a continuous flow but is organized in terms of social acts. Acts have three phrases. Acts arise out of problematic situations in which ongoing activity is blocked. Phase one is the perception of the problematic situation (e.g., ‘I am hungry’). Phase two is the manipulation phase. The person moves to resolve the problematic situation by acting on an object (e.g., preparing a meal). Phase three is consummation. The act is brought to completion (e.g., eating the meal). In each phase of the act, different phases of the self (the ‘I’ and the ‘me’) are experienced. Mead argued that the ‘‘supreme test of any presentphilosophy . . . must be found in its interpretation of experimental science’’ (Philosophy of the act, 1938: 505). By experimental science, he meant several things, including modern physics (as reworked by Whitehead), Darwinian naturalism, and Dewey’s instrumentalism (as given in Dewey’s Essays in experimental logic). Mead argued that a scientist must have a clearly formulated problem, or hypothesis, and a clear method of determining if a hypothesis is

Meaning Postulates 561

confirmed or disconfirmed. The scientist works through the three phases of the act in the resolution of a scientific problem. Scientists must have a procedure that does not allow them to enter into the situation in advance of experimental inquiry. Mead’s pragmatism, or scientific method, was a version of Dewey’s instrumentalism. Mead remains a central figure in contemporary social theory and social psychology. His work has influenced the theories of Ju¨ rgen Habermas, Anthony Giddens, and Anselm Straus, for example. Mead’s works are still topics of empirical investigation and philosophical and theoretical controversy (Blumer and Miller, 2004).

Bibliography Blumer H & Miller D L (2004). ‘On George Herbert Mead’s contributions to understanding human conduct.’ In Morrone T J (ed.) Herbert Blumer: George Herbert Mead and human conduct. Walnut Creek, CA: AltaMira. 109–154. Cook G A (1993). George Herbert Mead: the making of a social pragmatist. Urbana: University of Illinois Press. Joas H (1985). G. H. Mead: a contemporary re-examination of his thought. Cambridge: MIT Press. Mead G H (1938). The philosophy of the act. Chicago: University of Chicago Press. Miller D L (1973). George Herbert Mead: self, language, and the world. Austin: University of Texas Press.

Meaning Postulates K Allan, Monash University, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.

Carnap (1956: 225) proposed that meaning postulates are relative to a purpose, for instance: Suppose [a man constructing a system] wishes the predicates ‘Bl’ and ‘R’ to correspond to the words ‘black’ and ‘raven’. While the meaning of ‘black’ is fairly clear, that of ‘raven’ is rather vague in the everyday language. There is no point for him to make an elaborate study, based either on introspection or on statistical investigation of common usage, in order to find out whether ‘raven’ always or mostly entails ‘black.’ It is rather his task to make up his mind whether he wishes the predicates ‘R’ and ‘Bl’ of his system to be used in such a way that the first logically entails the second. If so, he has to add the postulate (P2) ‘(x) (Rx !Bl x)’ to the system, otherwise not.

Given P2 nothing can be a raven that is not black. If the system is a semantic metalanguage for English, then (according to Carnap) it is analytically true that (x[raven0 (x) ! black0 (x)] (to use a different, but equivalent formulation). One may dispute the choice of example (to be raven-haired is to have black hair, but an observed raven may be albino); yet the method for introducing nonlogical vocabulary into formal semantics seems appropriate. For instance it is generally agreed that something like (1) and (2) are valid: (1) 8x8y[kill0 (x,y) ! cause0 (x,(become0 (:(alive0 (y)))))]

(2) 8x[lx[bull0 (y) ^ animal0 (y)](x) ! male0 (x)]

The predicates kill0 , cause0 , bull0 , etc. are nonlogical vocabulary treated as semantic primitives. The stipulation of meaning postulates for any given language is problematic. There is no consensus over what constitute the set of semantic primitives for any language. There is the problem of correspondence between the metalanguage and the object language (e.g., between kill0 and kill). There is often a problem determining which the necessary semantic relations are and which are the contingent ones. For instance, one might question how the figurative killing of fires or conversations can be accommodated with (1). (2) is valid for papal bulls, male calves, male elephants, male whales, male seals, and male alligators; but how is it to be extended to incorporate the default reference to an adult bovine? Fodor (1998) and elsewhere has argued against meaning postulates (which he supported in Fodor et al., 1975). In part his argument is that lexical meanings are not compositional for words like kill and bull. But there is no reason to insist that the righthand side of the arrow in either (1) or (2) represents part of the semantic composition of the left-hand side. It simply shows a valid relation between structures containing some semantic primitives. Fodor also adopted Quine’s (1953) objections to analyticity and claims that no principled distinction can be drawn between meaning postulates and encyclopedic knowledge. Determining what is semantic and what encyclopedic is a knotty problem we cannot resolve here (see Allan, 2001 for a discussion). Horsey (2000) marshals several arguments against Fodor’s position. Meaning postulates remain a useful device for semantics.

Meaning Postulates 561

confirmed or disconfirmed. The scientist works through the three phases of the act in the resolution of a scientific problem. Scientists must have a procedure that does not allow them to enter into the situation in advance of experimental inquiry. Mead’s pragmatism, or scientific method, was a version of Dewey’s instrumentalism. Mead remains a central figure in contemporary social theory and social psychology. His work has influenced the theories of Ju¨rgen Habermas, Anthony Giddens, and Anselm Straus, for example. Mead’s works are still topics of empirical investigation and philosophical and theoretical controversy (Blumer and Miller, 2004).

Bibliography Blumer H & Miller D L (2004). ‘On George Herbert Mead’s contributions to understanding human conduct.’ In Morrone T J (ed.) Herbert Blumer: George Herbert Mead and human conduct. Walnut Creek, CA: AltaMira. 109–154. Cook G A (1993). George Herbert Mead: the making of a social pragmatist. Urbana: University of Illinois Press. Joas H (1985). G. H. Mead: a contemporary re-examination of his thought. Cambridge: MIT Press. Mead G H (1938). The philosophy of the act. Chicago: University of Chicago Press. Miller D L (1973). George Herbert Mead: self, language, and the world. Austin: University of Texas Press.

Meaning Postulates K Allan, Monash University, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.

Carnap (1956: 225) proposed that meaning postulates are relative to a purpose, for instance: Suppose [a man constructing a system] wishes the predicates ‘Bl’ and ‘R’ to correspond to the words ‘black’ and ‘raven’. While the meaning of ‘black’ is fairly clear, that of ‘raven’ is rather vague in the everyday language. There is no point for him to make an elaborate study, based either on introspection or on statistical investigation of common usage, in order to find out whether ‘raven’ always or mostly entails ‘black.’ It is rather his task to make up his mind whether he wishes the predicates ‘R’ and ‘Bl’ of his system to be used in such a way that the first logically entails the second. If so, he has to add the postulate (P2) ‘(x) (Rx !Bl x)’ to the system, otherwise not.

Given P2 nothing can be a raven that is not black. If the system is a semantic metalanguage for English, then (according to Carnap) it is analytically true that (x[raven0 (x) ! black0 (x)] (to use a different, but equivalent formulation). One may dispute the choice of example (to be raven-haired is to have black hair, but an observed raven may be albino); yet the method for introducing nonlogical vocabulary into formal semantics seems appropriate. For instance it is generally agreed that something like (1) and (2) are valid: (1) 8x8y[kill0 (x,y) ! cause0 (x,(become0 (:(alive0 (y)))))]

(2) 8x[lx[bull0 (y) ^ animal0 (y)](x) ! male0 (x)]

The predicates kill0 , cause0 , bull0 , etc. are nonlogical vocabulary treated as semantic primitives. The stipulation of meaning postulates for any given language is problematic. There is no consensus over what constitute the set of semantic primitives for any language. There is the problem of correspondence between the metalanguage and the object language (e.g., between kill0 and kill). There is often a problem determining which the necessary semantic relations are and which are the contingent ones. For instance, one might question how the figurative killing of fires or conversations can be accommodated with (1). (2) is valid for papal bulls, male calves, male elephants, male whales, male seals, and male alligators; but how is it to be extended to incorporate the default reference to an adult bovine? Fodor (1998) and elsewhere has argued against meaning postulates (which he supported in Fodor et al., 1975). In part his argument is that lexical meanings are not compositional for words like kill and bull. But there is no reason to insist that the righthand side of the arrow in either (1) or (2) represents part of the semantic composition of the left-hand side. It simply shows a valid relation between structures containing some semantic primitives. Fodor also adopted Quine’s (1953) objections to analyticity and claims that no principled distinction can be drawn between meaning postulates and encyclopedic knowledge. Determining what is semantic and what encyclopedic is a knotty problem we cannot resolve here (see Allan, 2001 for a discussion). Horsey (2000) marshals several arguments against Fodor’s position. Meaning postulates remain a useful device for semantics.

562 Meaning Postulates See also: Formal Semantics; Ideophones; Metalanguage versus Object Language; Semantic Primitives.

Bibliography Allan K (2001). Natural language semantics. Oxford & Malden, MA: Blackwell. Carnap R (1956). Meaning and necessity (2nd edn.). Chicago: University of Chicago Press. [First published 1947.] Dowty D R, Wall R E & Peters S (1981). Introduction to Montague semantics. Dordrecht: Reidel.

Fodor J A (1998). Concepts: where cognitive science went wrong. Oxford: Oxford University Press. Fodor J D, Fodor J A & Garrett M F (1975). ‘The psychological unreality of semantic representations.’ Linguistic Inquiry 6, 515–531. Horsey R (2000). ‘Meaning postulates and deference.’ UCL Working Papers in Linguistics 12, 45–64. Quine W V O (1961). From a logical point of view: 9 logico-philosophical essays. Cambridge, MA: Harvard University Press. [First published 1953.]

Meaning, Procedural and Conceptual D Blakemore, University of Salford, Greater Manchester, UK ! 2006 Elsevier Ltd. All rights reserved.

The distinction between procedural and conceptual meaning resulted from the attempt to reanalyze Grice’s (1989) notion of conventional implicature in relevance theoretic terms, or, more generally, from the attempt to provide a cognitive reanalysis of the distinction drawn between truth conditional and non-truth conditional meaning (see Grice, 1989; see Grice, Herbert Paul (1913–1988); Implicature). However, soon after its introduction (Blakemore, 1987), it was shown that the relevance theoretic distinction was not coextensive with the distinction between truth conditional and nontruth conditional meaning, and thus represented a departure from the approach to semantics that underlies Grice’s notion of conventional implicature. While there is a range of phenomena that can be analyzed in terms of the conceptual–procedural distinction, the emphasis in pragmatics has mostly centered on its application to the analysis of so-called ‘discourse markers’ (see Discourse Markers). It has to be said, however, that the term procedural is not always used in the same way, and that the distinction between conceptual and procedural encoding is not always drawn within a relevance theoretic framework. In argumentation theory, any expression which has an argumentative function is said to encode procedural meaning (Moeschler, 1989). According to this criterion, because is procedural; however, it is not procedural according to the relevance theoretic definition outlined below. Fraser (1996, 1998) has used the term procedural to describe the meaning of any expression with an indicating function, where indicating is intended in its speech act theoretic sense, in which it contrasts with saying

or describing. Because the relevance theoretic distinction is not coextensive with the traditional distinction between truth conditional and nontruth conditional meaning, it is not surprising that there are expressions which are procedural according to Fraser but conceptual according to the relevance theoretic definition – e.g., as a result. This article will focus on the relevance theoretic distinction, showing, on the one hand, how it derives from a relevance theoretic view of linguistic semantics, and on the other, how the principle of relevance provides an explanation for the fact that languages have developed means for encoding procedures (see Relevance Theory). It will then turn to the way in which procedural encoding has been applied in the analysis of discourse markers or connectives, and the questions this type of analysis raises.

Relevance Theoretic Semantics Grice’s (1989) notion of conventional implicature was a means of maintaining a definition of what is said in which linguistic meaning coincides with truth conditional content. While he accepts that the suggestion carried by therefore in (1) is linguistically encoded, he does not wish to allow that in his ‘‘favoured sense of ‘say,’ one who utters (1) would have said that Bill’s being courageous follows from his being a philosopher’’ (1989: 21): (1) Bill is a philosopher and he is, therefore, brave.

Grice’s solution to the problem posed by nontruth conditional linguistic meaning was to modify his definition of what is said so that it applied only to the performances of so-called central speech acts, and not to the performances of noncentral speech acts indicated by expressions such as therefore (see Speech Acts; Speech Acts and Grammar).

562 Meaning Postulates See also: Formal Semantics; Ideophones; Metalanguage versus Object Language; Semantic Primitives.

Bibliography Allan K (2001). Natural language semantics. Oxford & Malden, MA: Blackwell. Carnap R (1956). Meaning and necessity (2nd edn.). Chicago: University of Chicago Press. [First published 1947.] Dowty D R, Wall R E & Peters S (1981). Introduction to Montague semantics. Dordrecht: Reidel.

Fodor J A (1998). Concepts: where cognitive science went wrong. Oxford: Oxford University Press. Fodor J D, Fodor J A & Garrett M F (1975). ‘The psychological unreality of semantic representations.’ Linguistic Inquiry 6, 515–531. Horsey R (2000). ‘Meaning postulates and deference.’ UCL Working Papers in Linguistics 12, 45–64. Quine W V O (1961). From a logical point of view: 9 logico-philosophical essays. Cambridge, MA: Harvard University Press. [First published 1953.]

Meaning, Procedural and Conceptual D Blakemore, University of Salford, Greater Manchester, UK ! 2006 Elsevier Ltd. All rights reserved.

The distinction between procedural and conceptual meaning resulted from the attempt to reanalyze Grice’s (1989) notion of conventional implicature in relevance theoretic terms, or, more generally, from the attempt to provide a cognitive reanalysis of the distinction drawn between truth conditional and non-truth conditional meaning (see Grice, 1989; see Grice, Herbert Paul (1913–1988); Implicature). However, soon after its introduction (Blakemore, 1987), it was shown that the relevance theoretic distinction was not coextensive with the distinction between truth conditional and nontruth conditional meaning, and thus represented a departure from the approach to semantics that underlies Grice’s notion of conventional implicature. While there is a range of phenomena that can be analyzed in terms of the conceptual–procedural distinction, the emphasis in pragmatics has mostly centered on its application to the analysis of so-called ‘discourse markers’ (see Discourse Markers). It has to be said, however, that the term procedural is not always used in the same way, and that the distinction between conceptual and procedural encoding is not always drawn within a relevance theoretic framework. In argumentation theory, any expression which has an argumentative function is said to encode procedural meaning (Moeschler, 1989). According to this criterion, because is procedural; however, it is not procedural according to the relevance theoretic definition outlined below. Fraser (1996, 1998) has used the term procedural to describe the meaning of any expression with an indicating function, where indicating is intended in its speech act theoretic sense, in which it contrasts with saying

or describing. Because the relevance theoretic distinction is not coextensive with the traditional distinction between truth conditional and nontruth conditional meaning, it is not surprising that there are expressions which are procedural according to Fraser but conceptual according to the relevance theoretic definition – e.g., as a result. This article will focus on the relevance theoretic distinction, showing, on the one hand, how it derives from a relevance theoretic view of linguistic semantics, and on the other, how the principle of relevance provides an explanation for the fact that languages have developed means for encoding procedures (see Relevance Theory). It will then turn to the way in which procedural encoding has been applied in the analysis of discourse markers or connectives, and the questions this type of analysis raises.

Relevance Theoretic Semantics Grice’s (1989) notion of conventional implicature was a means of maintaining a definition of what is said in which linguistic meaning coincides with truth conditional content. While he accepts that the suggestion carried by therefore in (1) is linguistically encoded, he does not wish to allow that in his ‘‘favoured sense of ‘say,’ one who utters (1) would have said that Bill’s being courageous follows from his being a philosopher’’ (1989: 21): (1) Bill is a philosopher and he is, therefore, brave.

Grice’s solution to the problem posed by nontruth conditional linguistic meaning was to modify his definition of what is said so that it applied only to the performances of so-called central speech acts, and not to the performances of noncentral speech acts indicated by expressions such as therefore (see Speech Acts; Speech Acts and Grammar).

Meaning, Procedural and Conceptual 563

However, as Sperber and Wilson (1995) and Carston (2002) have shown, the assumption that linguistic meaning coincides with the truth conditional content of the utterance cannot be justified, because the propositional content of utterances is underdetermined by their linguistic properties. While the thoughts communicated by utterances may have truth conditions, it cannot be said that linguistic form encodes thoughts. Linguistic meaning is an input to the pragmatically constrained inferential processes that use contextual information to deliver thoughts. The question for linguistic semantics, in this view, is not about the relationship between language and the world but about the relationship between linguistic form and the inferential processes that deliver thoughts (see Pragmatics and Semantics). The distinction between conceptual and procedural encoding derives from the argument that if utterance interpretation involves the construction and inferential manipulation of propositional representations, then it is reasonable to expect two answers to this question. On the one hand, linguistic form can encode the constituents of the conceptual representations that take part in inferential computations, and on the other, it can encode information that makes particular kinds of computations salient. Consider (2a) and (2b), where (2b) could be interpreted as a contextual implication derived in an inference that has (2a) as a premise, or as a premise that has (2a) as a conclusion.

The claim that linguistic meaning can encode procedures is the claim that there are linguistic expressions e.g., – so or after all in this case – which guide the hearer to the inferential procedure that yields the intended interpretation.

amount of processing effort required for the derivation of the intended cognitive effects, this means that the use of an expression that encodes a procedure for identifying the intended interpreted would be consistent with the speaker’s aim of identifying relevance for a minimum processing cost. Traugott and Dasher (2002) have argued that languages tend to develop procedural constraints on interpretation out of their existing conceptual resources. However, although they agree that there is a distinction between procedural and conceptual (or, as they call it, ‘contentful’) meaning, their account of how meaning change takes place is based on the assumptions of cognitive semantics and neo-Gricean pragmatics rather than on relevance theoretic pragmatics, and it is not clear that the distinction is drawn in the same way (see Cognitive Linguistics; Neo-Gricean Pragmatics). Moreover, their account assumes that meaning, changes results from the conventionalization of pragmatic inferences, where this is unpacked in (Gricean) terms of the conventionalization of the relation between a linguistic form and the proposition or propositions derived by inference. A relevance theoretic account would approach this process in terms of the conventionalization of an inferential routine or process. If procedural meaning develops from existing conceptual resources, then it would not be surprising for an expression to encode both a concept and a procedure. Although it has been argued that this possibility has been ruled out in relevance theory, it is consistent with relevance theoretic assumptions, and there have been relevance theoretic analyses (e.g., Blakemore’s 1987 analysis of but, Nicolle’s 1997 and 1998 account of be going to, and Wilson’s 2004 analysis of expressions such as few and several) where it is argued that a single form may encode two types of meaning.

Why Languages Develop Procedural Encoding

The Conceptual–Procedural Distinction and Conventional Implicature

It has been argued that the fact that languages have developed expressions which constrain inferential procedures can be explained within relevance theory in terms of the communicative principle of relevance (Sperber and Wilson, 1995). According to this principle, a hearer who recognizes that a speaker has deliberately communicated with her is entitled to assume not just that the utterance is relevant enough (in terms of effects and effort) to be worth her attention, but that it is the most relevant utterance the speaker could have produced, given her interests and abilities. Because the degree of relevance is affected by the

While the notion of procedural encoding was developed as a means of analyzing expressions which encode constraints on the recovery of implicit content – e.g., therefore, so, after all – subsequent investigation suggested that it could be extended to expressions that encode constraints on the recovery of explicit content. Some of these – mood indicators, attitudinal particles – are analyzed as encoding constraints on the inferential processes involved in the recovery of higher-level explicatures. For example, Wilson and Sperber (1993) suggest that the use of huh in (3) encourages the hearer to construct the

(2a) Tom will be late. (2b) He is coming from London.

564 Meaning, Procedural and Conceptual

higher-level explicature in (4) (see also Wilson and Sperber, 1988; Clark, 1993): (3) Peter’s a genius, huh! (4) The speaker of (3) doesn’t think that Peter is a genius.

In this case, the equation between procedural meaning and nontruth conditional meaning is maintained, because higher-level explicatures are not regarded as contributing to truth conditional content. However, it has also been suggested that there are expressions, notably pronouns, which should be analyzed as constraints on the proposition expressed, and thus contribute to truth conditional content. At the same time, it has been argued (Wilson and Sperber, 1993; Ifantidou-Trouki, 1993; Blakemore, 1990, 1996) that expressions such as sentence adverbials or parentheticals which, although they do not contribute to truth conditional content, must be analyzed as encoding concepts (see Parentheticals). This means that the procedural–conceptual distinction cannot be coextensive with the distinction that underlies Grice’s notion of conventional implicature. If Carston’s (2002) conception of linguistic semantics is right, then the fundamental distinction must be the conceptual–procedural distinction rather than the distinction between truth conditional and nontruth conditional meaning. This means that linguistic semantics must include a means of distinguishing conceptual from procedural meaning. Within relevance theory, attention has been drawn to properties that distinguish expressions that encode procedures from those that encode concepts. First, in contrast with concepts, procedures cannot be brought to consciousness (Wilson and Sperber, 1993: 16). This explains why even native speakers find it difficult to judge whether expressions which encode procedures – e.g., but and however in English or dakara and sorede in Japanese – are synonymous without testing their intersubstitutability in all contexts. In contrast, even when the definition of a concept proves controversial, speakers can say whether two expressions encode the same concept without having to test whether they can be substituted for each other in every context. If it is difficult for native speakers to make synonymy judgments, then it is not surprising that the translation of expressions that encode procedures is notoriously difficult, particularly since languages do not necessarily conventionalize the same inferential routines. However, the translation of procedural meaning has yet to be investigated systematically. Second, while expressions that encode concepts can be semantically complex, expressions that encode

procedures cannot combine with other expressions to produce semantically complex expressions. Compare (5) with (6): (5) In total, absolute confidence, she has been promoted. (6) Sue likes red wine. *Totally however, Mary drinks beer.

Rouchota (1998) has shown that while expressions that have been analyzed as encoding procedures can combine in some way, they do not combine to form complex discourse markers. Compare (7) with (5): (7) Sue fell asleep during the lecture. But after all, she had heard it all before.

Procedural Analyses of Discourse Markers The notion of procedural encoding has been applied to the analysis of a range of nontruth conditional discourse markers in a variety of languages (e.g., Blass, 1990; Itani, 1993; Unger, 1996; Matsui, 2002; Iten, 2000; Blakemore, 2002). However, this work has suggested that the original notion is neither sufficiently fine-grained to capture the differences between closely related but nonintersubstitutable discourse markers (e.g., but and however or dakara and sorede), nor sufficiently broad to capture all the ways in which linguistic form can encode information about the inferential computations involved in the interpretation of utterances. As it is defined by Blakemore (1987), procedural encoding is linked to the three cognitive effects defined within relevance theory: contextual implication, strengthening, and elimination. In order to account for the differences between expressions such as but and however, it has been argued that the notion of procedural encoding must be broadened to include the activation of particular types of contexts. Moreover, it has been suggested that the meanings of some discourse markers (e.g., well) may not necessarily be linked to the activation of cognitive effects at all (see Blakemore, 2002).

Future Directions Procedural encoding has also played a role in the analysis of the role of intonation in interpretation (see Fretheim, 2002; House, 2004). If, as Gussenhoven (2002) has argued, aspects of intonation have become grammaticalized so that certain pitch contours encode arbitrary meanings, then it is plausible that these meanings should be analyzed in terms of

Meaning, Procedural and Conceptual 565

instructions for interpretation. However, Wharton’s (2003a, 2000b) work on natural codes suggests it may also be possible to generalize the notion of procedural meaning to accommodate phenomena that are on the borderline of language (e.g., interjections) as well as natural or paralinguistic aspects of intonation (see Interjections). Research in this area has yet to be developed, but it seems clear that the scope of procedural encoding extends beyond the analysis of the non-truth conditional discourse markers. See also: Approaches to Translation: Relevance Theory;

Cognitive Linguistics; Discourse Markers; Grice, Herbert Paul (1913–1988); Implicature; Interjections; Neo-Gricean Pragmatics; Parentheticals; Pragmatics and Semantics; Relevance Theory; Speech Acts and Grammar; Speech Acts.

Bibliography Blakemore D (1987). Semantic constraints on relevance. Oxford: Blackwell. Blakemore D (1990). ‘Performatives and parentheticals.’ Proceedings of the Aristotelian Society 91.3, 197–213. Blakemore D (1996). ‘Are apposition markers discourse markers?’ Journal of Linguistics 32, 325–347. Blakemore D (2002). Relevance and linguistic meaning: the semantics and pragmatics of discourse markers. Cambridge, UK: Cambridge University Press. Blass R (1990). Relevance relations in discourse: a study with special reference to Sissala. Cambridge: Cambridge University Press. Carston R (2002). Thoughts and utterances: the pragmatics of explicit communication. Oxford: Blackwell. Clark B (1993). ‘Relevance and pseudo-imperatives.’ Linguistics and philosophy 16, 79–121. Fraser B (1996). ‘Pragmatic markers.’ Pragmatics 6, 167–190. Fraser B (1998). ‘Contrastive markers in English.’ In Jucker A & Ziv Y (eds.) Discourse markers: descriptions and theories. Amsterdam: John Benjamins. 301–326. Fretheim T (2002). ‘Intonation as a constraint on inferential processing.’ In Proceedings of Speech Prosody 2002, University of Aix-en-Provence.

Grice H P (1989). Studies in the way of words. Cambridge, MA: Harvard University Press. Gussenhoven C (2002). ‘Intonation and interpretation: phonetics and phonology.’ In Proceedings of Speech Prosody 2002, University of Aix-en-Provence, 47–57. House J (2004). ‘Constructing a context with intonation.’ Paper read at the 6th NWCL International Conference, Prosody and Pragmatics, Preston, UK. Ifantidou-Trouki E (1993). ‘Sentential adverbs and relevance.’ Lingua 90, 69–90. Itani R (1993). ‘The Japanese particle ka: a relevance theoretic approach.’ Lingua 90, 129–147. Iten C (2000). ‘Non-truth conditional’ meaning, relevance and concessives. Ph.D. thesis, University of London. Matsui T (2002). ‘Semantics and pragmatics of a Japanese discourse marker: dakara.’ Journal of Pragmatics 34.7, 867–889. Moeschler J (1989). Argumentation, relevance and discourse. Argumentation (3.3). Paris: Herme`s. Nicolle S (1997). ‘A relevance theoretic account of be going to.’ Journal of Linguistics 33, 355–377. Nicolle S (1998). ‘A relevance theoretic perspective on grammaticalization.’ Cognitive Linguistics 9, 1–35. Rouchota V (1998). ‘Procedural meaning and parenthetical discourse markers.’ In Jucker A & Ziv Y (eds.) Discourse markers: descriptions and theories. Amsterdam: John Benjamins. 301–326. Sperber D & Wilson D (1995). Relevance: communication and cognition. Oxford: Blackwell. Traugott E & Dasher R (2002). Regularity in semantic change. Cambridge: Cambridge University Press. Unger C (1996). ‘The scope of discourse connectives: implications for discourse organization.’ Journal of Linguistics 32, 403–438. Wharton T (2003a). ‘Interjections, language and the ‘showing-saying’ continuum.’ Pragmatics and Cognition 11.1, 39–91. Wharton T (2003b). ‘Natural pragmatics and natural codes.’ Mind and Language 18.4, 447–477. Wilson D (2004). ‘Relevance and argumentation theory.’ Paper delivered at Pragmatic Interfaces, Geneva. Wilson D & Sperber D (1988). ‘Mood and the analysis of non-declarative sentences.’ In Dancy J, Moravcsik J & Taylor C (eds.) Human agency: language, duty and value. Stanford, CA: Stanford University Press. Wilson D & Sperber D (1993). ‘Linguistic form and relevance.’ Lingua 90, 1–25.

566 Meaning, Sense, and Reference

Meaning, Sense, and Reference D J Cunningham, Indiana University, Bloomington, IN, USA ! 2006 Elsevier Ltd. All rights reserved.

Attempts to define and/or construct a theoretical account of meaning can be found throughout the arts, sciences, and humanities. Dewey and Bentley (1949: 297) contended that the word ‘meaning’ is ‘‘so confused that it is best never used at all.’’ As our task is to characterize meaning from a semiotic point of view, perhaps we can limit rather than contribute to this confusion. Consider the following scenarios: . A college student hears her professor use an unfamiliar word and looks it up in her dictionary . A pedestrian carrying a white cane approaches a busy intersection, pauses, and then walks into the street, whereupon the traffic yields . A cat races to the kitchen at the sound of an electric can opener . A doctor sees a dark area on X-ray photograph of the lung of a patient who has been coughing up blood . An anthropologist observes that among Nestlik Native Americans, sons have a more familiar relationship with their maternal uncle than with their father . A bicyclist rides past a dog that suddenly raises its head, arches its neck, and bares its teeth . A husband notices that his wife has suddenly stopped showing any physical affection toward him . A high school history teacher asks her students to analyze the St. Crispin’s day speech from Shakespeare’s Henry V as part of their unit on war in England . A three-year-old child points to the picture of her absent father and says ‘‘Daddy!’’ . A concert-goer unexpectedly begins to weep while listening to Barber’s ‘Adagio for strings’. Within semiotics, these scenarios are often treated as having something very important in common despite their many obvious differences (context, organism, medium, etc.). In one sense, these scenarios all describe an effort after meaning, of the word, cane, sound, shadow, etc. More broadly, they are all examples of semiosis or sign action. In studying these scenarios, in ascertaining the meaning of the word, cane, sound and so forth, we must ask a series of questions: What is a sign? How are these signs organized? How are these signs related to other signs and sign systems?

Since the topic of meaning is so deeply embedded in language and linguistics in general and semiotics in particular, other entries in this encyclopedia have already implicitly or explicitly dealt with the topic. Here, we will try to particularize the meaning of meaning for semiotics by examining some of the distinctions that have been drawn in past attempts to characterize it.

Meanings of Meaning In their classic book Meaning of meaning, Ogden and Richards (1956: 185–208) compiled a ‘representative list’ of 16 definitions of the word ‘meaning,’ and several more can be discerned in their full discussion. Included in their list are the familiar meanings, such as a dictionary entry, the feelings or images aroused, an intention (as in ‘I meant no harm.’), something of significance (as in ‘Religion adds meaning to my life.’), and so forth. To sort through these and many other definitions, it will be helpful to review some of the distinctions raised in previous philosophical analyses. Many of these accounts focus on the meaning of words and statements, that is, semantics, and one of our tasks here will be to demonstrate how semiotics enables us to look beyond language signs. Frege (1892) is usually credited with distinguishing between meaning as sense and meaning as reference. Even in the formal language of mathematics, such a distinction seems fruitful. For example, in the equations 2 þ 2 ¼ 4 and 3 þ 1 ¼ 4, both 2 þ 2 and 3 þ 1 refer to the same object, 4, but in different ways. We can say that 2 þ 2 and 3 þ 1 are the same in that they refer to the object 4, but, of course, they are also different than 4 (as sums) and from each other (as two different sums). They are two expressions or senses of the same thing – there is more information contained in them than in the tautology 4 ¼ 4. Frege’s most famous example is that the expressions ‘morning star’ and ‘evening star’ both refer to the same celestial object (the planet Venus), but in a different sense (as seen in the morning before sunrise or in the evening after sunset). Meaning as reference tells us that a thing is (That object in the sky there is the planet Venus) while meaning as sense tells us what a thing is (If there is a bright star in the sky right after sunset, it is the evening star). Meaning as reference proposes an identity, while meaning as sense proposes interpretation. According to Frege, ‘‘By means of a sign, we express [the object’s] sense and designate its reference’’ (1892). Other related accounts of meaning make a distinction similar to Frege’s including Carnap on intension (sense) vs. extension (reference) and J. S. Mill on

Meaning, Sense, and Reference 567

connotation (sense) vs. denotation (reference). While there are subtle and sometimes not so subtle differences between these accounts and Frege’s and among themselves, it is beyond our purposes here to compare them. We will use the sense/reference distinction to examine semiotic models of meaning.

Meaning and Semiotics As stated above, the topic of meaning is embedded deeply within semiotics in general and the concept of sign in particular. Here we will briefly describe how the two major founders of semiotics, Saussure and Peirce, have dealt with meaning. Ferdinand de Saussure and Structuralism

Saussure, it will be remembered, regarded the sign as constituted by a signifier (or sound image in the case of speech) and a signified (a concept or idea). Thus the spoken word tree is a language sign made up of a signifier (the ‘psychological imprint of the sound’) for the mental concept Tree, the signified. Saussure sometimes spoke of the relationship between signifier and signified as signification. Since the signified and signifier are both mental concepts, the sign they constitute has no necessary relationship to a reference. In fact, one of the hallmarks of Saussure’s semiology is that the relationship between a sign and its object is arbitrary, established by human social convention. Thus the sign tree is a language sign in English solely because it has been adopted by use as such, just as the sign arbor has been adopted in use by Frenchspeaking communities. Meaning is, therefore, a mentalistic concept for Saussure, arising not from the links between signs and the real objects to which they refer, but from the interplay between and relationships that arise from the action of signs. In short, meaning arises from the structure of signs. As a linguist, Saussure naturally emphasized language signs and structures. He defined language as a ‘‘system of interdependent terms in which the value of each term results solely from the simultaneous presence of others . . ..’’ (Saussure, 1959: 116). Language signs get their meaning within the structure as a whole, in the relationships between one sign and the others. These relationships are particularly determined by differences. Just as at the phonemic level of a language where some sounds are recognized as different and others are not (e.g., the initial sounds of tin and kin are regarded as different in English whereas those of coal and call are not), language signs also get their meaning in terms of recognized differences with other language signs. It is therefore the structure of the whole that determines meaning, not reference to independently existing objects.

It is more than just the structures of a language that determine meaning, however. In addition, a language also includes rules for distinguishing one sign from another and for relating signs to one another. These rules, called syntax, allow us to combine and recombine signs in a potentially indefinite number of ways. Without syntax, a structure of signs is static. With syntax, it is possible to manipulate signs and their structure so that new meanings are brought to light or new possibilities for a structure can be constructed. We can therefore build possible structures, meanings that may have no embodiment outside of our semiosis. For example, without a system of signs and syntax to manipulate them, we could not build the possible worlds of the past or the future. The past and the future do not exist in our immediate experience, but we can construct the past and future and act upon their meanings in language signs using syntax. Following Saussure, a number of scholars expanded this notion beyond language signs per se to all manner of cultural phenomena. Work in this vein is often called structuralism (see Hawkes, 1977 for a useful summary). Many of these analyses explicitly talk of signs in terms of signifiers and signified organized into a language-like structure and organized as if they embody a grammar. So we have semiotic analyses of the ‘language’ of gesture, dance, clothing, cars, advertisements, professional wrestling, circus, nearly anything you can imagine. Claude LeviStrauss (1966) dedicated his career to demonstrating that there are underlying systems of structure and meaning behind such dazzlingly varied and complex social systems as kinship patterns (as in scenario 5 above) and myth. Roland Barthes (1964) took all of culture as his domain as he mapped out systems of connotation and denotation for such objects as cars, food, and clothing. There are many other examples, of course. One important outcome of these analyses is that once you have uncovered the underlying structure for a particular phenomenon, manipulation of the syntax can create new forms of meaning that may not have emerged already; for example, structuralist analyses of advertisements can lead to a reversal or even a mocking of the dominant structures. Mystery stories usually keep the reader in the dark about the perpetrator of the crime until the last chapter, but the television detective show Columbo was very popular, in part, because the culprit was revealed at the beginning. Structuralist models tend to suggest there is a single meaning to a text, a notion attacked forcefully by poststructuralists such as Derrida (1984). The problem arises from the signifier–signified relationship itself. The meaning of a text cannot reside in the

568 Meaning, Sense, and Reference

text but only in the signs that represent the text. As such, the signs are only a part of the potential meanings of any text, and alternate relations lead to alternative meanings. To speak of the meaning of a text is nonsense; there will be as many meanings as there are contexts, potential structures of signs to interpret it. But these meanings are nonetheless related to the network of relationships mapped out by the sign system in question. The relation, as Eco (1984) showed, is one of inference, where: To walk, to make love, to sleep, to refrain from doing something, to give food to someone else, to eat roast beef on Friday – each is either a physical event or the absence of a physical event, or a relation between two or more physical events. However, each becomes an example of good, bad, or neutral behavior within a given philosophical framework. (1984: 10, emphasis his) C. S. Peirce

Charles S. Peirce was originally drawn to the study of signs by his search for a model of meaning valid for scientific inquiry. For Peirce, hypotheses are signs, inferred from the world of experience to give meaning to aspects of that world. Imagine, for example, stumbling across some bit of experience that, in itself, is quite puzzling and surprising (e.g., scenario 7 above). Yet, if that bit of experience is treated not for its unique properties, but as an example or case or sign of some rule of experience in action (e.g., perhaps the medication she is taking for high blood pressure is reducing her libido), then it is transformed into an ordinary affair. Couple this notion with Peirce’s assumption that the things of the world of experience that we claim to be true are at best only more or less plausible and therefore only meaningful hypotheses, and we can begin to understand why the concept of meaning plays such a central role in his theory, why he came to equate logic and semiotics. That signs can be both the product of and further source of inference opens semiotic inquiry in quite a different direction from the structuralist approaches described earlier, to the processes of sign use within organisms: semiosis. A sign, according to Peirce is ‘‘something which stands to somebody for something in some respect or capacity’’ (2.228 – As is common in Peircean scholarship, quotes and citations will be identified by volume and paragraph number from Peirce [1931– 1958]). The sign stands for something, the object, by linking it to an interpretant, an additional sign that stands for some aspect of the object. All three elements, sign, object, and interpretant, are necessary for sign process to occur and are not decomposable into dyads. Thus, Peirce’s conception of the sign includes both its sense and its reference, but not as

separate components. A sign is only an incomplete representation of the object or referent. Only certain aspects of the object are represented, and it is these aspects that come to define the interpretant, the sense of the sign process. Different signs may represent different aspects of the object and thereby produce different senses. Additionally, signs have aspects that are not relevant to the object (that may be characteristics of the sign as something in the world of experience, but not of the object) and that can produce additional, different interpretants. In other words, our experience of the world is mediated through signs and can never, therefore, be isomorphic with the objects of the world. In essence, we create our world of experience by creating and using signs as we interact with objects in our environment. Meaning emerges in the action of signs, also called semiosis. The varieties of meaning that are possible are due, in large part, to the complex interplay of sign, object, and interpretant. This interplay is seen clearly in Peirce’s sophisticated analysis of signs and sign process. Perhaps his most famous derivation (2.254–2.264) was to propose three trichotomies of signs (or aspects of signs). In brief, these are (1) the nature of the sign itself, (2) its relation to the object it represents, and (3) its relation to the effect it produces (i.e., to its interpretant). In the first trichotomy, the sign itself can be a quality, a single individual thing, or a general type. Peirce labeled these three sign aspects as qualisign, sinsign and legisign, respectively. The second trichotomy is perhaps the best known. A sign can represent an object by resembling it (icon), by being existentially connected to it (index), or by a general rule (symbol). The third trichotomy considers the manner in which the sign relates to its interpretant. The sign can lead to the effect of possibility (rheme), of actual fact or existence (dicent), or of formal law (argument). These trichotomies can be combined to identify additional aspects of sign process. For reasons too complicated to go into detail here, when the three trichotomies outlined above are combined, 10 classes of signs can be identified: Rhematic Iconic Qualisign, Rhematic Iconic Sinsign, Rhematic Iconic Legisign, Rhematic Indexical Sinsign, Rhematic Indexical Legisign, Rhematic Symbolic Legisign, Dicent Indexical Sinsign, Dicent Indexical Legisign, Dicent Symbolic Legisign, and Argument Symbolic Legisign. But the potential varieties of signs do not end here. Peirce estimated that many thousands of sign types could be identified (8.343) but left their identification to ‘future explorers.’ What is important to note for our purposes of understanding Peirce’s concept of meaning is that the distinctions raised by sign types only identify aspects of signs, not isolated pure categories. These

Meaning, Sense, and Reference 569

aspects emerge in the action of signs or semiosis, that is, in the quest for meaning. In the perpetual action of semiosis, only certain emphases can be noted, only certain tendencies about the force of the sign action can be identified as the process of semiosis spreads from sign to sign. Sign action is in no way limited to any single triad but spreads throughout a network of interpretants, a process characterized by Eco (1984) as unlimited semiosis. A second point is a derivative of the first. Semiosis is the action of signs, not of a person. Certainly a person is an essential element in any human semiosis, but the action involves all three elements: the sign, object, and its effect. The interpretant could be an action, feeling, or thought of a person (Houser, 1987), but in its most general sense, the interpretant could be any effect, including additional sign action – a new triad of sign, object, and interpretant. Alternatively, a new aspect of sign process could emerge in a particular context, as when, for example, the iconic aspect of a symbol (e.g., the shape of a word heretofore treated as a symbol) comes to the foreground. For Peirce, meaning is the ‘‘proper significate effect of a sign’’ (5.475). Consider the following example. Suppose our student in scenario 1 above is looking up the word ocelot that she heard her professor use in her Biology class. In her trusty Merriam-Webster, she finds ‘one of a family (Felidae) of feral, carnivorous, usually solitary and nocturnal mammals.’ Not knowing the meaning of Felidae, she looks up the word to discover that the ocelot is a variety of cat. This discovery immediately links her to all of the prior knowledge, prior semiosic structures that she constructed from her experiences (feelings, thoughts, and actions) with cats. Returning to the original definition, she looks up other words she does not understand. Under carnivorous, she reads ‘subsisting or feeding on animal tissue; rapacious.’ Never having come across the word rapacious before, she looks it up to read ‘excessively grasping or covetous; ravenous or voracious.’ As she moves from word to word, from interpretant to interpretant, new meanings are attached to her original one. She is slowly developing her structure for ocelot. We can also move beyond words. Our dictionary could include a drawing of an ocelot. Her professor could mime the actions of an ocelot in stalking, crouching, and springing upon its prey. The growl could be recorded and played in the class. It is in this spread of signs, in the potential for unlimited semiosis, that meaning is to be found. According to Peirce, ‘‘a sign is not a sign unless it translates itself into another sign in which it is more fully developed’’ (5.594). Interpretants become signs for additional semiosis, generating additional interpretants that link to additional objects that in turn

become signs for additional semiosis. Of course, this spread does not continue indefinitely. Eventually, it must resolve itself into a set of structures with which the world as we experience it is meaningful to us. Peirce referred to this set of structures as ‘beliefs.’ The concept of belief is key in Peirce’s view, and he spoke of a semiosis in general and meaning is particular as a movement toward ‘fixing’ a belief. The converse of belief is doubt, and Peirce was very explicit in drawing the distinction between them: We generally know when we wish to ask a question and when we wish to pronounce a judgment, for there is a dissimilarity between the sensation of doubting and that of believing. But this is not all that distinguishes doubt from belief. There is a practical difference. Our beliefs guide our desires and shape our actions . . . Doubt is an uneasy and dissatisfied state from which we struggle to free ourselves and pass into the state of belief; while (belief) is a calm and satisfactory state which we do not wish to avoid, or to change to a belief in anything else. On the contrary, we cling tenaciously, not merely to believing, but to believing just what we do believe. (5.370–5.372)

Peirce called such doubt ‘genuine doubt.’ As such, it is situated in our experience, not a methodological strategy as in Descartes’ use of doubt. Doubt arises when the structures we have created, our current beliefs, do not account for some experience, when the character of signs does not fit our understanding. Peirce proposed four methods of resolving doubt and fixing beliefs: tenacity, authority, a priori, and experiment. Briefly, tenacity is invoked whenever one holds on to beliefs in the face of doubt and asserts that the beliefs will eventually accommodate the doubtful event. We use the method of authority to fix beliefs when we accept the beliefs of authority figures like teachers or scientists. Nowhere is the method of authority more widely used, and abused, than in the field of education. The a priori method is invoked when our beliefs change in the context of already existing structure of beliefs, a conceptual coherence to a worldview that has served us well so far. The three methods described so far all resolve doubt by opinion, stubbornly maintained, taken from others, or reasoned from premises. The fourth method, which Peirce preferred, is the method of experiment, where one seeks to remove doubt by collecting observations, generating potential hypotheses to account for the surprising experience, and reaching a conclusion based upon the interplay of inferential processes. Inference, in fact, is implicated in all the methods of resolving doubt. Elsewhere (e.g., 5.145) Peirce describes three modes of inference – abduction, induction, and deduction – through which observers can build and work with structures of signs:

570 Meaning, Sense, and Reference Deduction is the only necessary reasoning. It is the reasoning of mathematics. It starts from a hypothesis, the truth or falsity of which has nothing to do with the reasoning; and of course its conclusions are equally ideal. The ordinary use of the doctrine of chances is necessary reasoning, although it is reasoning concerning probabilities. Induction is the experimental testing of a theory. The justification of it is that, although the conclusion at any stage of the investigation may be more or less erroneous, yet the further application of the same method must correct the error. The only thing that induction accomplishes is to determine the value of a quantity. It sets out with a theory and measures the degree of concordance of that theory with fact. It can never originate any idea whatsoever. No more can deduction. All the ideas of science come to it by way of Abduction. Abduction consists in studying facts and devising a theory to explain them. Its only justification is that if we are ever to understand things at all, it must be in that way. (5.145)

In the case of the method of experiment, a surprising experience might lead us to abduce a new hypothesis and examine it deductively to see if it squares with the available facts. Suppose, for example, I held the view that individual members of a species tended to be larger in colder climates. Recent data, however, have shown that this difference was not true of fish. If I engage in abduction, I might generate the hypothesis that the original relationship applies only to mammals. If observation showed this hypothesis to be tenable, then the surprising experience would be a matter of course. Deductively, I could link my hypothesis to other varieties of animals and inductively test the consequences. Similar inferential strategies can be observed in a priori, authority, and even tenacity. The validity of our beliefs is tested in accord with Peirce’s pragmatic maxim: Consider what effects, that might conceivably have practical bearings, we conceive the object of our conception to have. Then, our conception of these effects is the whole of our conception of the object. (5.402)

If our beliefs are adequate to account for the phenomena before us, then we are satisfied. It is doubt that drives semiosis. One important source of doubt comes from comparing our beliefs with others. Logic is grounded in the collective nature of semiosis itself and oriented toward future activity. Peirce’s understanding of community was in terms of both the present communities of practice (scientist, citizen, family member, etc.), and the ‘family’ of all participants, past and future as well as present, who have, are and will work on clarifying our ideas and understandings.

Finally, as what anything really is, is what it may finally come to be known to be in the ideal state of complete information, so that reality depends on the ultimate decision of the community; so thought is what it is, only by virtue of its addressing a future thought that is in its value as thought identical with it, though more developed. In this way, the existence of thought now depends on what is to be hereafter; so that it has only a potential existence, dependent on the future thought of the community. (5.316)

In summary, Peirce argued that the subject matter of semiotics is semiosis, the action of signs in all domains of life. Semiosis is an effort after meaning. Our understanding of the world is entirely mediated by signs, and therefore to understand meaning, we must understand the nature of our signs: What is a sign? How is one sign related to another sign? What do signs reveal about the real world? What do they obscure? How are signs formed? What are the ways in which signs can stand for something else? The identification, understanding, and use of signs are fundamental parts of inquiry. In fact, the process of semiotics within inquiry was seen by Peirce to be an emergent process, and one quite explicitly linked to our cognitive natures: Symbols grow. They come into being by development out of other signs . . . We think only in signs. These mental signs are of mixed nature; the symbol-parts of them are called concepts. If a man makes a new symbol, it is by thoughts involving concepts. So it is only out of symbols that new symbols can grow. Omne symbolum de symbolo. A symbol, once in being, spreads among the people. In use and in experience, its meaning grows. Such words as force, law, wealth, marriage, bear for us very different meanings from those they bore for our barbarous ancestors. (2.302)

Dictionary vs. Encyclopedia In a seminal chapter in his Semiotics and the philosophy of language, Umberto Eco (1984: 46–86) further elaborated some of the issues raised above. In particular, Eco explored the question of whether the meaning of a linguistic expression could be captured in a synonymous expression or definition. As in scenario 1 above, the act of consulting a dictionary is often associated with seeking the meaning of a word or phrase. Even in the case of a nonlinguistic sign (as in the white cane of scenario 2, we often convert to linguistic sign (e.g., a blind man) and consult a dictionary. We expect a dictionary to help us pinpoint meaning and disambiguate the meaning of one word from another. For example, the dictionary entry for ram should provide a synonymous expression or paraphrase (e.g., adult male sheep) as well as

Meaning, Sense, and Reference 571

similarities/differences from other words (e.g., buck, stag, ewe, lamb, wether). This effect is accomplished by implicit (or sometimes explicit) reference to a hierarchical structure or Porphyrian tree. The synonymous terms or definitions are usually more general or abstract entries arrayed in a tree-like structure that depicts the relationships between the word whose meaning is sought and a network of related ones. For example, to define a ram as an adult male sheep specifies a relationship between ram and sheep but also raises the issues of the definition of sheep, differences between sheep and goats, cows, horses, and so on. Such taxonomies and classifications as this one are common within scientific domains, but the question raised by Eco (1984) is whether these structures actually analyze meaning. Structures achieve their classificatory power by proposing a finite number of hierarchically ordered categories by means of which terms are located. Ram has its place in this structure, and its place is linked to higher order terms (sheep, mammal, animal, living thing) and differentiated from other terms (goats, fish, plants, nonliving) by its position in the hierarchy. In Frege’s terms, such classifications seek to pinpoint the reference where the meaning of a term consists of an analysis that identifies the more abstract referents or properties to which the term in question is related. Ideally, this process should be an automatic one in which there is no ambiguity about whether the classification is correct. Indeed, a computer programmed with the requisite structures could perform our meaning analysis. In such an analysis, there is no room for interpretation! In a brilliant series of critiques, Eco clearly demonstrated the inadequacy of the ‘dictionary’ approach to meaning. There is something very artificial about the terms in these hierarchical tree structures – in the languages they use and the worlds they propose. Even at a formal level, logical inconsistencies crop up in even the most ‘natural’ classification systems. Thus, while a ram is an adult male sheep, it is difficult to construct a classification that appropriately differentiates on all three criteria at once (e.g., is gender superordinate or subordinate to sheep?). The very notion of hierarchy seems very slippery when probed – wool can be a property of and therefore subordinate to sheep in one analysis, whereas sheep can be an instance of and hence subordinate to the category of ‘wool-producing animals’ in another. Most serious, however, is that the attempt to minimize interpretation ultimately fails. According to Eco (1984: 57), ‘‘either the primitives (more abstract referents) cannot be interpreted, and one cannot explain the meaning of a term, or they can and must be interpreted and one cannot limit their number.’’

Any act of meaning, therefore, must involve both reference and sense. To understand the meaning of ram, we must inevitably bring to bear some portion of an immense body of potentially relevant world knowledge. In most everyday uses of the word ram, for example, connections to generic spheres of knowledge such as animal husbandry, Greek culture, hunting, clothing manufacture, astronomy, climate, etc. are possible as are more personal connections such as living on a sheep ranch, hearing nursery stories about sheep, etc. Most actual dictionaries actually make explicit links to portions of this world knowledge and for this reason, Eco regards them as ‘disguised’ encyclopedias. In some very specific context, the meaning of the word water might be limited to the expression H2O, but in most contexts, it would be immediately connected to possible spheres of world knowledge related to drinking, washing, sailing, extinguishing fires, and so forth. The task of a person trying to figure out the meaning of a term is, therefore, not one of establishing the referent, but contextualizing the sense in which the referent is being interpreted – in Peirce’s terms by means of interpretants in a (potentially unlimited) process of semiosis. Eco (1984) explored several metaphors to describe encyclopedic competence and settles upon the rhizome. A rhizome is a root crop, a prostrate or underground system of stems, roots, and fibers whose fruits are tubers, bulbs, and leaves. A tulip is a rhizome as is rice grass, even the familiar crab grass. The metaphor of rhizome specifically rejects the inevitability of such notions as hierarchy, order, node, kernel, or structure. The tangle of roots and tubers characteristic of rhizomes is meant to suggest a form of mind where: . Every point can and must be connected with every other point, raising the possibility of an infinite juxtaposition . There are no fixed points or positions, only connections (relationships) . The structure is dynamic, constantly changing, such that if a portion of the rhizome is broken off at any point it could be reconnected at another point, leaving the original potential for juxtaposition in place . There is no hierarchy or genealogy contained as where some points are inevitably superordinate or prior to others . The rhizome whole has no outside or inside, but is rather an open network that can be connected with something else in all of its dimensions. The notion of a rhizome is a difficult one to imagine, and any attempt to view it as a static picture risks minimizing its dynamic, temporal, and even

572 Meaning, Sense, and Reference

self-contradictory character. Eco (1984) labeled the rhizome as ‘an inconceivable globality’ to highlight the impossibility of any global, overall description of the network. Since no one (user, scientist, or philosopher) can describe the whole, we are left with ‘local’ descriptions, a vision of one or a few of the many potential structures derivable from the rhizome. Every local description of the network is an hypothesis, an abduction, constantly subject to falsification. To quote Eco: Such a notion . . . does not deny the existence of structured knowledge; it only suggests that such a knowledge cannot be recognized and organized as a global system; it provides only ‘local’ and transitory systems of knowledge which can be contradicted by alternative and equally ‘local’ cultural organizations; every attempt to recognize these local organizations as unique and ‘global’ – ignoring their partiality – produces an ideological bias. (1984: 84, emphasis his)

This last statement emphasizes the point that we are not proposing the metaphor of rhizome for an individual mind making meaning, but to minds as distributed in social, cultural, historical, and institutional contexts. Except as a degenerate case, there is no such thing as a single mind, unconnected to other minds or to their (collective) social cultural constructions. Thinking, or whatever we choose to call the activity of mind as it makes meaning, is always dialogic, connected to another; either directly as in some communicative action or indirectly via some from of semiotic mediation – signs and/or tools appropriated from the sociocultural context. We are connected to other people individually but also collectively as in the speech communities or social languages in which we are all embedded. We are connected to the sociocultural milieu in which we operate, a milieu characterized by the tools (computers, cars, television, and so forth) and signs (language, mathematics, drawing, etc.) that we may appropriate for our thinking. Thus meaning is not an action that takes place within a mind within a body, but rather at the connections, in the interactions. But, it is worth saying again that this thinking is always ‘local,’ always a limited subset of the potential (unlimited) rhizomous connections that embody our meanings. Thinking is, rather, a matter of constructing and navigating a local, situated path through a rhizomous labyrinth, a process of dialogue and negotiation with and within a local sociocultural context. Although this analogy fails if pushed too far, the connectivity we have in mind is a bit like the World Wide Web. While the ‘results’ of a connection to WWW are experienced via an interface with one’s local workstation, that experience is possible only as a result of connections with many (potentially an

infinite) number of servers all over the world. The local workstation both contributes to (constructs) and is constructed by its connections. Meaning is in the connections.

Umwelt – Meaning beyond Words Jacob von Uexku¨ ll’s (1957, 1982) notion of ‘Umwelt’ reinforces this conception of meaning as arising out of ‘local’ connections, but also extends the analysis beyond language signs. In his paper titled ‘The theory of meaning’ (1982; originally published in 1940) von Uexku¨ ll elaborated Umwelt as the phenomenal or personal world of an organism; that is, the set of connections constructed by the organism via speciesspecific sensory characteristics and particular experiences in the physical world. Ticks, for example, individually and collectively, construct their Umwelt in part from their biologically determined sensitivity to butyric acid and the particular physical surroundings in which they find themselves. The tick will climb up a tree or tall grass and launch itself toward a source of butyric acid, usually a passing mammal, burrow into its skin, gorge itself with blood, drop off, lay eggs, and die. The Umwelt of the tick is not its environment, that which an observer of an organism might describe as pre-adjacent to and independent of the organism, but is, rather, that environment as selectively reconstituted and structured according to the particular species’ characteristics and according to the specific needs and experiences of the individual organism. The organism and its Umwelt are indivisible – it makes no sense to speak of an organism apart from its personal world. So in this sense, all semiosis is simultaneously distributed and situated in the personal world of an individual organism. From the point of view of its inhabitants, an Umwelt is the actual world of experience; that is, an organism’s particular slice of the rhizome is the ‘real’ world for that organism. As external observers of species and individual organisms, we can attempt to gain some understanding of these worlds, and how they influence and are influenced by the multitude of other worlds to which they are coupled. In a delightful paper titled ‘A stroll through the worlds of animals and men’ (1957; originally published in 1934), von Uexku¨ ll described numerous examples of the Umwelten that organisms individually and collectively create that then serve to mediate their experience in the world. Bees, dogs, snails, flies, earthworms, and even human children and adults are among the subjects whose Umwelt he attempts to map. Of course, as von Uexku¨ ll acknowledged, analyses of an Umwelt will always be partial and incomplete in that we can not fully know the Umwelt, the

Meaning, Sense, and Reference 573

systems of meaning, of another. Indeed the same physical object often provides a quite different experience both across and within species. In a famous example, von Uexku¨ ll (1957) described the various Umwelten created by a large oak tree: a rough textured and convoluted terrain for a bug, a menacing form for a young child, a set of limbs for a nesting bird, a crop to be harvested by a woodsman, and so on. In all these cases, the environment of the tree was the same; that is, the bark, the height, the limbs were ‘available’ to each of the organisms, yet their experience of them, their meaning, was quite different. Objects of the world are ‘meaning carriers’ and our sensory-motor mechanisms are ‘meaning receivers.’ The coordination of meaning carriers and meaning receivers constitute a ‘functional circle’ out of which our Umwelt arises. Thus the fundamentals of the meaning making process are shared by all organisms. However, the human Umwelt, dubbed the ‘Lebenswelt’ by von Uexku¨ ll (1957), includes not only biological and physical factors but cultural ones as well. Although both humans and animals engage in sign action, humans are unique in their ability to consciously manipulate signs, thus enabling them to construct worlds apart from nature and direct experience. The possibility for signs that are arbitrary (e.g., language signs) allows humans individually and collectively to create an infinite array of meanings and possibilities for reality through the manipulation of signs. The importance of signs in creating the Lebenswelt lies in their creative power for infinite representation and meaning-making, or unlimited semiosis as described above. Indeed, culture itself can be described as a web of signs. Culture, in this view, is not some mind-independent, pre-adjacent social reality, but rather a collective construction that is reconstructed (or more accurately co-constructed in context) by each new participant. And via these structures, we literally construct our knowledge dynamically as we interact in the world. As an example, consider the following. In their provocative book Metaphors we live by, Lakoff and Johnson (1980) built a solid case that the way in which we perceive and think about a situation is a function of the metaphors (they could have as easily said sign structures) we have adopted for and use in that situation. If we take a cultural phenomenon like schooling, we can examine the dominant metaphors that define its meaning. Marshall (1988) has argued convincingly that the fundamental metaphor in many schools is ‘School Is Work.’ We speak of students needing to work harder on their studies, to complete their homework, to earn a grade, and so forth. Teachers are trained to manage their classes and are often held accountable in terms of their productivity.

Teachers to whom this metaphor is pointed out often deny it and yet are surprised at how often they catch themselves and other educators using language consistent with the workplace metaphor. Other metaphors are equally powerful. The notion that the ‘Mind Is A Container’ is very pervasive and in fact underlies much of our teacher training and instructional design and development. If the mind is a container and knowledge is an entity that can be transferred to the mind (a Conduit metaphor) to fill the void, then naturally the task of the educational system is to facilitate this process of knowledge communication. If, however, we understand the metaphorical basis of our thinking, we can raise the possibility of choosing different metaphors and thereby different meanings, and consequently different roles for educators. If we adopt the metaphor of ‘School As A Community Resource,’ for example, perhaps we will have our students working on authentic tasks that meet community needs (designing bicycle routes through the city; looking into water quality of local streams, etc.) through which they learn the academic content and skills deemed worthwhile. The focus now becomes not on what the students know, but the processes whereby something can become known, how we know it, and for what purpose. So all cognition, all human process, is situated in the sense that we as meaning makers are inseparable from our Lebenswelt. Yet as human observers using signs in ways unique to our species, we can gain some awareness of the sign process itself in ourselves and in others and some limited ability to manipulate it. With the recognition that our personal world is some portion of, some selection from, infinite possibilities comes the realization that any attempt to assert our view as global inevitably introduces an ideological bias that must be justified. As we build and rebuild our Lebenswelt, we can gain some reflexive self-identity, some autonomy in determining through deliberate choice what ideological bias we will adopt in our actions and our words. In the felicitous phrase of Peirce, we become ‘masters of our own meaning’ (5.393). To do so, we must solve the problem of ‘other minds.’ We structure our interactions in the world in terms of our ideas of how our mind works and how other minds work. In other words, we are continually engaged in Umwelt research. To understand the meanings of others, we must endeavor to understand our own meanings and how they are connected. The character of any human interaction – but communication in particular – is guided by how the participants establish intersubjectivity based upon their respective conceptions of Umwelt. At its most fundamental level, communication depends upon this intersubjectivity, the raising of a distinction between one’s mind

574 Meaning, Sense, and Reference

and that of others, all the while attending to their connection.

Summary We began with the assertion that the topic of meaning is embedded deeply within semiotics in general and the concept of sign in particular. Using Frege’s distinction between sense and reference as an analytical tool, we have seen how various models of semiotics have dealt with meaning. From the dazzling array of structuralist accounts of the interplay of meanings in cultural contexts by Saussure and his followers, through Peirce’s brilliant integration of both sense and reference in his triadic model of sign and sign process, to the elaborations of Eco and von Uexku¨ ll of unlimited semiosis and Umwelt, we can confirm that the whole of semiosis can be seen in large part as an effort after meaning. Meaning is not a possession of an individual or a property inherent in an object, but the inevitable outcome of semiosis, the action of signs. Returning briefly to the scenarios with which we began this entry, we have seen that even in the prototypical case of looking up the meaning of a word in a dictionary (scenario 1), one is quickly immersed in a rhizomous network of related meanings and experiences. A white cane in the hands of a pedestrian (scenario 2) can have a powerful influence on the behavior of others and in a very real sense comes to define that person in that context. Yet a whole other set of connections, different senses of the person, emerge when our pedestrian sets her cane aside and now uses her hands to drink coffee at a sidewalk cafe. These various senses must be shared by the community if they are to be acted upon. Meaning therefore is always an interactive outcome, not an inherent characteristic of either individuals or objects. Learning certainly plays a major role in the development of meaning. The cat in scenario 3 learned over time that the sound of the can opener heralded a possible meal. Our doctor in scenario 4 must likewise learn to distinguish between suspicious and ordinary features of an X-ray in order to make a correct diagnosis. Signs do not speak for themselves, they must be interpreted! And in that interpretation, particular conceptual contexts will be brought to bear that highlight some aspects of the complex situation and obscure others. Our anthropologist might be using a structuralist model of culture (e.g., Levi-Strauss) in interpreting scenario 5, a model that stresses certain interpersonal characteristics of this culture while ignoring others. We must always be alert to the fact that meanings can change as interpretive context changes.

Meanings must be communicated if they are to make a difference. The bicyclist and the dog in scenario 6 are, in a sense, searching for a means to coordinate their behaviors, as are the husband and wife in scenario 7. This process of interpretation and coordination draws heavily upon inference, particularly abductive inference. What meaning could account for these unexpected signs that would make them sensible? Generating and testing potential meanings lies at the heart of human semiosis. Finally, to become ‘masters of our own meaning,’ we must become more reflexive about the centrality of meaning in our life and how we can play a role in enriching that process. For a child to make the link between a photograph and her absent father (scenario 9) is an achievement that every parent should celebrate, because this act signifies the beginning of an awareness of the difference between a sign and its sense (interpretant) and reference (object). These humble beginnings can grow into the wonders of the hermeneutic process whereby we can come to know the worlds we have created and our role within them. In such worlds, Shakespeare and ‘Adagio for strings’ become possible. What other wonders await us? See also: Denotation versus Connotation; Eco, Umberto:

Theory of the Sign; Reference: Semiotic Theory; Saussure: Theory of the Sign; Semiosis.

Bibliography Barthes R (1964). Elements of semiology. Lavers A & Smith C (trans.). New York: Hill and Wang. Derrida J (1984). ‘Languages and the institutions of philosophy.’ Researches Semiotiques/Semiotic Inquiry 4, 91–154. Dewey J & Bentley A (1949). Knowing and the known. Boston: Beacon. Eco U (1984). Semiotics and the philosophy of language. Bloomington: Indiana University Press. Frege G (1892). ‘U¨ ber Sinn und Bedeutung.’ In Zeitschrift fu¨r Philosophie und philosophische Kritik. 100. Reprinted as ‘On Sense and Reference.’ In Geach P & Black M (eds.) Translations from the philosophical writings of Gottlob Frege. Oxford: Basil Blackwell. 25–50. Hawkes T (1977). Structuralism and semiotics. Berkeley, CA: University of California Press. Houser N (1987). ‘Toward a Peircean semiotic theory of learning.’ The American Journal of Semiotics 5, 251–274. Lakoff G & Johnson M (1980). Metaphors we live by. Chicago: University of Chicago Press. Levi-Strauss C (1966). The savage mind. Chicago: University of Chicago Press. Marshall H (1988). ‘Work or learning: implications of classroom metaphors.’ Educational Researcher 17, 9–16.

Meaning: Cognitive Dependency of Lexical Meaning 575 Ogden C & Richards I (1956). The meaning of meaning (8th edn.). New York: Harcourt, Brace & Co. Peirce C S (1931–1958). Collected papers of Charles Sanders Peirce. Hartshorne C & Weiss P (eds.). Cambridge, MA: Harvard University Press. Saussure F de (1959). Course in general linguistics. New York: Philosophical Library.

von Uexku¨ ll J (1957). ‘A stroll through the worlds of animals and men: a picture book of invisible worlds.’ In Schuller C (ed.) Instinctive behavior: redevelopment of a modern concept. New York: International University Press, Inc. von Uexku¨ ll J (1982). ‘The theory of meaning.’ Semiotica 42, 25–82.

Meaning: Cognitive Dependency of Lexical Meaning P A M Seuren, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

It is often thought or implicitly assumed, even in circles of professional semanticists, that predicate meanings, as codified in their satisfaction conditions (see Lexical Conditions), are lexically fixed in such a way that they automatically produce truth or falsity when applied to appropriate reference objects. This assumption is unfounded. In many, perhaps most, cases, the satisfaction conditions imply an appeal to nonlinguistic knowledge, so that the truth and falsity of assertive utterances are not the product of mere linguistic compositional computation, but are codetermined by nonlinguistic knowledge, either of a general encyclopedic or of a context-bound, situational nature. An obvious case is provided by a large class of gradable adjectival predicates, such as expensive, old, and large, whose applicability depends on (preferably socially recognized) standards of cost, age, and size, respectively, for the objects denoted by their subject terms. The description of such standards is not part of the description of the language concerned, but of (socially shared) knowledge. Further obvious examples are ‘possession’ predicates, such as English have, lack, and with(out), and whatever lexical specification is needed for genitives, datives, and possessive pronouns. These clearly require general encyclopedic knowledge for their proper interpretation. Consider the following examples: (1a) This hotel room has a bathroom. (1b) This student has a supervisor.

For (1a) to be true, it is necessary that there be one unique bathroom directly connected with the room in question, whose use is reserved for the occupants of that room. When the room carries a notice that its bathroom is at the end of the corridor to the right, while the same bathroom serves all the other rooms in the corridor, (1a) is false – not just misleading but

false, as any judge presiding over a court case brought by a dissatisfied customer will agree. But for (1b) to be true, no such uniqueness relation is required, as one supervisor may have many students to look after. This is not a question of knowing English, but of knowing about the world as it happens to be. The same goes for the parallel sentences: (2a) This is a hotel room with a bathroom. (2b) This is a student with a supervisor.

Possession predicates, therefore, must be specified in the lexicon as involving an appeal to what is normally the case regarding their term referents. They express a well-known relation of appurtenance between the kind of object referred to in subject position and the kind of object referred to in object position. The semantic description (satisfaction conditions) of have and other possessive predicates is thus taken to contain a parameter for ‘what is well-known,’ making the interpretation of this predicate in each token occurrence truth-conditionally dependent on world knowledge. Not all possession predicates are subject to the same conditions. Possessive pronouns, for example, may express a relation of ‘being responsible for’ or ‘taking care of,’ which other possession predicates cannot express. An example is sentence (3) uttered by a gardener with regard to the flower beds he is tending: (3) Please don’t mess up my flower beds.

This sentence can be uttered appropriately without the speaker implying that the flower beds are owned by him. Many such examples can be given. Consider the predicate flat said of a road, a tire, a mountain, a face, or the world. There is an overall element ‘spread out, preferably horizontally, without too much in the way of protrusions or elevations,’ but that in itself is insufficient to determine what ‘being flat’ amounts to in these cases. The full meaning comes across only if it is known what roads, tires, mountains, faces, and the world are normally thought to be like. Dictionaries,

Meaning: Cognitive Dependency of Lexical Meaning 575 Ogden C & Richards I (1956). The meaning of meaning (8th edn.). New York: Harcourt, Brace & Co. Peirce C S (1931–1958). Collected papers of Charles Sanders Peirce. Hartshorne C & Weiss P (eds.). Cambridge, MA: Harvard University Press. Saussure F de (1959). Course in general linguistics. New York: Philosophical Library.

von Uexku¨ll J (1957). ‘A stroll through the worlds of animals and men: a picture book of invisible worlds.’ In Schuller C (ed.) Instinctive behavior: redevelopment of a modern concept. New York: International University Press, Inc. von Uexku¨ll J (1982). ‘The theory of meaning.’ Semiotica 42, 25–82.

Meaning: Cognitive Dependency of Lexical Meaning P A M Seuren, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.

It is often thought or implicitly assumed, even in circles of professional semanticists, that predicate meanings, as codified in their satisfaction conditions (see Lexical Conditions), are lexically fixed in such a way that they automatically produce truth or falsity when applied to appropriate reference objects. This assumption is unfounded. In many, perhaps most, cases, the satisfaction conditions imply an appeal to nonlinguistic knowledge, so that the truth and falsity of assertive utterances are not the product of mere linguistic compositional computation, but are codetermined by nonlinguistic knowledge, either of a general encyclopedic or of a context-bound, situational nature. An obvious case is provided by a large class of gradable adjectival predicates, such as expensive, old, and large, whose applicability depends on (preferably socially recognized) standards of cost, age, and size, respectively, for the objects denoted by their subject terms. The description of such standards is not part of the description of the language concerned, but of (socially shared) knowledge. Further obvious examples are ‘possession’ predicates, such as English have, lack, and with(out), and whatever lexical specification is needed for genitives, datives, and possessive pronouns. These clearly require general encyclopedic knowledge for their proper interpretation. Consider the following examples: (1a) This hotel room has a bathroom. (1b) This student has a supervisor.

For (1a) to be true, it is necessary that there be one unique bathroom directly connected with the room in question, whose use is reserved for the occupants of that room. When the room carries a notice that its bathroom is at the end of the corridor to the right, while the same bathroom serves all the other rooms in the corridor, (1a) is false – not just misleading but

false, as any judge presiding over a court case brought by a dissatisfied customer will agree. But for (1b) to be true, no such uniqueness relation is required, as one supervisor may have many students to look after. This is not a question of knowing English, but of knowing about the world as it happens to be. The same goes for the parallel sentences: (2a) This is a hotel room with a bathroom. (2b) This is a student with a supervisor.

Possession predicates, therefore, must be specified in the lexicon as involving an appeal to what is normally the case regarding their term referents. They express a well-known relation of appurtenance between the kind of object referred to in subject position and the kind of object referred to in object position. The semantic description (satisfaction conditions) of have and other possessive predicates is thus taken to contain a parameter for ‘what is well-known,’ making the interpretation of this predicate in each token occurrence truth-conditionally dependent on world knowledge. Not all possession predicates are subject to the same conditions. Possessive pronouns, for example, may express a relation of ‘being responsible for’ or ‘taking care of,’ which other possession predicates cannot express. An example is sentence (3) uttered by a gardener with regard to the flower beds he is tending: (3) Please don’t mess up my flower beds.

This sentence can be uttered appropriately without the speaker implying that the flower beds are owned by him. Many such examples can be given. Consider the predicate flat said of a road, a tire, a mountain, a face, or the world. There is an overall element ‘spread out, preferably horizontally, without too much in the way of protrusions or elevations,’ but that in itself is insufficient to determine what ‘being flat’ amounts to in these cases. The full meaning comes across only if it is known what roads, tires, mountains, faces, and the world are normally thought to be like. Dictionaries,

576 Meaning: Cognitive Dependency of Lexical Meaning

even the best ones, limit themselves to giving examples, hoping that the user will get the hint. Another example is the predicate fond of, as in: (4a) John is fond of his dog. (4b) John is fond of cherries. (4c) John is fond of mice.

In (4a), obviously, John’s fondness is of a rather different nature from what is found in (4b): the fondness expressed in the one is clearly incompatible with the fondness expressed in the other. The fondness of (4c) can be either of the kind expressed in (4a) or of the kind expressed in (4b). The common element in the status assigned to the object-term referents is something like ‘being the object of one’s affection or of one’s pleasure,’ but again, such a condition is insufficient to determine full interpretation. Cognitive dependency is an essential aspect in the description of predicate meanings. The fact that some predicate meanings contain a parameter referring to an available nonlinguistic but language-independent, cognitive knowledge base means that neither utterance-token interpretation nor sentence-type meaning is compositional in the accepted sense of being derivable by (model-theoretic) computation from the linguistic elements alone. As regards utterance-token interpretation, this is already widely accepted, owing to valuable work done in pragmatics. The noncompositionality of sentence-type meaning, defined at the level of language description, is likewise beginning to be accepted by theorists of natural language. This type-level noncompositionality does not mean, however, that the specification of the satisfaction conditions of predicates is not truth-conditional, only that standards embodied in socially accepted knowledge have become part of the truth conditions of sentences in which the predicate occurs. In most treatises on lexicology, the term polysemy is used for phenomena such as those presented above. At the same time, however, it is widely recognized that this is, in fact, little more than a term used to give the problem a name. The problem itself lies in the psychology of concepts. One may assume that there are socially shared concepts like ‘possession,’ ‘flatness,’ and ‘fondness,’ but it is not known in what terms such concepts are to be defined. In a general sense, Fodor (1975, 1998) is probably right in insisting that lexical meanings are direct reflexes of concepts that have their abode in cognition but outside language. The necessary and sufficient conditions taken to define the corresponding lexical meanings cannot, according to Fodor, be formulated in natural language terms, but must be formulated in a ‘language of thought,’ which is categorically different from any natural language and whose terms and

combinatorial properties will have to be established as a result of psychological theorizing. It is clear, in any case, that phenomena like those shown in (1)–(4) pose a serious threat to any attempt at setting up a model-theoretic theory of lexical meaning, such as Dowty (1979): the neglect of the cognitive factor quickly becomes fatal in lexical semantics. Context-bound or situational knowledge plays a role in the interpretation of predicates that involve a ‘viewpoint’ or ‘perspective,’ such as the pair come and go, or predicates such as to the right (left) of, in front of, and behind. The two versions of (5) are truth-conditionally identical, but they differ semantically in that the ‘mental camera,’ so to speak, has stayed in the corridor in the went version, but has moved along with Dick into the office in the came version. (5) Dick and Harry were waiting in the corridor. Then Dick was called into the office. After five minutes, Harry [went/came] in too.

In similar manner, the sentences (6a) and (6b) may describe the same situation, but from different points of view. In (6a), schematically speaking, the viewer, the tree, and the statue are in a straight line; in (6b), it is the viewer, the tree, and the fountain that are in a straight line: (6a) There was a statue behind the tree, and a fountain to the left of the tree. (6b) There was a fountain behind the tree, and a statue to the right of the tree.

A further cognitive criterion for the lexical meaning of predicates, especially those denoting artifacts, seems to be the function of the objects denoted. What defines a table or a chair is not their physical shape or the material they are made of, but their socially recognized function. The same holds for a concept like ‘luxury.’ Laws imposing special taxation on luxury goods or luxury activities usually enumerate the goods and activities in question, making exceptions for special cases (such as frock coats for undertakers). Yet what defines luxury is not a list of goods or activities, but socially recognized function – roughly, anything relatively expensive and exceeding the necessities of life. A peculiar example of cognitive dependency, probably based on function, is provided by the English noun threshold and its Standard German translation equivalent Schwelle. In their normal uses, they denote the ridge or sill usually found between doorposts at floor level. Yet these two words differ in their capacity for semantic extension: the elevations in roads and streets that are normally called speed bumps in

Meaning: Development 577

English are called Schwelle in German. Yet it is unthinkable that speed bumps should be called thresholds in English. The question is: why? One is inclined to think that, at some ill-understood level of interpretation, the word threshold implies containment within a space or a transition from one kind of space to another, perhaps as a result of its etymology (which is not fully known). Schwelle, by contrast, is a swelling in the ground that forms an obstacle to be got over – which is also its etymology, although, on the whole, German speakers do not realize that. The difference between the two words is not a question of the ontological properties of the objects concerned, but, apparently, of the

ways they are conceived of. The role of etymology in this case is intriguing. See also: Cognitive Semantics; Lexical Conditions; Polysemy and Homonymy.

Bibliography Dowty D (1979). Word meaning and Montague grammar. Dordrecht: Reidel. Fodor J A (1975). The language of thought. Hassocks, Sussex: Harvester Press. Fodor J A (1998). Concepts: Where cognitive science went wrong. New York: Oxford University Press.

Meaning: Development E V Clark, Stanford University, Stanford, CA, USA ! 2006 Elsevier Ltd. All rights reserved.

How do children assign meanings to words? This task is central to the acquisition of a language: words allow for the expression of the speaker’s intentions, they combine to form larger constructions, and the conventional meanings they have license their use for making references in context. Without them, there is no language. In the acquisition of meaning, children must solve the general mapping problem of how to line up word forms with word meanings. The forms are the words they hear from other (mainly adult) speakers. The meanings they must discern in part from consistencies in speaker usage in context from one occasion to the next and in part from inferences licensed by the speaker on each occasion. Possible meanings for unfamiliar words, then, are built up partly from children’s conceptual representations of events and partly from the social interactions at the heart of adult-child conversation. One critical task for children is that of working out the conventional meanings of individual words (e.g., cup, team, friend, truth). Yet, doing so is not enough: syntactic constructions also carry meanings that combine with the meanings contributed by the actual words used (causative constructions, as in They broke the cup or The boy made the pony jump; the locative construction, as in She put the carving on the shelf; the resultative construction, as in He washed the floor clean). However, children start mapping word meanings before they begin combining words.

Languages differ in how they lexicalize information – how they combine particular elements of meaning into words – and in the kinds of grammatical information that have to be expressed. They may package information about events differently; for example, combining motion and direction in a single word (depart) or not (go þ toward), combining motion and manner (stroll), or not (walk slowly). They also differ in the grammatical distinctions made in each utterance. Some always indicate whether an activity was completed; others leave that to be inferred. Some always indicate whether the speaker is reporting from direct observation, or, for example, from the report of someone else. Some indicate whether object-properties are inherent or temporary. The grammatical distinctions that languages draw on vary, as do the ways in which they lexicalize information about objects and events. Mapping meanings onto words is not simply a matter of equating meanings with conceptual categories. Children have to select and organize conceptual information as they work out what the conventional meanings are for the words they are learning. How do children arrive at the meanings they first assign to unfamiliar words? How do they identify their intended referents? And how do they arrive at the relations that link word meanings in different ways? The general conversational context itself serves to identify relevant information on each occasion for children trying to work out the meaning of an unfamiliar word. Adult language use presents them with critical information about how words are used, their conventional meanings, and the connections among words in particular domains.

Meaning: Development 577

English are called Schwelle in German. Yet it is unthinkable that speed bumps should be called thresholds in English. The question is: why? One is inclined to think that, at some ill-understood level of interpretation, the word threshold implies containment within a space or a transition from one kind of space to another, perhaps as a result of its etymology (which is not fully known). Schwelle, by contrast, is a swelling in the ground that forms an obstacle to be got over – which is also its etymology, although, on the whole, German speakers do not realize that. The difference between the two words is not a question of the ontological properties of the objects concerned, but, apparently, of the

ways they are conceived of. The role of etymology in this case is intriguing. See also: Cognitive Semantics; Lexical Conditions; Polysemy and Homonymy.

Bibliography Dowty D (1979). Word meaning and Montague grammar. Dordrecht: Reidel. Fodor J A (1975). The language of thought. Hassocks, Sussex: Harvester Press. Fodor J A (1998). Concepts: Where cognitive science went wrong. New York: Oxford University Press.

Meaning: Development E V Clark, Stanford University, Stanford, CA, USA ! 2006 Elsevier Ltd. All rights reserved.

How do children assign meanings to words? This task is central to the acquisition of a language: words allow for the expression of the speaker’s intentions, they combine to form larger constructions, and the conventional meanings they have license their use for making references in context. Without them, there is no language. In the acquisition of meaning, children must solve the general mapping problem of how to line up word forms with word meanings. The forms are the words they hear from other (mainly adult) speakers. The meanings they must discern in part from consistencies in speaker usage in context from one occasion to the next and in part from inferences licensed by the speaker on each occasion. Possible meanings for unfamiliar words, then, are built up partly from children’s conceptual representations of events and partly from the social interactions at the heart of adult-child conversation. One critical task for children is that of working out the conventional meanings of individual words (e.g., cup, team, friend, truth). Yet, doing so is not enough: syntactic constructions also carry meanings that combine with the meanings contributed by the actual words used (causative constructions, as in They broke the cup or The boy made the pony jump; the locative construction, as in She put the carving on the shelf; the resultative construction, as in He washed the floor clean). However, children start mapping word meanings before they begin combining words.

Languages differ in how they lexicalize information – how they combine particular elements of meaning into words – and in the kinds of grammatical information that have to be expressed. They may package information about events differently; for example, combining motion and direction in a single word (depart) or not (go þ toward), combining motion and manner (stroll), or not (walk slowly). They also differ in the grammatical distinctions made in each utterance. Some always indicate whether an activity was completed; others leave that to be inferred. Some always indicate whether the speaker is reporting from direct observation, or, for example, from the report of someone else. Some indicate whether object-properties are inherent or temporary. The grammatical distinctions that languages draw on vary, as do the ways in which they lexicalize information about objects and events. Mapping meanings onto words is not simply a matter of equating meanings with conceptual categories. Children have to select and organize conceptual information as they work out what the conventional meanings are for the words they are learning. How do children arrive at the meanings they first assign to unfamiliar words? How do they identify their intended referents? And how do they arrive at the relations that link word meanings in different ways? The general conversational context itself serves to identify relevant information on each occasion for children trying to work out the meaning of an unfamiliar word. Adult language use presents them with critical information about how words are used, their conventional meanings, and the connections among words in particular domains.

578 Meaning: Development

Conventionality and Contrast Adult speakers observe two general pragmatic principles when they converse. First, they adhere to the conventions of the language they are speaking and in so doing make sure their addressees identify the meanings intended in their utterances. The principle of conventionality takes the following form: ‘For certain meanings, there is a form that speakers expect to be used in the language community.’ So if there is a conventional term that means what the speaker wishes to convey, that is the term to use. If the speaker fails to use it or uses it in an unusual way, that speaker risks being misunderstood. For conventions to be effective, conventional meanings must be given priority over any nonconventional ones. The second general principle speakers observe is that of contrast: ‘Speakers take every difference in form to mark a difference in meaning.’ When speakers choose a word, they do so for a reason, so any change in word choice means they are expressing a different meaning. These two principles work hand-in-hand with the Cooperative principle in conversation and its attendant maxims of quality (be truthful), quantity (be as informative as required), relation (make your contribution relevant), and manner (avoid ambiguity; Grice, 1989). Acting in a cooperative manner demands that one observe the conventions of the language in order to be understood. At the same time, if there is no conventional term available for the meaning to be expressed, speakers can coin one, provided they do so in such a way that the addressee will be able to interpret the coinage as intended (Clark, 1993).

In Conversation Adults talk to young children from the very start, and what they say is usually tied closely to specific objects and activities. This feature of conversation presents young infants with opportunities to discern different intentions, marked by different utterances from early on. Infants attend to adult intentions and goals as early as 12 months of age. They show this, for example, by tracking adult gaze and adult pointing toward objects (e.g., Carpenter et al., 1998), so if they are also attentive to the routine words and phrases used on each type of occasion, they have a starting point for discerning rational choices among contrasting terms and gestures. Consider the general conditions for conversational exchange: joint attention, physical co-presence, and conversational co-presence. Adults observe these conditions and indeed impose them, as they talk to very

young children. They work to get 1- and 2-year-olds to attend, for instance when planning to tell them about an unfamiliar object, and only then do they talk to them about whatever object or event is visibly present (Clark, 2001). By first establishing joint attention, adults set young children up to identify and then to help add to common ground. Children can do this by ratifying offers of new words by repeating them or else indicating in some other way that they have taken up an unfamiliar term (Clark, 2003). When adults offer unfamiliar words, they do so in the conversational context; that is, with children who are already attending to whatever is in the locus of joint attention. This feature, along with any familiar terms that are co-present in the conversation, allows children to make a preliminary mapping by identifying the intended referent, whether it is an object or an action (Tomasello, 2002). In effect, the conditions on conversation narrow down the possible meanings that young children might consider for a new term to whatever is in the current joint focus of attention. However, adults do more in conversation. They accompany their offers of unfamiliar words with additional information about the intended referent on that occasion and about how the target word is related to other terms in the same semantic field. Among the semantic relations adults commonly offer are inclusion (An X is a kind of Y), meronomy or partonomy (An X is part of Y), possession (X belongs to Y), and function (X is used for Y; Clark and Wong, 2002). After offering one term, adults often offer others that happen to contrast in that context, so a dimensional term like tall may be followed up by short, wide, narrow, and long (Rogers, 1978). In fact, the meanings of words for unfamiliar actions may also be inferred in part from their co-occurrence with terms for familiar objects affected by those actions, and the meanings of words for unfamiliar objects may be inferred in part from the verbs with which the nouns in question occur (e.g., Goodman et al., 1998; Bowerman, 2005). All this information offers ways for children to link new terms to any relevant words they already know. Children learn from child-directed speech about general properties of the lexicon – taxonomic relations, nonoverlapping categories within levels, opposites, overlaps in meaning (through hierarchical connections) vs. in reference, and so on. In short, adults are the experts in providing the conventional terms used for specific meanings in the speech community. The novices, children, ask them innumerable What’s that? questions from around age 2;0–2;6 on and treat them as reliable sources for how to talk about new things (e.g., Diesendruck and

Meaning: Development 579

Markson, 2001). Moreover, when young children make errors, adults frequently check up, through side sequences and embedded corrections, on what they intended to say, and so present children with the conventional forms to aim for (Chouinard and Clark, 2003).

Making Inferences When children hear a new term for some object or activity, they can infer in context that the term probably applies to the object or activity to which they are attending. However, the information that adults often follow up with allows children to make more detailed inferences about the candidate meaning. Mention of class membership – for example, A sparrow is a bird – tells them that they can add the term sparrow to the set of terms they already know for birds, perhaps just chicken and duck. Comments on the size, characteristic song, or flight each allow further inferences about how sparrows differ from ducks and chickens. What evidence is there that young children take in such information? In spontaneous conversations, they give evidence of attending to what adults say in several ways. First, they repeat new terms in their next conversational turn, either as single words or embedded in a larger utterance; second, they acknowledge the adult offer with forms like yeah, uhhuh, and mmh; and third, they continue to talk about the relevant semantic domain (Clark, 2004). Children’s readiness to make inferences from added information offered by adults has also been examined in word-learning experiments. In one study, children aged just 2 years old were taught words for two sets of objects (A and B) that were similar in appearance and had the same function. After teaching the first word for the first set (A), the experimenter introduced the second set of objects while saying just once, ‘‘Bs are a kind of A.’’ He then proceeded to teach the second word, B. Children were then tested by asking them to find all the As and then all the Bs. For the first request, they typically selected As; for the second, they consistently (and correctly) picked only Bs (Clark and Grossman, 1998). In short, the one statement of an inclusion relation was enough for even young 2-year-olds to make use of it in this task. In another condition, again teaching two new words for two sets that resembled each other, children infer that there could be an inclusion relation but they have no way to tell which way it should go, so some include A in B, and some B in A. Children rely on contrast in context to make inferences about the most probable reference for a newly introduced word. For example, if they already know

what the object they are attending to is called, they are more likely to infer that a new term denotes a subordinate, a part, or some other property of it (Taylor and Gelman, 1989). This propensity was exploited directly in studies of whether children could decide in context whether a new word was intended to denote an object or an activity. Young 2-year-olds were presented with the same object doing different actions, with one action labeled with the new term, or else several objects, one labeled with the new term and all doing the same action. The children readily inferred that the new word denoted an activity in the first case and an object in the second (e.g., Tomasello, 2002). Young children are also able to discern the intended from the accidental. When shown various actions being demonstrated, infants aged 18 months imitated intended actions (marked by utterances like ‘There’) more frequently than unintended ones (signaled by utterances like ‘Oops’). By age 2, young children know to ignore errors in wording, for example, and attend only to the final formulation of what someone is saying. In one study, for example, children were taught a word for a set of objects; then the experimenter exclaimed, ‘‘Oh, I made a mistake: these aren’t As, they’re Bs’’ and proceeded to teach the word B in place of the earlier A. When tested, even children who were just 2 years old knew that they did not know what A, the first word, meant (e.g., Clark and Grossman, 1998). All the inferences presented so far have been overt inferences about unfamiliar word meanings, made on the spot by children exposed to the new words. Yet, although adults make clear offers of new words, marking them as new by introducing them in formulaic deictic frames (e.g., This is a . . .), with utterancefinal stress, many of the other words they use will be unfamiliar to very young children. How do children assign meanings to all those words? The answer lies in the covert use of Roger Brown’s (1958) ‘‘original word game.’’ Basically, the child notices an unfamiliar word, makes inferences in context about its probable meaning and acts on that, and then adjusts those inferences in light of the adult’s responses. Consider these scenarios by way of illustration: (a) Young child watching parent in the kitchen, with several drink containers on the counter Mother (to older sibling): Hand me that mug, will you? (Child, wondering what a mug is, watches sibling pick up a mug) Mother: Thanks (Child infers for now that mug denotes something that has a handle, is a solid color, and is made of ceramic)

580 Meaning: Development

Sometimes, the inferences that children make are informed slightly more directly by the parent’s direct responses, as in (b). (b) Young child holding two plastic animals, a cat and a dog Father: Can you give me the spaniel? (Child, uncertain what spaniel means, holds out the cat) Father: No, the spaniel please. (Child infers that spaniel must refer to a kind of dog rather than a kind of cat, and so hands over the plastic dog instead)

In both cases, the child makes preliminary or tentative inferences that can then be adjusted or changed in light of adult follow-up utterances, further exposures in other contexts, and additional, often explicit information about inclusion, parts, properties, or functions. Of course, inferences like these can also be made about terms for actions, relations, and states, as well as about those for objects, parts, and properties.

Pragmatics and Meaning In the conversational exchanges considered so far, adult and child both follow the cooperative principle characterized by Grice (1989), as seen by their observation of joint attention, physical co-presence, and conversational co-presence. In addition, each participant in the exchange must add to common ground and keep account of the common ground that has been accumulated so far (H. Clark, 1996). All of this requires that speakers keep careful track of the intentions and goals being conveyed within an exchange (Tomasello, 1995; Bloom, 2000). Infants are attentive to nonlinguistic goals very early. For example, if 14-month-olds are shown an unusual action that achieves a goal – for example, an adult bending down to touch a panel switch with her forehead – they imitate it. If 18-month-olds watch an adult try and fail to place a metal hoop on a prong, the infants will produce the action successfully, even though they have never seen it completed (Meltzoff, 1995). That is, infants infer that the adult intended to turn on the light or intended to hang up the hoop. Intentions are what is critical, Meltzoff demonstrated, not just observation of the relevant actions, because infants do not re-enact these events when the actions are performed by a mechanical hand. In much the same way, infants attend to the words that adults use. Upon hearing a word, they infer that the speaker is referring to the entity physically present in the locus of joint attention. If instead the speaker produces a different word, they infer that the speaker is now referring to something else and therefore has a different goal in speaking. That is, each linguistic

expression chosen indexes a different intention, thus exemplifying the speaker’s reliance on contrast, as well as on conventionality (Clark, 1993). This recognition then licenses young children to use words to express their intentions and in this way to convey specific goals. Adult usage provides the model for how to do so within conversational exchanges. Infants also grasp quite early that the words used to express certain meanings are fixed and conventional. For example, they know that adults who wish to refer to a squirrel use the term ‘squirrel’ or to refer to a sycamore tree use the term ‘sycamore,’ and so on. As a result, when they notice adults who fail to use ‘squirrel’ when looking at a squirrel, but instead use another expression, they can readily infer that the speaker must therefore mean something else. In effect, young children, just like adults, assume that if the speaker intends to talk about a squirrel, he will use the conventional term for it. If instead, he uses something else, then he must intend to convey some other meaning. As a result, in situations where children already know terms for some of the objects they can see, they expect the adult to use a familiar term for any familiar object. If the adult instead produces an unfamiliar term, in the presence of an unfamiliar object, they will infer that he intended to refer to the object for which they do not yet have a term. So they use contrast, together with the assumption that conventional terms always take priority, to interpret the speaker’s intentions on such occasions. The result is that they consistently assign unfamiliar terms to asyet unnamed objects or actions. This pragmatic strategy for interpreting intentions and thereby making a first assignment of meaning to an unfamiliar word helps young children in many settings. Take the case of an adult looking at a single familiar object that is well known to the child. The adult, clearly talking about that object, does not use the expected conventional term. What can the child infer from that? There are two common options: (1) the unfamiliar expression denotes a superordinate or subordinate category, or (2) it denotes a part or property of the familiar object. Then, the remainder of the utterance can typically provide the child with important clues about the correct inference. For example, production of a familiar term for a known object is typically followed by a part term accompanied by a possessive pronoun (as in his ear), whereas such expressions as is a kind of or is a are associated with assignments to class membership in a superordinate category (Clark and Wong, 2002; Saylor and Sabbagh, 2004). Use of a verb like looks or feels (as in it looks smooth, it feels soft) often accompanies the introduction of property terms, and when the unfamiliar term is introduced before the

Meaning: Development 581

familiar one with kind of (a spaniel is a kind of dog), the child readily infers that the new term, here spaniel, designates a subordinate category. Finally, children as young as age 2 rely on a combination of syntax cues and physical co-presence in identifying generic noun phrases; for example, when asked something like What noise do dogs make? with just one dog in sight. What these findings indicate is that even very young children are highly attentive to the locus of joint attention and to whatever is co-present physically and conversationally. When one adds in whatever linguistic knowledge children have already built up about word meanings and constructions, it becomes clear that they have an extensive base from which to make inferences about possible, plausible meanings of unfamiliar words. This holds whether the words are presented explicitly as ‘new’ by adult speakers or whether children simply flag them en passant as unfamiliar and therefore in need of having some meaning assigned. At the same time, young children may have a much less firm grasp on the meanings of many of their words than adult speakers, and incidental or even irrelevant pragmatic factors may affect their interpretations and responses. Take the case of the Piagetian conservation task where the experimenter ‘checks up’ on the 5- or 6-year-old near-conserver’s answer by asking, for the second time, whether the amount that has just been transferred to a new container or transformed into a different array ‘‘is still the same.’’ Children on the verge of conserving typically change their initially correct answers from ‘yes’ to ‘no’ at this point. They do so because, pragmatically, asking the same question a second time signals that the initial answer was unsatisfactory (Siegal, 1997).

Another Approach In another approach to the acquisition of lexical meaning, some researchers have proposed that the task is so complex for young children that they must start out with the help of some a priori constraints. These constraints are designed to limit the kinds of meanings children can attribute to new words. What form would these constraints take, and what evidence is there for them? Among the constraints proposed are whole object – ‘Words pick out whole objects’ – and mutual exclusivity: ‘Each referent is picked by just one word’ (e.g., Markman, 1989). The whole object constraint predicts that young children should assume that any unfamiliar word picks out a whole object and not, for example, a part or property of that object. The mutual exclusivity constraint predicts that young

children should assume that an unfamiliar word must pick out something other than whatever has a name that is already known to the child. So this constraint predicts that children will systematically reject second terms they hear apparently applied to an already labeled referent, as well as fail to learn second terms. The predictions from these and other constraints have been tested in a variety of word-learning experiments where the target referents are objects. In fact, the whole object and mutual exclusivity constraints apply only to words for objects, so they would have infants treat all unfamiliar words as if they designated only objects and never actions, relations, or properties. How do such constraints work, as they conflict with many properties of word meanings? For example, mutual exclusivity would cause children to not learn inclusion relations in taxonomies because they would need to apply two or more terms to the same referent category in learning that an X can be called a dog, specifically a subtype called a spaniel, and that a dog is also a kind of animal. The whole object constraint would cause children to not learn terms for parts and properties. It would militate against children learning any terms for activities or relations. One could propose that such constraints apply only in the early stages of acquisition, after which they are overridden. However, then one has to specify what leads to their being overridden; in other words, what the necessary and sufficient conditions are for each constraint to be dropped so children can start to learn words for activities and relations, say, from adult usage, or words for parts and properties, as well as words for objects. Under this view of meaning acquisition, children could start out observing the various constraints and then drop each one at a certain stage in development so as to be able to learn other kinds of meanings up to then blocked by the constraints. In short, children should at first ignore much of what their parents say about words and word meanings and reject second labels whenever they are offered to mark a different perspective, for example. They should also look for words only to pick out objects, mistakenly assigning any that might, in context, seem to be designating categories of actions or relations as words for objects instead. Is this a realistic picture of development? No, because it calls for selectively ignoring or rejecting a large amount of what adults do with language as they talk about the world to their children, offer them words for objects and events in the locus of joint attention, and provide extensive commentary on parts, properties, motion, and functions associated with specific category members. The constraints approach ignores conditions imposed on conversational

582 Meaning: Development

exchanges, such as joint attention, and physical and conversational co-presence, and what they contribute to assigning meaning. It also conflicts with adult usage, which offers a range of perspectives on specific objects and events. A piece of fruit can be just that, fruit, or it can be an apple, dessert, or a snack, depending on the perspective chosen (Clark, 1997). Yet, these factors must all be taken into account in both designing and interpreting studies of meaning acquisition.

Sources of Meanings Children draw on conceptual categories already known to them and on information offered in context, both nonlinguistic and linguistic, when they assign a first meaning to new words. Infants build up and organize conceptual categories of the objects, relations, and events they observe months before they try to use words to evoke the relevant categories. As they assign candidate meanings, they rely on these conceptual categories to connect category instances and words as they start in on language (Slobin, 1985). However, because languages differ, children learn, for purposes of talking, to attend automatically to some aspects of events and ignore others; for example, whether an action is complete or not or whether the speaker witnessed an event for herself or simply heard about it. It is therefore important to distinguish between conceptual knowledge about events and the knowledge speakers draw on when they talk about those events (Slobin, 1996). Children try to make sense of what the adult wants. This means relying on any potentially useful source of information for interpreting and responding to adult utterances. What children know about the conceptual categories that appear to be at the focus of joint attention seems to provide initial strategies for coping when they do not yet understand all the words. The physical and conversational contexts, with joint attention, identify the relevant ‘space’ in which to act. This holds just as much for responding to half-grasped requests as for direct offers of unfamiliar words. Children attend to what is physically present, to any familiar words, and to any conceptual preferences. These preferences may include choosing greater amounts over lesser ones, assuming that the first event mentioned is the first to occur, and exchanging one state of affairs for another (Clark, 1997). Such coping strategies may be consistent with the conventional meanings of certain words, so children will appear to understand them when in fact they do not. The match of coping strategies and meanings offers one measure of complexity in acquisition: Matches should be simpler to acquire than cases of mismatch.

Children can draw on what they already know about objects and events, relations, and properties for their world so far. Their current knowledge about both conceptual categories and about their language at each stage offers potential meanings, in context, assignable to unfamiliar words. These preliminary meanings can be refined, added to, and reshaped by adult usage on subsequent occasions. This way, children learn more about the meanings that abut each word, the contrasts relevant in particular semantic domains, and the number of terms in a domain that have to be distinguished from one another. To succeed in this effort, children have to identify consistent word uses for specific event-, relation-, and objecttypes. They have to learn what the conventions are for usage in the speech community where they are growing up (e.g., Eckert, 2003).

Summary As children learn new words, they rely on what they know so far – the conceptual and linguistic knowledge they have already built up – to assign them some meaning in context. These initial meanings draw equally on their own conceptual categories and on adult patterns of word use within the current conversational exchange. In effect, joint attention, along with what is co-present physically and conversationally, places pragmatic limits on what the meaning of an unfamiliar word is most likely to be. In addition, adults often provide further information about the referent object or action, linking the word just offered to other words for relevant properties and actions, and thereby situating the new word in relation to terms already known to the child. This framing by adults for new word meanings licenses a variety of inferences by the child about what to keep track of as relevant to each particular word meaning (Clark, 2002). Adults here are the experts and constitute both a source and resource for finding out about unfamiliar word meanings. See also: Brown, Roger William (b. 1925); Cognitive Semantics; Context and Common Ground; Cooperative Principle; Cross-Linguistic Comparative Approaches to Language Acquisition; Developmental Relationship between Language and Cognition; Gestures: Pragmatic Aspects; Grice, Herbert Paul (1913–1988); Language Development in School-Age Children, Adolescents, and Adults; Language Development: Overview; Lexical Semantics: Overview; Lexicalization; Lexicon: Structure; Pragmatic Determinants of What Is Said; Sense and Reference: Philosophical Aspects; Slobin, Dan Isaac (b. 1939); Social Aspects of Pragmatics; Social-Cognitive Basis of Language Development; Word Formation.

Meaning: Development 583

Bibliography Bloom P (2000). How children learn the meanings of words. Cambridge, MA: MIT Press. Bowerman M (2005). ‘Why can’t you ‘‘open’’ a nut or ‘‘break’’ a cooked noodle? Learning covert object categories in action word meanings.’ In Gershkoff-Stowe L & Rakison D (eds.) Building object categories in developmental time. Mahwah, NJ: Lawrence Erlbaum. Brown R (1958). Words and things. New York: Free Press. Carpenter M, Nagell K & Tomasello M (1998). ‘Social cognition, joint attention, and communicative competence from 9 to 15 months of age.’ Monographs of the Society for Research in Child Development 63(176). Chouinard M M & Clark E V (2003). ‘Adult reformulations of child errors as negative evidence.’ Journal of Child Language 30, 637–669. Clark E V (1993). The lexicon in acquisition. Cambridge: Cambridge University Press. Clark E V (1997). ‘Conceptual perspective and lexical choice in acquisition.’ Cognition 64, 1–37. Clark E V (2001). ‘Grounding and attention in the acquisition of language.’ In Andronis M, Ball C, Elston H & Neuvel S (eds.) Papers from the 37th meeting of the Chicago Linguistic Society, vol. 1. Chicago: Chicago Linguistic Society. 95–116. Clark E V (2002). ‘Making use of pragmatic inferences in the acquisition of meaning.’ In Beaver D, Kaufmann S, Clark B Z & Casillas L (eds.) The construction of meaning. Stanford, CA: CSLI Publications. 45–58. Clark E V (2003). First language acquisition. Cambridge: Cambridge University Press. Clark E V (2004). ‘Pragmatics and language acquisition.’ In Horn L R & Ward G (eds.) Handbook of pragmatics. Oxford: Blackwell. 562–577. Clark E V & Grossman J B (1998). ‘Pragmatic directions and children’s word learning.’ Journal of Child Language 25, 1–18. Clark E V & Wong A D-W (2002). ‘Pragmatic directions about language use: words and word meanings.’ Language in Society 31, 181–212. Clark H H (1996). Using language. Cambridge: Cambridge University Press. Diesendruck G & Markson L (2001). ‘Children’s avoidance of lexical overlap: a pragmatic account.’ Developmental Psychology 37, 630–641.

Eckert P (2003). ‘Social variation in America.’ Publication of the American Dialect Society 88, 99–121. Goodman J C, McDonough L & Brown N B (1998). ‘The role of semantic context and memory in the acquisition of novel nouns.’ Child Development 69, 1330–1344. Grice H P (1989). Studies in the ways of words. Cambridge, MA: Harvard University Press. Markman E M (1989). Categorization and naming in children: problems of induction. Cambridge, MA: MIT Press. Meltzoff A N (1995). ‘Understanding the intentions of others: re-enactment of intended acts by eighteen-monthold children.’ Developmental Psychology 31, 838–850. Rogers D (1978). ‘Information about word-meaning in the speech of parents to young children.’ In Campbell R N & Smith P T (eds.) Recent advances in the psychology of language. London: Plenum. 187–198. Saylor M M & Sabbagh M A (2004). ‘Different kinds of information affect word learning in the preschool years: the case of part-term learning.’ Child Development 75, 395–408. Siegal M (1997). Knowing children: experiments in conversation and cognition (2nd edn.). Hove, Sussex: Psychology Press. Slobin D I (1985). ‘Crosslinguistic evidence for the language-making capacity.’ In Slobin D I (ed.) The crosslinguistic study of language acquisition, vol. 2. Hillsdale, NJ: Lawrence Erlbaum. 1157–1249. Slobin D I (1996). ‘From ‘‘thought and language’’ to ‘‘thinking for speaking.’’ In Gumperz J J & Levinson S C (eds.) Rethinking linguistic relativity. Cambridge: Cambridge University Press. 70–96. Taylor M & Gelman S A (1989). ‘Incorporating new words into the lexicon: preliminary evidence for language hierarchies in two-year-old children.’ Child Development 60, 625–636. Tomasello M (1995). ‘Joint attention as social cognition.’ In Moore C & Dunham P J (eds.) Joint attention: its origins and role in development. Hillsdale, NJ: Lawrence Erlbaum. 103–130. Tomasello M (2002). ‘Perceiving intentions and learning words in the second year of life.’ In Bowerman M & Levinson S C (eds.) Language acquisition and conceptual development. Cambridge: Cambridge University Press. 132–158.

584 Meaning: Overview of Philosophical Theories

Meaning: Overview of Philosophical Theories R M Martin, Dalhousie University, Halifax, NS, Canada ! 2006 Elsevier Ltd. All rights reserved.

The Direct Reference Theory It is obvious that an important fact about language is that bits of it are systematically related to things in the world. ‘Referential’ theories of meaning hold that the meaning of an expression is a matter, somehow, of this connection. The most immediately plausible application of this theory is to the meaning of proper names: the name ‘Benedict Spinoza’ is connected to the philosopher, and this fact appears to exhaust the meaning of that name. The other names – ‘Benedictus de Spinoza,’ ‘Baruch Spinoza,’ and ‘Benedict d’Espinosa’ – mean the same because they are connected to the same person. But even in the case of proper names, problems arise. For example, consider proper names with nonexistent references: ‘Santa Claus.’ If the meaning of a proper name is constituted by nothing but its relationship to the bearer of that name, then it follows that ‘Santa Claus’ is meaningless; but this seems wrong, because we know perfectly well what ‘Santa Claus’ means, and we can use it perfectly well, meaningfully. Another example would be the two proper names applied to the planet Venus by the Ancient Greeks, who were unaware that it was the same planet that appeared sometimes in the evening sky, when they called it ‘Hesperus’ and sometimes in the morning sky, when they called it ‘Phosphorus.’ Because these two names in fact refer to one and the same object, we should count them as having exactly the same meaning. It would appear to follow that someone who knew the meaning of both names would recognize that the meaning of one was exactly the same as the meaning of the other, and therefore would be willing to apply them identically. But the Greeks, when seeing Venus in the morning sky, would apply ‘Phosphorus’ to it, but refuse to apply ‘Hesperus.’ Does it follow that they did not fully understand the meanings of those terms? That is an implausible conclusion, since these are terms of the ancient Greek language: how could the most competent speakers of that language fail to understand the meanings of two terms in that language? It looks much more plausible to say that the fact that Hesperus and Phosphorus are identical is not a consequence of the meanings of those words. So meaning is apparently not exhausted by reference. (This example and this argument were originated by Frege.)

But here is a second sort of difficulty for the reference theory. Even if it could do a plausible job of explaining the meaning of proper names, it is not at all clear what it should do with other elements of language. Proper names, after all, make up only a small part of language, and an atypical part, insofar as meaning is concerned, at that; one does not find them in most dictionaries, for example. Consider whether this sort of approach to meaning might be extended to linguistic items other than proper names. It is a short step from there to more complex, less direct ways of referring, for example, ‘the Amsterdam-born author of the Ethics.’ If this definite description gets its meaning by its reference, then since it refers to Spinoza again, it must mean the same as those other names. But a problem immediately arises here, similar to the ‘Hesperus/Phosphorus’ worry. One might understand the meaning of ‘Benedict Spinoza’ perfectly, it seems, without knowing some facts about the philosopher, for example, that he was born in Amsterdam and wrote the Ethics; and, as a result, although one understood ‘the Amsterdam-born author of the Ethics’ he or she would not know that this referred to Spinoza. A similar problem arises with the ‘Santa Claus’ worry: ‘the present king of France’ is surely meaningful, although it is without reference (Russell’s famous example). Still other linguistic elements provide greater difficulty for a reference theory. How, for example, is the meaning of the predicate ‘is wise,’ occurring, for example, in ‘Spinoza is wise,’ to be explained in terms of reference? Particular wise objects exist, to be sure – Spinoza for one. But clearly it is not helpful to think that ‘is wise’ here gets its meaning merely by referring to Spinoza again – which would add nothing – or to some other wise person – which seems irrelevant. And what if that sentence were false (but meaningful), and Spinoza were not among the wise things: what would ‘is wise’ refer to then? More likely ‘is wise’ refers to a set of things – the wise individuals (Spinoza, Socrates, Bertrand Russell, etc.). But the sentence clearly tells us more than that Spinoza belongs to the group Spinoza, Socrates, Bertrand Russell, etc. It refers not to that group, it seems, but rather to the property that is necessary for inclusion in that group: wisdom. It is controversial whether we really need to populate our universe with strange objects such as ‘properties’ and ‘universals’; but, in any case, even if they do exist, it’s not clear that ordinary predicates refer to them. For example, ‘wisdom’ may be the name of a particular thing, referred to in the sentence, ‘Wisdom appeals to Francine,’ but it is much less clear that this thing is

Meaning: Overview of Philosophical Theories 585

referred to in the sentence ‘Spinoza is wise.’ A similar difficulty is posed by common nouns, e.g., ‘philosopher.’ It does not seem that we could explain the meaning of this element in the sentence ‘Spinoza is a philosopher’ by claiming reference to a particular philosopher, the class of philosophers, or philosopher-ness. Furthermore, reference has nothing to do with grammatical structure, which one would think is an important part of the meaning of any sentence. These two sentences, ‘Jane loves John’ and ‘John loves Jane,’ make the same references (to Jane, John, and loving, perhaps) but they surely mean something very different. A sentence conveys more than a series of references. It does not merely point at Jane, John, and the property of loving; in addition, it makes the claim that Jane loves John (or vice versa).

Meaning as Truth Conditions Perhaps a more promising way to extend the reference theory for common nouns, predicates, and other linguistic elements is to think of them as functions. Consider the analogy with arithmetic: 5, 13, !9, and so on are the names of numbers (whatever they are), but x/3 ¼ 9 is a function from numbers to a ‘truth value.’ With ‘27’ as the argument, its value is TRUE. With ‘16’ as the argument, its value is FALSE. Its meaning consists in the systematic way in which it pairs arguments with truth values. Now consider the systematic way ‘x is wise’ relates arguments to truth values. Substitute the proper name of certain things (any of the wise things) and the value is TRUE. Substitute the proper name of other things and the value is FALSE. The systematic way in which arguments and values are related in this case (it seems) exhausts the meaning of ‘is wise.’ Philosophers have proposed similar ways to deal with other linguistic elements. For example, adverbs might be regarded as functions taking a predicate as ‘argument’ and yielding a predicate as ‘value.’ This amendment in the spirit of the direct reference theory considerably extends its power and explains the function, basically in terms of reference, of various parts of language that do not by themselves refer. Partially because some of the functions in this approach have TRUE and FALSE as values, it was proposed that these truth values be considered the referents of sentences. (This move has seemed implausible to many, however: what are these things called truth values?) In the 1930s, Alfred Tarski proposed a definition of ‘truth’ that some philosophers thought would be the basis of a good theory of meaning. Tarski’s proposal was that what would constitute a definition of TRUE

for a language L would be a complete list of statements giving the truth conditions for each of the sentences in L. So one of these statements defining truth-in-English would be ‘Snow is white’ is true in English if and only if snow is white. (This may appear ludicrously trivial, because the sentence whose truth conditions are being given, and the reference to the truth condition itself, are in the same language. Of course, if you did not know what made ‘Snow is white’ true, this statement would not tell you. But that is not a problem with Tarski’s view in particular: no statement of a theory would be informative to somebody who didn’t speak the language in which the theory was stated.) Now, when we know the truth-conditions for a sentence, then do we know its meaning? Once you know, for example, what it takes to make ‘Snow is white’ true, then, it seems, you know what that sentence means. And what it takes, after all, is that snow be white. Obviously what one learns when one learns the meaning of a language cannot be the truth conditions of each sentence in the language, one at a time, because there are an infinite number of sentences. What is learned must be the meanings of a finite number of elements of sentences, and a finite variety of structures they go into. In the Tarskian view, then, the semantic theory of a language consists of a large but finite number of elements (words, perhaps), together with composition rules for putting them together into sentences and information sufficient for deriving the truth conditions for each of a potentially infinite number of sentences. One might object that this elaborate theory could not be what people know when they know what their language means. But perhaps providing this is not the purpose of a semantic theory (or a theory of anything). Baseball players are adept at predicting the path of a ball but only rarely are familiar with Newtonian theory of falling bodies. The idea here is attractive. If you know what sort of world would make a sentence true, then it seems that that is all it would take for you to know what that sentence means. This idea (though not particularly through Tarski’s influence) was the basis of the ‘logical positivist’ theories of meaning and of meaningfulness. Logical positivists enjoyed pointing out that Heidegger’s famous assertion ‘‘Das Nicht nichtet’’ (‘The nothing nothings’) was associated with no particular ways the world might be that would make it either true or false, and concluded that this statement, along with many others in metaphysics (e.g., McTaggart’s assertion that time is unreal), were meaningless. But it seemed that baby and bathwater alike were being flushed down the drain. Coupled

586 Meaning: Overview of Philosophical Theories

with a rather narrow and ferocious empiricism, this criterion ruled out as meaningless a number of assertions that were clearly acceptable. What empirical truth conditions are there now for statements about the past, or for assertions about invisible subatomic particles? But this may be a problem more about the logical positivists’ narrow empiricism than about their theory of meaning/meaningfulness. More germane here is the problem that many perfectly meaningful sentences have no truth conditions because they’re neither true nor false: ‘Please pass the salt,’ for example.

Sense and Reference Because of the Hesperus/Phosphorus problem mentioned above, Frege rejected the idea that the meaning of an expression is the thing it refers to. So Frege distinguished the thing to which a symbol referred (in his words, the Bedeutung, the ‘referent’ or ‘nominatum’) from what he counted as the meaning (the Sinn, usually translated as the ‘sense’) of the symbol, expressed by the symbol. The sense of a symbol, according to Frege, corresponded to a particular way the referent was presented. It might be called the ‘way of thinking about’ the referent. While his theory separated meaning from reference, nevertheless it can be considered a ‘mediated reference theory’: senses are ways a reference would be encountered, ways of getting to things from the words that refer to them. But it is the reference of included terms, not their sense, that determines the truth value of the sentence. Frege’s approach led him to a problem with sentences such as these: (1) Fred said, ‘‘Venus is bright tonight.’’ (2) Fred believes he’s seeing Venus.

Both sentences include examples of ‘opaque context,’ a part in the sentence in which substitution of a co-referring term can change the truth value of the sentence. In each of these sentences, substituting ‘the second planet from the sun’ for ‘Venus’ may make a true sentence false, or a false one true. In (1), an example of ‘direct discourse,’ if Fred’s very words did not include ‘the second planet from the sun,’ then that substitution can make a true sentence into a false one. That substitution in (2) may result in a false sentence if Fred believes that Venus is the third planet from the sun. Frege’s solution to this problem is to stipulate that in direct discourse – word-for-word quotation – the expression quoted refers to ‘itself,’ rather than to its usual referent (in this case, Venus). And in belief contexts and some other opaque contexts, expressions refer to their ‘senses,’ not to their usual referents.

But what, exactly, are these ‘senses’? First, it is clear that Frege did not intend them to be the ideas anybody associates with words. Frege’s ‘senses’ are objective: real facts of the language whose conventions associate senses with its symbols. One may have any sort of association with a bit of language, but the conventions of the language specify only certain of them as meaning-related. (Therefore, Lewis Carroll’s Humpty Dumpty does not succeed in meaning ‘There’s a nice knock-down argument for you’ with ‘There’s glory for you,’ even though he insists ‘‘When I use a word it means just what I choose it to mean – neither more nor less.’’) But why not admit that in addition to the public languages there can be ‘private’ ones with idiosyncratic senses? More will be said about this later. But second, neither can ‘senses’ be characteristics of the things referred to: for then, whatever is a sense of ‘Hesperus’ would be a sense of ‘Phosphorus.’ Furthermore, there appear to be symbols in a language that have sense but no reference: ‘the present king of France’ and ‘Atlantis’ are examples. Reference-less terms pose a problem for Frege. Should he consider them words with sense but no reference? If so, then how can they figure in sentences with a truth-value? (This is similar to the ‘Santa Claus’ problem.) A promising approach seems to be to say that the sense of a term ‘T’ consists of those characteristics judged to be true of things that are called ‘T’ by competent speakers of the language. But this immediately creates a problem with proper names. If ordinary proper names have senses – associated characteristics that are the way of presenting the individual named, associated conventionally with that name by the language – then there would be corresponding definitional (hence necessary) truths about individuals referred to by proper names. But this is problematic. The sense of a name applying to one individual cannot be the sense of the name of any other individual, because senses are the way terms pick out their designata. So the characteristics associated with a name would have to constitute that individual’s ‘individual essence’ – characteristics uniquely and necessarily true of that individual. But many philosophers have doubted that there are any such characteristics. Even if we can find characteristics that uniquely designate an individual, Kripke (1972) influentially argued that these characteristics are never necessary. Suppose, for example, that ‘Aristotle’ carries the sense ‘Ancient Greek philosopher, student of Plato, teacher of Alexander the Great.’ It would follow that this determined the referent of ‘Aristotle’; so if historians discovered after all that no student of Plato’s ever taught Alexander the Great, then ‘Aristotle’ would turn out to be a

Meaning: Overview of Philosophical Theories 587

bearer-less proper name, like ‘Santa Claus.’ But this is not how the proper name would work. Instead, we would just decide that Aristotle did not teach Alexander after all. Kripke argues that all sentences predicating something of a proper-named individual are contingent, so proper names do not have senses. But, of course, they are meaningful bits of language. This problem may apply even more broadly than merely to proper names. Consider the meaning of the term ‘water.’ Back before anyone knew the real properties of what made water water – i.e., its chemical constitution – competent speakers applied the term to any colorless, odorless liquid. But they were sometimes wrong, because the characteristics then used to distinguish proper and improper applications of the term, although they happened to pick out water on the whole, were not the genuinely necessary and sufficient conditions for something to be water at all. In those days, then, the real sense of the word was totally unknown and unused by any competent speaker of the language. Quine argued that Frege’s senses lack what is necessary for well-behaved theoretical objects. We have no idea, for example, of their identity conditions: is the sense of this word the same as the sense of that? More about Quine will be discussed in the final section of this article.

The Idea Theory The theories discussed so far consider what linguistic elements mean, but other classical theories have concentrated on what people mean by bits of language. Words, Locke argued, are used as ‘sensible marks’ of ideas; the idea corresponding to a word is its meaning. This has a certain amount of intuitive plausibility, in that non-philosophers think of language as a way of communicating ideas that is successful when it reproduces the speaker’s idea in the hearer’s mind. The ‘idea theory’ of meaning received its fullest expression in conjunction with British Empiricist epistemology. For the classical empiricists, our ideas were copies of sense-impressions – presumably similar to the sense experiences themselves, except dimmer. These mental representations served us as the elements of thought and provided the meanings for our words. However, problems with this theory are obvious. For one thing, not every such association is relevant to meaning. For example, the word ‘chocolate’ makes me picture the little shop in Belgium where I bought an astoundingly impressive chocolate bar. But although one might want to say, in a sort of loose way, that that’s what ‘chocolate’ means to me, it doesn’t seem to be a real part of the word’s meaning.

Someone else could be completely competent in using that word without any mental pictures of Belgium. Also, there are some meaningful terms that seem to be associated with no mental imagery, for example, ‘compromise.’ The problem of the meaning of ‘unicorn’ is solvable: images of a horse’s shape and an antelope’s horn can be mentally pasted together to provide a representation; but there are other problems. My image of my cat Tabitha might picture her facing right; but I’m to use this also to identify her as the bearer of that name when she’s facing left; so the mere association of a word with an image is not enough to give that word meaning. There must also be some procedure for using that image. Common nouns (‘dog’) could stand for any and all dogs, whereas the meaning of ‘dog’ was presumably a particular image of a particular dog. More of a theoretical mechanism is needed to explain why this image stands for Fido and Rover but not for Tabitha. And other sorts of words – logical words, prepositions, conjunctions, etc. – raise problems here too: how could sensory experiences be the source of their meaning? A more recent concern about the idea theory arose from the fact that the ideas that gave bits of language their meaning were private entities, whereas the meanings of a public language were presumably public. Clearly I would learn the word ‘cat’ by hearing you use the word in the presence of cats, but not in their absence; but according to the idea theory, I would have learned it correctly when my private image matches yours – something that is impossible for either of us to check. What we can check – identical identifications of objects as cats and non-cats – does not ensure identical private imagery and (according to the idea theory) has nothing directly to do with the meaning we invest ‘cat’ with anyway. Wittgenstein’s ‘private language argument,’ deployed against the idea theory, was considered devastating by many philosophers. Very briefly put, this argument is that the meaning of a public bit of language could not be given by a supposedly necessarily private item, such as a mental representation, because this would make impossible any public check – any real check at all – on whether anyone understood the meaning of a term; and without the possibility of a check, there was no distinction between getting the meaning right and getting it wrong.

Meaning as Use Wittgenstein’s hugely influential suggestion was that we think instead of sentences as ‘‘instruments whose senses are their employments’’ (1953: 421). Starting in the 1930s and 1940s, philosophers began thinking

588 Meaning: Overview of Philosophical Theories

of the meaning of linguistic items as their potential for particular uses by speakers and attempting to isolate and describe a variety of things that people do with words: linguistic acts, accomplished through the use of bits of language. One clear theoretical advantage of this approach over earlier ones was its treatment of a much wider variety of linguistic function. Whereas earlier approaches tended to concentrate on information giving, now philosophers added a panoply of other uses: asking questions, giving orders, expressing approval, and so on. This clearly represented a huge improvement on the earlier narrower views, which tried to understand all the elements of language as signs – representations – of something external or internal. Austin distinguished three kinds of things one does with language: (1) the ‘locutionary act,’ which is the utterance (or writing) of bits of a language; (2) the ‘illocutionary act,’ done by means of the locutionary act, for example, reporting, announcing, predicting, admitting, requesting, ordering, proposing, promising, congratulating, thanking; and (3) the ‘perlocutionary act,’ done by means of the illocutionary act, for example, bringing someone to learn x, persuading, frightening, amusing, getting someone to do x, embarrassing, boring, inspiring someone. What distinguishes illocutionary acts is that they are accomplished just as soon as the hearer hears and understands what the utterer utters. It is clear that the illocutionary act is the one of these three that is relevant to the meaning of the utterance. The performance of the act of merely uttering a sentence obviously has nothing to do with its meaning. Neither does whatever perlocutionary act is performed: you might bore someone by telling her about your trip to Cleveland or by reciting 75 verses of The fairie queen, but the fact that both of these acts may serve to bore the hearer does not show that they are similar in meaning. But similarity in meaning is demonstrated by the fact that two different locutionary acts serve to accomplish the same illocutionary act. For example, ‘Do you have the time?’ and ‘What time is it, please?’ perform the same illocutionary act (a polite request for the time) and are thus similar in meaning. However, this approach does not deny that what the other theories concentrated on is a significant part of language. In Austin’s classification, one part of many locutionary acts is an act of referring; when I say, ‘‘Aristotle was the student of Plato,’’ I’m probably performing the illocutionary act of informing you, but I’m doing that by means of the locutionary act of referring to Aristotle and Plato. And clearly many linguistic acts include ‘propositional content’: one reports, announces, predicts, admits, requests, orders, proposes, promises, and so on, ‘that p,’ so it seems

that inside this theory we would need an account of the way any of these linguistic acts correspond to actual or possible states of the world. A recent influential theory from Donald Davidson responds to these needs by combining, in effect, speech act theory with Tarskian semantics. According to Davidson’s proposal, the list of truth conditions for each assertion in a language provides an account of the language’s semantics – at least, of the propositional content of sentences in the language: what a statement, prediction, assertion, question, etc., states, predicts, asserts, asks, etc. This explains what is shared by ‘The light is turned off,’ ‘Turn off that light,’ ‘Is the light turned off?’ and so on. But secondly, according to Davidson, a theory of meaning needs a ‘mood indicator’ – an element of the sentence that indicates the use of that sentence – e.g., as a statement, request, or question.

Quine’s Skepticism Quine’s skepticism about meanings was among the most important 20th-century developments in the philosophy of language. Here is one way of approaching his position. Imagine a linguist trying to translate a tribe’s language. Suppose that the tribesmen said ‘‘gavagai’’ whenever a rabbit ran past. Does ‘gavagai’ mean ‘lo, a rabbit!’? The evidence might as well be taken to show that ‘gavagai’ asserts the presence of an undetached rabbit part, a temporal slice of a rabbit, or any one of a number of other alternatives, including even ‘Am I ever hungry!’ or ‘There goes a hedgehog!’ (if the tribesmen were interested in misleading you). Quine called this the ‘indeterminacy of translation.’ But, we might object, couldn’t further observation and experimentation decide which of these alternatives is the right interpretation? No, argued Quine, there are always alternatives consistent with any amount of observation. But, we might reply (and this is a more basic objection), what that shows is that a linguist never has absolutely perfect evidence for a unique translation. This is by now a familiar (Quinian) point about theory: theory is always to some extent undetermined by observation. In any science, one can dream up alternative theories to the preferred theory that are equally consistent with all the evidence to date. But in those cases, isn’t there a right answer – a real fact of the matter – which, unfortunately, we may never be in a perfect position to determine, because our evidence must always be equivocal to some extent? At least in the case of meaning, argued Quine, the answer is no, because for Quine, linguistic behavior is all there is to language, so there are no hidden facts about meaning to discover, with linguistic behavior as evidence. So

Meaning: Overview of Philosophical Theories 589

meanings are not given by ideas in the head, Fregean senses, or anything else external to this behavior. A similar sort of position was argued for more recently by Kripke (1982). Imagine that someone – Fred – has used a word ‘W’ to refer to various things, A, B, and C. Now he encounters D: is that referred to by ‘W’? One wants to say: if D is like A, B, and C, then Fred should go on in the same way and apply ‘W’ to D. But Kripke argues that there is no fact about Fred’s intentions or past behavior – no fact about what he means by ‘W’ – that would make it correct or incorrect for him to apply ‘W’ to D. Neither is there, in the external world, a real sufficient (or insufficient) similarity of D to A, B, and C that make it correct (or incorrect). The only thing that would make that application correct or incorrect is what a community of speakers using that word would happen to agree on. But does anti-realism about meaning really follow from these considerations? Suppose that Quine and Kripke are right, and all there is to language is social behavior. But maybe this does not imply that meanings are unreal. When we consider an action as social behavior, after all, we do not think of it (as Quine, in effect, did) merely as bodily movements. There are facts about the social significance of the behavior, above and beyond these movements, that give the movement its social significance. Perhaps it is these facts that would make one linguistic theory rather than another correct – that determine the meaning of the noises made by Quine’s natives, and whether Fred is following the linguistic rule when he applies ‘W’ to D. Language is a tool we know how to use, and the real meaning of our utterances is what we know when we know how to use that tool. See also: Descriptions, Definite and Indefinite: Philosophical Aspects; Direct Reference; Empiricism; Empty Names; Expression Meaning versus Utterance/Speaker Meaning; Ideational Theories of Meaning; Indeterminacy, Semantic; Intention and Semantics; Mood, Clause Types, and Illocutionary Force; Nominalism; Private Language Argument;

Proper Names: Philosophical Aspects; Radical Interpretation, Translation and Interpretationalism; Realism and Antirealism; Reference: Philosophical Theories; Rigid Designation; Sense and Reference: Philosophical Aspects; Speech Acts; Thought and Language: Philosophical Aspects; Truth Conditional Semantics and Meaning; Use Theories of Meaning; Use versus Mention.

Bibliography Alston W P (1964). Philosophy of language. Englewood Cliffs, NJ: Prentice-Hall. Austin J L (1962). How to do things with words. Urmson J O & Sbisa M (eds.). Cambridge, MA: Harvard University Press. Blackburn S (1984). Spreading the word: groundings in the philosophy of language. Oxford: Clarendon Press. Davidson D (1967). ‘Truth and meaning.’ Synthese 17, 304–323. Frege G (1892). ‘On sense and reference.’ In Geach P & Black M (eds.) Translations from the philosophical writings of Gottlob Frege. Oxford: Basil Blackwell. Grice H P (1957). ‘Meaning.’ Philosophical Review 66, 377–388. Kripke S (1972). ‘Naming and necessity.’ In Davidson D & Harmon G (eds.) Semantics of natural language. Dordrecht: Reidel. Kripke S (1982). On rules and private language. Cambridge, MA: Harvard University Press. Martin R M (1987). The meaning of language. Cambridge, MA: The MIT Press. Mill J S (1872). A system of logic, book I (8th edn.). London: Longmans. Quine W V O (1951). ‘Two dogmas of empiricism.’ In From a logical point of view. Cambridge, MA: Harvard University Press. Quine W V O (1960). Word and object. Cambridge, MA: The MIT Press. Russell B (1905). ‘On denoting.’ Mind 14, 479–493. Searle J R (1969). Speech acts. Cambridge: Cambridge University Press. Stainton R J (1996). Philosophical perspectives on language. Peterborough, ON: Broadview Press. Wittgenstein L (1953). Philosophical investigations. Anscombe G E M (trans.). Oxford: Basil Blackwell.

590 Meaning: Pre-20th Century Theories

Meaning: Pre-20th Century Theories G Haßler, Potsdam, Germany ! 2006 Elsevier Ltd. All rights reserved.

Early Theories of Meaning The problem of the basis on which linguistic signs mean something and can designate real objects or relations is a central subject of theories of language. The starting point of debates on this matter might be the dialog Cratylus by Plato (428/7–349/8), in which two opposite positions are contrasted: the conventionalist position (thesei), which ascribes the imposition of names to a voluntary convention between people, and the naturalist position, which supposes a natural denomination of an object depending on its properties (physei). Plato introduced Socrates (470/ 69–399) as a mediator between these two positions and presented them as complementary. For Aristotle (384/3–322), names are kata syntheken: unlike the unarticulated sounds of animals, they are historically imposed. The Aristotelian concept of the sign, which corresponds to the notion of arbitrariness in later theories, implies semantic function as well as the relation between the name and the designated reality. During the following centuries, several changes brought to the fore this relation between sounds and objects and its genetic explanation and partially disregarded the question of how signs mean something. The idea that linguistic signs were not naturally motivated, but invented by voluntary imposition, found different expressions (non natura sed ad placitum, ex arbitrio, arbitrarius, willku¨rlich, arbitraire, arbitrary). The explanation of the semantic function could then concentrate on the description of this voluntary imposition. For the medieval authors of modi significandi a sound (vox) became a word by its meaning (ratio significandi). The bilateral character of the linguistic sign was expressed in this context by the opposition of significatum and quod significatur, which appeared in several variations (Thomas of Erfurt, Grammatica speculativa, I, x3: signum, vel significans; Martinus de Dacia: res designata/vox significans). The relation between both sides of the linguistic sign was presented as a problem by nominalist theories. The Logic of William of Ockham (1285?–1349?) was mainly based on semantic observations of expressions that have meanings. The signification, corresponding to the result of the previous formation of concepts, was differentiated into suppositions concerning the actual and contextual relation of a sentence. In this manner, the significatio as a potential

of meaning was opposed to the suppositio, the actualized meaning. This consideration of meaning by the nominalists implied a higher degree of independence of meaning with regard to reality. The doctrine of St. Augustine (354–430) led to another conception of meaning and its relation to reality. A main tenet of the Augustinian rationalist doctrine was the merely spiritual nature of all notions and of the relations between them. The denotation of a term was regarded as a mental object that could only have a representational relation to the word and could not depend on linguistic signs and their corporeal nature. The form words obtain in the different languages was regarded as arbitrary, whereas the composition of the concept was universal and did not depend on sensations. For the rationalist thinkers, the necessity of language appeared only with communication between people when the transmission of pure incorporeal notions was impossible. But linguistic signs met the necessities of communication in a very insufficient way because intuitive conceptions overwhelmed our mind, whereas their linguistic signs distracted from their content and slowed down the process of thinking. Although words had different forms in different languages, the ideas designated were neither Greek nor Latin but universal and independent of any sensual support. This rationalist theory limited the impact of language on cognition and, in the same way, the trustworthiness of sensory cognition. But in some authors we find the opposite perspective. Let us mention the Spaniard Luis de Leo´ n (1527–1591), who wanted to achieve knowledge about the nature of religious concepts by studying the denominations of Christ in the Bible. The basis of his De los nombres de Cristo was a semantic theory that supposed a capacity of representation of the denominated (nombrado) by the name (nombre) in cognitive processes. By the separation of the original status of language from the development of historical languages, the validity of the biblical doctrine on the origin of language was clearly established and the possibility of reflection on arbitrary signs based on human reason was opened.

Intension and Extension in Port-Royal Logic The Grammar (1660) and the Logic (1660) of PortRoyal took up and developed semantic concepts that exercised an important influence on the further development of semantic theories. The authors of the Port-Royal Logic, Antoine Arnauld (1612–1694) and

Meaning: Pre-20th Century Theories 591

Pierre Nicole (1625–1695) treated the problem of the arbitrary sign in its complexity, regarding words not as natural (signes naturels) but as conventional signs that functioned as grounded on tradition (signes d’institution et d’e´ tablissement). But the establishment of a convention presupposed the existence of already formed ideas, which could be denominated by words chosen conventionally. The primacy of thought set prerequisites for the interpretation of arbitrariness. What could be regarded as arbitrary were not ideas themselves but only the attribution of sounds to these ideas, which would exist even without them. The meanings of the linguistic signs did not appear in the moment of their conjunction with sounds but, rather, they existed as clear and reasonable ideas and were independent of their denomination. The fact that convention varied in different languages could prove easily that it had nothing to do with the nature and formation of ideas. But this conclusion, which derived from the rationalist thesis of innate ideas, was not the only restriction on the arbitrariness of signs. The authors of the Logic declared that the relation between sound and meaning was only arbitrary in the individual use of language but that it was determined by common use through communication between people. It seems remarkable that the assignment of sounds to meanings was regarded as a relation between two mental entities. The idea of the psychological nature of both sides of the linguistic sign is obviously not a product of the beginning of the 20th century. In the examination of meaning, Arnauld and Nicole distinguished the intension (compre´ hension) and the extension (e´ tendue) of a sign. By intension they meant the totality of features that constituted an idea, none of which might be taken away without destroying the idea. Extension was defined as the totality of objects, notions, and subnotions to which an idea could refer. What determined the identity of a meaning was not its reference to different objects and notions but the intension. Although it was always possible to limit the extension of a word, leaving out a feature of its intension would make it lose its identity. The reference point for the intensional definition of meaning was language as a product and not its use in communication. The meaning was obligatory in its features and was determined by convention, whereas the extension depended on the use of the word to denote either a whole class of objects or an individual. The discussion of the actualization of conventional signs in use leads to another distinction in the PortRoyal Logic. Arnauld and Nicole distinguished between a proper meaning (signification propre) and accessory ideas (ide´ es accessoires). The latter derived

from the dynamic character of meaning in communication in which accessory ideas were evoked. For exampal, the sentence you have lied has the same principal (proper) meaning as you know that the contrary of what you are saying is true, but it evokes the accessory ideas of insult and contempt. Accessory ideas were not ascribed to a word by common use, but they appeared in the individual use of language. Repetition could lead to generalization of such accessory ideas and, as the result of this process, accessory ideas might be linked habitually to words, especially to synonyms. But in this case, we would have to take into account the inconsistency of such accessory ideas. The differential function of the accessory ideas was highly estimated by the authors of Port-Royal. They even demanded to explain them in dictionaries. This assertion of distinct words with similar meanings helped to prepare dictionaries of synonyms that became popular in many European countries. Arnauld and Nicole also discussed oppositions and proposed a classification of them. They called ‘divisions’ such pairs of opposites that covered the whole extension of a notional field and could be related to a hyperonym (pair/impair ! nombre). In cases in which the opposites were divided by a zone of indifference, they even described the field of this zone, including the antonyms in a narrow sense (Table 1). The confusion of these two types of oppositions was regarded as harmful for logical conclusions, if a person took the negation of a term as its opposite and did not take into account the zone of indifference. For a further differentiation, Arnauld and Nicole gave the terms of opposites shown in Table 2. The Logic of Port-Royal contained a comprehensive theory of different semantic oppositions and even attempts at a systematic description of the vocabulary.

Table 1 Opposites and zones of indifference Opposites

Zone of indifference

sain/malade jour/nuit avarice/prodigalite´

indispose´, convalescent cre´puscule libe´ralite´, e´pargne louable, ge´ne´rosite´

Table 2 Kinds of oppositions Name of the opposition

Examples

Termes relatifs Termes contraires Termes privatifs Termes contradictoires

pe`re/fils, maıˆ tre/serviteur froid/chaud; sain/malade vie/mort; vue/aveuglement voir/ne pas voir

592 Meaning: Pre-20th Century Theories

The Recognition of the Historical and Cultural Nature of Meaning The Recognition of the Genius of a Language

At the end of the 17th century, the notion of the genius of the language (ge´ nie de la langue) became very fashionable. It appeared in systematizations of Vaugelas’s Remarques sur la langue franc¸ aise by Louis Du Truc (1668), Jean Menudier (1681), and Jean d’Aisy (1685) and was taken up by grammarians and philosophers as well. The ‘Essay concerning human understanding’ (1690) by John Locke (1632–1704) gave a new answer to the question of how thought could be influenced by language. According to Locke, linguistic signs did not represent the objects of knowledge but the ideas that the human subject created. The nominalist explanation of complex ideas led to the denial of innate ideas and to the supposition of a voluntary imposition of signs on a collection of simple ideas for which there was no pattern in reality. This rendered possible an extension of the concept of arbitrariness to the composition of meaning. Language was no longer regarded as a simple expression of a universal reason but as a system that organized thought following historical and social principles. Bernard Lamy (1640–1715), who was influenced by the Port-Royal authors but found a way to integrate their theory into the assertion of a sensual basis of human cognition, remarked on the different division of lexical areas in various languages. In his Rhetoric (La rhe´ torique ou l’art de parler, 1699), he explained the quantity and differentiation of words by the attention different peoples give to some fields of knowledge. In this way, he explained the existence of more than 30 words to denote ‘camels’ in Arabic as well as the fact that people who cultivated the sciences and arts had a developed and highly differentiated vocabulary. Especially in the formation of diminutive and augmentative denominations, languages differed considerably: In Italian, for example, many diminutives had developed, whereas they were absent in French. Because languages used different points of view in the denomination of things and concepts, a word-byword translation from one language into another was nearly completely impossible. This was an opinion that was largely shared by 18th-century authors. If a person looked at the motives for the formation of words, the differences between languages became even more obvious. Lamy used an old example for the argumentation on this subject: the words for ‘window’ in the Romance languages that are derived from three Latin etyma. The Spanish ventana

(< ventus) emphasizes the blowing of the wind, the Portuguese janela (< janua) uses a comparison with a gate, and the French feneˆ tre (< fenestra < Greek jainEin ‘to shine’) uses the transmission of light. This meant that the same word, which stood for a fundamental idea, had different significations in different languages. The Study of Metaphors

With the positive evaluation of the cognitive force of sensation and emotional speech, the problem of metaphors became more important. Ce´ sar Chesneau de Du Marsais (1676–1756) dedicated a digression in his book Des Tropes ou des diffe´ rents sens dans lesquels on peut prendre un meˆ me mot dans une meˆ me langue (1730) to the study of metaphors and idioms in various languages. He started with the difficulties of translation of lexical metaphors and the lack of assistance from dictionaries. Although it was possible to find an equivalent for the proper meaning (sens propre) for which the word had originally been installed, it was often impossible to translate all the figurative meanings (sens figure´ s) of words. The use of words in their metaphorical meaning was not regarded as an exception. Naming an idea by a sign that was related to an associative idea was a regular case in language. In addition, the use of words in their figurative meanings could fill the gaps in vocabulary. The French word voix, for example, had, in addition to its proper meaning ‘the sounds emitted by the mouth of animals, and especially by human mouth,’ several figurative senses: (1) ‘inner inspiration or pangs of conscience’ in the sentence le mensonge ne saurait e´ touffer la voix de la ve´ rite´ dans le fond de nos cœurs (‘the lie will not be able to suffocate the voice of truth in the deep of our hearts’), (2) ‘inner sensations’ in word groups such as la voix du sang (‘the voice of blood’) and la voix de la nature (‘the voice of nature’), and (3) ‘opinion, view, judgment.’ Not all of these figurative meanings of voix could be translated by the Latin word vox. In the same manner, it would not be possible to translate porter ‘to carry’ by ferre in the case of figurative uses such as porter envie, porter la parole, and se porter bien. Du Marsais criticized the usual practice in French-Latin dictionaries, which gave just a series of verbs (in the case of porter: ferre, invidere, alloqui, valere) without mentioning the specificity of their semantic qualities. If dictionaries continued in this way, they might arrive at the explanation of aqua ‘water’ by feu ‘fire,’ simply because the cry for help regarding a fire in Latin was aquas aquas and in French au feu. From a certain referential meaning of a word, it was not possible to infer its essential semantic property. As Du Marsais

Meaning: Pre-20th Century Theories 593

claimed, dictionaries should first of all give the significations propres and then explain the figurative meanings that could not be deduced from the proper one. This explanation should be given by examples and by grouping the figurative meanings according to the given state of language. But a historical explanation of figurative meanings should also be possible because all of them developed out of the proper meaning that was regarded as the original one. The manner of such a development might differ in various languages, which would lead to a complicated picture of the relations between figurative meanings of words in two languages. In his Fragment sur les causes de la parole, Du Marsais arrived at an important theoretical conclusion concerning the semantic function of words. He supposed meaning to be a systematic virtual property of language and opposed it to the realizations of words and their meanings in language use. The systematic property, called la valeur des mots by Du Marsais, was acquired by education and contact with other people. It corresponded to an abstraction of the different senses a word might have.

The Study of Synonymy The study of synonymy in 17th- and 18th-century linguistics did not concern just supplying different means for the expression of the same idea but mainly differentiating the meanings of synonyms and determining the exact definition of the meaning of every single word. In addition to the theoretical interest that synonyms as a simple systematic phenomenon represented, practical needs contributed to the study of synonymy. The starting point was the doctrine of a verbum proprium, an appropriate word for each purpose, which should be defined by its invariable semantic properties. Defining synonyms became an aristocratic game in France in the 17th century and was disseminated throughout Europe. The most influential work was written by Gabriel Girard (1777/8–1748) (La Justesse de la langue franc¸ aise, ou les diffe´ rentes significations des mots qui passent pour synonymes, 1718; Synonymes franc¸ ais, leurs significations et le choix qu’il faut en faire pour parler avec justesse, 1736). Girard declared explicitly that the language of a certain state formed a system, despite the unsystematic character of the formation of languages. In this context, he defined the valeur of a word as its correct meaning, which corresponded to the current use in the language. He started from a rationalistic position and supposed that ideas were independent of language and only had to be denominated by words. The valeur of a word consisted in

the representation of ideas that the language use had related to it, and therefore it was determined by a social convention or an explicit individual imposition. But in the description of synonyms, Girard mainly regarded the differences of their values. Following his approach, synonyms were words that expressed a common idea but were distinctive in the expression of accessory ideas. The similarity of their meanings did not encompass the whole area of their significations. The accessory ideas gave a special and proper character to every synonym and determined its correct use in a certain situation. The richness of a language was therefore not only determined by a multitude of words but by the distinctions between their meanings and by precision in the expression of simple and complex ideas. In Girard’s system of synonyms, only the valeur counted, and it was described by the relations of a synonym to words with similar significations. He rejected the opinion that synonyms should only help to avoid monotony by variation in sound. In the practical distinction of synonyms, Girard presented the similarities and distinctions of synonyms using the genus proximum–differentia specifica scheme of antiquity and scholasticism. He tried not to create differences that could not be observed in language use; however, in many cases he attempted to establish a logical relation between synonyms, for example, a gradation (ordinaire, commun, vulgaire, trivial), a purpose-means relation (projet – dessein), or causal relations (re´ formation – re´ forme). Girard’s description of complex lexical structures showed that the term synonymy could be used in 18th-century texts for lexical fields and their structure. The words designated as synonyms could be different in their intension and extension, showed use restrictions, or entered into various oppositions. On this basis, Girard distinguished the elements of the semantic field of intellectual qualities, establishing semantic compatibilities and determining features of meaning for every word. Apart from a substantial description of the meaning of words, Girard also supplied oppositions, such as: esprit – beˆ tise raison – folie bon sens – sottise jugement – e´ tourderie entendement – imbe´ cilite´ conception – stupidite´ intelligence – incapacite´ ge´ nie – ineptie

In Girard’s doctrine, the value (valeur) of a word is a use-independent semantic property that could be

594 Meaning: Pre-20th Century Theories

described by its relations to other words. In the use of language, it allowed different significations to be produced. Girard’s approach was widely disseminated in Europe. It was used, for example, by Johann August Eberhard (1730–1809) in his Synonymisches Handwo¨ rterbuch der deutschen Sprache fu¨ r alle, die sich in dieser Sprache richtig ausdrucken wollen (1802). In Spain, synonymy became an important field of discussion, as expressed in works such as the Ensayo de los synonimos by Manuel Dendo y Avila (1756).

Condillac’s and the Ide´ ologues’ Semantics Etienne Bonnot de Condillac (1714–1780) formulated a coherent sensationalist theory of cognition by substituting for Locke’s dualist explication of sensation and reflection the concept of transformed sensation (sensation transforme´ e), which helped to explain even complex thought as being made up of simple sensations. The instrument allowing this transformation was language, to which Condillac attributed an important role in human thought. The signs of human language operated according to the principle of analogy, which corresponded to a motivated relation between signs of analogous content. Instead of signes arbitraires, and in order to emphasize the genetic character of language, Condillac used the term signes d’institution, finally preferring in his Grammar (1775) the denomination signes artificiels. An arbitrary relation existed not only between sounds and the ideas related to them but in the composition of complex ideas. This arbitrariness was relativized by the long historical process of imposition of signs, in which a language evolved its specific shape or genius. This specificity concerned the functions of languages as analytic means and it became important in the discussions of the Ideologues of the end of the 18th century. The linguistic ideas of the Ide´ ologues have mainly been studied in relation to theories they took up and modified, as well as from the point of view of the continuation of their legacy by later linguistic theorists. Whether the Ide´ ologues really were representatives of a transitional thinking that rendered possible the explication of the school grammar categories of the 19th and 20th centuries has also been discussed. They were considered, moreover, as the starting point and the background for several theoreticians, for example Wilhelm von Humboldt (1767–1835) and Ferdinand de Saussure (1857–1913), for whom language was, above all, an instrument of the articulation of thought. But the Ide´ ologues themselves did not

admit that they had much continuity with the theory of Condillac; they even stressed their independence from all previous authors. This was an expression of a break with mechanical education.

Meaning in 19th-Century Linguistics General Evolution

In this section we do not follow the fascinating history of reflection on meaning in general and on the meaning of words in particular in the 19th century; instead, we examine some approaches to semantic description. In the 19th century, the fascination with meaning led to a sudden increase in the number of books, treatises, and pamphlets on semantic topics, in the widest sense of the term. Books on words were widely read and shaped the popular image of semantics, thereby undermining its claims as a science. Even Bre´ al’s Essai de se´ mantique (1897) was regarded as entertaining or amusing. Conversations on etymology contributed to this general soft image of semantics in a century that is now considered the advent of historical and comparative linguistics, with a focus on the discovery of sound laws. Nevertheless, during the 19th century semantics was a very productive field. It led to innovations and went through three phases or stages (see Nerlich, 1992: 3). During the first stage, questions about the origin of language (see the previous discussion of the proper signification, the Grundbedeutungen, or original meanings) were gradually replaced by the problem of a continuous evolution or transformation of language and meanings. The search for a true meaning was abandoned in favor of the search for the types, laws, or causes of semantic change. It was claimed that the meaning of a word was not given by its etymological ancestry but by its current use, and that omitting etymology was an important factor in the functioning and evolution of language. During the second stage, questions about the types and causes of semantic change were slowly replaced by reflections on the mechanism of communication, comprehension, and linguistic interaction between speakers and hearers in a particular situation or context. During the third stage, semantics merged with what we now call pragmatics; word meaning was seen as an epiphenomenon of the sentence meaning. Even though these three stages did not emerge neatly separated, profound changes took place. Semantics shed its early historical ties to comparative philology to become more and more attached to other fields, such as psychology and sociology. The history of semantics in the various countries did not, in any way, unfold simultaneously. There

Meaning: Pre-20th Century Theories 595

were also major differences among influences on the development of semantics, stemming from various philosophical traditions, on the one hand, and from the various natural sciences (biology, geology, medicine, etc.), on the other. Finally, the development of semantics differed by the influence of other linguistic disciplines and fields such as rhetoric, classical philology, comparative philology, and grammar, in general, and etymology, lexicography, and phonetics, in particular. The birth of semasiology in Germany and sematology in England almost coincided. Semasiology was brought into being by Christian Karl Reisig’s (1792–1829) lectures, given in the 1820s and posthumously published in 1839; sematology was founded with the publication of Benjamin Humphrey Smart’s (1786–1872) treatise on sign theory, Outline of sematology (1831). In Germany, the maturation process of the discipline started with the edition of Friedrich Haase’s (1808–1867) lectures on semasiology in 1874, followed by the re-edition of Reisig’s work from 1881 to 1890. In England, sematology matured when, in 1857, the Philological Society of London decided to embark on a huge project: the New English Dictionary. In 1831, Smart already noted the similarity and difference between French Ideology and his sematology. Both were theories of signs, but French Ideology studied the development of ideas in isolation, whereas Smart was interested in the context and in meaning construction. At the end of the 19th century, a group of linguists adhered strictly to the doctrine of August Schleicher (1821–1868) and regarded linguistics as a natural science and language as an evolving organism. Similar metaphors were also used by E´ mile Littre´ (1801– 1881), already in 1850, and by Darmesteter (1887), although in Arse`ne Darmesteter’s (1846–1888) case the influence of psychology on his semantic thought was soon outweighed by that of biology. The psychological influence brought about a shift from an interest in the mere classifications of types or laws of semantic change to a search for the causes of semantic change. According to different versions of psychology, especially of Vo¨ lkerpsychologie (psychology of people), developed by Heymann Steinthal (1823–1899) and Moritz Lazarus (1824–1903), on the one hand, and by Wilhelm Wundt (1832–1920), on the other, these causes were to be sought in different phenomena. Psychology became the midwife to the second and most successful kind of semantics, developed in France mainly by Michel Bre´ al (1832–1915). He too referred to Condillac and the conception of the sign put forward by the Ide´ ologues,

in order, however, to criticize those naturalists who saw language as a natural organism or studied the growth, life, and death of words. For Bre´ al, words as signs had no autonomous existence. They changed because they were signs of thought, which were created and used by the speakers of a language. In his lectures and articles, he argued, like William Dwight Witney (1827–1894) in the United States, against the German way of doing linguistics – the search for an Ursprache (original language) and the study of language as an organism. He excluded from his criticism only Franz Bopp (1791–1867), citing his work as an example of careful methodology. Semasiology in Germany

Between 1830 and the end of the 19th century, it is possible to detect an increasing broadening of the field of semasiology, from the atomistic study of words and the rather ill-conceived classification of types of semantic changes toward the recognition of the importance of cultural and social factors and the acknowledgment of context as a factor in semantic change. The logico-historico-classificatory tradition (Reisig; Haase; and Ferdinand Heerdegen, 1845–1930) was followed by the psychological phase (Lazarus and Steinthal) and the further development of semasiology (Max Hecht, 1857–?). Another landmark was the Vo¨ lkerpsychologie of Wundt, who made possible a psychological tradition in semasiology. At the turn of the century, the time was ripe for new ideas in semantics, whether developed in response to Wundt or independently of his work (e.g., by Hermann Paul, 1846–1921; Philipp Wegener, 1848–1916; Stu¨ cklein, 19th century; and Karl-Otto Erdmann, 1858–1931). The Development of the Se´ mantique in France

For Bre´ al, the history of a language was not just an internal history; it was intimately linked to political, intellectual, and social history, where creation and change were at any moment our own work. In his lectures (1866–1868), he showed that the meaning or function of words could survive the alteration of form. The force countering the alterations and filling the gaps was the human mind. Darmesteter asked himself what were the causes and laws of semantic change. The evolution of language at all its higher levels was based on what Darmesteter called, in accordance with Bre´ al, the disregard of etymology. What was here described as a loss of etymological adequacy was seen in La vie des mots e´ tudie´ e dans leurs sigifications (1887) as the gain of an adequate signification for language use.

596 Meaning: Pre-20th Century Theories From Sematology to Significs in England

English semantics in the 19th century was not a theoretically well-established field. It consisted of two disjointed strands of thought (Nerlich, 1992: 207): sematology and semasiology. The first type of semantics was a predominantly philosophical one that emerged from thinkers such as Locke, John Horne Tooke (1736–1812), and Dugald Stewart (1753– 1828), culminating in Smart. The second type was a predominantly practical one that sprang from lexicographers and etymologists such as Samuel Johnson (1709–1784), Noah Webster (1758–1843), and Charles Richardson (1775–1865), culminating in Richard Chenevix Trench (1807–1886), James Murray, and Walter William Skeat (1835–1912). By the end of the century, Lady Victoria Welby (1837–1912) had rekindled philosophical interest in semantic questions and fostered a new approach to the problem of meaning. This approach was called significs and constituted a return to Smart’s reflections on semiotics, but it lifted sematology, semasiology, and la se´ mantique onto a higher psychological, moral, and ethical plane. Summary

To summarize this section (see Nerlich, 1992: 19), in Germany we can distinguish between two main approaches to semantics: an atomistic-historical attitude and a holistic-psychological one. In France, the historical se´ mantique sprang directly from psychological insights into language evolution. And in England, the psychological approach was marginal and the historical one dominant until the end of the century. See also: Aristotle and the Stoics on Language; Augustine,

Saint: Theory of the Sign; Bopp, Franz (1791–1867); Bre´al, Michel Jules Alfred (1832–1915); Cognitive Semantics; Color Terms; Condillac, Etienne Bonnot de (1714–1780); Definition; Dictionaries and Encyclopedias: Relationship; Encyclope´die; Epistemology and Language; Etymology; Humboldt, Wilhelm von (1767–1835); Language Ideology; Lexical Fields; Locke, John (1632–1704); Metaphor: Philosophical Theories; Onomasiology and Lexical Variation; Paul, Hermann (1846–1921); Plato and His Predecessors; Port-Royal Tradition of Grammar; Reisig, Karl (1792– 1829); Saussure, Ferdinand (-Mongin) de (1857–1913); Schleicher, August (1821–1868); Steinthal, Heymann (1823–1899); Synonymy; Thought and Language: Philo-

sophical Aspects; Tooke, John Horne (1736–1812); Universal Language Schemes in the 17th Century; Whitney, William Dwight (1827–1894); Wundt, Wilhelm (1832–1920).

Bibliography Arnauld A & Nicole P (1965–1967). L’Art de penser. La Logique de Port-Royal (2 vols). Baron von Freytag Lo¨ ringhoff B & Brekle H E (eds.). Stuttgart-Bad Cannstatt: Friedrich Frommann Verlag (Gu¨ nther Holzboog). Auroux S (1979). La se´ miologie des encyclope´ distes. Essai d’e´ piste´ mologie historique des sciences du langage. Paris: Payot. Auroux S & Delesalle S (1990). ‘French semantics of the late nineteenth century and Lady Welby’s significs.’ In Schmitz W (ed.) Essay on significs: papers presented on the occasion of the 150th anniversary of the birth of Victoria Lady Welby (1837–1912). Amsterdam/ Philadelphia: John Benjamins. 105–131. Dutz K D & Schmitter P (eds.) (1983). Historiographia Semioticae: Studien zur Rekonstruktion der Theorie und Geschichte der Semiotik. Mu¨ nster: Nodus Publikationen. Gauger H-M (1973). Die Anfa¨ nge der Synonymik: Girard (1718) und Roubaud (1785). Ein Beitrag zur Geschichte der lexikalischen Semantik. Tu¨ bingen: Tu¨ binger Beitra¨ ge zur Linguistik. Gipper H & Schmitter P (1975). ‘Sprachwissenschaft und Sprachphilosophie im Zeitalter der Romantik.’ In Sebeok T A (ed.) Current trends in linguistics, vol. 13/2. The Hague: Mouton. 481–606. Gordon W T (1982). A history of semantics. Amsterdam/ Philadelphia: John Benjamins. Haßler G (1991). Der semantische Wertbegriff in Sprachtheorien vom 18. bis zum 20. Jahrhundert. Berlin: Akademie-Verlag. Haßler G (1999). ‘Sprachtheorie der ide´ ologues.’ In Schmitter P (ed.) Geschichte der Sprachtheorie, vol. 4. Tu¨ bingen: Narr. 201–229. Knobloch C (1987). Geschichte der psychologischen Sprachausfassung in Deutschland von 1850–1920. Tu¨ bingen: Niemeyer. Nerlich B (1992). Semantic theories in Europe 1830–1930. Amsterdam/Philadelphia: John Benjamins. Quadri B (1952). Aufgaben und Methoden der onomasiologischen Forschung. Eine entwicklungsgeschichtliche Darstellung. Bern: Francke. Swiggers P (1982). ‘De Girard a` Saussure: Sur l’histoire du terme valeur en linguistique.’ In Travaux de linguistique et de litte´ rature, publie´ s par le centre de Philologie et de litte´ ratures romanes de l’Universite´ de Strasbourg 20/1: Linguistique, Philologie, Stylistique. Paris: Klincksieck. 325–331.

Media and Language: Overview 597

Media and Language: Overview S McKay, University of Queensland, Brisbane, Australia ! 2006 Elsevier Ltd. All rights reserved.

The mass media are generally considered to include the press, radio, and television; the Internet, although not strictly a medium, is also increasingly included. News and current affairs articles and programs, documentaries, sports news and broadcasts, radio phone-ins, advertisements, reality television, quiz shows, soap operas, websites, and so on help to organize the ways we understand our society and culture. They are often the only way we have of understanding other societies and cultures. The media’s influence on everyday life is usually taken for granted. They supply accounts of reality and construct particular forms of knowledge and pleasure: they inform and educate, they sell products, they tell stories, they entertain, they connect us with others, they help form our identities, they influence trends and mobilize opinion, they circumscribe our experience in what has become a media-saturated environment. Linguists and others working in language and communication have always been interested in the language of the media. Bell (1995: 23) gives four reasons for this: the ready availability and accessibility of media texts as sources of language data; the importance of media for evidence of language use and language attitudes in a speech community; the use the media themselves make of language; and the way the media reflect culture, politics and social life. Not surprisingly, other researchers from a range of disciplines and fields such as media studies and journalism, communication, cultural studies, sociology, education, and psychology have also been interested in how the media shape these understandings. In these fields, media are studied not just through their texts but also through the wider cultural contexts associated with industrial production processes, and through the effects they have on their audiences (for example, Watson, 1998; Curran and Gurevitch, 2000). While texts are at the center of any concern related to language and the media, the analysis and critique of production processes and audience effects are often valuable as well. This article will discuss a variety of approaches to media analysis and outline briefly some of the recent debates about the media to emphasize the usefulness of the range of critical resources that have been applied to media texts and media language.

Approaches to Media Language Analysis Media texts have been analyzed both quantitatively and qualitatively (see Media: Analysis and Methods). Early work, especially in the United States, concentrated on content analysis. Content analysis focuses on the message and assumes that its content can be broken down into units of meaning that can be counted in a process that is designed to be objective and replicable. Much of this early work looked at the content of American newspapers, especially on subject matter categories like politics, crime, and sports. Towards the end of the 1930s, Harold Lasswell, who was interested in the relationship between propaganda and public opinion in the ‘new’ mass medium of radio, used content analysis to investigate political values. During World War II, the U.S. government, through the Experimental Division for the Study of War-Time Communications under Lasswell, conducted the ‘World attention survey’ through a content analysis of major newspapers, which was designed to investigate newspaper coverage of foreign affairs. Others investigated the propaganda output from various organizations and individuals, focusing on aspects like the ‘values’ expressed in political speeches or the ‘tone’ of the headlines. Essentially, the origins of this approach developed in the study of the content of newspapers, but then evolved to be used as a way of evaluating political messages, especially Nazi and Soviet propaganda messages during World War II and the Cold War. Much of this research was quantitative, relying on coding aspects of content into a number of discrete categories and counting their frequency. Later content analyses applied a similar methodology to television, but just about every type of media communication has been studied. Modern content analysis is still used for news content or detecting political bias, but it has also been used to investigate topics like gender representation in advertisements, violence in children’s television programs, and representations of minority groups (for example, the work of the Glasgow Media Group in the United Kingdom on the news; and the research by George Gerbner and his associates in the United States on the Cultural Indicators project). The first edition of the Encyclopedia of language and linguistics included an article on media language as part of the communicative process and outlined a different and more theoretically based tradition in the study of media language. It started with semiotics (see Saussure, Ferdinand (-Mongin) de (1857–1913)) to demonstrate the need to think of media language as

598 Media and Language: Overview

part of a sign system or as a process of communication with complex social and cultural influences affecting the way in which media texts are produced and understood. The article reviewed the major approaches to the study of media language, such as critical linguistics, noting its approach to ideology and power in the representation of social reality, and its agenda in unmasking the seemingly objective nature of news reports; semiotics, especially as a way of decoding both verbal and visual aspects in advertisements; and discourse analysis, again primarily of news texts, but also phone-ins, interviews, and disc jockey monologues, with an emphasis on pragmatics, speech acts, and turn-taking. It noted a shift in emphasis away from textual analysis alone towards a more audience-focused approach. Other overviews published more recently have outlined the research trajectories in language in the media through the work of individual linguists, especially with respect to the growing influence of Critical Discourse Analysis. Bell’s (1995) overview maps out the approaches made in the study of media language and discourse from, first, Critical Linguistics and then Critical Discourse Analysis (CDA). He notes the growing importance of CDA and its sociopolitical concern to reveal inequalities of power as a standard approach to media texts, outlining the contributions of Fowler through his development of Critical Linguistics to apply functional grammar to news texts; of van Dijk’s using text linguistics to develop discourse analysis to apply to news story structure; of Fairclough’s bringing social theory, especially the work of Foucault, to discourse analysis, and to his contribution, along with that of van Dijk, to the development of Critical Discourse Analysis with its explicit sociopolitical stance. Bell’s own work links text analysis of news stories to media production processes and the role of the audience. Critical discourse analysts are interested in both details of the text itself and the broader social, political, and cultural functions of media discourse to determine other layers of meaning. Much of their work to date has been on the analysis of factual genres like news rather than fiction or advertising. CDA has become an important approach for studying media texts, especially in European linguistics and discourse studies. However, to argue uniformity in the CDA approach could be misleading. The use of the individuals’ contributions to structure overviews of the field is instructive here. CDA is not a holistic approach, using a single theory or even a single methodology. Its foundations are derived from classical rhetoric, pragmatics, text linguistics, sociolinguistics, and applied linguistics and reflect the growing

interdisciplinarity in research between the humanities and the social sciences. Ideology and power underpin applications of CDA to issues relating to gender, class, and ethnicity and also to more general discussions about media discourses relating to politics or the economy (see Media, Politics, and Discourse: Interactions). However, depending on the discipline background of the researcher, the methodologies may differ, with both large empirical studies found as well as small, focused qualitative case studies. Weiss and Wodak’s (2003) volume covers some of the research practices under the umbrella of CDA, and their introduction outlines some of the critiques and debates that have evolved with it. The dominant paradigms of media language research have tended to produce critical evaluations of the power of the media to influence and even to subordinate their audiences. As indicated above, much of the work on media language, especially in critical linguistics and critical discourse analysis, has been undertaken on the news and is concerned with uncovering its underlying ideologies. In particular, this work targets for special attention social problems like discrimination and prejudice (for example, van Dijk’s 1991 work on racism). The studies undertaken in CDA are designed with an emancipatory purpose; that is, they have a sociopolitical agenda intended to shed light on issues of power and domination. Thus, it should not be surprising to see so much CDA focused on the news. News is not considered as some neutral image of the real world but as a product of news gathering and news making. In other words, news is the end result of a number of processes, including organizational policies and preferences that set the news agenda and selection and judgments about relative importance and significance. It is thus a representational discourse made by converting the raw data from a variety of sources, including eyewitness accounts, interviews, and media releases, into stories within the context of technical constraints like production deadlines, and in accordance with the news values of the time (see News Language; Newspeak). News values, or the set of criteria used to determine newsworthiness and hence whether or not an item is likely to appear as news, act as a filtering mechanism or gate-keeping device for what is reported. The study of the news has always been at the forefront of media discourse research. The constructedness and selectivity of the news and its concentration on negative events, together with its connection to powerful institutions and commercial market imperatives, have proved compelling for language researchers, especially those who are interested in critical approaches to ideology, and have stimulated

Media and Language: Overview 599

investigations into story content, underlying values, and structure, as well as studies of representations of specific issues and groups. However, media language has been studied at other levels beyond journalistic practices, media production processes, and textual analysis. The traditional distinction between news and entertainment has transferred into analytic approaches, with linguists and discourse analysts using conventional, language-focused, empirical methods on news texts, especially on news content, news values, and ideals of objectivity. This has left theorists from other fields like sociology, cultural studies, and women’s studies to supply much of the insight into the entertainment side of media output. It is from these traditions that studies about consumption, popular taste, the politics of the everyday, and notions of pleasure and resistance have been undertaken. A major contribution here has been insights from reception studies into reading practices and how readers or audiences negotiate meaning and respond to various media texts such as soap operas and women’s magazines, as well as news programs. These are questions that usually lie outside traditional language or discourse analysis. Audience research has been the mainstay of media and cultural studies research for some years now as researchers endeavor to find out how audiences make sense of media texts. Reception studies have focused on specific genres to see how different social groups (based on age, class, ethnicity, or gender) or subcultures interpret texts in different ways. Hermes (1995), for example, in her analysis of women’s magazines, draws heavily on responses from women who describe the pleasure they derive from reading what is often considered a devalued genre. Fewer studies have undertaken detailed linguistic analysis to relate textual features to audience interpretation. Richardson’s work (1998) on economic reporting in television news is an exception here, as she links an analysis of media discourse on the economy with an analysis of the reception of that discourse. Studying texts with images and sounds has presented challenges to conventional discourse analysis, which has valued modes of language through speech and/or writing over visual images or music. The mass media produce multimodal texts, that is, texts that draw from language, pictures, or other graphic elements and sounds in various combinations. Considerations of the multimodal nature of media texts are difficult to incorporate in language-based media analysis. Two examples of work on multimodality in the media, linked to linguistics, are Kress and van Leeuwen’s work on text layouts (1998, 2001) and Cook’s work (2001) on advertisements.

Kress and van Leeuwen have pointed out how the multimodality of the media needs to be taken into account when analyzing discourse. Their (1998) semiotic analysis of newspaper front page layouts shows how conventions related to the positioning of headlines, blocks of text, and photographs produce meaning and coherence; how visual cues (size, color, contrast) produce hierarchies of meaning; and how frames like lines and spaces produce separations or connections. They emphasize the importance of layout analysis in the critical study of newspaper language. The nature of advertising also means that analysts need to take account of pictorial and musical modes, even in advertisements where language predominates (see Media: Semiotics; Visual Semiotics; Barthes, Roland (1915–1980)). Cook (2001) stresses the links between language in advertising with the other modes and urges analysts to understand the interconnections they make with each other to construct meaning. In spite of the difficulties in trying capture such multimodality, concentrating on language and ignoring the other modes is to miss much of the potential for meaning of contemporary media texts.

Current Issues The Impact of New Media

Researchers and other commentators have continued to be interested in media language and have expanded the field to include newer forms of media, mainly through attention to computer-mediated communication (CMC) (see Language in Computer-Mediated Communication). The development of global, digital, networked communication systems and their infrastructure in the latter part of the 20th century has captured the public imagination in terms of the seemingly limitless potential to provide access to information and to facilitate interaction with others. This has been greeted with either utopian predictions of a better informed and more democratic society with greater facilities to connect and interact with others through the creation of virtual communities, or a darker view of information overload, unregulated content, personal isolation, and even loss of ability to interact face-to-face, and a ‘digital divide’ between those with access and those without. New on-line technologies have transformed news through the capacity to accumulate unlimited multimodal information in a single text and to permit different routes of access and different levels of interaction. On-line news departs from the constraints of space and time associated with print and broadcast news and instead organizes vast amounts of information content as self-supporting layers with hypertext

600 Media and Language: Overview

links (Lewis, 2003). In addition, various new forms of news have emerged with webcasts, web logs, news groups, and news alert services that provide constant access to news and embedded information. Research on language use in computer-based media is struggling to keep pace with rapid technological innovations. The impact of the Internet and related technologies on language has been discussed (often negatively) in terms of language change and effects on literacy. Crystal’s (2001) linguistic perspective offers evidence of the emergence of distinctive language varieties developing in Internet-related forms of communication, including e-mail, chat groups, virtual worlds (imaginary environments for text interactions), and the World Wide Web. In each, he notes adaptations in graphology, grammar, semantics, and discourse to suit the characteristics of the technology and its uses. Rather than forecasting the demise of the written form or, more dramatically, the death of languages as the result of the impact of new technologies, Crystal argues that the Internet is providing creative possibilities and enrichment opportunities for its users. Work related to language in the new media has included studies on topics like web pages and hypertext links, e-mail discussion lists, and Internet video conferencing, and studies on themes like multilingualism in chat sites, the dominance (or not) of English in the new media, the extent of democracy, and the division between the information-rich and information-poor (for a range of studies, see Pemberton and Shurville, 2000; Herring, 1996; Jenkins and Thorburn, 2003; Aitchison and Lewis, 2003). Language research has focused primarily on computer-mediated communication via the Internet, its relationship to face-to-face communication, and its relative anonymity, as well as textual aspects like special language features and idiosyncratic codes of conduct and etiquette (or ‘netiquette’). E-mail users and Internet chat users have developed spelling shortcuts like letter homophones or acronyms as a kind of coded shorthand to speed up typing (by saving keystrokes), and even special symbols created originally from ASCII characters (‘emoticons’ like smiley faces) to add emotional content. These shortcuts and symbols also help to differentiate regular users from newcomers. The so-called ‘netspeak’ or ‘netlingo’ has spread beyond CMC, with similar shortcuts and symbols adopted by mobile telephone users for text messaging, and CMC jargon like ‘flaming’ and ‘spamming’ becoming more widely used and understood. Internet communication channels like chatrooms, bulletin boards, newsgroups, complex hyperlinked websites, and electronic games need to be thought of differently from traditional media texts

(see Language in Computer-Mediated Communication). Such texts are characterized by their interactivity but also by their intertextuality, as they draw on other texts and play with established conventions of form and representation (Buckingham, 2000). In addition, many new media forms with their built-in potential for interactivity make the traditional distinction between author and reader (or producer and audience) less important, as texts are cooperatively produced and meanings are created, challenged, and changed. While it may be compelling to foreground the benefits of intertextuality and interactivity in contemporary media texts, especially in terms of creativity and innovation, many texts produced in new media are just refashioned conventional texts whose interactivity is based on limited choice and pathways.

Revisiting Traditional Media The more traditional media continue to generate research interest with new insights (see Radio: Language; Television: Language; Documentary; News Language; Sports Broadcasting). Changes to the language of the media have been noted, such as growing tendencies for a more casual, conversational style with an increasing use of colloquial vocabulary and vernacular idiom by journalists and interviewers; a greater emphasis on celebrity; stress on the personal; and a marked trend to the short, sharp, sound bite designed to be pithy and provocative, especially when edited into news reports. In debates about declining journalistic standards, the news media across a range of countries increasingly have been criticized for not separating news from entertainment, for personalizing and sensationalizing stories, and for focusing on ‘softer’ or ‘lighter’ stories rather than concentrating on more serious issues in a process termed ‘tabloidization’ (for international perspectives on tabloidization, see Sparks and Tulloch, 2000). Mass-market magazine content also has become more sensationalized with a growing emphasis on celebrity coverage, gossip, scandal, and intrigue. Franklin (1997: 4) has criticized both print and broadcast journalism, commenting that ‘‘the trivial has triumphed over the weighty; the intimate relationships of celebrities from soap operas, the world of sport or the royal family are judged more ‘newsworthy’ than the reporting of significant issues and events of international importance.’’ While news and public affairs remain an important part of media output, a significant portion of television output is devoted more directly to entertainment through soap operas, dramas, talk shows, reality

Media and Language: Overview 601

television, quiz shows, sports, and lifestyle programming (see Television: Language). It is likely that the increasing global trend of the media supplying more and more of these shows is underpinning the tendency for the news to become more conversationalized and more ‘tabloid,’ as it too comes under market pressure to entertain and to connect more closely with ordinary values and practices. Research has continued as well on a range of variously disenfranchised groups, including women, children, the aged, ethnic minorities, and disabled groups (see Media and Marginalized Groups), and on moral panics, not just in the traditional sense of ‘muggings’ and escalating urban violence (Hall et al., 1978) but in a more diverse sense that includes media coverage of issues related to pornography, pedophilia, health risks, and medical negligence (see Media Panics). Research also has been conducted into media’s role in language development and literacy. These new and continuing directions in research are contributing to the growing critique of the media and how they operate. Contemporary media language is changing, especially under the influence of globalizing technologies and changes to the media industries that are affecting the way our social world is presented to us. As global media corporations operate beyond the symbolic spheres of national influences and cultures, we are seeing increasingly globalized programming and convergent technologies, but whether this will translate into a homogenized culture with a global language is not at all certain. It is more likely that improvements in technology will give us more choice about where we get our information from and how we entertain ourselves. What is evident within the study of language in the media is a continuing trend away from just textbased and/or purely linguistic approaches, not only in order to take account of factors surrounding media texts like the production processes, the characteristics of the medium, and the circumstances of audience reception but also to look at the nature of media language itself as it changes in relation to changing industry, social pressures, and emerging technologies. See also: Barthes, Roland (1915–1980); Critical Discourse Analysis; Documentary; Genre and Genre Analysis; Language in Computer-Mediated Communication; Media: Analysis and Methods; Media and Marginalized Groups; Media Panics; Media, Politics, and Discourse: Interactions; Media: Pragmatics; Media: Semiotics; News Language; News Language; Newspeak; Radio: Language; Saussure, Ferdinand (-Mongin) de (1857–1913); Speech Acts; Speech Community; Sports Broadcasting; Television:

Language; Television: Language; Text and Text Analysis; Visual Semiotics.

Bibliography Aitchison J & Lewis D M (eds.) (2003). New media language. London: Routledge. Bell A (1991). The language of news media. Oxford: Blackwell. Bell A (1995). ‘Language and the media.’ Annual Review of Applied Linguistics 15, 23–41. Bell A & Garrett P (eds.) (1998). Approaches to media discourse. Oxford: Blackwell. Berelson B (1984). Content analysis in communication research. New York: Hafner Press. [Facsimile of 1952 edn.] Brookes R (2000). ‘Tabloidization, media panics and mad cow disease.’ In Sparks C & Tulloch J (eds.) Tabloid tales: global debates over media standards. Lanham, MD: Rowman & Littlefield. 195–209. Buckingham D (2000). After the death of childhood: growing up in the age of electronic media. Cambridge, UK: Polity Press. Chouliaraki L & Fairclough N (1999). Discourse in late modernity: rethinking critical discourse analysis. Edinburgh: Edinburgh University Press. Cook G (2001). The discourse of advertising (2nd edn.). London: Routledge. Corner J (1995). Television form and public address. London: Edward Arnold. Crystal D (2001). Language and the internet. Cambridge: Cambridge University Press. Curran J & Gurevitch M (eds.) (2000). Mass media and society (3rd edn.). London: Arnold. Dijk T A van (1988). News as discourse. Hillsdale, NJ: Lawrence Erlbaum. Dijk T A van (1991). Racism and the press. London: Routledge. Eldridge J (ed.) (1995). Glasgow Media Group reader, vol. 1: news content, language and visuals. London: Routledge. Fairclough N (1995). Media discourse. London: Edward Arnold. Fowler R (1991). Language in the news: discourse and ideology in the press. London: Routledge. Franklin B (1997). Newszak and news media. London: Arnold. Graddol D & Boyd-Barrett O (eds.) (1994). Media and texts: authors and readers. Clevedon, UK: Multilingual Matters. Gross L (1998). ‘Minorities, majorities and the media.’ In Liebes T & Curran J (eds.) Media, ritual and identity. London: Routledge. 87–102. Hall S, Critcher C, Jefferson T, Clarke J & Roberts B (1978). Policing the crisis: mugging, the state and law and order. London: Macmillan. Hermes J (1995). Reading women’s magazines: an analysis of everyday media use. Cambridge: Polity Press.

602 Media and Language: Overview Herring S (ed.) (1996). Computer-mediated communication: linguistic, social and cross-cultural perspectives. Amsterdam: John Benjamins. Jenkins H & Thorburn D (eds.) (2003). Democracy and new media. Cambridge: MIT Press. Kress G & Leeuwen T van (1998). ‘Front pages: (the critical) analysis of newspaper layout.’ In Bell A & Garrett P (eds.) Approaches to media discourse. Oxford: Blackwell. 188–219. Kress G & Leeuwen T van (2001). Multimodal discourse: the modes and media of contemporary communication. London: Arnold. Langer J (1998). Tabloid television: popular journalism and the ‘other news.’ London: Routledge. Lewis D M (2003). ‘On-line news: a new genre?’ In Aitchison J & Lewis D M (eds.) New media language. London: Routledge. 95–104. Livingstone S (2002). Young people and new media. London: Sage. Manovich L (2001). The language of new media. Cambridge: MIT Press.

Pemberton L & Shurville S (eds.) (2000). Words on the web: computer mediated communication. Exeter: Intellect Books. Richardson K (1998). ‘Signs and wonders: interpreting the economy through television.’ In Bell A & Garrett P (eds.) Approaches to media discourse. Oxford: Blackwell. 220–250. Sparks C & Tulloch J (eds.) (2000). Tabloid tales: global debates over media standards. Lanham, MD: Rowman & Littlefield. Thurlow C, Lengel L & Tomic A (2004). Computer mediated communication: social interaction and the Internet. London: Sage. Watson J (1998). Media communication: an introduction to theory and process. Basingstoke: Macmillan. Weiss G & Wodak R (eds.) (2003). Critical discourse analysis: theory and interdisciplinarity. Basingstoke: Palgrave Macmillan. Wodak R & Meyer M (eds.) (2001). Methods of critical discourse analysis. London: Sage.

Media and Marginalized Groups A Jakubowicz, University of Technology, Sydney, Australia ! 2006 Elsevier Ltd. All rights reserved.

How Do Media Marginalize Groups? When the Nazi Party built their media outlets of print, radio, and cinema, they discovered the great benefit of producing constant propaganda about ‘untermenschen,’ – the pillorying of ‘lower people’ on grounds of supposed racial and cultural differences – Jews, Slavs, Gypsies, Blacks, homosexuals, and people with impairments. The aim of this marginalization was to dehumanize their targets and gain public support for their exploitation of these groups, and then their extermination. The media provide environments through which communication occurs. These environments are set within social, political, economic, and cultural parameters that reflect wider structures of power at the local, national, and international levels. ‘Marginalization’ is a process that reflects differentials in power and is expressed in unequal access to and representation in the media. Such marginalization can occur in any or all of the four analytical levels of the media: ownership and regulation, production and creativity, textual content, and audience consumption. ‘Marginalized groups’ can refer to any segment of a population with shared perceptions and experiences, who are unable to fully realize their rights as

articulated in both the Human Rights and the Economic Social and Cultural Rights conventions of the United Nations. In particular, the term ‘marginalized groups’ refers to groups whose social or cultural characteristics generate opprobrium or hostility among other groups in society, so that their chances for a long life are significantly reduced. This article examines the process through which marginalization occurs in the media, and gives some examples of resistance to such marginalization. Most research concentrates on the way the media relates to ethnic minorities, seeing them as the most obvious example of groups disempowered by communication structures in societies dominated by a single ethnic group, for instance, Whites in the United States, Anglo-Celts in Australia, or metropolitan Francophones in France. Studies of minorities and the media have been undertaken in most countries where there is ethnic diversity and has been a theme in the study of international communication. More recently, research on disability and the media has pointed to the significant role of stereotypes communicated by the mass media in disabling people with impairments.

Ownership Most of the world’s media are owned either by large multinational conglomerates (AOL Time Warner, Disney, General Electric, News Corporation, Viacom,

602 Media and Language: Overview Herring S (ed.) (1996). Computer-mediated communication: linguistic, social and cross-cultural perspectives. Amsterdam: John Benjamins. Jenkins H & Thorburn D (eds.) (2003). Democracy and new media. Cambridge: MIT Press. Kress G & Leeuwen T van (1998). ‘Front pages: (the critical) analysis of newspaper layout.’ In Bell A & Garrett P (eds.) Approaches to media discourse. Oxford: Blackwell. 188–219. Kress G & Leeuwen T van (2001). Multimodal discourse: the modes and media of contemporary communication. London: Arnold. Langer J (1998). Tabloid television: popular journalism and the ‘other news.’ London: Routledge. Lewis D M (2003). ‘On-line news: a new genre?’ In Aitchison J & Lewis D M (eds.) New media language. London: Routledge. 95–104. Livingstone S (2002). Young people and new media. London: Sage. Manovich L (2001). The language of new media. Cambridge: MIT Press.

Pemberton L & Shurville S (eds.) (2000). Words on the web: computer mediated communication. Exeter: Intellect Books. Richardson K (1998). ‘Signs and wonders: interpreting the economy through television.’ In Bell A & Garrett P (eds.) Approaches to media discourse. Oxford: Blackwell. 220–250. Sparks C & Tulloch J (eds.) (2000). Tabloid tales: global debates over media standards. Lanham, MD: Rowman & Littlefield. Thurlow C, Lengel L & Tomic A (2004). Computer mediated communication: social interaction and the Internet. London: Sage. Watson J (1998). Media communication: an introduction to theory and process. Basingstoke: Macmillan. Weiss G & Wodak R (eds.) (2003). Critical discourse analysis: theory and interdisciplinarity. Basingstoke: Palgrave Macmillan. Wodak R & Meyer M (eds.) (2001). Methods of critical discourse analysis. London: Sage.

Media and Marginalized Groups A Jakubowicz, University of Technology, Sydney, Australia ! 2006 Elsevier Ltd. All rights reserved.

How Do Media Marginalize Groups? When the Nazi Party built their media outlets of print, radio, and cinema, they discovered the great benefit of producing constant propaganda about ‘untermenschen,’ – the pillorying of ‘lower people’ on grounds of supposed racial and cultural differences – Jews, Slavs, Gypsies, Blacks, homosexuals, and people with impairments. The aim of this marginalization was to dehumanize their targets and gain public support for their exploitation of these groups, and then their extermination. The media provide environments through which communication occurs. These environments are set within social, political, economic, and cultural parameters that reflect wider structures of power at the local, national, and international levels. ‘Marginalization’ is a process that reflects differentials in power and is expressed in unequal access to and representation in the media. Such marginalization can occur in any or all of the four analytical levels of the media: ownership and regulation, production and creativity, textual content, and audience consumption. ‘Marginalized groups’ can refer to any segment of a population with shared perceptions and experiences, who are unable to fully realize their rights as

articulated in both the Human Rights and the Economic Social and Cultural Rights conventions of the United Nations. In particular, the term ‘marginalized groups’ refers to groups whose social or cultural characteristics generate opprobrium or hostility among other groups in society, so that their chances for a long life are significantly reduced. This article examines the process through which marginalization occurs in the media, and gives some examples of resistance to such marginalization. Most research concentrates on the way the media relates to ethnic minorities, seeing them as the most obvious example of groups disempowered by communication structures in societies dominated by a single ethnic group, for instance, Whites in the United States, Anglo-Celts in Australia, or metropolitan Francophones in France. Studies of minorities and the media have been undertaken in most countries where there is ethnic diversity and has been a theme in the study of international communication. More recently, research on disability and the media has pointed to the significant role of stereotypes communicated by the mass media in disabling people with impairments.

Ownership Most of the world’s media are owned either by large multinational conglomerates (AOL Time Warner, Disney, General Electric, News Corporation, Viacom,

Media and Marginalized Groups 603

Sony, Bertelsmann, AT&T and Liberty Media) (Shah, 2004) or by governments. While there are many additional smaller companies operating in national and local media markets, the large corporations set the broad tone and focus for media practices, especially in liberal democratic states with market economies. Most media markets are under some sort of control by the state, ranging from the almost total control exhibited in China and Cuba, through to the almost anarchic operation of the market in some postCommunist states. However, the dominant paradigm remains an environment where the print media are privately owned, and the electronic media are either private or have some government involvement. In the United Kingdom, Australia, and Canada, the government-run electronic media have a reputation for independence, taking seriously their role as defenders of the public interest. In the United States, public broadcasting has some support from the government, but is essentially a not-for-profit sector. These patterns of ownership play an important part in how minority groups experience the media and use them to communicate within their own boundaries and with wider societies. Commercial media need to build an audience that they can ‘deliver’ to their advertisers, on whom their economic viability rests. Thus, they tend to search for ‘demographics’ that will be most attractive to advertisers, and who can be reached most efficiently. They seek out commonalities in audiences – language, culture, economic affluence – that they can use to expand their reach. Those people who are part of the majority group, particularly if it has greater economic and social power, will tend to find their interests and worldviews reflected in the media. On the other hand, minorities who are less powerful, and poorer, are less likely to find their interests addressed. One consequence of this exclusion has been the development of media targeting minority groups in their own languages and from within their cultures. Government media may present a more wideranging view, depending on whether government polices seek to facilitate minority group inclusion, and the overcoming of marginalization, or the reinforcement of existing social hierarchies. Thus, government media may contribute to the demonizing of minorities who are perceived to be enemies of the state (and indeed commercial media may collaborate in such attacks). With the development of the Internet and new media technologies in relation to print and broadcast, many countries have seen a rapid expansion of minority media outlets targeting particular communities. Global diasporas can thus maintain links across worlds, and cultural communities can build diversity

within them, though at the real cost of separating different language communities from each other.

Regulation Most countries have some form of media regulation, where governments establish rules under which they can operate. These rules can range from prescribing who can own media (for example; forbidding foreigners; banning political opponents; preventing monopolies; encouraging cultural diversity), to censorship, such as what can be printed or broadcast or uploaded to the World Wide Web. Many systems of regulations, including the self-regulatory codes of practice adopted by some media sectors, specifically prohibit discrimination against or the vilification of minority groups. In this they mirror the UN ESC Rights convention, and its signatories’ commitment to implement those rights ‘‘without discrimination of any kind as to race, color, sex, language, religion, political or other opinion, national or social origin, property, birth or other status’’ (United Nations, 1966/1976: Article 2: 2). The rise of racism on the Internet has become a particular focus of government attention in Europe. The European community has undertaken a series of studies through the European Monitoring Centre on Racism and Xenophobia (EUMC) that has sought to document the expansion of hate speech on the Internet, and then to develop strategies of how to resist its impact on vulnerable communities. While some countries have specific legislation under which hate speech on the Internet is a criminal offense, others have preferred to develop avenues for civil recourse or have argued that free speech rights should not be restrained on the Internet. Indeed, the regulation of hate speech altogether has become a matter of significant controversy in many countries, especially where the issues have arisen from religious differences. Studies of anti-Semitism have shown how complex contemporary debates over the Israeli-Palestinian conflict have intensified this already emotionally charged arena, while Arab and Muslim groups in Western countries have pointed to the need for restrictions on the demonizing of Arabs and Muslims in the media. Thus, national regulatory regimes increasingly have to respond to some of the dynamics of marginalizing, trying to balance the values of open discussion and debate with the dangers of humiliation of minority groups whose participation in society can be heavily compromised by hate speech. Regulation becomes important in terms of the process of media production, and how messages are generated and transmitted.

604 Media and Marginalized Groups

Production If communication is a process through which meaning is conveyed from a point of production to a point of consumption, then meanings are first constructed by media professionals that choose elements to form into narratives. The production locus is always one of choice; the producers select from a range of possibilities and build an argument, story, or experience that they believe will move their audiences in some way. Whether it is journalism, fiction, advertising or nonfiction, television game shows, or ‘reality TV,’ there are creative minds building conceptions of how the audience would be moved – emotionally, viscerally, and intellectually. Communication is always about change – changing how people are before they experience it – and trusting or hoping that the desired effect will result. Studies of newsrooms have shown how the dynamics of news reporting are influenced by the ethnic makeup of the reporters. In societies where aboriginal communities survive, research demonstrates that few aborigines are involved in media production, and their stories are underrepresented in the media. Similarly, immigrant and second-generation populations are underrepresented in news media organizations, while people with impairments are almost totally excluded. In racially diverse societies (e.g., France, the United States, the UK) this diversity, to some extent, may be reflected through the range of media, but is rarely reflected within the dominant media organizations. Despite some moves towards affirmative action, the tendency of powerful positions being held by powerful men from the dominant community remains. These production structures have tended to reinforce existing practices and worldviews. Even where minorities exist in significant numbers in local communities, American researchers have shown that this has only marginal effect on content and employment. Where economic hierarchies mimic racial ones, the media production forms reflect both economic and racial inequalities, and, to some extent, sustain and reinforce them. Thus there may be no intentional vilification of minorities, yet their continuing marginalizationmay be a consequence of their historic exclusion and the worldviews in the organizations that control the communication system. It requires a conscious self-awareness by media proprietors of the consequences of their actions – many of which are unintended – to begin to remedy such continuing inequities. There are many examples of managers deciding to change the organizational culture of exclusion – recruitment campaigns for minority employees, active dialogue with excluded

communities, strategies to widen audience appeal by diversifying content, and the sustained selection of minority community members as expert commentators (and not just on minority affairs’ issues). Disability communities have also sought to modify stereotypes and marginalization. Some media have included people with disabilities in their normal lineup, with British broadcasting organizations in 2004 pledging their commitment to raise disabled involvement in production. However, disability remains a significant area where marginalization continues. Some advertisers have also seen the importance of normalizing disability. In the United States and Europe, people with impairments are appearing more regularly on TV screens, and the ‘access’ movement, seeking to ensure the disability-friendliness of the Web, has been increasingly successful in generating and disseminating accessibility standards.

Content Given the range and diversity of media outlets, it could be argued that every community of language or interest can find a forum for communication, particularly in larger and wealthier societies. This view carries with it a serious implication: the communication networks that enable societies to function as integrated units may be in the process of unfurling – rendering them more like places in which multiple diasporic communities cohabit in their own communication spaces. The complexity of ‘mediascapes’ carry embedded within them ‘ethnoscapes,’ or visions of the world as perceived through the eyes of particular ethnic collectivities. Dominant media tend to sustain the views of the dominant communities, so that the content of these media tends to privilege the interests and concerns of these groups. Studies of stereotyping of minorities – Jews, Blacks, disabled people, Muslims – have shown that these groups tend to be framed by the perspectives of able-bodied White Christians. The stereotypes tend to speak to the fears and threats these marginalized groups are felt to represent. Thus, Jews were often portrayed in Euro-American media as outsiders who carried particular cultural attributes as a group, sometimes positive, sometimes negative. In recent years, Jews have again been represented in harsher terms in Arab media. African-Americans were typically represented in the past in American media as criminals, comics, or sports people, though some have emerged into the front rank of Hollywood actors. Still, for the most part, stories about Blacks in dominant culture media neither canvas the range of Black experiences, nor explore issues relevant to

Media and Marginalized Groups 605

Black people in any depth or from their point of view. Such stereotypical practices have two effects: They exclude Black people from the communication community formed by specific media, and they transmit biased and limited information to other participants in those communication communities. Such miscommunication serves to further marginalize Blacks from the collective discourse of the society, and thus both reinforces their alienation and legitimates wider practices of social exclusion. In the United States, there has been a major increase in Black (and Hispanic) cable television channels, and a turning away by these audiences from ‘mainstream’ channels. Over recent decades, and especially since the events of September 11, 2001, Arabs, Muslims, and other people from the Middle East and South Asia have experienced the full force of stereotype and hostility in Western media, while the media in their countries of origin have tended to emphasize the most oppressive aspects of the imperial histories of Western cultures. Many Arab and Muslim organizations have developed strategies to influence media content in Western countries, through media training of spokespeople, the development of media monitoring projects, and the use of complaints mechanisms. Textual studies have demonstrated the processes through which discourses of marginalization are sustained. News programming, for instance, will frame stories within the context of expected protagonists and antagonists. In societies with racialized histories, the discourses mobilized in the media may tend to legitimate preexisting hierarchies of a certain race, while reinforcing prejudices against immigrants from minority backgrounds. Disability may be used as a metaphor for evil, so that fear of people with impairments is reinforced and their social exclusion intensified. Advertising carries some of the most important content, through its constant messages about appropriate social behavior. Television has become one of the most important socialization agents for children in advanced Western societies, with children in the United States watching on average about twenty hours per week; in a year they may see up to 200 000 commercials. Children learn a great deal from this experience about race and disability, their own racial groups and others, and about what it means to be normal. Biased, negative, or minimal representation of minority groups can intensify social exclusion directed against them, and undermine their own feelings of self-worth and personal capacity.

Audience/Consumption Meaning is realized by audiences by their working on texts. While producers propose ways that texts should

be interpreted, usually privileging one interpretation, the meaning (i.e., the reinforcement or changes in perception, perspective, or behavior among audiences) is not totally determined by the intent of the authors. While meanings are individually experienced, they are socially constituted through the shared communal discourses developed over time within subcultures. The rise of mass media during the 20th century was predicated on the creation of undifferentiated audiences, and pushed towards homogenized lifestyles expressed because of the growth of large organizations and the suburbanization of metropolises. Over the past three decades, and accelerating in this century, technological change and social diversification have undermined the reality of mass society, generating in its place a more variegated, urbanized, and older population. Most Western societies are characterized by an ageing dominant population group with low birth rates, while their minority groups tend to be younger and have higher birth rates. These demographic changes have already seen major audience shifts for the US media and similar transformations are likely for other societies with immigrant and refugee populations. By 2050, the non-Hispanic White population of the United States will have declined to about five in ten, a decrease from about seven in ten in 2000. In the meantime, the African-American population will have risen slightly from 13 to 15%, with Asian and Pacific Islanders rising nearly 300% to one in twelve, and Hispanics growing from one in nine, to one in four. One major effect of these changes is that the child and youth media market will be dominated by minorities. Those who are currently marginalized by the mainstream will have become together almost half the population, and much more than half the youth population. Audiences seek reinforcement from their media experiences. They look for cultural experiences that confirm their identities, and validate their expectations and desires. Where there is choice, and they experience marginalization or exclusion, they will turn towards media that satisfy their needs and address their realities. Increasingly wealthier and more educated populations are turning to the Internet for their information needs, selecting services that suit them. Internet providers talk of building user communities who both draw on the materials provided to them but also interact with other users of the same services. As communities of interest develop, communities of place become less important, and the discursive environments used can become global rather than local. Audiences for any one source become smaller, even

606 Media and Marginalized Groups

though demand for information and entertainment continues to expand. US studies of children as audiences have shown that they would welcome a more diverse range of representations of people in their television viewing. Studies of minority audiences in Europe, Canada, and Australia also demonstrate their desire for a better representation of their own stories and experiences in mainstream media. Many express their frustration with the repetition of tired stereotypes and also their clear sense of the peripheral nature of their participation in society as it is expressed in dominant group media.

Alternatives Many of these issues have been recognized by government, community, and corporate bodies; they have developed a variety of strategies that have started to remedy the exclusion and marginalization of ethnic and disabled communities. Technological change (e.g., digital broadcasting, satellite newspaper transmission, the Internet, cable television) has facilitated the development of alternative media outlets which provide opportunities for communities to hear their own voices or read their own stories (although with only minor impact outside the community boundaries). Within the mainstream media in the United States, Canada, and the UK, there have been programs to implement affirmative action, with the active recruitment of minority staff as journalists, writers, producers, and directors. Similar approaches have recently been extended to cover people with disabilities, with a cross-Europe initiative emerging in 2003. However, European studies have shown that the challenge continues for the media to talk ‘with’ rather than ‘about’ minorities, indicating that there is still a pressing need to enhance equal representation by giving voice to ethnic, cultural, and religious minority sources, and by involving those affected by discrimination in the media. Minority perspectives need to be covered in the mainstream media, not only on issues that concern the ethnic minorities, but also in other general news genres. Governments need to convince media professionals of the necessity to integrate immigrants’ points of view in their productions more widely, and not only in productions dealing with minority issues. The media can contribute to societal integration and cohesion, by reflecting diversity, and giving voice to groups that have something to say about issues concerning them and society as a whole, whether they are marginalized for their culture, language, disability status, or religion. The processes of

marginalization occur at all levels of the media and need therefore to be addressed systematically. See also: Discourse of National Socialism, Totalitarian; Gender and Political Discourse; Genres in Political Discourse; Identity and Language; Media: Analysis and Methods; Media Panics; Media, Politics, and Discourse: Interactions; Modes of Participation and Democratization in the Internet; News Language; Political Rhetorics of Discrimination.

Bibliography Allen T & Eade J (1999). Divided Europeans: understanding ethnicities in conflict. The Hague and Boston: Kluwer Law International. Anderson B (1983). Imagined communities: reflections on the origin and spread of nationalism. London: Verso. Appadurai A (1999). ‘Dead Certainty: Ethnic Violence in the Era of Globalization.’ In Geschiere M B & Geschiere P (eds.) Globalization and identity: dialectics of flow and closure. Oxford: Blackwell. Bang H K (2003). ‘Minorities in children’s television commercials: new, improved, and stereotyped.’ Journal of Consumer Affairs 37(1), 42–68. Berger P L & Bertelsmann S (1998). The limits of social cohesion: conflict and mediation in pluralist society: a report of the Bertelsmann Foundation to the Club of Rome. Boulder, CO: Westview Press. Braziel J E & Mannur A (eds.) (2003). Theorizing diaspora: a reader. Oxford: Blackwell. Cottle S (2000). Ethnic minorities and the media: changing cultural boundaries. Buckingham, UK: Open University Press. Daniels T & Gerson J (eds.) (1989). The colour black: Black images in British television. London: British Film Institute. Ebo B L (1998). Cyberghetto or cybertopia?: race, class, and gender on the Internet. Westport, CT: Praeger. Goggin G & Newell C (2003). Digital disability. Oxford: Rowman & Littlefield. Greenberg B & Brand J (1993). ‘Minorities and the mass media: 1970s to 1990s.’ In Bryant J & Zillmann D (eds.) Media effects: advances in theory and research. Hillsdale, NJ: Lawrence Erlbaum. Gunew S M (2003). Haunted nations: the colonial dimensions of multiculturalisms (1st edn.). New York: Routledge. Hjarvard S (2001). News in a globalized society. Go¨ teborg: Nordicom. Jakubowicz A (2003). ‘Multiculturalism and the cultural politics of cyberspace.’ In Thorburn D & Jenkins H (eds.) Democracy and new media. Cambridge, MA: MIT Press. Jakubowicz A, Goodall H, Martin J, Mitchell T, Randall L & Seneviratne K (eds.) (1994). Racism, ethnicity and the media. St. Leonards, N.S.W: Allen & Unwin. Norden M (1994). The cinema of isolation: a history of physical disability in the movies. New Brunswick, NJ: Rutgers University Press. Pointon A & Davies C (1997). Framed: interrogating disability in the media. London: British Film Institute.

Media Panics 607 Ross K (1997). ‘But where’s me in it? Disability, broadcasting and the audience.’ Media, Culture and Society 19(4), 669–677. Ross K (ed.) (2001). Black marks: research studies with minority ethnic audiences. London: Ashgate. Said E (1995). ‘Orientalism.’ In Ashcroft B, Griffiths G & Tif H (eds.) Post-colonial studies reader. London and New York: Routledge. Shah A (2004). ‘Corporate influence in the media: media conglomerates, mergers, concentration of ownership.’ http://www.globalissues.org/HumanRights/Media/ Corporations/Owners.asp. United Nations (1966/1976). ‘International covenant on economic, social and cultural Rights.’ Geneva: United Nations. http://www.unhchr.ch/html/menu3/b/a_cescr.htm. van Dijk T A (ed.) (1985). Discourse and communication: new approaches to the analysis of mass media discourse and communication. Berlin and New York: W. de Gruyter.

van Dijk T A (ed.) (1991). Racism and the press. New York: Routledge. van Dijk T A (ed.) (1993). Elite discourse and racism. Newbury Park, CA: Sage. Walter J (ed.) (2002). Racism and cultural diversity in the mass media: an overview of research and examples of good practice in the EU Member States, 1995–2000. Vienna: European Research Centre on Migration and Ethnic Relations (ERCOMER) on behalf of the European Monitoring Centre on Racism and Xenophobia, Vienna (EUMC). Whillock R K & Slayden D (eds.) (1995). Hate speech. Thousand Oaks, CA: Sage. Zilber J & Niven D (2000). ‘Stereotypes in the News.’ Harvard International Journal of Press/Politics 5(1), 32–50.

Media Panics J Leach, University of Queensland, Brisbane, QLD, Australia ! 2006 Elsevier Ltd. All rights reserved.

Cultural panics have existed as long as culture itself. Herodotus reports that the citizens of Asia Minor panicked when rumors that their wells had been poisoned by the Greeks surfaced in the 7th century B.C.E. The rise of the mass media from the 17th century onward, however, meant that panics could be monitored, contained, and most importantly, created by the media itself. In the 20th century, the introduction of social statistics and the ever-increasing amount of scientific information available for public scrutiny encouraged moral panics over increasing crime, environmental degradation, changing social norms of gender and sexuality, and a host of health-related issues. Strikingly, these panics frequently tell us more about the structure and power of the media than they do about the issue supposedly causing panic. The model of a media panic discussed below owes much to theories of social panic and there are some limitations to seeing media interventions in social problems in these terms. After considering the elements of media panics, there is a discussion of the limits of the model. Media panics usually start in the news. Broadsheet newspapers, television news, radio news sources, tabloids, and Internet news sources describe periodic or new phenomena as systemic disorders that are worthy of sustained audience and media attention

and needing some sort of urgent intervention. As such, reports of phenomena such as crime statistics articulate to traditional news values. Most crucially, to be news it must be ‘new.’ However, crime is not a new phenomenon and crime statistics are analyzed periodically. The ‘newness’ of the statistics in combination with typical media claims that crime is ‘getting worse’ creates a message of a systemic social disorder. Adding to the gravity of this message, the claim that there is a ‘systemic disorder’ produces a contradiction between our knowledge of crime as a persistent phenomenon and the claim to its newness. To resolve this contradiction, new categories of crime or deviance are created and audiences are morally sensitized to these new forms of crime; in short, new categories are created for audiences to worry about. In addition, the media, needing to direct and hold audience attention, selectively engage events in order to show their most extraordinary, most worrisome, or otherwise unique features. Thus, violent crimes, bank robberies, and ‘moral deviance’ receive more news attention than white-collar crime or domestic violence. Reports of extraordinary or worrisome events are usually coupled with a generalized claim that these remarkable events ‘could affect you.’ This conforms to the news value that news events should be audience relevant. The contradictory messages produced by claiming both the remarkable and the everyday can lead to audience confusion, apprehension, annoyance, and alleged social and moral panic. The structural features of a ‘media panic’ include a news item and statistics or narratives that support the

Media Panics 607 Ross K (1997). ‘But where’s me in it? Disability, broadcasting and the audience.’ Media, Culture and Society 19(4), 669–677. Ross K (ed.) (2001). Black marks: research studies with minority ethnic audiences. London: Ashgate. Said E (1995). ‘Orientalism.’ In Ashcroft B, Griffiths G & Tif H (eds.) Post-colonial studies reader. London and New York: Routledge. Shah A (2004). ‘Corporate influence in the media: media conglomerates, mergers, concentration of ownership.’ http://www.globalissues.org/HumanRights/Media/ Corporations/Owners.asp. United Nations (1966/1976). ‘International covenant on economic, social and cultural Rights.’ Geneva: United Nations. http://www.unhchr.ch/html/menu3/b/a_cescr.htm. van Dijk T A (ed.) (1985). Discourse and communication: new approaches to the analysis of mass media discourse and communication. Berlin and New York: W. de Gruyter.

van Dijk T A (ed.) (1991). Racism and the press. New York: Routledge. van Dijk T A (ed.) (1993). Elite discourse and racism. Newbury Park, CA: Sage. Walter J (ed.) (2002). Racism and cultural diversity in the mass media: an overview of research and examples of good practice in the EU Member States, 1995–2000. Vienna: European Research Centre on Migration and Ethnic Relations (ERCOMER) on behalf of the European Monitoring Centre on Racism and Xenophobia, Vienna (EUMC). Whillock R K & Slayden D (eds.) (1995). Hate speech. Thousand Oaks, CA: Sage. Zilber J & Niven D (2000). ‘Stereotypes in the News.’ Harvard International Journal of Press/Politics 5(1), 32–50.

Media Panics J Leach, University of Queensland, Brisbane, QLD, Australia ! 2006 Elsevier Ltd. All rights reserved.

Cultural panics have existed as long as culture itself. Herodotus reports that the citizens of Asia Minor panicked when rumors that their wells had been poisoned by the Greeks surfaced in the 7th century B.C.E. The rise of the mass media from the 17th century onward, however, meant that panics could be monitored, contained, and most importantly, created by the media itself. In the 20th century, the introduction of social statistics and the ever-increasing amount of scientific information available for public scrutiny encouraged moral panics over increasing crime, environmental degradation, changing social norms of gender and sexuality, and a host of health-related issues. Strikingly, these panics frequently tell us more about the structure and power of the media than they do about the issue supposedly causing panic. The model of a media panic discussed below owes much to theories of social panic and there are some limitations to seeing media interventions in social problems in these terms. After considering the elements of media panics, there is a discussion of the limits of the model. Media panics usually start in the news. Broadsheet newspapers, television news, radio news sources, tabloids, and Internet news sources describe periodic or new phenomena as systemic disorders that are worthy of sustained audience and media attention

and needing some sort of urgent intervention. As such, reports of phenomena such as crime statistics articulate to traditional news values. Most crucially, to be news it must be ‘new.’ However, crime is not a new phenomenon and crime statistics are analyzed periodically. The ‘newness’ of the statistics in combination with typical media claims that crime is ‘getting worse’ creates a message of a systemic social disorder. Adding to the gravity of this message, the claim that there is a ‘systemic disorder’ produces a contradiction between our knowledge of crime as a persistent phenomenon and the claim to its newness. To resolve this contradiction, new categories of crime or deviance are created and audiences are morally sensitized to these new forms of crime; in short, new categories are created for audiences to worry about. In addition, the media, needing to direct and hold audience attention, selectively engage events in order to show their most extraordinary, most worrisome, or otherwise unique features. Thus, violent crimes, bank robberies, and ‘moral deviance’ receive more news attention than white-collar crime or domestic violence. Reports of extraordinary or worrisome events are usually coupled with a generalized claim that these remarkable events ‘could affect you.’ This conforms to the news value that news events should be audience relevant. The contradictory messages produced by claiming both the remarkable and the everyday can lead to audience confusion, apprehension, annoyance, and alleged social and moral panic. The structural features of a ‘media panic’ include a news item and statistics or narratives that support the

608 Media Panics

claim that this news item is relevant, even worthy of urgent attention, for general audiences. This news item must then be articulated to other issues of public concern or labeled extraordinary and construed as deviant and needing public scrutiny or action. The ‘panic’ of a media panic emerges from a series of contradictions about newness, newsworthiness, relevance, and deviance. And finally, the audience should not be left in a state of panic or anxiety. Socially sanctioned experts typically express a range of options, again, in the media, for audiences to intervene in the issue. This might include the media source exhorting audiences to ‘be aware,’ seek out more information,’ ‘call their local politician,’ or even to buy a certain product to reduce confusion or panic. The ultimate intervention in media-generated panics is legislative change by political, policy, or judicial systems against social actors that have been labeled as deviant or part of the problem. The events that cause media panic are rarely ‘made up’ or entirely fictional. Rather, the media directs attention to phenomena and links these phenomena with troubling public issues or labels certain social actors as ‘deviant’ and thus worthy of public scrutiny and intervention. The global HIV/AIDS pandemic provides a striking example. The emergence of the disease worldwide is an actual event that has killed over 14 million people. The media, drawing attention to the unknown cause of the disease and its debilitating and wasting effects, made the case that HIV/AIDS was worthy of audience attention. Coupled with statistical evidence that the death toll was rising and the disease was associated with sexual practices and drug use that the media deemed deviant, the disease received more press than other emerging pandemics such as tuberculosis. The ensuing panic was both ‘moral’ and media generated. Deviance was created in relation to the disease and activists and educators have had to work against this initial media-generated definition. The crucial issue is not that the HIV/AIDS pandemic is not real or serious; the issue is that the media focus attention and create worry about particular aspects of the pandemic with moral overtones. Thus, ‘media panic’ about HIV/AIDS sanctioned homophobia and a distrust of certain immigrant groups. The appropriate intervention, given this framing of the issue, is avoidance and suspicion of particular social groups. The significance of media panics demonstrates the power of the media in agenda setting and framing social issues. The use of statistical and scientific evidence, as well as scientific and medical expertise, has become increasingly important to media sources and plays a larger role in media panics. Drug panics, for example, are recurrent in the media as different recreational

drugs, from cocaine to ecstasy to ‘chroming’ (sniffing fumes from paints) are introduced to audiences and their effects are discussed by experts. While experts in the recent past have included politicians, stakeholders, religious leaders, and other ‘right-minded’ figures, medical or health-related issues are frequently dissected by physicians or experts with academic or professional scientific qualifications. This adds another level of media engagement with audiences. Not only do audiences need to do something urgently, there is also more that they need to know. That knowledge can also come from experts introduced in the media. In the drug panic example, concerned parents might be alerted to a new threat by a news story. This may be followed up by a news magazine program where doctors and scientists discuss the possible physiological effects of drug use and the concurrent social behaviors parents might observe in drugtaking children. In addition, there might be statistical evidence of ‘at-risk’ populations and experts to comment on ways of lowering that risk. There also might be experts to advise parents on the best strategies of intervention. While this all works to raise and not lower anxiety, there becomes a wide body of expert knowledge that is needed before the audience – here parents – can take action. There are contradictions here also. Experts frequently disagree on the causes and risks of social behaviors and even on the results of scientific studies. Audiences knowing more, then, becomes a difficult proposition among a range of opinions, interpretations, and ambiguities. This can further support the role of the media as audiences look for clarification or it can turn audiences away as they throw their hands up in the face of so much conflicting information. It is clear, however, that the rise of expert culture has perpetuated and supported a number of media panics. The rise of global media systems marks an unprecedented centralization of media attention and the potential to create ‘panics’ in large audiences. News sources can focus mass audience attention in more precise ways and create concerns that run across national boundaries. Creating concern or ‘panic’ works in the media’s interest as concerned audiences are consistent viewers and consistent viewers help media companies sell advertising. Panic is good for the media system; often the media will even attempt to create panic about the possibility of media panics, raising concern about the media’s own ability to raise concern. Media panics are also useful for the state insofar as deviance can be defined and ‘rooted out’ or ‘policed’ by concerned members of the public, thus creating an opportunity for audiences to feel empowered even when the terms of that empowerment are defined in the media. Media panics may also reflect a

Media, Politics, and Discourse: Interactions 609

trend in dominant cultures to be concerned with their own plight. In this way, media panics are symptoms of a larger crisis of a ‘risk society.’ On this account, the various plague, crime, and moral downfall scenarios being circulated in media systems cause panic as a culture becomes increasingly aware of its potential for self-annihilation through nuclear war, environmental degradation, social exclusion, and pandemics of disease. However, the model described above points out certain features of media panic and ignores others. As the issues raised are not fictional, what would proper media treatment of important social issues look like? How would they properly emerge in a way not to cause panic? The features outlined above also give the impression that media panics follow a formal road map. In actuality, they differ substantially; the origins and resolution of HIV/AIDS panics around issues of sexual orientation only bear some resemblance to the issues of class and race that motivated drug panics. Finally, the term ‘moral panic’ itself has been used to political ends in the media. At the end of the 1990s, sex abuse scandals within the clergy were making worldwide headlines. Members of the clergy defended themselves by accusing themedia of creating panics, thus attempting to diffuse audience attention and interest. This last feature of ‘media panics,’ the ability of the media to create panic and by that very act, be open to the criticism of causing undue panic, demonstrates both the power of the media to control social agendas and the power of audiences to resist and restructure media agendas. See also: Media and Language: Overview; Media and Marginalized Groups; News Language; Television: Language.

Bibliography Beck U (1992). Risk society: towards a new modernity. New York: Sage. Behlmer G K (2003). ‘Grave doubts: Victorian medicine, moral panic, and the signs of death.’ Journal of British Studies 42, 206–235. Bessant J (2003). ‘Stories of disenchantment: supervised chroming, the press and policy-making.’ Media International Australia incorporating Culture and Policy 108. Chiricos T, Eschholz S & Gertz M (1997). ‘Crime, news and fear of crime: toward an identification of audience effects.’ Social Problems 44, 342–357. Cohen S (1972). Folk devils and moral panics: the creation of the Mods and Rockers. Oxford: Martin Robertson. Cohen J & Richardson J (2002). Pit bull panic: media coverage of a dog breed. Journal of Popular Culture 36, 285–317. Hall S, Critcher C, Jefferson T, Clarke J & Roberts B (1978). Policing the crisis: mugging, the state, and law and order. London: Macmillan. Hier S P (2002). ‘Raves, risks and the ecstacy panic: A case study in the subversive nature of moral regulation.’ Canadian Journal of Sociology 27, 33–57. Humphreys M (2002). ‘No safe place: disease and panic in American history.’ American Literary History 14(4), 845–857. Kroker A & Kroker M (1989). Panic encyclopedia. New York: St. Martin’s Press. O’Sullivan T, Fiske J & Saunders D (1983). Key concepts in communication and cultural studies. New York: Routledge. Quigley M & Blashki K (2003). ‘Beyond the boundaries of the sacred garden: children and the Internet.’ Information Technology in Childhood Education Annual, 309–317. Robbins T (1994). ‘Satanic panic: the creation of a contemporary legend.’ Sociology of Religion 55, 373–375. Staff (2002). ‘Outbreak of legionnaires’ disease in the United Kingdom.’ British Medical Journal 325, 1033.

Media, Politics, and Discourse: Interactions B Busch, University of Vienna, Vienna, Austria ! 2006 Elsevier Ltd. All rights reserved.

Media texts are frequently being used as corpora in linguistic analysis, treating politics- and policyrelated questions such as social inclusion and exclusion, stereotypes, and constructions of national or ethnic identities (Wodak and Busch, 2004). For instance, over 40% of the papers published in the journal Discourse & Society are based on media texts (Garrett and Bell, 1998: 6). There are different

reasons for this: given their public nature, media texts are easily available and accessible, offering even a historic dimension through archives. Media texts can be assumed to have an impact as they address and reach multipliers or a more general large public. It can be assumed that they receive attention by their audiences, as their reception is voluntary. Media texts, as other texts in the public domain, provide discursive and linguistic resources that can be seen as authoritative voice (Bourdieu, 1982). Although also in media studies news is one of the most widely studied media forms, the connection between

Media, Politics, and Discourse: Interactions 609

trend in dominant cultures to be concerned with their own plight. In this way, media panics are symptoms of a larger crisis of a ‘risk society.’ On this account, the various plague, crime, and moral downfall scenarios being circulated in media systems cause panic as a culture becomes increasingly aware of its potential for self-annihilation through nuclear war, environmental degradation, social exclusion, and pandemics of disease. However, the model described above points out certain features of media panic and ignores others. As the issues raised are not fictional, what would proper media treatment of important social issues look like? How would they properly emerge in a way not to cause panic? The features outlined above also give the impression that media panics follow a formal road map. In actuality, they differ substantially; the origins and resolution of HIV/AIDS panics around issues of sexual orientation only bear some resemblance to the issues of class and race that motivated drug panics. Finally, the term ‘moral panic’ itself has been used to political ends in the media. At the end of the 1990s, sex abuse scandals within the clergy were making worldwide headlines. Members of the clergy defended themselves by accusing themedia of creating panics, thus attempting to diffuse audience attention and interest. This last feature of ‘media panics,’ the ability of the media to create panic and by that very act, be open to the criticism of causing undue panic, demonstrates both the power of the media to control social agendas and the power of audiences to resist and restructure media agendas. See also: Media and Language: Overview; Media and Marginalized Groups; News Language; Television: Language.

Bibliography Beck U (1992). Risk society: towards a new modernity. New York: Sage. Behlmer G K (2003). ‘Grave doubts: Victorian medicine, moral panic, and the signs of death.’ Journal of British Studies 42, 206–235. Bessant J (2003). ‘Stories of disenchantment: supervised chroming, the press and policy-making.’ Media International Australia incorporating Culture and Policy 108. Chiricos T, Eschholz S & Gertz M (1997). ‘Crime, news and fear of crime: toward an identification of audience effects.’ Social Problems 44, 342–357. Cohen S (1972). Folk devils and moral panics: the creation of the Mods and Rockers. Oxford: Martin Robertson. Cohen J & Richardson J (2002). Pit bull panic: media coverage of a dog breed. Journal of Popular Culture 36, 285–317. Hall S, Critcher C, Jefferson T, Clarke J & Roberts B (1978). Policing the crisis: mugging, the state, and law and order. London: Macmillan. Hier S P (2002). ‘Raves, risks and the ecstacy panic: A case study in the subversive nature of moral regulation.’ Canadian Journal of Sociology 27, 33–57. Humphreys M (2002). ‘No safe place: disease and panic in American history.’ American Literary History 14(4), 845–857. Kroker A & Kroker M (1989). Panic encyclopedia. New York: St. Martin’s Press. O’Sullivan T, Fiske J & Saunders D (1983). Key concepts in communication and cultural studies. New York: Routledge. Quigley M & Blashki K (2003). ‘Beyond the boundaries of the sacred garden: children and the Internet.’ Information Technology in Childhood Education Annual, 309–317. Robbins T (1994). ‘Satanic panic: the creation of a contemporary legend.’ Sociology of Religion 55, 373–375. Staff (2002). ‘Outbreak of legionnaires’ disease in the United Kingdom.’ British Medical Journal 325, 1033.

Media, Politics, and Discourse: Interactions B Busch, University of Vienna, Vienna, Austria ! 2006 Elsevier Ltd. All rights reserved.

Media texts are frequently being used as corpora in linguistic analysis, treating politics- and policyrelated questions such as social inclusion and exclusion, stereotypes, and constructions of national or ethnic identities (Wodak and Busch, 2004). For instance, over 40% of the papers published in the journal Discourse & Society are based on media texts (Garrett and Bell, 1998: 6). There are different

reasons for this: given their public nature, media texts are easily available and accessible, offering even a historic dimension through archives. Media texts can be assumed to have an impact as they address and reach multipliers or a more general large public. It can be assumed that they receive attention by their audiences, as their reception is voluntary. Media texts, as other texts in the public domain, provide discursive and linguistic resources that can be seen as authoritative voice (Bourdieu, 1982). Although also in media studies news is one of the most widely studied media forms, the connection between

610 Media, Politics, and Discourse: Interactions

media and politics has not been sufficiently investigated (Fiske, 1987: 281), and no coherent theory that integrates media theories, political theory, and social change has been developed so far. The first part of this article discusses the correlations between media and politics. The historical perspective on developments in Western Europe shows a change of paradigms that becomes visible in the media order, in conceptions of the public sphere, and in media theory more generally. The second part focuses on approaches to the analysis of political discourse in the media. Given that the political field and the media field both undergo a process of rapid change, a flexible framework is needed, which is not based on fixed categories such as ownership or media genre and which establishes close connection between media texts and contexts of production and reception. The article therefore foregrounds approaches developed within critical discourse analysis (CDA).

Media, Politics, and the Public Sphere The post-World War II media order in Western Europe was characterized by a strong public service broadcasting sector that aimed at reinforcing a unified national public sphere and was therefore protected by broadcasting monopolies. While print media were differentiated according to ideological, class, regional, and language criteria and therefore reached only segments of the national public sphere, the public broadcasting service addressed the nation as a whole. The public service saw itself as a kind of classroom for the nation, which it served, represented and constituted. National broadcasting could create ‘‘a sense of unity – and of corresponding boundaries around the nation,’’ ‘‘turn previously exclusive social events into mass experiences,’’ and ‘‘link the national public into the private lives of citizens’’ (Morley, 2000: 107). Within the public service the state language was seen as a means of strengthening national identities. In the case of multilingual states, such as Switzerland or Belgium, this led to the creation of parallel systems in the respective languages within the national broadcasting order. Imagining the audience as a national community and structuring communication in a unidirectional paternalistic top-down manner also determined language practices that emphasized the cultivation of a ‘pure’ standard language. This became apparent not only in language practices excluding ‘deviant’ accents, but also in ideologically loaded metalinguistic discourses. Early mainstream theories in media sociology were based on the political philosophy of Enlightenment and viewed the individual as an actor in fulfilling the social contract that grounds civil society. The

ideal of media as the fourth realm, as an autonomous controlling institution, which faces the legislative, the executive, and the judiciary, presupposes that there is a relationship of autonomy and distance between the media and the political sector. Such approaches to media and politics tend to conceive media as an omnipotent manipulatory force or to relativize the power of media with respect to politics. In both, communication is conceived as a linear process, in which the transfer of meaning from the sender to the receiver is the central concern. Such approaches fail to locate media, individual actors, and audiences within conflicting fields of interest and power relationships, which are in turn constitutive for the formulation of subject positions (Mattelart, 1999: 94). Habermas’s discourse-based concept of the public sphere defines public sphere as the mediating institution between society and the state. Media fulfill a dual role; on the one hand they are a commodity depending on the rules of the media and the advertising market; on the other they structure social relationships. In the first phase of the constitution of bourgeois nation states, media contributed to the development of the public sphere, whereas in a later phase – because of commercialization and the intimate linkage between the state and the economy – they contributed to its erosion. Habermas’s model, originally published in 1962, was based on the assumption of a single unified (national) public sphere. Critics of Habermas’s model were formulated by feminist studies (e.g., Young, 1987) and later also by scholars addressing the question of minority exclusion (e.g., Morley, 2000; Husband, 2001) or language as a factor of such exclusion (Busch, 2004). Habermas (1990) himself revised his model and conceded that he had neglected the existence of counter discourses, counter publics, and counter cultures. Drawing upon Foucault’s notion of discourse, he recognized that the exclusion of the ‘other’ is a constitutive element in the formation of the public sphere and that the formation of dominant discourses is based on mechanisms of exclusion. Nevertheless, suppressed discourses can have a transformative impact on dominant discourses.

Struggle of Discourses In the 1960s and 1970s, various processes of differentiation transformed the media landscape. All over Europe, media became a central topic on the political agenda. The state monopoly on broadcasting as well as the commercial media trusts that gained power in the concentration process were criticized for excluding a range of social groups and deviant political positions. In other words, social movements claimed

Media, Politics, and Discourse: Interactions 611

an adequate representation within the national media systems. To achieve this aim, social movements largely had to rely on their own media such as posters, leaflets, and alternative print media to express alternative or counter opinions in public. At the same time, public service broadcasting differentiated its programs according to different styles and tastes. Alongside the nationwide programs, specific regional programs gained in importance. On the political level, regionalization in the media corresponded to a revalorization of regional autonomy and of regional languages. The idea of cultural homogeneity conceived as masculine, white, and monolingual lost in impact, as counter discourses challenged the dominant views and hegemonic social relations. The main challenge for the public service media stemmed from the pressure of private media enterprises. It was the opposition of public versus private that dominated media policies and finally led to the abolishment of broadcasting monopolies in Western Europe. So-called liberalization first only concerned radio, which had abandoned its function as the leading medium to television already in the 1960s. In commercial media, the information aspect became subordinated to the entertainment component. Technology-driven media theories, which prioritize aspects of form and format over aspects of content, attribute to the medium as such the leading role in shaping and transmitting a message. This approach became popular with McLuhan’s succinct formula ‘‘the medium is the message,’’ which marks a break from concepts that ascribe a preeminent power in influencing public opinion to media, both in an educational and in a manipulative way. It neglects, however, the question of how technological developments are appropriated socially and culturally. Discourseoriented approaches to society and politics as developed by Habermas, Foucault, or Bourdieu have a strong impact on media studies and theories. This shift of perspective emphasizes social relations and cultural factors and allows revealing hegemonies and power relationships. In this sense, media can be seen as sites of discourse struggles.

Fragmentation and Reconfiguration of Media Spaces In the 1990s, the process of decentering the national public sphere gained in momentum. The availability of satellite and cable TV and later of the Internet has led to a further fragmentation of media audiences, a reconfiguration of media spaces and an increasing multidirectionality of communication flows. Two distinctive developments characterize the new kinds of

transnational broadcasting spaces and new transnational configurations of culture: the emergence of global broadcasting regions which link populations of neighboring countries on the basis of proximity, common cultural heritage, language, or ethnic identity, and the creation of new diasporic broadcasting spaces, which gather into a single audience different national communities scattered across the world (Robins, 1997: 15–16). In the broadcasting sector, the media landscape becomes complex and manifold. In addition to the traditional national public service emerge not only the global players, but also a range of local media that themselves tend to have various translocal and transborder connections. While commercial interests are constructing a transnational space as a by-product of their primarily economic motives, nonprofit ventures seek to empower local communities, marginalized populations, and civic activists. Media formats and genres developed by public service media change in nature. Genres and text categories that used to be separated are blurred, new genres appear (e.g., infotainment, edutainment, reality soaps). Political discourse is not confined to the information genre, but also has its impact in the entertainment sector. This development in the media field corresponds to a decentering of the nation–state at the political level. Under the pressure of an increasingly globalized economy, the state gradually loses its central role as an organizing principle in society and delegates former core competencies, on the one hand, to institutions and expert committees on a supra-state level, and on the other, to a substate level of regions and communes as well as private bodies or NGOs (Castells, 2003). Neoliberal discourses are replacing the former vision of the welfare state. The deregulation of the media system and the weakening of state influence on the media order potentially provide the media with more space for acting in their own right. In the current debate on the desideratum of a European public sphere, it becomes obvious that the nation state with its public arena is not withering away yet. At the same time, shortcomings already inherent to the notion of a national public sphere – exclusions and fragmentations – are becoming more accentuated. Husband (2001) draws attention to the necessity of reflecting on possible interfaces between parallel and mutually exclusive public spheres in order to guarantee some social coherence. Whereas political discourse could easily be tied to (national) discourse communities within the nation state paradigm, it seems to be becoming more multilayered now.

612 Media, Politics, and Discourse: Interactions

Approaches to the Analysis of Political Discourse in the Media In the academic field, certain assumptions that were taken for granted within the nation–state paradigm are increasingly submitted to revision. In retrospective, media are acknowledged a central role in the formation of the modern nation–state as well as in the standardization and homogenization of national languages. Present media developments seem to also allow more space, however, for nonstandard linguistic practices, for multilinguality and multivoicedness (Busch, 2004). The present trend in approaches to media texts can be characterized in both disciplines, in media studies as well as in linguistics, by a focus away from textinternal readings, where readers are theorized as decoders of fixed meanings, to more dynamic models, where meanings are negotiated by actively participating readers. Consequently, there is a stronger emphasis on processes of transformation of discourses, on recontextualization, and on intertextuality. Nevertheless, studies that focus on production-related transformations are still scarce. Although since Stuart Hall’s (1980) influential model of decoding and encoding, the reader is attributed an active role in the constitution of meanings, studies that combine the moment of the media text and its reception are not very common. Rapid changes within the media system and within the political order demand an analytical framework for the study of political discourse in the media that is flexible and sufficiently open. Critical discourse analysis (CDA) provides such a framework, as it combines in an interdisciplinary approach close textual analysis with the analysis of the larger social context as well as with routines of textual production and reception. CDA foregrounds a dual focus in the approach to political discourse in the media: the detailed analysis of communicative events and the order of discourses, i.e., to locate these communicative events ‘‘within the fields of social practice and in relation to the social and cultural forces which shape and transform those fields’’ (Fairclough, 1998: 143). This involves the analysis of texts, of discursive practices of text production, distribution, and consumption, and the analysis of social and cultural practices that frame discourse practices and texts (Fairclough, 1998: 144).

The Field of Politics and of the Media To capture how the political and the media domain are articulated and how media discourses and political discourses are interlinked, it is useful to draw

upon the notion of different fields that characterize modern society as developed by Pierre Bourdieu (1982). According to Bourdieu, the separate fields of the economy, the state, the legal system, the arts, politics or the media, etc. are each marked by their own particular form of institutionalization and by their own social and discursive practices. The political field has undergone significant changes that led to an increasing specialization and professionalization of the involved actors. Political discourse manifests itself among others in parliamentary debates, political meetings, party conferences, and public debates. A certain amount of political discourse is designed from the outset to be reported and represented in the media. This is, for example, the case for parliamentary speeches in which politicians often do not only address the assembly but also a larger general public, as they anticipate mediatization. Referring to the field of politics, Bourdieu suggests that political discourse is doubly determined, on the one hand internally within the field of professional politics and on the other externally, i.e., in relation to the people politicians represent as well as in relation to other fields, the media field being one of them. The field of journalism and the field of the media are determined by power relations and competition between different media, which become visible through certain indicators such as the market share, the value on the advertising market, the symbolic capital of prestigious journalists. The importance of the media field in relation to other fields, such as the field of politics, is that it holds a sort of monopoly on the means of production and on large-scale distribution of information. Political discourse and action are subjected to the principle of selection exerted by journalists and the media within the logic of the journalistic field, which is in turn dominated by market competition (Bourdieu, 1996: 45ff). In order to escape possible censorship that results from the selection process in media production, political actors develop strategies of presenting their concerns in forms that are likely to become media events or stories with news value. Political actors adapt their agenda and style to the requirements of media presence (e.g., short statements, studied gestures, hairstyle) and of media formats (live debates, talk shows). Both fields, the political and the journalistic, have in common that they are directly under the influence of the sanctions of the market and of plebiscite. According to Bourdieu (1996: 92), the linkage between the two fields amplifies the tendencies of the agents involved in the political field to act according to pressures exerted by the expectations of a mass public and reduces the autonomy of the political field. In the media field, competition between different media enterprises has

Media, Politics, and Discourse: Interactions 613

become more accentuated in the past few years. Since the fall of the state monopolies, the public service broadcasting cultures have also changed significantly. Even genres that used to be relatively stable such as the national news programs on television have experienced fundamental transformations. For instance, there seems to be an acceleration of pace, a compression of time, in which live-broadcast is given priority over indepth investigation, and short and diversified news items are combined into a more or less connected rapid succession of news flashes. From the perspective of CDA, the relationship between the field of politics and the field of media can be understood and analyzed as a chain of recontextualizations. Writing or speaking about any social practice is already an act of recontextualization (Caldas-Coulthard, 2003: 276). Recontextualization can involve suppression and filtering of meaning potentials, but it can also result in expanding meaning potentials by adding or elaborating upon the an earlier version of the text (Chouliaraki and Fairclough, 1999). Van Leeuwen and Wodak (1999: 96) suggest that transformations due to the recontextualization of political discourse include deletion, rearrangement (e.g., changing the order of propositions, altering emphasis), substitution (through linguistic means such as nominalization, metaphor, metonymy, synecdoche, personalization), and addition (adding new elements to the representation of social practices). In this sense, recontextualization can in turn have a transformative effect on a particular practice or create a new practice. Linking the fields of politics and media, recontextualization takes place in two directions: media recontextualize political discourse (e.g., speeches) and politicians recontextualize political discourse derived from media (e.g., quoting media as public opinion, as the voice of the common man in the street). In the process of recontextualization, discourses may also be transposed to a more legitimate context (e.g., from an informal context to the more formal of parliamentary speech) and thus gain in power and status. Legitimation of a text can also be gained through reference to authority, reification, moral evaluation, or through reference to ‘‘true stories from real life.’’ Authority can be gained through reference to law, a person with recognized authority, or an institution. On the other hand, laws in democratic societies often emerge from chains of political discourses in which media have their place (Wodak, 2000).

Analyzing Media Texts With the rapid and dramatic changes in the media sector, obvious categorizations of media with regards

to their area of dissemination (national, local, international, etc.), ownership structures (public service, private), or orientation (information, entertainment, education) are becoming more and more difficult. The formerly propagated separation between information and entertainment/edification can hardly be maintained as new genre names such as infotainment or reality soap suggest. Political discourse does not only appear in news or other information programs, but also in formats formerly reserved to other fields such as talk shows. Pinning down the language of news programs or the language of political media discourse becomes more and more impossible. The following part of this article therefore presents elements for an open and flexible framework for the analysis of (political) discourse in media.

Imagining the Audience Who is the addressee and how is the relationship with the audience structured? These are central questions in the analysis of media texts. It is in fact the structuring of the producer–audience relationship and the ways in which audiences and their expectations concerning texts are imagined that determine how a text is shaped. Genres and text categories depend on this aspect. The notion of the target audience, which encompasses a spatial (local, regional, national, global) and/or a social (social status, income, age, gender) dimension is based on rigid and reified audience categories. Research on media coverage and definitions of target audiences are instruments of marketing research and correspond to criteria established by the advertising industry. Ang (1991) demonstrates that this approach is based on a discursive construct of audience that is unable to grasp the actual relationship between media and audiences and to conceive communication processes. She distinguishes between two main orientations: audience-aspublic and audience-as-market. The first is generally associated with the public service media sector in which the addressee is seen as a citizen (of a state); the relationship with the audience is paternal and aims at transmitting values, habits, and tastes. It is linked to the so-called transmission model of communication, in which the transmission of a message and the ordered transfer of meaning is the intended consequence of the communication process. The second configuration of audience is associated with the private commercial media sector. Audiences are addressed as consumers in a double sense: as consumers of the media product and as potential consumers of the products advertised in the programs. In the attention model of communication (McQuail, 1987), communication is considered successful as

614 Media, Politics, and Discourse: Interactions

soon as attention is actually raised in audiences. The transfer of meaning plays a secondary role. The scoop, the extraordinary, and the scandal gain in importance as means of awakening attention. In the alternative media sector, the conception of the audience is determined by the idea of an active public that participates in social action and media production. The aim is to overcome the division between producers and audiences, to move closer to a situation in which the Other is able to represent itself, in which the heterogeneity of ‘‘authentic informants’’ is not reduced (Atton, 2002: 9). Alternative or thirdsector media are consequently closer to the ideal of representing the multivoicedness of society in all three dimensions, which Bakhtin (Todorov, 1984: 56) has described: heterology (raznorecˇie), i.e., the diversity of discourses, heteroglossia (raznojazycˇnie), i.e., the diversity of language(s), and heterophony (raznoglossie), i.e., the diversity of individual voices. These different basic orientations in conceiving the producer–audience relationship result in preferences for particular media formats (e.g., authoritative information-centered programs, infotainment programs, and dialogic forms such as phone-in programs) and a choice of particular linguistic practices. They also determine the way in which discourses are being shaped, reproduced and transformed. It has, for example, been observed that the public service sector, at least in some segments of its program, is becoming more market-oriented and that formats and genres developed in a certain sector are taken up – sometimes in a transformed way – by others.

Modalities and Meanings Media communication is inherently multimodal communication: this means that language in written and spoken form is one of several modes available for expressing a potential of meanings. For instance, in print media lay-out and image are available in addition to the written word; in radio, language is present in its spoken form, alongside music and different sounds; in television all the aforementioned modes can be drawn upon in a context in which the moving image holds a central position. Similarly, in computer-mediated communication, a wide range of modes is available. ‘‘A multimodal approach assumes that the message is ‘spread across’ all the modes of communication. If this is so, then each mode is a partial bearer of the overall meaning of the message. All modes, speech and writing included, are then seen as always partial bearers of meaning only. This is a fundamental challenge to hitherto current notions of ‘language’ as a full means of making meaning’’ (Kress, 2002: 6).

How these modes interact is not only a question of technical availability, but rather a question of social appropriation and convention, as Kress and van Leeuwen (2001) point out in their multimodal social semiotic theory. The interplay between the different modes has undergone substantial changes in media history. Writing was considered in many cultural environments as the central mode for the transfer of canonical knowledge and authoritative discourse. This practice of the predominance of the written text influenced radio production so that practically all radio texts in the early days of the medium were produced first in written form and then read in the radio broadcast. Even in television for some time, news broadcasts were read without a transmission of the image of the speaker, as it was considered that the moving image could distract attention. Linguistic practices and text genres from established media exerted and exert a considerable influence on new media and vice versa. Gradually with television, the image has moved into a central position. This has an impact also on political discourse, where appearances and the visible mise-en-scene of political events is becoming more important and must be considered as part of the political discourse. Live broadcasts of TV debates between representatives of different political orientations have for instance become central events in elections. Computer-mediated communication (e.g., Internet, e-mail) as well as other forms of new communication media (e.g., mobile phone news services) are increasingly present in the political field. The conversationalization of (political) discourse in the media (Fairclough, 1995: 9–10) has already gained in momentum with the image and television; possibly the new media will contribute to accelerating this development.

Text and Context Media production is regulated by institutional routines, and media reception by everyday practices and arrangements, both depending on available resources. The production of media texts can be seen as a series of transformations, a chain of communicative events that link sources in the public domain to the private domain of media reception (Fairclough, 1995: 48–49). Media production also encompasses the collection and selection of raw material. At each stage in media production, earlier versions of the text are transformed and recontextualized in ways that correspond to the priorities and goals of the current stage. In a multimodal context, different modes can become separated; for example, images can be subtitled with other texts or associated with other contexts.

Media, Politics, and Discourse: Interactions 615

During the production process, journalists can revert to different kinds of source material: political speeches, interviews, items already preprocessed by news agencies, press releases, archives, other media texts, etc. Current transformations in media production can be characterized on the one hand by an increasing specialization of journalists on narrower fields of reporting (specialization on topics, geographical areas, etc.) and on the other hand by a decreasing labor division between technical and journalistic parts of production. The journalist is not only responsible for the text, but also for the lay-out, the selection of images and even parts of the technical production. Replacing the typesetter, the reader and/or the sound technician, the journalist becomes the designer of a multimodal text. At the same time, because of the economic imperative of reducing the fixed costs in media enterprises, the amount of genuine journalistic investigation decreases in favor of ready-made products such as news agency material and preproduced elements and formats. Journalistic work is becoming more and more a matter of selection than of investigation and news production. This process is encouraged by an oligopolistic owner structure and practices of cross-referencing between different media. In relation to political discourse, this means that certain topics can make sometimes even unexpected media careers and become dominant in certain media areas (Siegert, 2003). Another aspect frequently referred to in the context of media and political discourse is the acceleration of the rhythm of news production, especially with the rise of electronic media, which has in turn an accelerating influence the political agenda. That the political field responds to rhythms of urgencies was already raised by Platon, who drew attention to the differences between the actors in the agora, the public space, the political arena and philosophers who obey other rhythms. No doubt political and media agendas and discourses influence each other in multiple ways. Whereas earlier theories assumed a predominance of politics over the media or vice versa, contemporary approaches foreground interdependencies and interactions between the political and the media fields. See also: Language Politics; Media: Analysis and Methods; Media and Language: Overview; Media and Marginalized Groups; Media: Semiotics; Modes of Participation and Democratization in the Internet; News Language; Newspeak; Radio: Language; Society and Language: Overview; Television: Language; Word and Image.

Bibliography Ang I (1991). Desperately seeking the audience. London: Routledge. Atton C (2002). Alternative media. London: Sage. Bourdieu P (1982). Ce que parler veut dire. L’e´conomie des e´changes linguistiques. Paris: Fayard. Bourdieu P (1996). Sur la tele´vision. Paris: Liber-Raisons d’agir. ¨ ffenBusch B (2004). Sprachen im Disput. Medien und O tlichkeit in multilingualen Gesellschaften. Klagenfurt: Drava. Caldas-Coulthard C R (2003). ‘Cross-cultural representation of ‘‘Otherness’’ in media discourse.’ In Weiss G & Wodak R (eds.) Critical discourse analysis, theory and interdisciplinarity. Basingstoke: Palgrave Macmillian. 272–296. Castells M (2003). Das Informationszeitalter III. Jahrtausendwende. Opladen: Leske þ budrich. Chilton P (2004). Analysing political discourse. Theory and practice. London: Routledge. Chilton P & Scha¨ ffner C (2002). ‘Themes and principles in the analysis of political discourse.’ In Chilton P & Scha¨ffner C (eds.) Politics as text and talk: analytical approaches to political discourse. Amsterdam: John Benjamins. 1–41. Chouliaraki L & Fairclough N (1999). Discourse in late modernity: rethinking critical discourse analysis. Edinburgh: Edinburgh University Press. Garrett P & Bell A (eds.) (1998). ‘Media discourse: a critical overview.’ In Bell A & Garrett P (eds.) Approaches to media discourse. Oxford: Blackwell. 1–21. Fairclough N (1995). Media discourse. London, New York: Edward Arnold. Fairclough N (1998). ‘Political discourse in the media: Analytical framework.’ In Bell A & Garrett P (eds.) Approaches to media discourse. Oxford: Blackwell. 142–162. Fiske J (1987). Television culture: popular pleasures and politics. London: Methuen. ¨ ffentlichHabermas J (1990/1962). Strukturwandel der O keit. Frankfurt/Main: Suhrkamp. (English translation: (1989) Structural transformation of the public sphere. Cambridge: Polity Press). Hall S (1980). ‘Encoding/decoding.’ In Centre for Contemporary Cultural Studies (ed.). Culture, media, language. Working papers in cultural studies 1973–1979. London: Hutchinson. 128–138. Husband C (2001). ‘U¨ ber den Kampf gegen Rassismus hinaus. Entwurf einer polyethnischen Medienlandschaft.’ In Busch B, Hipfl B & Robins K (eds.) Bewegte Identita¨ten. Medien in transkulturellen Kontexten. Klagenfurt: Drava. 9–20. Kress G (2002). ‘The multimodal landscape of communication.’ Medien Journal 4, 4–19. Kress G & van Leeuwen T (2001). Multimodal discourse. The modes and media of contemporary communication. London: Arnold.

616 Media, Politics, and Discourse: Interactions Mattelart A (1999). Kommunikation ohne Grenzen? Geschichte der Ideen und Strategien globaler Vernetzung. Rodenbach: Avinus Verlag. McQuail D (1987). Mass communication theory: an introduction. London: Sage. Morley D (2000). Home territories. Media, mobility and identity. London: Routledge. Robins K (ed.) (1997). Programming for people. From cultural rights to cultural responsibilities. United Nations World Television Forum. New York, 1997. Geneva: European Broadcasting Union. ¨ konomiSiegert G (2003). ‘Im Zentrum des Taifuns: Die O sierung als treibende Kraft des medialen Wandels?’ Medien Journal 1, 20–31. Todorov T (1984). Mikhail Bakhtin. The dialogic principle. Manchester: Manchester University Press.

Van Leeuwen T & Wodak R (1999). ‘Legitimizing immigration control: a discourse-historical analysis.’ Discourse Studies 1(1), 83–118. Wodak R (2000). ‘Recontextualization and the transformation of meanings: a critical discourse analysis of decision making in EU meetings about employment policies.’ In Sarangi S & Coulthard M (eds.) Discourse and social life. London: Longman. 185–206. Wodak R & Busch B (2004). ‘Approaches to media texts.’ In Downing J, McQuail D, Schlesinger P & Wartella E (eds.) Handbook of media studies. London: Sage. 105–123. Young I M (1987). ‘Impartiality and the civic public.’ In Cornell D & Benhabib S (eds.) Feminism as a critique. Cambridge: Polity Press. 56–76.

Media: Analysis and Methods J Thornborrow, Cardiff University, Cardiff, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction The media provide a vast, continuous, and increasingly varied supply of linguistic data. The analytic methods which have been used to examine and explain these data are correspondingly diverse. From the early days of terrestrial television, with limited broadcasting hours, we have moved to 24-hour, multiple-channel satellite and digital broadcasting; radio stations on FM supply a wide choice of local and national programs; the Web has provided a whole new mode of electronic communication, including new ways of accessing print journalism, as most newspapers now publish on the Internet as well as in newsprint. ‘Interactive’ has become the digital buzzword for 21st-century media, as audiences of all kinds, readers, listeners, viewers, are invited to participate by phoning, e-mailing, texting (SMS messaging), joining bulletin boards and internet discussion groups. The ‘voice’ of the public is now as much a part of media discourse as the voice of the newscaster or commentator. How has this proliferation of communicative data been approached by discourse analysts and linguists? What kind of media texts have been the focus of analysis and what kind of questions do we ask about such texts? This article presents some of the main analytic approaches to media discourse over the last two decades, and gives a summary of the concepts and issues that have been most salient in this work. The first section concentrates on aspects

of participation: work dealing with analyses of speaker roles, voices, and identity in media discourse. The second section outlines aspects of content, dealing with work which has focused on how the media construct and represent the world. The final section addresses the particular characteristics of selected media genres, summarizing those aspects of media discourse which have been considered as context specific, institutionally oriented, and, as such, analyzably different from other forms of text and talk.

Analyzing Participation in Media Discourse A major concept within interactional sociolinguistics, based around Goffman’s notion of participation frameworks and ‘footing’ (Goffman, 1981), is the complex relationship between speakers, hearers, and the context of utterance. Goffman challenges the ‘speaker/hearer’ conduit models of communication and shows how these two categories can be described in far more contextually sensitive terms. Participation Frameworks and Footings

Linguistic analyses of media discourse have drawn heavily on Goffman’s work, particularly in relation to the discourse of public participation broadcasting (talk shows, radio phone-ins, panel debates, etc.), where members of the public interact with media professionals and other institutional representatives. These programs set up particular configurations of participant roles that are institutionally determined and generically specific. So, for example, in a British TV current affairs panel debate such as the BBC’s

616 Media, Politics, and Discourse: Interactions Mattelart A (1999). Kommunikation ohne Grenzen? Geschichte der Ideen und Strategien globaler Vernetzung. Rodenbach: Avinus Verlag. McQuail D (1987). Mass communication theory: an introduction. London: Sage. Morley D (2000). Home territories. Media, mobility and identity. London: Routledge. Robins K (ed.) (1997). Programming for people. From cultural rights to cultural responsibilities. United Nations World Television Forum. New York, 1997. Geneva: European Broadcasting Union. ¨ konomiSiegert G (2003). ‘Im Zentrum des Taifuns: Die O sierung als treibende Kraft des medialen Wandels?’ Medien Journal 1, 20–31. Todorov T (1984). Mikhail Bakhtin. The dialogic principle. Manchester: Manchester University Press.

Van Leeuwen T & Wodak R (1999). ‘Legitimizing immigration control: a discourse-historical analysis.’ Discourse Studies 1(1), 83–118. Wodak R (2000). ‘Recontextualization and the transformation of meanings: a critical discourse analysis of decision making in EU meetings about employment policies.’ In Sarangi S & Coulthard M (eds.) Discourse and social life. London: Longman. 185–206. Wodak R & Busch B (2004). ‘Approaches to media texts.’ In Downing J, McQuail D, Schlesinger P & Wartella E (eds.) Handbook of media studies. London: Sage. 105–123. Young I M (1987). ‘Impartiality and the civic public.’ In Cornell D & Benhabib S (eds.) Feminism as a critique. Cambridge: Polity Press. 56–76.

Media: Analysis and Methods J Thornborrow, Cardiff University, Cardiff, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction The media provide a vast, continuous, and increasingly varied supply of linguistic data. The analytic methods which have been used to examine and explain these data are correspondingly diverse. From the early days of terrestrial television, with limited broadcasting hours, we have moved to 24-hour, multiple-channel satellite and digital broadcasting; radio stations on FM supply a wide choice of local and national programs; the Web has provided a whole new mode of electronic communication, including new ways of accessing print journalism, as most newspapers now publish on the Internet as well as in newsprint. ‘Interactive’ has become the digital buzzword for 21st-century media, as audiences of all kinds, readers, listeners, viewers, are invited to participate by phoning, e-mailing, texting (SMS messaging), joining bulletin boards and internet discussion groups. The ‘voice’ of the public is now as much a part of media discourse as the voice of the newscaster or commentator. How has this proliferation of communicative data been approached by discourse analysts and linguists? What kind of media texts have been the focus of analysis and what kind of questions do we ask about such texts? This article presents some of the main analytic approaches to media discourse over the last two decades, and gives a summary of the concepts and issues that have been most salient in this work. The first section concentrates on aspects

of participation: work dealing with analyses of speaker roles, voices, and identity in media discourse. The second section outlines aspects of content, dealing with work which has focused on how the media construct and represent the world. The final section addresses the particular characteristics of selected media genres, summarizing those aspects of media discourse which have been considered as context specific, institutionally oriented, and, as such, analyzably different from other forms of text and talk.

Analyzing Participation in Media Discourse A major concept within interactional sociolinguistics, based around Goffman’s notion of participation frameworks and ‘footing’ (Goffman, 1981), is the complex relationship between speakers, hearers, and the context of utterance. Goffman challenges the ‘speaker/hearer’ conduit models of communication and shows how these two categories can be described in far more contextually sensitive terms. Participation Frameworks and Footings

Linguistic analyses of media discourse have drawn heavily on Goffman’s work, particularly in relation to the discourse of public participation broadcasting (talk shows, radio phone-ins, panel debates, etc.), where members of the public interact with media professionals and other institutional representatives. These programs set up particular configurations of participant roles that are institutionally determined and generically specific. So, for example, in a British TV current affairs panel debate such as the BBC’s

Media: Analysis and Methods 617

Question Time, the host, the members of the panel, and the audience occupy different participant roles as the program unfolds. All are ‘ratified participants’ in the communicative event, but the relationship between their position as speaker or hearer, addresser or addressee, is both complex and shifting. In his analysis of indirectly targeted utterances in this TV panel discussion, Levinson (1988: 221) pointed out that ‘‘having a set of participant role categories is one thing, but working out who stands in which when can be quite another, on a vastly greater plane of complexity.’’ A microanalytic account which addresses this complex relationship in mediated interaction is Hutchby’s (1999) analysis of frame attunement and footing in talk radio phone-in call openings. Starting from Goffman’s notion of ‘mutually ratified participation’ in situated interactions, Hutchby described how callers move from a state of ‘incipient’ speakership to actual speakership and topic initiation in the first three seconds of these, usually routine, opening sequences, such as the following: Host: Joan calling from Clapham now. Good morning. Caller: Good morning Brian. Erm:, I (li) I also agree that the . . . (Hutchby, 1999: 46)

He argues that the sequential organization of the call opening (i.e., host goes first, caller goes second, with a specialized distribution of turn types; see ‘Media Talk as ‘‘Specialized’’ Speech Exchange Systems’ below) is only partly how institutionality is achieved in this context. Temporality is also crucial here; Hutchby identifies the moment-by-moment shifts of footing hosts and callers move through (i.e., the greeting, ‘buffer’ [erm], and topic initiation) in order to effect a transition from the private sphere of the caller into the public sphere of the broadcast. In addition to providing a set of categories and a description of the range of participant roles in a TV or radio broadcast, the analysis of participant roles and turn-taking positions in these programs enables us to examine the relationship between the institutional and the communicative context for talk. An analysis of the participation framework also enables us to examine typically who does what in those roles, i.e., the type of turns taken by each participant. As an illustration, a popular radio phone-in format will typically involve a host, a studio guest, and a caller (as well as the wider listening audience). In their institutional roles, these participants do different things: the role of the host is to mediate interaction between caller and guest, while the caller asks the questions and the guest gives the answers. However, although the caller occupies the potentially powerful role of questioner, their right of reply is not given in

this context; the third-turn receipt of the answer is generally taken by the host, who may or may not offer a further turn to the caller. This third turn is a host’s resource for controlling the trajectory of the talk (Thornborrow, 2002). Participation and Identity in Public Participation Broadcasting

Often linked to the analysis of speaker roles, another analytic concept used to examine participation in media discourse is the use of the categories ‘expert’ and ‘lay.’ These terms describe the institutional and social status of participants, and have a bearing on the kind of talk that they will be invited to produce. In a discussion of the participation framework in Kilroy, a former British TV talk show, Livingstone and Lunt described the difference between expert and lay discourse. Talk by lay audience members is real, authentic, grounded in experience, ‘hot’; while expert talk is ‘cold’: artificial, fragmented, and ungrounded (Livingstone and Lunt, 1994: 102). Experts speak for others, while the audience speak for themselves. Again, using Goffman’s model of ‘footing’ to describe the relationship of a speaker to their utterance, it was claimed that a lay participant speaks from a position of ‘animator, author and principal,’ in their own voice, in their own words, with commitment to what they say, while experts talk as animator rather than author or principal, speaking with an institutional voice, for ‘the profession’ and not from personal experience or commitment. The theme of authenticity, and what is meant by an authentic voice in media discourse, has been taken up in Thornborrow and van Leeuwen (2001). Another way of thinking about the expert/lay dichotomy in public participation broadcasting is to look at how speakers identify themselves in this context. While some participants in these programs are identified by name and by their institutional relevance to the topic at hand by the host, so-called lay participants have to do this work for themselves. Callers to radio phone-ins, and participants in talk show debates, routinely construct a relevant situated identity in relation to what they have to say: Caller: hello yes uh my question uh to the prime minister is on health .hh I’m a nurse in (.) a London teaching hospital and (.) my question is this [–] (Thornborrow, 2001b: 464)

This process of ‘discursive grounding’ provides a relevant frame for their contribution to the talk. A similar kind of action can be found in the way callers legitimize their opinions and viewpoints through a process Hutchby (2001: 495) called ‘witnessing’: ‘‘they bring into play claims that are – or are

618 Media: Analysis and Methods

assumed to be – incontrovertible, such as being a member of the category of pensioners, or having seen an event with one’s own eyes.’’ Participation and Identity on the Net

The question of participant identity is also one that has been explored in relation to computer-mediated communication (CMC), where participation frameworks are configured differently again from broadcast radio or television media genres. It has been suggested that one of the primary features of Webbased communication is the option not to reveal aspects of one’s social identity. It is possible to conceal age, gender, ethnic background, and class – the traditional sociolinguistic ‘big four’ variables – and construct a persona which is free from visual or spoken identity markers. The advantages of Web-based communication have been studied from the perspective of people who suffer from some kind of impediment or disability in face-to-face, or voice-to-voice, communicative situations, and for whom interaction is facilitated through the written modes of e-mail and messaging (cf. Lupton, 2002). However, the construction of a virtual identity can clearly have negative effects when used to intentionally deceive others. In spite of the initial view that the Internet was going to release users from conventional social hierarchies and prejudices, research has shown that in CMC, people still make assumptions about who they are communicating with based on stereotypical interpretations of gendered behavior (cf. Deuel, 1996).

Analyzing Representation and Ideology in Media Discourse Turning from participation in mediated interaction to the subject matter of media discourse, the analysis of content – particularly of print and broadcast news – was the focus of research in media studies begun in the mid-1970s by a group known as the Glasgow Media Group (1976, 1980). This concern with content as ideological representation was also taken up from a linguistic perspective in the development of a body of work known as ‘critical linguistics’ (cf. Kress and Hodge, 1976). Critical Linguistics and Critical Discourse Analysis

In terms of media analysis, critical linguists began to look in detail at the linguistic structures in news reporting, particularly at the processes of grammatical and lexical selection and representation. Basing their claim on what has been described as a ‘weak version’ of the Sapir–Whorf hypothesis, they argued

that ideological positions can to some degree be ‘read off’ from linguistic form. Linguistic processes such as transitivity, nominalization, and passivization realized as systematic textual choices present different ideological versions, or theories, of the world, and of events, actors, and actions in it (Fowler, 1991; Simpson, 1993). Driven to some extent by the political context at that time (in Britain this was characterized by monetarist economics, industrial unrest, and, as in the United States, hard-line nuclear defense policies), this work was developed in order to challenge and critique the dominant ideologies of Thatcherism and Reaganism in a range of textual genres, from media news reports to political speeches and policy documents (Chilton, 1988; Fairclough, 1995). Montgomery’s (1996) analysis of the news reports of the British coal miners’ strike in 1983 showed how contrasting representations of what is ostensibly the same event can be produced through different selections of linguistic forms. Key theories which have been elaborated by Norman Fairclough within the field of critical discourse analysis (CDA) are ‘synthetic personalization’ and the ‘marketization’ of discourse, the ‘commodification’ of social institutional processes such as education, and the ‘mediatization’ of politics and government. With the change in the political agenda brought about by the election of a Labour government in Britain in 1997, Fairclough (2000) turned the spotlight on the language of Tony Blair’s New Labour, analyzing the rhetorical construction of political spin: the control over the way politicians use language to communicate their ideas through the media. The agenda of CDA, as it has been set out by Norman Fairclough (1995), Teun van Dijk (1988), and their colleagues across Europe over the last 10 to 15 years (Wodak and Meyer, 2002), is to provide a set of critical tools and analytic methods with which to challenge and critique ideological discourse practices. Examples of this approach as it is applied to media texts can be found in van Dijk’s analysis of racist discourse in the press (1991) and in Ruth Wodak’s research into attitudes towards immigration, racism, and the European Union. Gunter Kress and Theo van Leeuwen (1998) also take a critical approach to media texts in their work on the semiotics of newspaper layouts. This is an analytic approach which takes into account the organization of not just the verbal, but also the visual aspects of mediated communication, and offers a multimodal analysis of the components implicated in the production of textual meanings. The theoretical grounding of CDA and its methods are not without critics. There can be a tendency to

Media: Analysis and Methods 619

overtheorize at the expense of paying close attention to empirical data, and a proliferation of technical terms that are not always clearly defined. As a methodological ‘toolkit’ for the analysis of media (or indeed other) texts, however, one of the clearest and most systematic accounts of the application of CDA to media texts can be found in van Dijk (1998). Here he laid out a set of analytic parameters (social functions, cognitive structures, and discursive expression and reproduction) and summarized the relationship between them, demonstrating how they can be used to unpack the ideologies which organize social attitudes and opinions in news stories. Despite these criticisms, it is clear that what versions of discourse analysis taking a sociocritical approach to textual data can convincingly do is show the kind of underlying background assumptions and prevailing discourses which structure hegemonic social meanings in texts. As an example, Justine Coupland’s work on the representation of social groups in a range of advertising texts from dating ads to face cream ads focuses on the social construction of age and identity. Coupland uses the concept of commodification of the self to analyze the discourse of dating adverts (1996) and more recently, to analyze the ideological discourse practices involved in advertising skin care products (2003), i.e., the presupposition in that ageing is problematic. CDA has established itself as one of the principal methods for analyzing and critically evaluating the content of media discourse broadly understood, whether in spoken mode (see Fairclough, 1995, for analyses of popular television series in Britain such as Crimewatch UK or Medicine Now) or in the print media, from advertising to newspaper reporting.

Analyzing Generic Properties of Media Texts The analysis of representational aspects of media discourse as outlined above is only part of the story. The analysis of media discourse as a particular form of talk has been the subject of a wide spectrum of research, which has focused on unscripted, broadcast talk. This is first and foremost ‘public’ talk (Scannell, 1991), since one of the most significant ways in which media talk differs from talk in many other nonmediated contexts is the fact that it is produced for an audience that is not copresent (often termed ‘the overhearing audience’). Most of the major analytic work which focuses on mediated talk as interaction has been undertaken in the field of conversation analysis (CA).

Media Talk as ‘Specialized’ Speech Exchange Systems

From the perspective of CA, the media have provided a highly fruitful source of institutional data. Examining mediated interaction as a context for talk which has a restricted and preallocated turn-taking system, and a specialized distribution of turn types, conversation analysts have been able to describe the organization of talk in media genres such as news interviews and public participation programs on both radio and television. The turn-taking system of a news interview is ‘‘a course of interaction’’ (Clayman and Heritage, 2002: 13) designed to maintain the audience as addressee. Specifically, news interview openings set agendas and regulate interviewees’ access to interaction, while closings are managed to limit abruptness within strict time limits. The turn-taking process involves one party asking questions (the interviewer) and the other responding to them (the interviewee); furthermore, it always falls to the interviewer to respond to an answer in third-turn receipt position, which enables them to do things such as reformulate an answer either positively, negatively, or aggressively. This speech exchange system is very different from conversational turn taking, where there is no preallocation of speakers or specialized distribution of turns. Issues of agenda setting, agenda shifting, and neutrality in news interviews have also been addressed by conversation analysts: for example, the design of interviewer questions, and the various strategies deployed by interviewees in answering (or not answering) the question. Clayman and Heritage pointed out that interviewers ‘‘often work to place some degree of distance between themselves and their more overtly opinionated remarks’’ (2002: 152) and the most usual way of doing this is by attributing that point of view to a third party, as in the following example: IR: .hhh People have used the phrase ‘concentration camps’: and the Bosnians themselves have used that phrase. Do you believe there’s any justification for that at all? (Clayman and Heritage, 2002: 153)

Neutrality is jointly produced by both interviewer and interviewee, and interviewees collaborate in maintaining the neutralistic stance by not challenging third-party assertions, although as Clayman and Heritage pointed out, they generally tend to refute them. Current styles of adversarial interviewing, and the increase of debate interviews where multiple interviewees take part in the interaction, usually with the intention of providing institutional

620 Media: Analysis and Methods

‘balance,’ are now also beginning to be investigated by conversation analysts as the genre changes and evolves. It is not only the turn sequences of questions and answers in news interviews that have been shown to be contextually specific. This is also the case in other broadcasts such radio phone-in programs, where callers can engage in various kinds of questioning activity. An example of how mediated interaction can be seen to differ from conversational interaction can be found in Hutchby’s (1995) analysis of the design of advice-giving turns in a radio phone-in program. In this context again, listeners have to be maintained as ratified participants in the talk event, as well as the host, expert adviser, and callers. The advice giver manages this by moving from the particular to the general, addressing not only the current caller with the problem, but all listeners who may also have the same problem, thus making their advice relevant for not just one addressee but for their wider audience. The design of talk for this ‘overhearing audience’ has been one of the central concepts in conversationanalytic approaches to media discourse, and it has also been taken up more generally by discourse analysts interested in the particular, public, nature of broadcast talk. Media Talk as ‘Performance’

Drawing on the notion that mediated communication is public talk produced for the listening and viewing audience, a body of work has emerged which analyzes media discourse as ‘performance.’ With specific reference to television talk shows, Tolson (2001b) examined the ways in which interaction in these programs is ‘doubly articulated,’ i.e., designed for its immediate recipients as well as the studio and/or viewing audience. Focusing on discourse genres such as conflict talk, argument, narrative, and therapy talk, which characterize many TV talk shows, these studies show how unscripted talk produced in such contexts is nevertheless performed for its audience. And such talk can be considered as ‘‘a form of play with pragmatic expectations of conversational practice’’ (Tolson, 2001a). For instance, Myers (2001) examined how a topic is made into an issue on the Jerry Springer show through four distinct stages: defining and representing stances, making those stances controversial, making them dramatic, then finally, making them meaningful. Wood (2001) looked at how Kilroy (former host of a British talk show of the same name) pursues an agenda through particular types of question and answer sequences which maximize conflict and enhance levels of ‘televisuality.’ She argued that in these shows it is not simply a case

of lay speakers being given a voice in the sense that ordinary people are invited to talk about their everyday lifeworlds; rather that the talk is constructed and managed in order to produce debate and dissent – i.e., the pursuit and performance of conflict. Similarly, Thornborrow (2001b) showed how narrative discourse of lay participants is elicited and managed in order to maximize the most dramatic moments of a story for the audience, examining the role of the host as narrative dramatizer as well as narrative elicitor. Media Discourse Genres: Some Examples

The fact that mediated communication gives rise to highly context specific genres of discourse has also been addressed by sociolinguists interested in describing forms of talk generated by broadcasting institutions, from DJ talk to sports commentary. Many of these media discourse genres have highly distinctive linguistic and pragmatic features, not only in terms of their content, but also in terms of their register, modes of address, and deictic format. DJ Radio Talk In a seminal paper on DJ talk, following Goffman (1981), Montgomery (1986) produced a further challenge to the model of mass communication which understands the communicative event in terms of a single speaker addressing a mass audience. Basing his analysis on pragmatic concepts of deixis and speech act theory, Montgomery showed how monologic DJ talk foregrounds an interpersonal relationship between the DJ and his audience, continually shifting its mode of address between different constituencies in the listening audience. Interpersonal features of DJ talk include simulating copresence through the linguistic devices of social and spatial deixis, the use of speech acts that require responses, e.g., questions and directives, and the use of expressives, e.g., congratulation and commiseration. Montgomery also drew on Goffman’s concept of participation frameworks and footing in order to examine alignments between speaker (DJ) and an array of hearers, through a process of interpolation, e.g.: Okay Fleet Street (they’re all awake now) I have news of a rock star (Montgomery, 1986: 437)

In this way, Radio DJs constantly realign their talk to address different segments of the audience, from the individual to the collective, but very rarely to the general mass. Mediated Narratives Narrative discourse is pervasive in broadcasting. In addition to the fictional genres of film, drama, soap, and sitcom, contexts for

Media: Analysis and Methods 621

storytelling include news, documentary, TV talk shows and talk radio, broadcasting, and, most recently, reality TV. An early account of how private stories are produced as public discourse is Montgomery’s study of narrative discourse on a popular British radio station. In his analysis of ‘Our Tune,’ a regular slot through the 1970s and 80s which featured listeners’ stories of overcoming personal difficulties in their lives, Montgomery used a Labovian framework of oral narrative analysis to show how different components of the stories become adapted to this particular mediated event. The DJ becomes the ‘epistolary narrator’ transforming the story from its original form of a private letter to the public medium of the spoken word. Through the use of ‘empathetic orientation’ towards the story protagonists, and specialized evaluation clauses that Montgomery calls ‘generic maxims,’ the DJ reconfigures the relationship between the teller, the story, and its recipients for its new, broadcast context (Montgomery, 1991). More recently, the design features and discourse structure of storytelling by lay participants on British and American talk shows have been examined from a range of analytical perspectives (cf. Thornborrow, 1997, 2001b; Lorenzo-Dus, 2003). Drawing on CA, as well as the pragmatics of facework, Goffman’s concept of role relations, and Gidden’s notion of the ‘project of the self,’ this work examines aspects of situated narrative form as well as its contextual function as a component of public participation broadcasting. Telling stories becomes a quasitherapeutic act, what Lorenzo Dus called ‘emotional DIY,’ conducted in the public domain. Live Commentary Relaying live events to a listening or TV viewing audience has become one of the primary functions of public broadcasting. The language of radio and television commentary, from sporting events to state ceremonies, is distinctive in its form, and linguistically particularly interesting as a media discourse genre. As Stephanie Marriott (1997: 194–195) explained, the commentator is ‘‘perpetually poised on the edge of the new,’’ and commentary constructs a shared ‘‘emergent present’’ between speakers and recipients. For radio, this involves a shared temporal framework, and in the case of television, also a shared spatial perspective: the visual field of the television monitor in the studio and the television screen of the viewer. This intersubjectivity between commentator and viewer is created linguistically through a high instance of deictic expressions which enforce the mutual, shared experience of the here and now, of first- and second-person pronouns which foreground the interpersonal relationship between speaker and hearer, of hedged opinions which

allow for the uncertainty of the moment, and of present and present perfect tenses which locate events in the ‘now’ and the ‘just now.’ Here are some examples from a bowls match commentary, and a snooker game: 1. Well, Mervyn, if he can make contact with his own nearest blue bowl and punch out the nearest McMahon bowl, could establish a set and match winning lie – he’s running after it – he rather likes it – he’s thereabouts – he’s got it. 2. I promise you the pressure has got to be the greatest ever. 3. Well that’s worked out exceptionally well for Steve. (Marriott, 1997: 194)

‘Netspeak,’ E-mails, and SMS Messaging To conclude this selection of linguistic analyses of media genres, no account would be complete without a brief comment on the most recent phenomena of the electronic age of communication, the Internet and the cell phone. These media not only give rise to different participation frameworks for communication (see above) but also, it is claimed, to new forms of written language. David Crystal (2001) examined some of the changes that the new medium of the Internet has brought about, e.g., neologisms, new conventions in formatting and punctuation, and the use of graphic symbols (emoticons) as expressive and evaluative discourse markers. Crystal takes the view that language is not only changing on the Internet, but it is changing because of the Internet. Others are more cautious in their claims, asking whether language use on the Internet is really so different from other varieties and registers to merit a label of its own. However, a consensus seems to be developing that ‘netspeak’ or ‘netlingo’ is at the very least a form of written communication that shares some characteristics with spoken language, and these characteristics can be contextually identified and linguistically described. E-mail communication has been described both as ‘‘letters by phone’’ and as ‘‘speech by other means’’ (Baron, 2003), capturing the difficulty of pinning down precisely what is distinctive about this form of communication. The language style is informal (but not as informal as face-to-face speech), a fast response is expected (but doesn’t always get acknowledged), an e-mail is generally intended for a limited audience (but leaves a trace and can be forwarded to others without the sender’s knowledge), and it is often treated as ephemeral: unedited and with the production errors left in (but it can be printed out, edited, and traced). The synchronous (real time) interaction that takes place in online chatrooms, with instant messaging, or in multiple user dialogs can be as complex as

622 Media: Analysis and Methods

multiparty talk: it is characterized by a high level of ‘addressivity,’ short turns, different conversational strands running simultaneously, and back channel support/minimal response tokens (Thurlow et al., 2003) While this form of communication is immediate, sharing many features of face-to-face interaction, it nevertheless relies on the use of the keyboard. The need for interactional speed leads to the development of specialized forms such as letter homophones, acronyms, and hybrids of both, capitalization for stress and emphasis, stylized spelling, and the use of emoticons and other graphic symbols which form a code, a linguistic variety used in this specialized context, which, like all mediated interaction, is constantly evolving. See also: Documentary; Language in Computer-Mediated Communication; Media and Language: Overview; Media and Marginalized Groups; Media Panics; News Language; Radio: Language; Sports Broadcasting; Television: Language.

Bibliography Aitchison J & Lewis D M (eds.) (2003). New media language. London: Routledge. Baron N (2003). ‘Why e-mail looks like speech: proofreading, pedagogy and public face.’ In Aitchison & Lewis (eds.). 85–94. Bell A & Garrett P (eds.) (1998). Approaches to media discourse. Oxford/Malden, MA: Blackwell. Chilton P (ed.) (1988). Language and the nuclear arms debate: nukespeak today. London: Pinter. Clayman S & Heritage J (2002). The news interview: journalists and public figures on the air. Cambridge: Cambridge University Press. Coupland J (1996). ‘Dating advertisements: discourses of the commodified self.’ Discourse and Society 7(2), 187–207. Coupland J (2003). ‘Ageist ideology and discourses of control in skin care product marketing.’ In Coupland & Gwyn (eds.). 127–150. Coupland J & Gwyn R (eds.) (2003). Discourse, the body, and identity. London: Palgrave Macmillan. Crystal D (2001). Language and the Internet. Cambridge: Cambridge University Press. Deuel N (1996). ‘Our passionate response to virtual reality.’ In Herring S (ed.) Computer mediated communication: linguistic, social and cross-cultural perspectives. Amsterdam/Philadelphia: John Benjamins. 129–146. Fairclough N (1995). Media discourse. London: Arnold. Fairclough N (2000). New Labour, new language? London: Routledge. Fowler R (1991). Language in the news. London: Routledge. Glasgow University Media Group (1976). Bad news. London: Routledge. Glasgow University Media Group (1980). More bad news. London: Routledge.

Goffmann E (1981). Forms of talk. Philadelphia: University of Pennsylvania Press. Hutchby I (1995). ‘Aspects of recipient design in expert advice-giving on call-in radio.’ Discourse Processes 19(2), 219–238. Hutchby I (1999). ‘Frame attunement and footing in the organisation of talk radio openings.’ Journal of Sociolinguistics 3(1), 41–63. Hutchby I (2001). ‘‘‘Witnessing’’: the use of first hand knowledge in legitimating lay opinions on talk radio.’ Discourse Studies 3(1), 481–497. Kress G & Hodge R (1976). Language as Ideology. London: Routledge and Kegan Paul. Kress G & van Leeuwen T (1998). ‘Front pages: (the critical) analysis of newspaper layout.’ In Bell & Garrett (eds.). 186–219. Levinson S (1988). ‘Putting linguistics on a proper footing: explorations in Goffman’s concepts of participation.’ In Drew P & Wootton A (eds.) Erving Goffman: exploring the interaction order. Cambridge: Polity Press. 161–227. Livingstone S & Lunt P (1994). Talk on television: audience participation and public debate. London: Routledge. Lorenzo-Dus N (2003). ‘Emotional DIY and proper parenting in Kilroy.’ In Aitchison & Lewis (eds.). 136–145. Lupton D (2002). ‘‘‘I am normal on the net’’: disability, computerized communication technologies and the embodied self.’ In Coupland & Gwyn (eds.). 246–265. Marriott S (1997). ‘The emergence of live television talk.’ Text 17(2), 181–197. Montgomery M (1986). ‘DJ talk.’ Media, Culture and Society 8, 421–440. Montgomery M (1991). ‘Our Tune: a study of a discourse genre.’ In Scannell (ed.). 138–177. Montgomery M (1996). An introduction to language and society (2nd edn.). London: Routledge. Myers G (2001). ‘‘‘I’m out of it. You guys argue’’: making an issue of it on The Jerry Springer Show.’ In Tolson (ed.). 173–192. Scannell P (1991). Broadcast talk. London: Sage. Simpson P (1993). Language, ideology and point of view. London: Routledge. Thornborrow J (1997). ‘Having their say: the function of stories in talk show discourse.’ Text 17(2), 241–262. Thornborrow J (2001a). ‘Authenticating talk: building public identities in audience participation broadcasting.’ Discourse Studies 3(4), 459–479. Thornborrow J (2001b). ‘‘‘Has it ever happened to you?’’: talk show stories as mediated performance.’ In Tolson (ed.). 117–138. Thornborrow J (2002). Power talk: Language and interaction in institutional discourse. London: Pearson Education. Thornborrow J & van Leeuwen T (eds.) (2001). Discourse Studies 3(1). Special Issue on Authenticity in Media Discourse. Thurlow C, Lengel L & Tomic A (2003). Computer mediated communication: social interaction and the Internet. London: Sage.

Media: Pragmatics 623 Tolson A (2001a). ‘Talking about talk: the academic debates.’ In Tolson (ed.). 7–30. Tolson A (ed.) (2001b). Television talk shows: discourse, performance, spectacle. Mahwah, NJ: Lawrence Erlbaum Associates. VanDijk T A (1988). News as discourse. Hillsdale, NJ: Lawrence Erlbaum Associates. VanDijk T (1991). Racism and the press. London: Routledge.

VanDijk T (1998). ‘Opinions and ideologies in the press.’ In Bell & Garrett (eds.). 21–63. Wodak R & Meyer M (eds.) (2002). Methods of critical discourse analysis. London: Sage. Wood H (2001). ‘‘‘No, YOU rioted!’’: the pursuit of conflict in the management of ‘‘lay’’ and ‘‘expert’’ discourses on Kilroy.’ In Tolson (ed.). 65–88.

Media: Pragmatics K C Schrøder, Roskilde University, Denmark ! 2006 Elsevier Ltd. All rights reserved.

Today, many, if not most, people in the world live in societies that can be described as ‘mediatized societies.’ A mediatized society is one in which the meaning processes, or discourses, provided by the communication media play an increasing, even overwhelming, role in the way society is economically, politically, and culturally organized, affecting the way we as individuals and groups think about everything and thus what we do in all contexts of life. The mediatized society affects us in whatever social roles we have to fill in everyday life. As citizens, we are concerned about the organization and power relations of society; as consumers, we have to take care of our material, intellectual, and wider cultural needs; and as human beings, we have to organize our private lives as individuals, couples, or families on a daily, weekly, and yearly basis. In all these respects, we are surrounded and affected by the sea of discursive meanings produced by the media. It is therefore mandatory for the understanding of modern society to understand the complex

social meaning processes that have media at their core. This requires a ‘pragmatics of media’ that explores media discourses in their situational and social contexts.

Approaches to the Study of Media Discourses It has gradually become accepted, at least in principle, that in order to understand the workings of the mediatized society it is necessary to adopt a holistic perspective of the media, according to which it is necessary to not just analyze the media texts but also to consider the production and reception processes involved in media texts, as well as the macrosocial context, as interdependent objects of empirical analysis. For a number of years, these processes were conceptualized theoretically in semiotic terms as a signifying process, along the lines laid down by the so-called ‘encoding/decoding’ model of mass communication (Hall, 1980) (Figure 1). This model implies that any study of a media genre or of the media coverage of real-world events must research, in addition to the textual aspects, the production and reception stages around the text. The

Figure 1 The encoding/decoding model. Reproduced from Hall S (1980). ‘Encoding/decoding’ In Hall S, Hobson D, Lowe A & Willis P (eds.) Culture, media, language. London: Hutchinson.

Media: Pragmatics 623 Tolson A (2001a). ‘Talking about talk: the academic debates.’ In Tolson (ed.). 7–30. Tolson A (ed.) (2001b). Television talk shows: discourse, performance, spectacle. Mahwah, NJ: Lawrence Erlbaum Associates. VanDijk T A (1988). News as discourse. Hillsdale, NJ: Lawrence Erlbaum Associates. VanDijk T (1991). Racism and the press. London: Routledge.

VanDijk T (1998). ‘Opinions and ideologies in the press.’ In Bell & Garrett (eds.). 21–63. Wodak R & Meyer M (eds.) (2002). Methods of critical discourse analysis. London: Sage. Wood H (2001). ‘‘‘No, YOU rioted!’’: the pursuit of conflict in the management of ‘‘lay’’ and ‘‘expert’’ discourses on Kilroy.’ In Tolson (ed.). 65–88.

Media: Pragmatics K C Schrøder, Roskilde University, Denmark ! 2006 Elsevier Ltd. All rights reserved.

Today, many, if not most, people in the world live in societies that can be described as ‘mediatized societies.’ A mediatized society is one in which the meaning processes, or discourses, provided by the communication media play an increasing, even overwhelming, role in the way society is economically, politically, and culturally organized, affecting the way we as individuals and groups think about everything and thus what we do in all contexts of life. The mediatized society affects us in whatever social roles we have to fill in everyday life. As citizens, we are concerned about the organization and power relations of society; as consumers, we have to take care of our material, intellectual, and wider cultural needs; and as human beings, we have to organize our private lives as individuals, couples, or families on a daily, weekly, and yearly basis. In all these respects, we are surrounded and affected by the sea of discursive meanings produced by the media. It is therefore mandatory for the understanding of modern society to understand the complex

social meaning processes that have media at their core. This requires a ‘pragmatics of media’ that explores media discourses in their situational and social contexts.

Approaches to the Study of Media Discourses It has gradually become accepted, at least in principle, that in order to understand the workings of the mediatized society it is necessary to adopt a holistic perspective of the media, according to which it is necessary to not just analyze the media texts but also to consider the production and reception processes involved in media texts, as well as the macrosocial context, as interdependent objects of empirical analysis. For a number of years, these processes were conceptualized theoretically in semiotic terms as a signifying process, along the lines laid down by the so-called ‘encoding/decoding’ model of mass communication (Hall, 1980) (Figure 1). This model implies that any study of a media genre or of the media coverage of real-world events must research, in addition to the textual aspects, the production and reception stages around the text. The

Figure 1 The encoding/decoding model. Reproduced from Hall S (1980). ‘Encoding/decoding’ In Hall S, Hobson D, Lowe A & Willis P (eds.) Culture, media, language. London: Hutchinson.

624 Media: Pragmatics

model serves to remind analysts that in analyzing a text, they are dealing not with a fixed structure of meaning, but with a volatile phenomenon resulting from the signifying codes of both the producers and the recipients of the text. Crucially, these codes need not be identical; indeed, since the codes at the disposal of any individual consist of a unique assemblage of the meanings assimilated during that person’s life history, the codes of producers and receivers are in principle nonidentical. Consequently, one should expect to find multiple meanings resulting from different individuals’ reading of a news bulletin, an advertisement, or a TV reality show. On the other hand, people also belong to interpretive communities (constituted by such factors as class, ethnicity, gender, age, profession, location, etc.) in which meanings are shared to a large extent (Schrøder, 1994). Therefore, a purely textual analysis of a media text can still be justified, as long as the analytical findings are offered cautiously as potential meanings, or made on behalf of a specific interpretive community whose sign universe makes these meanings plausible. In recent years, the term ‘discourse analysis’ has gained general acceptance as the way to characterize the theoretical and methodological framework, often holistic, within which analyses of media language and communication are carried out in the interdisciplinary field of media studies, cultural studies, and communication studies. For a general introduction to the most prominent distinctive approaches to discourse analysis in the social sciences and the humanities, see Jørgensen and Phillips (2002).

First-Generation Discourse Analysis: ‘Critical Linguistics’ Critical linguistics developed in the 1970s as an influential school of early discourse analysis because it was able to demonstrate the close relationship between the detailed linguistic choices and the production of ideology in media texts (especially news) and, by implication, to explain how media ideology contributed to the reproduction of a social order characterized by inequality and oppression (Fowler et al., 1979). Critical linguistics adheres to an early version of linguistic constructionism, according to which the words of our language function as a conventionalized mental grid through which we perceive reality. Words constitute the reality that they designate, and when newspapers inform about social states and events they inevitably construct those very states and events. Going one step further, critical linguistics claims that also the syntactic choices made in a text have a

constraining effect on the construction of social reality. The most important of such syntactic–ideological processes are those of passivization and nominalization. This should not be taken to mean that the verbal choices in news reports can be made by journalists at will. As mentioned previously, the formation of public opinion through the media in capitalist societies is controlled by those with economic and social power over the mass media, who will see to it that social affairs are represented in such a manner as to not jeopardize their interests and privileges. Consequently, there is a power dimension in all public communication, an ideological thrust that by its sheer omnipresence in the aggregate media landscape manages to establish the current social arrangements as natural and inevitable and to discredit alternative perspectives as being contrary to common sense. It is the task of critical linguists to expose the ideology conveyed by the media, to demonstrate that news reports of social affairs are indeed constructed and that they systematically portray a state of affairs that is not in the best interest of the majority of the population. It is thus the goal of detailed linguistic analysis to demonstrate how various seemingly innocent linguistic features in a text convey ideological meanings that reproduce existing unequal power relations. Through linguistic analysis of the news text, the critical linguist is able to expose ‘‘warped versions of reality’’ (Fowler, 1985: 68). Drawing on Halliday’s (1978) theory of functional grammar and social semiotic, innumerable publications by the core group of critical linguists through the late 1970s and the 1980s have described the specific linguistic features that ‘‘will probably repay close examination’’ (Fowler, 1985: 68; see Fowler et al., 1979; Hodge and Kress, 1988). The ‘checklist’ presented in Fowler (1985) includes the following: 1. Lexical processes: Under this heading, critical linguists examine the way a text uses different lexical fields through the choice of vocabulary (including metaphors) from specific areas of experience, such as scientific vocabulary in a cosmetics ad or management jargon in a political text. 2. Transitivity designates the textual construction of reality through the description of participants and processes, as reflected in the nouns and verbs of the text. As Fowler (1985: 70) states, ‘‘Different choices of transitivity structure will add up to different worldviews.’’ 3. Syntactic transformations: Critical linguists believe that certain syntactic transformations of sentences, particularly those labeled ‘passivization’ and ‘nominalization,’ are ideologically problematic

Media: Pragmatics 625

4.

5.

6.

7.

because they may make agency invisible and thereby obscure who did what to whom or significantly change the relative prominence of the participants. Modality: Under this heading, analysts search for the different linguistic features (e.g., modal verbs and adverbs) through which speakers and writers may express their attitudes toward the events depicted by the sentences in which they occur. Speech acts and turn-taking: Analysts should consider for each sentence or utterance what speech act it appears to perform and how it may build positions of power in the situation in which it is written or spoken. The analyst may also produce interesting insights about the power aspect of interpersonal interaction by examining the turn-taking patterns of different kinds of dialogic and discussion-oriented TV programs: who can speak when and about what, who can open up new topics, etc. Implicature: This linguistic feature is best explained as the meaning that can be found by ‘reading between the lines.’ There is more to meaning than what is said, and an essential part of the meaning communicated through language is inferred by the speakers from the situational and social context. Address and personal reference: Under this heading, the analyst may consider how different stylistic choices can be seen as addressing some readers rather than others (such as a Latinate vocabulary addressing the educated), how naming conventions affect the degree of formality, and how personal pronouns (you used with simultaneous individual and mass appeal in ads, and we used to include or exclude listeners from the group the speaker belongs to) may affect interpersonal relationships.

This checklist of potentially ideology-bearing features developed by ‘critical linguistics’ has found its way into many later forms of qualitative textual analysis of the media, including the approaches to discourse analysis discussed later. It is not evident, however, that they are all equally useful for the production of insights about the media’s signifying processes. For instance, transitivity analysis requires an immensely time-consuming scrutiny of textual details, and even when a kind of ideological pattern does emerge from the analysis (see the case analyzed in Fowler, 1985), it is usually a fairly predictable one that even a cursory glance at the text would have discovered. Therefore, especially for students with little linguistic training, transitivity analysis is usually not worth the effort.

Regarding the ‘syntactic transformations’ of nominalization and passivization, it is doubtful whether the claims of their mystifying effects are really warranted. In most cases, they appear to be based on erroneous assumptions about the ‘nonrecoverability’ of the transformed or deleted linguistic items (Trew, 1979). It seems to be equally probable that on the basis of their overall knowledge of the world, their familiarity with the media agenda and yesterday’s news reports, and their general communicative competence, average newsreaders will have no difficulty in reconstructing who did what to whom, despite the agent having been deleted from the sentence through a passive construction. Interestingly, an empirical study of readers’ reception of newspaper articles previously analyzed by Trew found that the syntactically constructed ideology of the articles did not determine the readers’ views of the events reported. Their views depended on their identities and life histories as much as on an ideological effect attributable to the news language (Sigman and Fry, 1985).

Critical Discourse Analysis of the Media Although clearly intellectually rooted in critical linguistics, critical discourse analysis (CDA), as developed by Norman Fairclough and others since the late 1980s, represents a significant theoretical and methodological advance toward the interdisciplinary study of media discourse (Fairclough, 1995, 2003; van Dijk, 2001; Wodak and Reisigl, 2001). Rather than concerning itself just with the media’s textual reproduction of ideology, CDA sets up a comprehensive theoretical framework that relates textual features systematically to the situations in which those texts are produced and consumed and to the larger social processes of the society in question. This theoretical framework is often described through a model of three embedded boxes (Figure 2),

Figure 2 Dimensions of analysis in CDA. Reproduced with permission from Fairclough (1995).

626 Media: Pragmatics

each of which represents a dimension of analysis (Fairclough, 1995: 59). ‘Texts’ stand at the core of the model and are explored through the same forms of linguistic analysis that are used by critical linguistics, with the purpose of illuminating the way the text represents social reality and the way it portrays the identities and relations of the participants in the textual universe. The second dimension of analysis deals with ‘discourse practices’ – that is, the processes through which the media text is produced in media institutions and consumed, or ‘decoded,’ by the audiences and users in the context of everyday life. The discourse practices are seen as mediators between texts and macrolevel ‘sociocultural practices,’ which constitute the third dimension of analysis. On this level, the phenomena brought to light by the other two dimensions are viewed in relation to the macrosocial processes that characterize the societal ‘order of discourse’ at a given point in time (see Discourse, Foucauldian Approach). CDA, in contrast to critical linguistics, is founded on an acute awareness of the ambivalent role of media discourses in the social formation. Inscribing his model into the current debates about ‘structure’ versus ‘agency’ (Beck et al., 1994; Giddens, 1984), Fairclough placed his approach within a social constructionist theory of society according to which media discourses are constituted by social practices and also are constitutive of such practices. This means that, on the one hand, mainstream media discourses are constrained, by the economic and political frameworks within which they operate, to produce versions of reality that are on the whole supportive of the existing social order. On the other hand, the fact that the existing social order is not monolithic but characterized by diversity and ideological struggle means that the faithful representation of this very reality must also be characterized by diversity and struggle, and the outcome of such representational struggles is by no means certain to always favor the power elites. The end result of the media’s discursive practices in a given area is therefore often uneasily balanced between social reproduction and social change, between convention and innovation. For example, the macrosocial phenomenon that Fairclough terms the ‘conversationalization’ of public discourse in the media (i.e., the increasing occurrence of informal speech forms and colloquial expressions in television news and documentary programs) can be seen as sometimes working to support ideologically hegemonic forces because it may trivialize and simplify complex social relationships. However, conversationalization may also serve as a generator of cultural democratization because it may make

complex issues easier to understand: ‘‘The communicative style of broadcasting lies at the intersection of . . . democratizing, legitimizing, and marketing pressures, and its ambivalence follows from that’’ (Fairclough, 1995: 149). ‘Intertextuality’ is a key analytical concept in CDA designating a principle of textual construction and recognition encompassing several distinct processes. It is meant to cover the basic fact that any text is indebted to innumerable previous source texts and will itself potentially become a source for an infinite number of future texts. Intertextuality also includes the way a text may stylistically echo one or more wellestablished genres, or particular well-known texts (e.g., when a TV commercial echoes the Western genre or a specific Western film), as well as the way a text may use specific recognizable passages from other texts (e.g., when a U.S. presidential hopeful inserts into his speech a passage from the Declaration of Independence or the Bible, or when news stories rely on direct or indirect quotations of a politician’s statements). Finally, the intertextual perspective means that the analyst should search for the way the particular text may draw on different ‘orders of discourse.’ This is a term that Fairclough borrowed from Foucault and that he defined as ‘‘a structured configuration of genres and discourses . . . associated with a given social domain’’ (Fairclough, 1998: 145) with clear implications for the regime of knowledge and power that rules within the particular domain. In the case of political discourse in the media, one may find that the conventional political order of discourse is intermingled with scientific and technological orders of discourse, the order of discourse of grassroots politics, the everyday order of discourse, etc. to create a new hybrid superordinate order of discourse that may herald innovative processes in the political domain. The main limitation of CDA, which in no way invalidates its achievement as a stimulating theoretical framework for media analysis, is the lack of empirical consideration of the middle level of analysis, the discourse practices. Fairclough deliberately excludes this aspect from his own analyses, stating that ‘‘my emphasis will be upon linguistic analysis of texts.. . .I am not concerned . . . with direct analysis of production or consumption of texts’’ (Fairclough, 1995: 62). It is nevertheless a limitation that becomes acute if the analyst wants to discuss the sociocultural implications for audiences of the meanings found through textual analysis. Very few researchers have undertaken a fully holistic, empirical study of media discourses examining the text-mediated communicative circuit between senders and recipients. Among the exceptions are Swales and Rogers’s (1995) study

Media: Pragmatics 627

of corporate mission statements and the studies of the news circuit by Gavin (1998) and Deacon et al. (1999).

Conversation Analysis and Discursive Psychology A third important discourse analytical approach has been developed mainly by scholars in the field of discursive psychology (Potter and Wetherell, 1987). Among its heterogeneous ancestry, there is no doubt that the ethnomethodology/conversation analysis complex has had the most formative influence on the approach with regard to the actual procedures of analysis. It is the analytical aim of conversation analysis to explore the situational micromechanics of verbal interaction, illuminating among other things the speakers’ management of turn-taking processes through adjacency pairs, the role played by silences and interruptions in the flow of interaction, the way speakers manage topic development and topic change, mechanisms for ‘opening up closings,’ and so on (for general introductions to conversation analysis, see Nofsinger (1991) or Have (1999)). Like conversation analysis, discursive psychology takes its point of departure in the situational context in which language is used. However, it does so in order to explore how the micromechanics of verbal interaction affect wider cultural, political, and social processes, for instance, in the analysis of nationalism in institutional and everyday settings (Billig, 1995, chap. 5), and in order to reconceptualize the study within psychology and social psychology of topics such as attitudes, memory, and attribution (Potter and Wetherell, 1987). The study of the mass media is not central to discourse psychology proper. However, it is clear that the approach has a lot to offer theoretically and analytically in this respect as the electronic media become increasingly dominated by programs that borrow from or replicate the verbal interaction of everyday life and as the digital interactive media open up enticing prospects of virtual communities based on verbal exchange (Hutchby, 2001) (see Cognitive Technology). Briefly characterized, discourse psychology analyzes talk in everyday situations because it is in interpersonal encounters that an important part of social reality is constructed, as speakers position themselves and each other in situational roles according to their individual and social interests. Discourse psychology is particularly concerned with the way speakers engage in fact construction – that is, the way they attempt to establish their accounts, or ‘versions,’ of social events as true and factual and to

undermine the factuality and truth of the versions of their interlocutors – a focus that is particularly appropriate to investigate many television news interviews and studio debate programs (Potter, 1996). When they produce their accounts of social events, staking a claim for their version, participants are drawing on meaning resources based on interpretive repertoires, a kind of ‘framework of understanding’ (Potter and Wetherell, 1996: 89): By interpretative repertoires we mean broadly discernible clusters of terms, descriptions, and figures of speech often assembled around metaphors or vivid images. In more structuralist language we can talk of these things as systems of signification and as the building blocks used for manufacturing versions of actions, self, and social structures in talk.

In a study of the discursive construction of politics in Danish media, Phillips and Schrøder (2004) found that the media made sense of the political through six different interpretative repertoires for understanding politics in a wide sense, ranging from the parliamentary arena, through the subpolitical arena of grassroots activism, to the life–political arena of political consumption in daily life: 1. ‘parliament-at-work,’ offering a positive perspective on capable and active politicians; 2. ‘the dirty underside of the party game,’ in which politicians are seen as scheming and manipulative; 3. ‘populism,’ pitching sensible citizens against distant, ignorant, and arrogant politicians representing ‘the system’; 4. ‘grassroots activism,’ in which citizens can make a difference by joining forces around single issues; 5. ‘everyday politics,’ where the negotiation of individual responsibility for social issues results in small-scale political action; and 6. ‘politics as a meta-phenomenon,’ an interpretive repertoire that constructs an outside evaluative and reflective perspective on the mechanics and limits of political institutions and agents. The important point about such repertoires is that they are not mutually exclusive but may coexist in a particular media discourse about politics, serving different rhetorical purposes in different situational circumstances. Also, they should be seen not just as contributing to the formation of citizens’ personal ‘attitudes’ about politics but also as generative meaning practices, which result in different conceptualizations of the possibilities and limits of political action. As already mentioned, discourse psychology takes no particular interest in the media. However, Potter’s comprehensive analysis of the situational construction of facticity is full of examples from media discourses, as he demonstrates how speakers in news programs take great pains verbally to demonstrate that they do not ‘have an axe to grind,’ to

628 Media: Pragmatics

voluntarily confess to having a stake in some state of affairs in order to create an impression of honesty and trustworthiness despite their stake, to claim that ‘‘facts show that . . .’’ something is the case, to bolster credibility by adducing the testimony of sources whose identity may be difficult to establish (e.g., when news reports draw on so-called ‘community sources’ for their reporting of city gang warfare), and so on. The most interesting and systematic work on media discourses from a situational perspective of dynamic interpersonal negotiation, however, has come from scholars who view themselves as conversation analysts rather than discourse psychologists (Clayman and Heritage, 2002; Greatbatch, 1998; Heritage and Greatbatch, 1991). One growing body of work has analyzed the most interactive genre of news production, the news interview (see Conversation Analysis). In such studies, there is not much interest in the possible ideological meanings of the sequence of utterances. Attention is focused on the interactive dynamics of the interview exchanges as their turntaking patterns are compared with those of ordinary everyday conversation in order to illuminate the specific constraints and options that govern the situational production of such interviews within the framework of public service broadcasting. Noting that news interviews deviate systematically from ordinary conversation in replacing the latter’s question–answer–receipt pattern with a question–answer–question sequence, Heritage (1985) explained this difference by the fact that news interviews are produced for an overhearing audience. By avoiding the evaluation inherent in the third-turn ‘receipt’–response characteristic of normal conversation, the interviewer declines the role of evaluating answer recipient while maintaining the neutral role of answer elicitor. The importance of the overhearing audience also manifests itself in the frequent occurrence of so-called ‘formulating’ utterances, in which an interviewer (by saying to the interviewee, ‘‘So you’re suggesting that . . .’’) may make explicit the potentially controversial implications of a politician’s answer while merely appearing to rephrase what the interviewee just said. Formulations may thus also serve an important function within the institutional framework of public service broadcasting, which requires journalists to maintain impartiality and balance in the coverage of controversial matters. A completely different type of TV interaction for an overhearing audience is analyzed by Crow (1986), who explores the conversational pragmatics of a U.S. phone-in program in which a sexologist host

gives advice about sexual problems, a genre that falls between private interpersonal talk and talk explicitly designed for an overhearing audience. Montgomery (1986) offers an excellent example of the analysis of broadcast monologue as he demonstrates how radio DJ talk, in contrast to third-person-based radio news monologue, operates on the axis between first and second person pronominal address. The DJ constructs an imagined community with his or her listeners in a simulated half-dialogue in which he or she does not display the slightest sign of awkwardness that his or her initiating speech acts (e.g., greetings and questions) are not responded to by anybody. Scannell (1991) presented a diverse range of studies on different types of conversational interaction in the broadcast media (see Talk Shows).

Analyzing the Visual Aspects of Media Discourse The visual aspects of modern media discourses have presented a difficult challenge for analysts of mediated meaning processes for many years. It is characteristic of discourse analytical approaches that they are almost exclusively focused on media language, whereas the visual dimensions of news reports in print and electronic media are at best given secondary attention. This situation exists despite the fact that it has long since become conventional wisdom for media research that the media landscape is increasingly dominated by still and moving pictures, which carry a substantial part of the total meaning communicated in newspapers and magazines, on television, and in the new media. When visual analysis of media pictures is attempted, the analytical tools always derive from the same two sources: Roland Barthes’s operationalization of the linguistic concepts of denotation and connotation for the analysis of images, especially photographs (see Barthes, Roland (1915–1980)), and semiotician Charles S. Peirce’s concepts of iconic, indexical, and symbolic signs (for a detailed analysis of news photographs, see Hall (1973) and Fiske (1990). Barthes (1964) suggested that we distinguish between two orders of meaning in a photograph: the denotative level, which carries the innocent, factual meanings available to any observer irrespective of cultural background, and the connotative level, which carries the visual meanings that a specific culture assigns to the denotative message. Barthes’s original example presented an advertisement that denotatively pictures a string shopping bag

Media: Pragmatics 629

in which one can see some onions, a green pepper, a can of tomato sauce, and two packets of pasta; the colors are yellow and green on a red background. In the French context of Barthes’s analysis, this visual message acquires the cultural meaning (connotation) of ‘Italianicity,’ and as a selling proposition the ad offers not just a number of unrelated products but the whole atmosphere associated with Italian cuisine. In similar fashion, other ads may offer visually based connotations such as sexuality, family happiness, scientific progress, and historical authenticity. According to Barthes, these connotative meanings will appear to the consumer as naturally given, not as ideological constructs, because they are ‘grafted’ onto the underlying, innocent denotative meaning. In this way, advertisers and other message senders may use connotations to convey taken-for-granted meanings that are shared within a culture without making these (ideological) meanings available for critical scrutiny. The analytical terms borrowed from Peircean semiotics have to do with the relations between signs and the real-world objects to which they refer (Peirce, 1985). Whereas a ‘symbol’ is a sign whose connection with its object is purely a matter of convention (the linguistic ‘word’ being the obvious example), an ‘icon’ is a sign that is related to its referent through similarity; thus, a photograph in a news article or in an advertisement is an iconic sign of the real-world phenomenon it represents. An ‘index’ is a sign that signifies by existential or physical connection with its object, such as when a product advertised in a magazine ad is made visually contiguous to the paraphernalia of a desirable lifestyle, whose qualities may thereby become associated with the product. It should be stressed that the three Peircean concepts are not to be thought of as three different kinds of sign but, rather, as three dimensions that are inherent properties of all signs in relation to their referents. A media picture of the White House is thus simultaneously an iconic representation of a particular building located in Washington, DC; an indexical representation of the government of the United States since it houses and thereby stands for its primary executive officer; and a symbolic representation of the connotative values conventionally associated with the United States and its president, be they those of the coercive global policeman or the home of freedom and democracy. It is especially the indexical/metonymical aspects of visual signs that may carry powerful ideological implications because they seem to establish a natural connection between the sign and its referent, between the ‘part’ that is selected by the photographer or editor for visual representation in the news photo

and the ‘whole’ scene that the photo supposedly represents. Thus, a metonymic photograph of a single violent incident may convey a wrong impression of a demonstration that was otherwise peaceful and orderly (see Pragmatic Indexing). These semiotic tools have proved to be of considerable heuristic value for the analysis of media images, but it must be acknowledged that they have been unable to provide insights beyond the commonsensical. Moreover, the distinction between denotation and connotation is theoretically dubious because it is impossible within the terms of the theory to define precisely the threshold between noncultural and culturally invested meaning on which the distinction relies (Eco, 1968). Some attempts have been made in recent years to change these theoretical terms by drawing on a cognitive approach to visual perception, according to which there is no fundamental difference between the perceptual processes of making sense of real-world and pictorial visual stimuli. The first-order visual media meanings are naturally perceived by the visual sense and then enter into a process of cultural investment and interpretation according to conventionalized signifying processes (Messaris, 1997). The challenge of developing an innovative approach to the analysis of visual discourse has been taken up by scholars who wish to take discourse analysis much further than merely extending its field of operation from linguistic to also include visual signs. They suggest that the modern media from school textbooks to the World Wide Web are increasingly producing texts that are multimodal, making use of a range of representational and communicational modes within the limits of one text (Kress and van Leeuwen, 2001). The range of the different ‘modes’ of communication includes, in addition to verbal language, the visual (including graphic styles, spatial display, diagrams, pictures from line drawings, and still and moving photos), the gestural, sound, etc. and requires the building of a comprehensive ‘discourse semiotic’ in which all kinds of human semiosis are explored within the terms of one theoretical platform (Kress et al., 1997: 258): Discourse analysis has, on the whole, focused on the linguistically realized text. In the multimodal approach the attempt is to understand all the representational modes which are in play in the text, in the same degree of detail and with the same methodological precision as discourse analysis is able to do with linguistic text.

The overall theoretical framework of Kress and van Leeuwen’s visual discourse semiotics is strongly akin to Fairclough’s three-dimensional model, whereas the analytical practice is inspired eclectically by theoretical and analytical work in linguistics, visual

630 Media: Pragmatics

semiotics, film theory, art criticism, as well as numerous predecessors in the various fields of media research, especially the analysis of advertising (Cook, 1992; Myers, 1994; Williamson, 1978). At this stage, however, it is not evident that the multimodal approach represents the kind of genuine innovation of the analysis of visual media discourses claimed by the authors. First, irrespective of the authors’ protestations to the contrary, with its indebtedness to Halliday’s (1978) theory of social semiotics, the approach is based on and to some extent biased toward a linguistic conceptualization of the other modes of representation. Moreover, both some of its theoretical ground pillars and some of the analytical insights balance somewhat uneasily between the postulatory and the commonsensical, and the analytical conclusions sometimes lapse into a simplistic view of the transfer of ideology from verbal–visual text to reader.

Toward a Pragmatics of Media This article has argued that a holistic theoretical perspective is necessary to understand the way the media communicate with citizens and consumers living in a mediatized world. One consequence of the adoption of a holistic, discourse–analytical perspective on the media is that in addition to focusing the analytical spotlight on the textual meaning processes in the media, the analyst is also invited to explore the discursive practices through which media texts are produced and received, as well as the larger sociocultural framework within which these processes take place. This requires the analyst to supplement traditional forms of critical analysis of media texts with ethnography-inspired interview-based fieldwork of, for instance, journalistic production routines and audience reception processes. Such discourse–ethnographic work produces new textual objects of analysis in the form of qualitative interviews with media producers and audience members about the meaning processes they engage in around the media product (Lindlof, 1995). Clearly, this research agenda is no less imbued with theoretical and methodological hazards than that of traditional media language analysis, which may explain the reluctance of media discourse analysts to embark on the kind of ethnographic fieldwork that they readily acknowledge is necessary. However, the payoff in terms of explanatory power is evident in those few instances in which scholars have faced the holistic challenge and brought together insights from production, textual, and reception studies of mediated communication (Deacon et al., 1999).

See also: Barthes, Roland (1915–1980); Cognitive Technol-

ogy; Communication: Semiotic Approaches; Constructivism; Conversation Analysis; Critical Discourse Analysis; Eco, Umberto: Theory of the Sign; E-mail, Internet, Chatroom Talk: Pragmatics; Ethnomethodology; Factivity; Fowler, Roger (b. 1938); Halliday, Michael A. K. (b. 1925); Marketing and Semiotics: From Transaction to Relation; McLuhan, Marshall (b. 1911); Media and Language: Overview; Media, Politics, and Discourse: Interactions; Media: Semiotics; Multimodal Interaction with Computers; Nominalization; Peirce, Charles Sanders (1839–1914); Pragmatic Indexing; Radio: Language; Saussure, Ferdinand (-Mongin) de (1857–1913); Social Construction and Language; Social Semiotics; Talk Shows; Telephone Talk; Television: Language; Transitivity: Stylistic Approaches; Visual Semiotics; Word and Image.

Bibliography Barthes R (1964). ‘Rhe´ torique de l’image.’ Communications 4, 40–51. (English translation: Barthes R (1977). ‘Rhetoric of the image.’ In Image–Music–Text. London: Fontana/Collins. 32–51.) Beck U, Giddens A & Lash S (1994). Reflexive modernization. Politics, tradition and aesthetics in the modern social order. Cambridge: Polity Press. Bell A & Garrett P (eds.) (1998). Approaches to media discourse. Oxford: Blackwell. Billig M (1995). Banal nationalism. London: Sage. Clayman S & Heritage J (2002). The news interview. Journalists and public figures on the air. Cambridge: Cambridge University Press. Cook G (1992). The discourse of advertising. London: Routledge. Crow B K (1986). ‘Conversational pragmatics in television talk: the discourse of ‘‘good sex.’’’ Media, Culture & Society 8, 457–484. Deacon D, Fenton N & Bryman A (1999). ‘From inception to reception: the natural history of a news item.’ Media, Culture & Society 21, 5–31. Eco U (1968). La struttura assente. Introduzione alla recerca semiologica. [The absent structure. Introduction to semiological research.] Milan: Bompiani. Fairclough N (1995). Media discourse. London: Arnold. Fairclough N (1998). ‘Political discourse in the media: an analytical framework.’ In Bell A & Garrett P (eds.). Fairclough N (2003). Analysing discourse. Textual analysis for social research. London: Routledge. Fiske J (1990). Introduction to communication studies. London: Routledge. Fowler R (1985). ‘Power.’ In van Dijk T (ed.) Handbook of discourse analysis, vol. 4. London: Academic Press. Fowler R, Hodge B, Kress G & Trew T (eds.) (1979). Language and control. London: Routledge & Kegan Paul. Gavin N T (ed.) (1998). The economy, media and public knowledge. London: Leicester University Press. Giddens A (1984). The constitution of society. Cambridge: Polity Press.

Media: Semiotics 631 Greatbatch D (1998). ‘Conversation analysis: neutralism in British news interviews.’ In Bell A & Garrett P (eds.). 163–185. Hall S (1973). ‘The determination of news photographs.’ In Cohen S & Young J (eds.) The manufacture of news. London: Constable. 176–190. Hall S (1980). ‘Encoding/decoding.’ In Hall S, Hobson D, Lowe A & Willis P (eds.) Culture, media, language. London: Hutchinson. Halliday M A K (1978). Language as social semiotic. The social interpretation of language and meaning. London: Arnold. Heritage J (1985). ‘Analyzing news interviews: aspects of the production of talk for an overhearing audience.’ In van Dijk T (ed.) Handbook of discourse analysis, vol. 3. London: Academic Press. Heritage J & Greatbatch D L (1991). ‘On the institutional character of institutional talk: the case of news interviews.’ In Boden D & Zimmerman D H (eds.) Talk and social structure: studies in ethnomethodology and conversation analysis. Oxford: Polity Press. Hodge B & Kress G (1988). Social semiotics. Cambridge: Polity Press. Hutchby I (2001). Conversation and technology: from the telephone to the Internet. Cambridge: Polity Press. Jørgensen M W & Phillips L (2002). Discourse analysis as theory and method. London: Sage. Kress G & van Leeuwen T (2001). Multimodal discourse. The modes and media of contemporary communication. London: Arnold. Kress G, Leite-Garcia R & Van Leeuwen T (1997). ‘Discourse semiotics.’ In van Dijk T (ed.) Discourse as structure and process. London: Sage. Lindlof T (1995). Qualitative communication research. London: Sage. Messaris P (1997). Visual persuasion: the role of images in advertising. London: Sage. Montgomery M (1986). ‘DJ talk.’ Media, Culture & Society 8, 421–440. Myers G (1994). Words in ads. London: Arnold.

Nofsinger R E (1991). Everyday conversation. Newbury Park, CA: Sage. Peirce C S (1985). ‘Logic as semiotic: the theory of signs.’ In Innis R (ed.) Semiotics: an introductory anthology. London: Hutchinson. Phillips L & Schrøder K (2004). Sa˚dan taler medier og borgere om politik (‘This is how media and citizens talk about politics’). Aarhus, Denmark: Aarhus University Press. Potter J (1996). Representing reality. Discourse, rhetoric and social construction. London: Sage. Potter J & Wetherell M (1987). Discourse analysis and social psychology. London: Sage. Potter J & Wetherell M (1996). ‘Discourse analysis.’ In Smith J A, Harre´ R & Van Langenhove L (eds.) Rethinking methods in psychology. London: Sage. Scannell P (ed.) (1991). Broadcast talk. London: Sage. Schiffrin D, Tannen D & Hamilton H E (eds.) (2001). The handbook of discourse analysis. Oxford: Blackwell. Schrøder K C (1994). ‘Audience semiotics, interpretive communities and the ‘‘ethnographic turn’’ in media research.’ Media, Culture & Society 16, 337–347. Sigman S J & Fry D L (1985). ‘Differential ideology and language use: readers’ reconstructions and descriptions of news events.’ Critical Studies in Mass Communication 2, 307–322. Swales J M & Rogers P (1995). ‘Discourse and the projection of corporate culture: the mission statement.’ Discourse and Society 6, 223–242. Ten Have P (1999). Doing conversation analysis: a practical guide. London: Sage. Trew T (1979). ‘Theory and ideology at work.’ In Fowler R et al. (eds.). 94–116. van Dijk T (2001). ‘Critical discourse analysis.’ In Schiffrin et al. (eds.). 352–371. Williamson J (1978). Decoding advertisements. Ideology and meaning in advertising. London: Marion Boyars. Wodak R & Reisigl M (2001). ‘Discourse and racism.’ In Schiffrin D et al. (eds.). 372–397.

Media: Semiotics G Withalm, University of Applied Arts, Vienna, Austria ! 2006 Elsevier Ltd. All rights reserved.

Introduction Although it might sound trivial, it is nevertheless true that our everyday life is increasingly perfused, guided, and even structured by media and media products/ messages. In order to cope with the media messages as such and with the changing variety of media we encounter – from newspapers and magazines to virtual reality environs, from movies and television shows to the Internet – we are in need of an integrated theory

and models to analyze that are appropriate to handle the messages both theoretically and in practice. It is most likely that whoever is dealing with media messages – from high school students or laypersons to academics or media professionals – will sooner or later stumble upon the notion ‘media semiotics,’ and subsequently try to get further information. Media Semiotics and/or Medium in Semiotic Handbooks – A First Glimpse

Usually, a first approach to a topic is to look up it up either in printed sources or in the more recent and increasingly popular way to get information, the

Media: Semiotics 631 Greatbatch D (1998). ‘Conversation analysis: neutralism in British news interviews.’ In Bell A & Garrett P (eds.). 163–185. Hall S (1973). ‘The determination of news photographs.’ In Cohen S & Young J (eds.) The manufacture of news. London: Constable. 176–190. Hall S (1980). ‘Encoding/decoding.’ In Hall S, Hobson D, Lowe A & Willis P (eds.) Culture, media, language. London: Hutchinson. Halliday M A K (1978). Language as social semiotic. The social interpretation of language and meaning. London: Arnold. Heritage J (1985). ‘Analyzing news interviews: aspects of the production of talk for an overhearing audience.’ In van Dijk T (ed.) Handbook of discourse analysis, vol. 3. London: Academic Press. Heritage J & Greatbatch D L (1991). ‘On the institutional character of institutional talk: the case of news interviews.’ In Boden D & Zimmerman D H (eds.) Talk and social structure: studies in ethnomethodology and conversation analysis. Oxford: Polity Press. Hodge B & Kress G (1988). Social semiotics. Cambridge: Polity Press. Hutchby I (2001). Conversation and technology: from the telephone to the Internet. Cambridge: Polity Press. Jørgensen M W & Phillips L (2002). Discourse analysis as theory and method. London: Sage. Kress G & van Leeuwen T (2001). Multimodal discourse. The modes and media of contemporary communication. London: Arnold. Kress G, Leite-Garcia R & Van Leeuwen T (1997). ‘Discourse semiotics.’ In van Dijk T (ed.) Discourse as structure and process. London: Sage. Lindlof T (1995). Qualitative communication research. London: Sage. Messaris P (1997). Visual persuasion: the role of images in advertising. London: Sage. Montgomery M (1986). ‘DJ talk.’ Media, Culture & Society 8, 421–440. Myers G (1994). Words in ads. London: Arnold.

Nofsinger R E (1991). Everyday conversation. Newbury Park, CA: Sage. Peirce C S (1985). ‘Logic as semiotic: the theory of signs.’ In Innis R (ed.) Semiotics: an introductory anthology. London: Hutchinson. Phillips L & Schrøder K (2004). Sa˚dan taler medier og borgere om politik (‘This is how media and citizens talk about politics’). Aarhus, Denmark: Aarhus University Press. Potter J (1996). Representing reality. Discourse, rhetoric and social construction. London: Sage. Potter J & Wetherell M (1987). Discourse analysis and social psychology. London: Sage. Potter J & Wetherell M (1996). ‘Discourse analysis.’ In Smith J A, Harre´ R & Van Langenhove L (eds.) Rethinking methods in psychology. London: Sage. Scannell P (ed.) (1991). Broadcast talk. London: Sage. Schiffrin D, Tannen D & Hamilton H E (eds.) (2001). The handbook of discourse analysis. Oxford: Blackwell. Schrøder K C (1994). ‘Audience semiotics, interpretive communities and the ‘‘ethnographic turn’’ in media research.’ Media, Culture & Society 16, 337–347. Sigman S J & Fry D L (1985). ‘Differential ideology and language use: readers’ reconstructions and descriptions of news events.’ Critical Studies in Mass Communication 2, 307–322. Swales J M & Rogers P (1995). ‘Discourse and the projection of corporate culture: the mission statement.’ Discourse and Society 6, 223–242. Ten Have P (1999). Doing conversation analysis: a practical guide. London: Sage. Trew T (1979). ‘Theory and ideology at work.’ In Fowler R et al. (eds.). 94–116. van Dijk T (2001). ‘Critical discourse analysis.’ In Schiffrin et al. (eds.). 352–371. Williamson J (1978). Decoding advertisements. Ideology and meaning in advertising. London: Marion Boyars. Wodak R & Reisigl M (2001). ‘Discourse and racism.’ In Schiffrin D et al. (eds.). 372–397.

Media: Semiotics G Withalm, University of Applied Arts, Vienna, Austria ! 2006 Elsevier Ltd. All rights reserved.

Introduction Although it might sound trivial, it is nevertheless true that our everyday life is increasingly perfused, guided, and even structured by media and media products/ messages. In order to cope with the media messages as such and with the changing variety of media we encounter – from newspapers and magazines to virtual reality environs, from movies and television shows to the Internet – we are in need of an integrated theory

and models to analyze that are appropriate to handle the messages both theoretically and in practice. It is most likely that whoever is dealing with media messages – from high school students or laypersons to academics or media professionals – will sooner or later stumble upon the notion ‘media semiotics,’ and subsequently try to get further information. Media Semiotics and/or Medium in Semiotic Handbooks – A First Glimpse

Usually, a first approach to a topic is to look up it up either in printed sources or in the more recent and increasingly popular way to get information, the

632 Media: Semiotics

Internet. A quick Web search with the two keywords ‘media’ and ‘semiotics’ renders more than 83 000 results (almost a moderate figure compared to some 328 000 results in the case of a search for ‘semiotics’ alone). After a first glance, however, the number will diminish drastically, since the majority of the first some 50 entries (normally checked) refer either to the few books that have the words ‘media semiotics’ in the title, or to Web pages offering sort of directories with links to ever the same handful of sites. So the reader seeking for information will certainly return to more conservative sources, that is, printed material, and she/he will look up the entries in diverse dictionaries and encyclopedias, and the respective articles on medium, media studies, or media semiotics in introductory works. Apart from the desired first information, such texts can also show whether the field in question, in our case: media semiotics, is recognized and appreciated as an individual area of semiotic research, or, more generally, whether it is present at all in the consulted texts, and how the topic is dealt with in the pertinent literature. Unfortunately, looking for definitions of, or at least descriptive passages about, media semiotics as a distinct field of semiotic research, analyses, and theorizing will not bring much help for the inquisitive novice, since the search does not render the expected results. A switch to the search for ‘medium’ or ‘media’ is, again, not very promising. The reason for this lack of entries, however, is definitely not caused by a lack of interest of semioticians in the field covered, or, even worse, a neglect of the notion as somewhat irrelevant to semiotics. For this observation, an explanation can be offered that lies in the comparably late process of academic autonomy and independence. Like all disciplines, or even subdisciplines and paradigms, semiotics has its own set of notions, the definitions of which sometimes overlap and coincide with those found in neighboring and adjacent disciplines; in other cases, they differ extremely from those generally known. To construct a concise terminological building in its own right, semiotic dictionaries and handbooks have to concentrate on the core terms. Although the process of mediation is by definition fundamental to semiotics and thus a pivotal area of semiotic reasoning, a crucial topic of semiotic research, and despite the interest in the field and the wide-spread use of the notion, ‘medium’ as such is not a genuinely semiotic concept. How is ‘media semiotics’ (or, more generally, medium) dealt with in some of the standard reference works published in the last two decades? One of the handbooks to start with for an indepth look at notions and concepts is certainly the

three-volume Encyclopedic dictionary of semiotics (edited by the long-term editor-in-chief of the journal Semiotica, the late Thomas A. Sebeok, and first published in 1986). The second volume contains an article titled ‘Medium.’ Unfortunately, the one-and-ahalf pages give no information whatsoever on medium, let alone media semiotics, since they are dealing with nothing but the notion referred to in the subtitle: ‘Message.’ This focus is somehow strange as there is another entry explicitly titled ‘Message/Medium’ that in turn discusses nothing but Marshall McLuhan’s view of the ‘medium/message’ complementarity. Although small in size, Vincent Colapietro’s Glossary of semiotics (1993), is usually a good spot for a first check on notions. With regard to our subject, the reader is less successful since the book has not a single entry on medium or media, let alone on media semiotics. The index to the first 100 volumes of Semiotica, published 1969–1994, offers a ‘Subject Index: Scientific Fields’ that, although quite detailed, does not even list the category keyword ‘medium’ or ‘media’ (1997). Looking more closely through the list of fields reveals, however, that the observation must not be read as a lack of papers on media topics: there are 2 articles subsumed under the heading ‘communication,’ 28 under ‘cinema,’ some 40 more under ‘film.’ With the 1998 Encyclopedia of semiotics (edited by Paul Bouissac), it is quite the same at first sight: the reader will not find an individual article on medium or media. Thanks to the index, he or she can check other entries to find something on the ‘ideological role of media,’ on ‘transformations,’ ‘violence,’ and ‘women in media.’ In addition, media is generally crossreferenced to mass communication, which has an entry. Further articles are dealing with ‘semiotics of advertising,’ cartoons, cinema, comics, communication, computer and computermediated communication, film semiotics (plus additional entries on: grande syntagmatique, imaginary signifier, and Christian Metz), mass communication, photography, and pictorial semiotics (1998). Like all the other volumes, the 2000 Encyclopedic dictionary of semiotics, media and communications (edited by Marcel Danesi) has no description of the field of media semiotics, but, of course, it includes an entry on medium that will be discussed below, since it offers a concise definition of, and approach to, the notion (2000). Winfried No¨ th’s Handbuch der Semiotik was first published in 1985 in German; in 1990, a heavily revised English version, the Handbook of semiotics, appeared, which was the basis for the 2000, second German edition. The 1990 English version grouped several chapters dealing with topics usually

Media: Semiotics 633

subsumed under ‘media’ under the heading ‘Aesthetics and visual communication.’ The second German edition features a new chapter, explicitly titled ‘Mediensemiotik.’ The areas presented are: media, image, image and text, maps, comics, photography, film, and advertising. The subject index, however, enumerates several other instances where the notion media is discussed. Taking up ideas first presented in his introductory contributions to two collective volumes (No¨ th, 1997, 1998), No¨ th started his chapter on ‘media’ with some general observations on the relationship between semiotics and the media, naming both early examples of semiotic studies of media texts and several strains and ideas within semiotics adopted for media studies. He then continued with three subchapters dealing with: ‘themes of media semiotics,’ ‘signs, medium and the media,’ and ‘signs, reality and hyper reality.’ So far the largest (and most recent) reference work is Semiotik: Ein Handbuch zu den zeichentheoretischen Grundlagen von Natur und Kultur/Semiotics: a handbook on the sign-theoretic foundations of nature and culture (referred to hence as Semiotik/ Semiotics), edited by Roland Posner, the former President of the International Association for Semiotic Studies IASS-AIS. Due to the structure of the handbook, the central article on our topic in volume 3 has the term in question – media semiotics – only in the subtitle (and unlike ‘semiotics of culture’ or ‘cultural semiotics,’ it is not listed in the index either). However, the text on ‘Semiotic aspects of mass media studies: media semiotics’ (Wolf, 2003) gave a good overview on the relationship between the two adjacent, or even overlapping, fields. Starting from the history of mass media studies, Wolf continued to outline the main areas of mass media research, and recent trends. When it came to the actual relationship, he proceeded from a strong focus on Eco’s theoretical models of the code, as well as the discussion of the reciprocal expectations and changing semiotic paradigms to the interconnection between semiotics and British cultural studies. The last part showed ‘‘thematic areas where integration between semiotics and mass media studies is more advanced and is required by the media under examination’’ (Wolf, 2003: 2932), like the ‘‘textual structures and utterance processes,’’ or ‘‘cognitive activities of the media audience’’ (Wolf, 2003: 2933). Wolf ended with ‘‘open questions on the research agenda’’ (Wolf, 2003: 2934), which will be presented later on. Within the same chapter of the third volume of Semiotik/Semiotics, presenting the relations between semiotics and individual disciplines, there is also a contribution on ‘Semiotische Aspekte der Filmwissenschaft: Filmsemiotik’ (‘semiotic aspects of film

studies: semiotics of the cinema,’ Kloepfer, 2003). Concerning ‘medium’ the handbook contains several articles that mention the notion even in the title. Chapter 2 of the first volume, for instance, is already titled ‘Aspects of semiosis – channels, media, and codes,’ and among others it features ‘Technische Medien der Semiose’ (‘Technical media in semiosis,’ Bo¨ hme-Du¨ rr, 1997), and ‘Social media of semiosis’ (Threadgold, 1997). At the end, in volume 4, which offers a chapter on selected topics of semiotics, there is also an article on ‘Multimediale Kommunikation’ (‘Multimedia communication’; Hess-Lu¨ ttich and Schmauks, 2004). Thanks to the detailed index of subjects the reader is able to look up some further contributions where medium or media are discussed. Finally, as already mentioned in the very first paragraphs, there are some textbooks, introductory works, collective volumes, or chapters in more general overviews of semiotic studies that carry the formula ‘media semiotics’ in their title. The earliest coupling was, most probably, a German collective volume edited by Gu¨ nter Bentele in 1981, titled Semiotik und Massenmedien. The various articles focused on different media and the analyses of media texts. Approximately a decade later, Ernest W. B. HessLu¨ ttich (1990a) dealt with ‘Massenmedien und Semiotik’ in the context of semiotics in the individual sciences. A volume that focused entirely on this field was certainly Semiotics of the media. state of the art, projects, and perspectives, edited by Winfried No¨ th (1997), which contained a large number of papers presented in Kassel at the 1995 conference on the semiotics of media. Finally, two more introductory publications are Media semiotics: an introduction (Bignell, 1997) and Understanding media semiotics (Danesi, 2002). In the latter volume, the field is defined in a first approach in the following way: The primary object of media semiotics is to study how the mass media create or recycle signs for their own ends. It does so [. . .] by asking: (1) what something means or represents, (2) how it exemplifies its meaning, and (3) why it has the meaning that it has (Danesi, 2002: 34).

The restriction to the explicit naming of semiotics in the title, however, might exclude several publications that proceed from semiotic concepts without telling so, like, a collection of early papers from the Birmingham Centre for Contemporary Cultural Studies (Hall et al., 1980). The same applies to volumes on special topics, as for instance, narrativity, edited or written by semioticians (cf. Kloepfer and Mo¨ ller, 1986; Cobley, 2001).

634 Media: Semiotics Semiotics of the Media and/or Media Semiotics

What most semioticians seem to agree upon is the view that there is nothing like a unified media semiotic approach (cf. Bentele, 1981: 26; Hess-Lu¨ ttich, 1990a: 177; No¨ th, 1997: 1); the more so since, on the one hand, media studies themselves have to be considered rather a vast field of research than a single discipline, and on the other hand, semiotics cannot be regarded as a unified single paradigm discipline either. As will be shown in several other entries (see Paris School Semiotics), the semiotic enterprise is made up of different schools and traditions, starting from the big divide between the philosophic or Peircean or American line of triadic concepts of semiosis and the linguistic or Saussurean or continental line of binary sign models. Evidently, these two and many other semiotic models and theories found their way into the study of media. Of particular interest for both disciplines seems to be the special kind of relationship between semiotics and media studies. As will be shown below, the history of semiotic studies of media and media texts goes at least back to the 1960s, which means that semiotics has been there long before the media studies hype has commenced. Nonetheless, semiotics has not always been valued as an approach fruitful for media studies (or at least not all over the world). Rereading the various introductory texts on media semiotics written over some two decades also reveals how the relation has changed during the last years. Hess-Lu¨ ttich (1990a) observed a certain skeptical attitude toward semiotics among the various disciplines that share the research object ‘mass media’: Zu sehr sind die Medienforscher noch ihren jeweiligen Fachtraditionen verpflichtet und verhaftet, als daß sie die integrierende Kraft semiotischer Theorienbildung methodisch zu nutzen gelernt ha¨tten. In Literaturwissenschaft wie Publizistik wird gegen die Semiotik zuweilen als ‘Mode’ polemisiert, ihre zweitausendja¨hrige Geschichte verkennend (Hess-Lu¨ ttich, 1990a: 203).

In the mid-1990s, No¨ th discussed the range in the geographical distribution of media semiotics: While in countries like Italy (especially under the influence of Umberto Eco), France, Spain, and Brazil, the terms ‘semiotics’ and ‘media studies’ are often almost used synonymously, in Germany and in the English speaking countries, the semiotics of the media has been considered as less central in media studies (No¨ th, 1997: 6–7).

One specific area, however, deserves a separate discussion: British cultural studies that to a large degree focus on media and are rooted in semiotics. As a more general observation, it seems that in the last years

semiotics has finally found its way into general media and communication studies, as university curricula, reading lists, and undergraduate text books can show. Depending whether viewed from the side of semiotics or from that of media studies, the question which of the two is considered just an auxiliary discipline or method, and questions of dominance and subordinance are answered quite controversially. Winfried No¨ th titled his contribution in the 1998 volume even ‘Die Semiotik als Medienwissenschaft’ (or ‘Semiotics as a media science par excellence’ as he named the relation in his introduction to Semiotics of the media No¨ th, 1997: 5), and considered semiotics to be the fundamental discipline for media studies. Accordingly, his enumeration of topics dealt with in media semiotics is by far the most comprehensive, and almost coextensive with that of semiotics in general, since it comprises themes like: Kommunikation, Kognition und Emotion, Mediensemiose und «Realita¨ t», Referenz und Selbstreferentialita¨ t, Wahrheit, Mythos und Ideologie, Information, «Objektivita¨ t» oder Manipulation und schliesslich auch die evolutionsgeschichtlichen Wurzeln der Zeichenproduktion und -rezeption in den Medien (No¨ th, 1998: 53).

Although generally used in a synonymous way, the two different notions in the section heading could discern two fields. Analogous to the distinction between semiotics of culture and cultural semiotics, the semiotically rooted occupation with the media goes in two directions: (1) the semiotic exploration of media genres and texts (or, how semiotics can be used for the analysis of movies, ads, or newspapers, as it is sometimes stated in media studies textbooks); and (2) a semiotic theory of (mass) communication and media. Both in general recognition and in the literature, the first one is by far more present and touches each and every mass medium used, from the more traditional ones to latest electronic developments in information and communication technologies. Nevertheless, it is exactly the latter field in which the modeling and conceptual quality of semiotics is able to contribute to the further development of media studies: semiotics is not just another discipline dealing with a particular object (which in the case of semiotics is vast enough: signs and sign processes of all kinds), but it is always and at the same time a metadiscipline reflecting on concepts, paradigms, and approaches. In this sense, media semiotics is not only reflecting on media and media texts, but it is also able to offer the forum for the dialogue between media studies and semiotics and to cogitate on their relation and their tasks.

Media: Semiotics 635

Semiotics and (Mass) Communication Processes The relation between semiotics and both communication studies and media studies is rather intricate. In order to discuss the role of semiotic studies and models with regard to media (studies), the reflection on a common topic could be an appropriate point of departure: the communication process. Since it is generally agreed upon that, as Jakobson has put it, ‘‘[t]he subject matter of semiotic is the communication of any messages whatever’’ (Jakobson, 1973: 32), semiotics and communication/media studies are neighboring disciplines that share a research object: communication (see Communication: Semiotic Approaches), although semiotics cannot be reduced to the study of communication, because it is equally concerned with signification (cf. Prieto, 1968; Mounin, 1970), or significative and cognitive processes (cf. No¨ th, 1985: 235). Krampen made yet another clarification concerning the particular relation between communication and semiosis that is needful: A communication process in which a sender transmits second hand experiences to a receiver is a sign process par excellence. In this sense, models of communication processes are always models of semioses, but not vice versa (Krampen, 1997a).

Accordingly, reflections on communication are often present when semioticians talk about sign processes. One of these authors for whom communication is a central concept is Charles W. Morris (1946). Alongside the divide between communication and signification, Umberto Eco offers a discussion on communication in his Trattato di semiotica generale (1975). Models and Concepts

The preoccupation of semiotics with communicative processes is not a recent phenomenon and can look back at least on more than one hundred years of history (if the two and a half millennia of occidental history of dealing with semiotic questions at large is not taken into account!). Decades before Shannon and Weaver formulated their communication model with sender and receiver in 1949, Ferdinand de Saussure (1916) (see Saussure, Ferdinand (-Mongin) de (1857–1913)) presented his view on the communicative situation (i.e., the speech circuit model) that, contrary to the 1940s diagram, implied already the three fundamental dimensions: the semantic dimension or the constitution of meaning, the pragmatic dimension showing sign producers in action, and the dialogic dimension of message exchange. Unfortunately, the speech circuit model was hidden within Saussure’s courses on linguistics and

hardly received outside this discipline. Communication studies would have started differently, if the ruling paradigm would have come from semiology. Although they have envisaged a broader application of their information–theoretic model, the two authors are not to blame for the development in the 1950s and 1960s and the resulting lack of a semantic and a pragmatic focus, since they created the model for a specific purpose and never dealt with semantic aspects of the message. The problem lies rather in those followers who eagerly took the model as a ready-made device and insufficiently adopted it for human communication. As a consequence, generations of students in communication and media studies were treated with technicistic concepts based on the short-sighted transmission model of communication. Ferdinand de Saussure’s diagram might have been the first semiotic model of communication, but it is definitely not the only one. Since semiotics is not a single-approach enterprise, but, as mentioned above, consists of several different currents, there is also more than one model of the communication process or of the sign process, of ‘semiosis,’ to use a more general term. (An extensive discussion of the diverse views of semiosis as well as a complex and comprehensive semiosic matrix is presented in Krampen, 1997a; cf. also Krampen, 1997b.) Chronologically, the next diagrammatic formulation that has to be taken into consideration for the study of communication was developed by Karl Bu¨ hler (see Bu¨hler, Karl (1879–1963)). In the 1930s, his tripartite organon model of language (1934: 28) has the sign as the center and point of intersection between three elements, the three respective functions of the sign with regard to these elements and the three correlated sign moments: the sender (Ausdrucksfunktion or the expressive function; symptom or index), the receiver (Appellfunktion or the appellative or conative function; signal), and the objects or states of affairs (Darstellungsfunktion or referential/representational function; symbol). Bu¨ hler’s schema has to be mentioned, although its reception was still restricted to linguistics, since it was quite influential for another concept that, in turn, has found its way into media and communication studies: Roman Jakobson’s communication model (see Jakobson, Roman (1896–1982)). Jakobson proceeded from Karl Bu¨ hler’s organon model of language and extended it to a schema combining six constitutive factors of communication, which are paired with six equivalent functions (named in parentheses): addresser (emotive), message (poetic), addressee (conative), context (referential), contact (phatic), and code (metalinguistic).

636 Media: Semiotics

Although the schema seems to be similar to the standard one-way model of communication between a sender and a receiver (in particular in the graphic presentation), there is a fundamental difference with regard to the elements due to including the context and the code. Originally conceived for verbal communication, Roman Jakobson’s model of these six ‘‘factors inalienably involved’’ (Jakobson, 1960: 353) has been adopted for the analysis of nonverbal communicative and media processes. Looking through the publications, the application to domains as different as architecture and advertising can be found. The most recent example is the discussion of Jakobson’s concepts with regard to new media and multiagent environments (Petric et al., 2001). Relying on the systemic quality of Jakobson’s thinking, Itamar EvenZohar adapted the schema to the literary (poly)system (cf. Even-Zohar, 1990: 31), which could be considered to be just a special case of media systems in general. It seems that over the years communication studies has developed a strong tendency to deal with aspects that are clearly associated with semiotics. According to Berger and Chaffee, communication science seeks to understand the production, processing, and effects of symbol and signal systems by developing testable theories, containing lawful generalizations, that explain phenomena associated with the production, processing, and effects (Berger and Chaffee, 1987: 17).

glossaries, and textbooks, we can see that many scholars in communication studies include signs and sign production in their formulas. Accordingly, the focus on Man as a producer and recipient (or reproducer) of the message, which is necessarily and always conveyed in form of sign vehicles or signifiers or representamina or signantia (depending on the semiotic theory adopted), is no longer mistaken for a restriction since it includes per definition an antireductionist approach compared to shortsighted technicistic models. The view of communication as semiosis is not confined to only meaning-oriented considerations as opposed to society-oriented approaches, because in semiosis three moments are interconnected: the production/constitution of meaning (semantic dimension); the structure of the message (syntactic dimension); and the usage of the message including attitudes and actions resulting thereof (pragmatic dimension). The most explicit integration of social aspects can be found in Ferruccio Rossi-Landi’s sociosemiotic theory of social reproduction in which communication is defined as sign exchange. For Rossi-Landi, social reproduction always comprehends ‘‘three indissoluble correlated moments’’: production–exchange–consumption. Exchange ‘‘is always, at the same time and constitutively, external material exchange [and] sign exchange, that is communication, including as such within it: sign production, sign exchange in the strict sense, and sign consumption’’ (Rossi-Landi, 1975: 65; cf. 1985: 38). Different Areas of Communication

Despite the wording (and the concepts referred to), the way scholars in communication science intended to handle the topic is again not a qualitative one, but rather aimed at empiricist approaches delivering quantifiable results. From the side of semiotically oriented researchers, however, this attitude is described quite differently, even if they proceed from the same observation, thus preparing the ground for a sign–theoretic foundation of the study of communication processes. In his article in Semiotik/Semiotics on semiotic aspects of mass media studies, Mauro Wolf stated: Indeed, there was a strong need in communication studies to focus on the complex nature of the communicative process, and semiotics was better suited than psychology or sociology to understand this element of crucial relevance in mass communication processes (Wolf, 2003: 2930).

As already shown, what semiotics can indeed offer is a differentiated view of the communication process as sign process. Looking through the many definitions of communication to be found in the various encyclopedias,

The study of communication is confronted with a vast field of quite different types of communication, or communicative situations, with quite different means of communication involved and used to convey messages and meaning. Due to the importance of the sign system language, there can hardly be an analysis of verbal communication without semiolinguistic concepts, starting from the situation, the course of the conversation, the various registers used, or the content of the utterance. In much the same way, a great deal of the publications on nonverbal communication is based within semiotics, like the research on ‘body language’ or proxemics. As soon as we transcend the domain of communication between human beings and turn to man–machine– communication, we have to consider intricate relations and overlaps between semiotics and cognitive sciences or AI-research respectively. With the area of mass communication, finally, we enter the diffusion zone between communication and media studies. Looking at the various subfields of mass communication research, additional connections

Media: Semiotics 637

can be found, like the transition from research on subconscious effects to the occupation with the recipients’ competence in handling mass media. From the point of view of semiotics, processes of mass communication are also semiosic processes like any other communication. When split into the partial processes involved and analyzed in detail, however, the degree of complexity appears to be much higher. Krampen distinguishes the subsequent and simultaneous phases and the various channels used, and he concludes that a mass communication event is a supersemiosis S constituting a matrix of semioses Z, modeled by semiosic matrices, with n kinds of channels [. . .] during m kinds of stages in the production of a mass communication event (e.g., acquitisition, editing, elaborating, sequencing or layout, sending or printing, etc.) (Krampen, 1997b: 96).

As discussed above, the question of the proportion of semiotic theories and analyses within media studies proper, or the specific relation of the two disciplines, can only be answered after a closer look at the entirety of semiotic approaches (including closely related concepts), their respective historic influence (cf. Wolf, 2003).

Notions and Concepts Medium and Media

Even though ‘medium’ is not part of the set of semiotic notions proper, it is widely used also within semiotic texts and thus deserves some attention, the more so, since one of the tasks semiotics is specially apt for is the theoretic reflection and a systematic reconsideration of pivotal concepts and models in media studies, like channel or medium (cf. Wulff, 1978). Since several different disciplines are occupied with the study of media, media texts, media institutions, etc., the notion of medium is in itself an iridescent notion. Unfortunately, the degree of usage in communication and media studies does not always correspond to that of pertinent definitions of the notion. In his entry on ‘Multimedia communication’ in Sebeok’s Encyclopedic dictionary of semiotics, HessLu¨ ttich thematized this problem: Medium’s widespread usage in mass communication and psychology, in economics and cybernetics, in physics and philology, makes it difficult to agree on an integrated basis for founding a category of material transmission of social meaning (Hess-Lu¨ ttich, 1986: 574).

More often than not, apart from the general quoting of the Latin origin of the term (medius ‘between, the middle’), the solution offered by the authors

seems to be rather a description resorting to everyday experiences (and uses) of the term and the enumeration of examples than a definition sensu stricto. Texts that try to cope with the term ‘medium’ usually start from the physical meaning of the word (that is, the contact matter or physical substance), only to add in the next sentences that it is generally used for the means of communication. An example for the contextually explicable restriction to the latter can be found in the ‘Glossary’ for the first Elsevier Encyclopedia of language and linguistics (ELL1), which names two definitions: ‘‘1 The means used in a communication, i.e., whether it is spoken, written, symbolic, color coded, etc., e.g., phonic, aural, visual medium. 2 A channel of communication as in mass media’’ (1994: 5144–5145). Mass media in turn are defined as ‘‘(t)he (mass produced) media (see medium 2) which seek to communicate with a mass audience, e.g., television, newspapers’’ (1994: 5144). A widespread classificatory approach distinguishes between primary media (which are entirely based on the abilities of our body and function without technical equipments), secondary media (the sender uses some kind of machines), and tertiary media (both sender and receiver have to rely on technical equipment) (cf. Pross, 1972); recently, the tripartite scheme has been augmented by a quaternary type, which comprise the digital media. Another tripartite scheme is elaborated by Marcel Danesi in his Encyclopedic dictionary of semiotics, media and communications. He opened the entry with general definitions: ‘‘1. any means, agency, or instrument of communication; 2. the physical means by which a sign or text is encoded (put together) and through which it is transmitted (delivered, actualized)’’ (Danesi, 2000: 141–142). In the added explanatory notes, he then proceeded from prewriting media for communication (oral–auditory and pictographic), and discussed the alphabet in terms of a Kuhnian paradigm shift. After briefly discussing McLuhan, and describing how each medium implicates knowledge of specific kinds of codes that the medium itself determined to be deployed, Danesi offered three more entries on three different types of medium (and the respective examples) that should be distinguished in the analysis. (1) The artifactual medium: ‘‘artifactual means or mode of encoding and decoding a message,’’ like books, paintings, sculptures, or letters; (2) the mechanical medium: ‘‘mechanical means or mode of transmitting a message,’’ that is, telephones, radios, television sets, computers, videos; and (3) the natural medium: ‘‘natural means or mode of encoding and decoding a message,’’ like the voice (speech), the face (expressions), the body (gesture, posture, etc.) (Danesi, 2000: 142).

638 Media: Semiotics

Despite such differentiated views, today the notion of medium is often used just in the plural and thus confined to denominate the means of mass communication. One of these definitions of mass media is included in the entry on mass communication of the 1998 Encyclopedia of semiotics, considering them to be channels of communication, located at the institutional and corporate levels of society that use large-scale hightechnology methods to supply standardized communication products to widespread heterogeneous audiences (Fulton, 1998: 389).

In somehow wider descriptions, the concept of (mass) media includes such different instances of the communication or sign process as the channel (and thus also the physical substrate used for transmission), the entire apparatus, the sender (often in the form of an organization or institution), the codes, and the signs (and sign systems) used. According to such a multifaceted view, Karin Bo¨ hme-Du¨ rr defines medium in her Semiotik/Semiotics article on technical media in the following way: Der Begriff ‘Medium’ bezieht sich [. . .] auf Kommunikationsmittel, also auf die Mittel zur Weitergabe von Zeichen. Kommunikationsmittel sind zum einen technische Gera¨ te (Instrumente, Apparate) und zum anderen Zeichenko¨ rper (¼ Zeichentra¨ger) (Bo¨ hme-Du¨ rr, 1997: 358).

To make this somehow very general definition operational, she distinguished three clusters conforming to the various disciplines dealing with it: the concept of media as used (1) in the social sciences, (2) the natural sciences, and (3) the technological media concept (cf. Bo¨ hme-Du¨ rr, 1997: 358). The most differentiated classificatory grid of subconceptions is offered by Roland Posner. Proceeding from the actual use of the notion both in everyday language and in the pertinent literature, he distinguished six different uses of the notion medium based on the kind of sign processes involved: (1) the biological concept, which characterizes the sign systems according to the body organs and sense modalities involved in production, distribution, and reception of signs (eyes/visual media; ears/auditive media; nose/ olfactory media; taste/gustatory media; skin/tactile media); (2) the physical concept, which characterizes the sign systems based on the chemical elements and the physical conditions; (3) the technological concept, based on the technical means and apparatuses used; (4) the sociological concept, covering social institutions that organize the biological, physical, and technical means; (5) the culture-related concept, which characterizes the sign systems according to the aims of the messages conveyed by them; and (6) the

code-related concept, which characterizes the sign systems according to the rules to correlate messages and sign vehicles (Posner, 1985: 255–257). On the one hand, this plurality of concepts is only helpful when the various notions and the correlated media concepts and criteria are kept apart for the sake of precision and clarity. On the other hand, it has to be clear that the differentiation of these six media conceptions is only made for analytical reasons; with regard to the actual process, they have to be viewed together, since they appear simultaneously or consecutively, much in the same way as Krampen has defined the mass communication process as a supersemiosis. A starting point for a truly semiotic reconsideration of the concept of the medium, however, could be a statement Charles S. Peirce made (presumably in 1905) in which he even equals sign and medium. ‘‘A sign is plainly a species of medium of communication’’ (Peirce, MS 283, 125). In 1906, he goes one step further and writes: ‘‘All my notions are too narrow. Instead of ‘Sign,’ ought I not to say Medium?’’ (Peirce, MS 339). With one exception, however, this use did not become the current one. Laying stress on the specific quality of the sign as mediator between object and interpretant, the notion introduced by Max Bense and Elisabeth Walther instead of sign or representamen is Mittel or medium. (For her discussion of medium, cf. Walther, 1997.) A Plethora of Neighboring Notions

‘Medium’ is not just in itself a multifaceted term. In the pertinent literature, it is always accompanied by several other notions that are only partly core concepts of semiotic theory. Although widely used, ‘channel’ is one of these notions coming from neighboring disciplines. Originating from information–theoretical texts, it has entered also the (media) semiotic discourses. The problem, however, lies not in the term as such but in a synonymous use of channel and medium (as analyzed in Wulff, 1978), which does not contribute to clarify the various elements and phases of the process. Unfortunately, since all terms in question have diverse meanings, even semioticians are sometimes not very precise in their respective use of the notions. When Thomas A. Sebeok, for instance, discussed how a source is linked with a destination, he described it as ‘‘a sort of medium, or channel, a passageway through which the two are capable of establishing and sustaining their communicative exchange’’ (1991: 27). Even more complicated is the case with code (see Postmodernism) that is likewise part and parcel of the terminology in several disciplines dealing with media and communication. ‘Code’ is often confined

Media: Semiotics 639

to transformation rules or the rules of correlating the elements of two different systems (the ‘s-codes’ in Eco’s terminology). Sometimes a code is regarded as a ‘‘set of substitution equations relating significata to signs or signifiers’’ (Watt and Watt, 1997: 408). Eventually, the term code is often used co-extensive with ‘sign repertoire’ or even ‘sign system,’ like for instance in the Jakobson model discussed above. Although ‘sign system’ is a central semiotic term, the definitions vary in the different schools and currents. The widest and most fruitful conception of sign system with regard to media semiotics is offered by the Italian semiotician and philosopher Ferruccio Rossi-Landi. In his definition, a sign systems contains at least one code, that is the materials on which one works, and the instruments with which one works; [. . .] the rules to apply the latter on the former [. . .] the channels and the circumstances that allow communication, the senders and receivers who make use of the code (Rossi-Landi, 1985: 242).

But he goes even one step further when he writes that it includes also all the messages which are exchanged or can be exchanged within the universe institutionalized by the the system itself (Rossi-Landi, 1985: 242).

Fields of Media Semiotics As indicated earlier, semiotics has turned to the study of media texts and contributed its share to their analysis at least since the 1960s. This contribution is evidenced by the remarkable amount of works in applied semiotics that can be subsumed under the heading ‘media semiotics.’ To establish some kind of order among the hundreds of works related to, and the various ways semioticians are dealing with, the media, two directions can be taken: to follow the usual listing of (mass) media, and to discuss some topics, media semiotics is especially interested in and apt for. Advertising

From the very beginning, advertising was a preferred area, and the various publications show the variety of semiotic models and concepts that are used to deal with the verbal and visual messages encoded in promotional texts. Among the most famous papers dealing with print ads is definitely Roland Barthes’ (see Barthes, Roland (1915–1980)) analysis of a French magazine ad for Panzani pasta sauce (1964). Proceeding from the objects depicted (an open string bag with spaghetti

packages, a tin of pasta sauce, a sachet of grated parmesan, and assorted fresh vegetables: tomatoes, onions, peppers, a mushroom etc.), he developed the levels of signs involved (fresh products, italianita`, complete meal, still life, etc.) and the three types of messages: a linguistic, a coded iconic (symbolic), and an noncoded iconic (literal) message. Following the line Barthes opened with the title of this paper, that is ‘Rhetoric of the image,’ Umberto Eco adapted the classic rhetorical figures to visual texts and used them to analyze advertisements (Eco, 1968). (With regard to rhetorics, the work of the Belgian Groupe m figures among the most prominent) (see Rhetoric: Semiotic Approaches). Even though a lot of papers are rooted in the semiologic or semiolinguistic tradition, the Peircean trichotomy of icon, index, and symbol has definitely found its way into the general analytic tool set. Accordingly, the concepts are at the basis of many a publications on posters or print advertisements. One of the best known theoreticians in the field of advertisement is Jean-Marie Floch who also worked in practice. His analyses of ad messages are based on a complex network of semiotic squares in the Greimasian tradition. Apart from rhetoric or narrative analyses of ad messages, or those offering critical views concerning the myths and ideologies diffused through ads, there is yet another field of semiotic research and reasoning that is closely related to advertising: marketing and consumer research (cf. Umiker-Sebeok, 1987; Floch, 1990; Mick, 1997) (see Marketing and Semiotics: From Transaction to Relation and Brands and Logos). Visual Semiotics

Although visual semiotics (Sonesson, 1989) cannot be reduced to media texts in the narrow sense, since works of fine art belong also to the topics investigated, it has to be included in such an overview. There are several groups of visual texts that have a long history of semiotic analysis, like photographs, cartoons, or comics (Eco, 1994) (see Photography: Semiotics and Comics: Semiotic Approaches). One of the classic domains of semiotic investigation are the relations between images and words (the accompanying verbal texts), be it in advertising, graffiti, popular prints, or newspapers (Schnitzer, 1994). For both areas, semiotic associations were founded many years ago: the International Association for Visual Semiotics/Association Internationale de Se´ miotique Visuelle and the International Association for Word and Image Studies, respectively (see Visual Semiotics).

640 Media: Semiotics Film and Television Semiotics

Although sometimes film theorists have their reservations about subsuming film theory under the trendy label of media studies, film has to be considered a part of the media semiotic endeavor. Since the 1960s, a large amount of theoretical reflexion on film is based on film semiotic theories and analyses, and both central theoretical concepts and models of analysis in film and television are of semiotic origin. Accordingly, film semiotics plays an important role within applied semiotics in general. The film semiotic discourse dates even further back than media semiotics in general, and it unfolds in several phases that are characterized by changing paradigms, thus showing that there is no such thing as a monolithic, unified film semiotics (in the same way as we should speak of ‘semiotics’ rather in the plural than in the singular). The plurality of film semiotic approaches, concepts, and models mirror the plurality of schools and currents that are today subsumed under the general label ‘semiotics.’ At the very beginning, we find the writings of the Russian Formalists (Boris M. Ejchenbaum, Viktor B. Shklovskij, Jurij N. Tynjanov) of the 1920s, who can be considered to be film semioticians avant la lettre. Both in individual publications and in the programmatic collection Poetika kino (1927), they focused on the question of film as language (discussing a syntax – ‘film sentence’ – and semantics of film) and on literary, in particular poetic, features of film. The fundamental distinction between syuzhet (‘plot’) and fabula (‘story’) goes back to their writings. During the following decades, there are a few, by now almost legendary publications, like Jan Marie Lambert Peters’ thesis De taal van den film (1950). The first stage of a broader semiotically based reflection on film started almost simultaneously with the first proliferation and organizational establishment of general semiotics in the middle of the 1960s (for instance, the International Association for Semiotic Studies IASS-AIS was founded after several years of preparatory conferences in 1969 in Paris). The major topic discussed in these years was the question whether film can be considered a language (in the sense of langue or language system) and, accordingly, be analyzed within the framework of semiolinguistic models (mostly in the Saussurean tradition). The search for the smallest units and a double articulation in the linguistic sense culminated in the well-known discussions between Eco and Pasolini at the 1965 and 1966 Pesaro film festivals. A first answer was given by Christian Metz, who considered cinema as a language without language system (cf. Metz, 1968: 65) and presented a differentiated theory of (general and

particular) cinematographic codes and nonspecific codes (Metz, 1971). A second phase (1970s and early 1980s) is characterized by a stronger orientation towards film reception and the recipients, based on (at least) two different paradigms: psychoanalytic and Marxian models in the tradition of Lacan and Althusser respectively. (With regard to the psychoanalytically inspired film semiotics, one influential publication has to be cited: Metz’s Le signifiant imaginaire [1975].) In addition, this decade is also the period of the bloom of British film and television studies, which to a great deal focused on the sociocultural positioning of the recipient and the role it plays in her/his reception and constitution of meaning. Finally, the film semiotic discourse in the 1980s and 1990s, the third phase, is characterized by a real plurality of approaches, concepts, and models and a proliferation of semiotic writings in film and television studies. On the one hand, film semiotic reflections (both theoretical and analytical) entered a positive exchange with several neighboring disciplines and related fields (like feminist theories as well as models rooted in narratology or discourse and text theory). Another strong field, with which general semiotics started to get in contact, is constituted by the cognitive sciences; accordingly, we can speak of the beginnings of a cognitive turn in film semiotics in these years. On the other hand, classic semiotic paradigms were taken up again and developed in new directions, like in neoformalist or semiopragmatic approaches. Part of this trend is also a reconsidering of those concepts and models from general semiotics that have so far not been adapted to film theoretic contexts. One way is to go back to the ‘founding fathers,’ as for instance Charles S. Peirce, and their writings that were also subject to changed readings and interpretations during the last decades. Since the 1960s, a few (film) theorists (like Peter Wollen [1969] or Gilles Deleuze [1983, 1985]) have dealt with his thought, and some of Peirce’s concepts, like the trichotomy of icon–index–symbol belong to the standard terminology in film and media studies. Nevertheless, a close reading of Peirce’s works will certainly bring helpful inspirations for film studies. One example of the application of Peircean semiotics to film are the works of Werner Burzlaff who examines montage and stylistic devices on the basis of the Peircean phaneroscopy (cf. 1992). But there are many more semiotic/semiophilosophic texts from the various traditions, schools, and currents that could contribute to a reflection on film. Birgit Recki (2004), for instance, presents Ernst Cassirer outside the philosophical circles to a public

Media: Semiotics 641

interested in film studies and adopts aspects from his Philosophie der symbolischen Formen for her discussion of film as an artistic medium. Digital Media/Hypertext/Computer/Internet

The most recent fields of research in media semiotics are ‘new media’ or ‘digital media,’ hypertextuality, and virtual environs/virtual reality. One of the earliest presence of this field was the 1995 conference (cf. No¨ th, (ed.), 1997). Papers in this area are dealing with various topics: the fundamental questions of representation in the digital age; whether the computer can be regarded as a (semiotic) medium (Santaella, 1998; cf. Andersen et al., 1993); the production and reception of hypertexts (Landow, 1994); the (narrative) structure of computer games (Wenz, 1998); or the semiotic analyses of expert systems. However, semioticians in this field are not only occupied with the analysis, but even with the sign-theoretically reflected production of multimedia applications or websites (Stockinger, 1993; Stockinger et al., 1998). Both directions, semiotic theory and semiotically rooted practice, as well as their integration can be found at the annual COSIGN conference on computational semiotics for games and new media. Topics across the Media

Apart from the fundamental questions of representation and signification with regard to media, or the particularity of the sign systems used for media semioses, there are some specific fields media semiotics has dealt with throughout the decades. Media texts are increasingly characterized by intertextuality and intermediality, respectively, which are areas of genuinely semiotic research. The scholar most often quoted in connection with ‘intertextuality’ is Julia Kristeva (1967, 1980), who is credited to have coined the term in her essay ‘‘Bakhtine, le mot, le dialogue et le roman.’’ Topics of research closely related to intertextuality are questions of reflexivity and self-awareness of/in media texts. Of particular interest is the ever increasing number of self-reflexive media texts, that is media texts dealing with, and occasionally criticizing, their own medium (institution) and its conditions and modes of production, and sometimes even reflecting on their own status as a medium text. Multimediality, on the other hand, exceeds both the question how individual texts can be related to each other, and how multimodality can be achieved in texts (cf. Hess-Lu¨ ttich and Schmauks, 2004). Another area of semiotic analysis that touches several different media is constituted by the stories, the narrative.

Finally, semiotics is not only concerned with syntactics and semantics, with formal questions, and with the structural properties of media texts. From the very beginning of the semiotic enterprise, there was always a strong focus on the pragmatic dimension of sign processes and their role within a sociocultural context, in short: a sociosemiotic view on the topic. Given the increasing mediatization of our world and the role globalized communication and media (texts) play in our today’s societies (in particular their omnipresence and the merging of telecommunication technologies and media), this feature is one that makes semiotics particularly important, also for the future. Pragmatics asks about sign work, about the production and reception (or reproduction) of texts, about the use and misuse of signs and sign systems, about the ideological implications of messages transmitted, and about the way people can be trained to handle these messages in order to resist manipulative tendencies and to learn to cooperate for a better future, as it was foreseen already by Morris (1946), or formulated more recently by Ponzio and Petrilli in their ‘semioethics’ (2005). See also: Barthes, Roland (1915–1980); Brands and Logos; Bu¨hler, Karl (1879–1963); Comics: Semiotic Approaches; Communication: Semiotic Approaches; Jakobson, Roman (1896–1982); Marketing and Semiotics: From Transaction to Relation; Paris School Semiotics; Photography: Semiotics; Postmodernism; Rhetoric: Semiotic Approaches; Saussure, Ferdinand (-Mongin) de (1857–1913); Visual Semiotics.

Bibliography Andersen P B, Holmqvist B & Jensen J (eds.) (1993). The computer as medium. Cambridge/New York: Cambridge University Press. Barthes R (1964). ‘Rhe´ torique de l’image.’ Communications 4, 40–51. Bentele G (ed.) (1981). Schriftenreihe der Deutschen Gesellschaft fu¨ r Publizistik-und Kommunikationswissenschaft. 7: Semiotik und Massenmedien. Mu¨ nchen: ¨ lschla¨ ger. O Berger C R & Chaffee S H (eds.) (1987). Handbook of communication science. Newbury Park, CA: Sage. Bignell J (1997). Media semiotics: an introduction. Manchester: Manchester University Press. Bo¨ hme-Du¨ rr K (1997). Technische Medien der Semiose.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 1. 357–384. Bouissac P (ed.) (1998). Encyclopedia of semiotics. New York/Oxford: Oxford University Press. Bu¨ hler K (1934). Sprachtheorie: Die Darstellungsfunktion der Sprache. Jena: Fischer. Burzlaff W (1992). ‘A Peircean Theory of Film/Une the´ orie peircienne du film.’ In Deledalle G, Balat M & Deledalle-

642 Media: Semiotics Rhodes J (eds.) Approaches to Semiotics. 107: Signs of humanity: Proceedings of the 4th International Congress, International Association for Semiotic Studies 2. (L’homme et ses signes: Actes du IVe Congre`s Mondial, Association Internationale de Se´ miotique 2.) Barcelona/ Perpignan, March 30-April 6, 1989. Berlin: Mouton de Gruyter. 849–862. Cobley P (2001). Narrative. London/New York: Routledge. Colapietro V M (1993). Paragon house glossaries for research, reading, and writing: Glossary of semiotics. New York: Paragon House. Danesi M (2000). Toronto Studies in Semiotics and Communication: Encyclopedic dictionary of semiotics, media, and communication. Toronto: University of Toronto Press. Danesi M (2002). Understanding media semiotics. London: Arnold. Deleuze G (1983). Cine´ ma 1: L’image mouvement. Paris: Minuit; Engl. transl. Cinema 1: The movement-image. Minneapolis: University of Minnesota Press. Deleuze G (1985). Cine´ ma 2: L’image – temps. Paris: Minuit; Engl. transl. Cinema 2: The time-image. Minneapolis: University of Minnesota Press. Eco U (1968). La struttura assente. Milano: Bompiani; German rev. ed.: Einfu¨ hrung in die Semiotik. Mu¨ nchen: Fink. Eco U (1975). Trattato di semiotica generale. Milano: Bompiani. (Engl. transl.: A theory of semiotics. Bloomington IN: Indiana University Press.) Eco U (1994). Apocalypse postponed. Bloomington IN: Indiana University Press. Even-Zohar I (1990). Polysystem Studies. (Poetics Today) 11(1). Durham: Duke University Press. Floch J-M (1990). Se´ miotique, marketing et communication. Paris: PUF. (Engl. transl.: Semiotics, Marketing and Communication. London: Palgrave & Macmillan 2001.) Fulton H (1998). ‘Mass communication.’ In Bouissac (eds.). 389–393. Hess-Lu¨ ttich E W B (1986). ‘Multimedia communication.’ In Sebeok T A (ed.). 573–577. Hess-Lu¨ ttich W B (1990a). ‘Massenmedien und Semiotik.’ In Koch W A (ed.) Semiotik in den Einzelwissenschaften, 1. Bochum: Brockmeyer. 176–213. Hess-Lu¨ ttich E W B (1990b). ‘Mass media and semiotics.’ In Koch W A (ed.) Semiotics in the Individual Sciences. Bochum: Brockmeyer. 455–485. Hall S et al. (eds.) (1980). Culture, media, language. working papers in cultural studies. 1972–1979. London: Hutchinson. Hess-Lu¨ ttich E W B & Schmauks D (2004). ‘Multimediale Kommunikation.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 4, 3487–3503. Jakobson R (1960). ‘Linguistics and poetics.’ In Sebeok T A (ed.) Style in language. Cambridge, MA: MIT Press. 350–377. Jakobson R (1973). Main trends in the science of language. London: Allen & Unwin. Kloepfer R (2003). ‘Semiotische Aspekte der Filmwissenschaft: Filmsemiotik (Semiotic aspects of film studies: Semiotics of the cinema).’ In Posner R, Robering K & Sebeok T A (eds.) vol. 3, 3188–3212.

Kloepfer R & Mo¨ ller K D (eds.) (1986). Papmaks 19: Narrativita¨t in den Medien. Mannheim: MANA, und Mu¨ nster: MAkS. Krampen M (1997a). ‘Models of Semiosis.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 1, 247–287. Krampen M (1997b). ‘Semiosis of the mass media: modeling a complex system.’ In No¨ th (ed.). 87–97. Kristeva J (1967). ‘Bakhtine, le mot, le dialogue et le roman.’ Critique 23(239), 438–465. Kristeva J (1980). Word, Dialogue, and Novel.’ In Kristeva J (ed.) Desire in language: a semiotic approach to literature and art. New York: Columbia University Press. Landow G P (ed.) (1994). Hyper-text-theory. Baltimore: Johns Hopkins University Press. Metz C (1968). Essais sur la signification au cine´ ma I. Paris: E´ ditions Klincksieck, 2nd ed. 1971. [Engl. transl.: Film language: a semiotics of cinema. Chicago: The University Press of Chicago 1991, 1st ed.: New York: Oxford University Press 1974.] Metz C (1971). Langage et cine´ ma. Paris: Larousse; augm. edn.: Paris: E´ ditions Albatros 1977. Metz C (1975). ‘Le signifiant imaginaire.’ Communications 23, 3–55. [Engl. transl. ‘The imaginary signifier.’ Screen 16(2). 14–76.] Mick D G (1997). Semiotics in marketing and consumer research: balderdash, verity, pleas.’ In Brown S & Turley D (eds.) Consumer research: postcards from the edge. London: Routledge. 249–262. Morris C W (1946). Signs, language, and behavior. New York: Braziller. Mounin G (1970). Introduction a` la se´ miologie. Paris: Minuit. No¨ th W (1985). Handbuch der Semiotik. Stuttgart: Metzler. [2nd rev. edn.: Stuttgart & Weimar: Metzler 2000; Engl. edn. A handbook of semiotics. Bloomington: Indiana University Press.] No¨ th W (1997). ‘Introduction.’ In No¨ th (ed.). 1–11. No¨ th W (1998). ‘Die Semiotik als Medienwissenschaft.’ In No¨ th & Wenz (eds.). 47–60. No¨ th W (ed.) (1997). Approaches to Semiotics 127: Semiotics of the media. state of the art, projects, and perspectives. Berlin/New York: Mouton de Gruyter. No¨ th W & Wenz K (eds.) (1998). Intervalle 2: Medientheorie und die digitalen Medien. Kassel: Kassel University Press. Peters J M L (1950). De taal van de film: Een linguistischpsychologisch onderzoek naar de aard en de betekenis van het expressiemiddel film. Ph.D. diss: Catholieke Universiteit Nijmegen; published as: De taal van de film: wezen, werking, schoonheid en belang van het expressiemiddel film. Den Haag/The Hague: A. N. Govers. Petric M, Tomic-Koludrovic I & Mitrovic I (2001). ‘A missing link: the role of semiotics in multiagent environments.’ In Clarke A, Fencott C, Lindley C, Mitchell G & Nack F (eds.) Proceedings COSIGN 2001: 1st Conference on Computational Semiotics for Games and New Media, Amsterdam, 10–12 September 2001. Amsterdam: Centruum voor Wiskunde en Informatica. 108–112. Petrilli S & Ponzio A (2005). Semiotics unbounded. interpretive routes through the open network of signs. Toronto: University of Toronto Press [forthcoming].

Medical Communication, lingua francas 643 Posner R (1985). ‘Nonverbale Zeichen in o¨ ffentlicher Kommunikation.’ Zeitschrift fu¨ r Semiotik 7(3), 235–271. Posner R, Robering K & Sebeok T A (1997–2004). Semiotik: Ein Handbuch zu den zeichentheoretischen Grundlagen von Natur und Kultur (Semiotics: /a handbook on the sign-theoretic foundations of nature and culture) (3 vols). Berlin/New York: de Gruyter. Prieto L (1968). ‘La se´ miologie.’ In Martinet A (ed.) Le langage. Paris: Gallimard. 93–114. Pross H (1972). Medienforschung. Darmstadt: Habel. Recki B (2005). ‘U¨ berwa¨ ltigung und Reflexion: Der Film als Mythos und als Kunst.’ In Waniek E, Nagl L & Mayer B (eds.) Film/Denken: Film und Philosophie. Wien: Synema. [forthcoming]. Rossi-Landi F (1975). Janua Linguarum, Series Maior 81: Linguistics and economics. The Hague: Mouton. Rossi-Landi F (1985). Studi Bompiani – Il campo semiotico: Metodica filosofica e scienza dei segni. Nuovi saggi sul linguaggio e l’ideologia. Milano: Bompiani. Santaella L (1998). ‘Der Computer als semiotisches Medium.’ In No¨ th & Wenz (eds.). 121–158. Saussure F de (1916). Cours de linguistique ge´ ne´ rale. In Bally, Charles, Sechehaye & Albert (eds.). Lausanne/ Paris: Payot. Schnitzer J (1994). Wort und Bild: Die Rezeption semiotisch komplexer Texte. Wien: Braumu¨ ller. Sebeok T A (1991). A sign is just a sign. Bloomington/ Indianapolis: Indiana University Press. Sebeok T A (ed.) (1986). Encyclopedic dictionary of semiotics. Berlin: de Gruyter. Semiotica (1997). Index to Semiotica 1–100 (1969–1994). Shannon C E & Weaver W (1949). The mathematical theory of communication. Urbana: University of Illinois Press. Sonesson G (1989). Pictorial concepts: inquiries into the semiotic heritage and its relevance for the analysis of the visual world. Lund: Lund University Press. Stockinger de Pablo E, Fadili H & Stockinger P (1998). ‘Se´ mioNet. Spe´ cification, production et imple´ mentation

de services d’information en-ligne.’ In Bernard J & Withalm H (eds.) Kultur und Lebenswelt als Zeichenpha¨ nomene. Akten eines internationalen Kolloquiums zum 70. Geburtstag von Ivan Byst ˇ rina und Ladislav Tondl, ¨ GS/ISSS. 177–234. Wien, Dezember 1994. Wien: O Stockinger P (ed.) (1993). Explorations in the world of multimedia.S – European Journal for Semiotic Studies 5(3). Threadgold T (1997). ‘Social media of semiosis.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 1, 384–404. Umiker-Sebeok J (ed.) (1987). Marketing and semiotics: new directions in the study of signs for sale. Berlin: Mouton de Gruyter. Walther E (1997). ‘The sign as medium, the medium relation as the foundation of the sign.’ In No¨ th (ed.). 79–85. Watt G T & Watt W C (1997). ‘Codes.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 1, 404–414. Wenz K (1998). Narrativita¨ t in Computerspielen.’ In Schade S & Tholen C (eds.) Konfigurationen: Zwischen Kunst und Medien. Mu¨ nchen: Fink. 209–219. Wolf M (2003). ‘Semiotic aspects of mass media studies.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 3, 2926–2936. Wollen P (1969). Signs and meaning in the cinema. London: Secker & Warburg, 2nd edn. 1972; Bloomington: Indiana University Press; London: British Film Institute 1998, 4th edn.: revised & enlarged. Wulff H J (1978). ‘Medium und Kanal.’ In Dutz K D (ed.) Papmaks 10: Zur Terminologie der Semiotik I. Mu¨ nster: MAkS Publikationen, 3rd edn., 47–71.

Relevant Website http://www.cosignconference.org – COSIGN conference on computational semiotics for games and new media.

Medical Communication, lingua francas I Taavitsainen, University of Helsinki, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.

The term lingua franca refers to the earliest Romancebased pidgin and has gained the meaning of a widely used auxiliary language to enable communication between people of different mother tongues. At present, English is the most widely spread lingua franca of the western world, used in several fields, including science and medicine. Latin was the lingua franca of Western medical writing for several centuries. The roots of Western

medicine lie in Greek; medical learning was transmitted in Latin translations of Greek and Arabic texts, mostly by translators whose first language was not a European vernacular but Arabic or Greek. Galen’s texts became available in the 13th century in Latin commentaries, with several layers of additions. Medical texts started to be translated into vernacular languages such as French, English, German, Portuguese, and Catalan in the 14th and 15th centuries, almost simultaneously in different parts of Europe (see, e.g., Crossgrove et al., 1998). Latin retained its firm position as a pan-European language of science. The situation started to change in England at the end of the 17th century, and several authors published in both

Medical Communication, lingua francas 643 Posner R (1985). ‘Nonverbale Zeichen in o¨ffentlicher Kommunikation.’ Zeitschrift fu¨r Semiotik 7(3), 235–271. Posner R, Robering K & Sebeok T A (1997–2004). Semiotik: Ein Handbuch zu den zeichentheoretischen Grundlagen von Natur und Kultur (Semiotics: /a handbook on the sign-theoretic foundations of nature and culture) (3 vols). Berlin/New York: de Gruyter. Prieto L (1968). ‘La se´miologie.’ In Martinet A (ed.) Le langage. Paris: Gallimard. 93–114. Pross H (1972). Medienforschung. Darmstadt: Habel. Recki B (2005). ‘U¨berwa¨ltigung und Reflexion: Der Film als Mythos und als Kunst.’ In Waniek E, Nagl L & Mayer B (eds.) Film/Denken: Film und Philosophie. Wien: Synema. [forthcoming]. Rossi-Landi F (1975). Janua Linguarum, Series Maior 81: Linguistics and economics. The Hague: Mouton. Rossi-Landi F (1985). Studi Bompiani – Il campo semiotico: Metodica filosofica e scienza dei segni. Nuovi saggi sul linguaggio e l’ideologia. Milano: Bompiani. Santaella L (1998). ‘Der Computer als semiotisches Medium.’ In No¨th & Wenz (eds.). 121–158. Saussure F de (1916). Cours de linguistique ge´ne´rale. In Bally, Charles, Sechehaye & Albert (eds.). Lausanne/ Paris: Payot. Schnitzer J (1994). Wort und Bild: Die Rezeption semiotisch komplexer Texte. Wien: Braumu¨ller. Sebeok T A (1991). A sign is just a sign. Bloomington/ Indianapolis: Indiana University Press. Sebeok T A (ed.) (1986). Encyclopedic dictionary of semiotics. Berlin: de Gruyter. Semiotica (1997). Index to Semiotica 1–100 (1969–1994). Shannon C E & Weaver W (1949). The mathematical theory of communication. Urbana: University of Illinois Press. Sonesson G (1989). Pictorial concepts: inquiries into the semiotic heritage and its relevance for the analysis of the visual world. Lund: Lund University Press. Stockinger de Pablo E, Fadili H & Stockinger P (1998). ‘Se´mioNet. Spe´cification, production et imple´mentation

de services d’information en-ligne.’ In Bernard J & Withalm H (eds.) Kultur und Lebenswelt als Zeichenpha¨nomene. Akten eines internationalen Kolloquiums zum 70. Geburtstag von Ivan Byst ˇ rina und Ladislav Tondl, ¨ GS/ISSS. 177–234. Wien, Dezember 1994. Wien: O Stockinger P (ed.) (1993). Explorations in the world of multimedia.S – European Journal for Semiotic Studies 5(3). Threadgold T (1997). ‘Social media of semiosis.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 1, 384–404. Umiker-Sebeok J (ed.) (1987). Marketing and semiotics: new directions in the study of signs for sale. Berlin: Mouton de Gruyter. Walther E (1997). ‘The sign as medium, the medium relation as the foundation of the sign.’ In No¨th (ed.). 79–85. Watt G T & Watt W C (1997). ‘Codes.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 1, 404–414. Wenz K (1998). Narrativita¨t in Computerspielen.’ In Schade S & Tholen C (eds.) Konfigurationen: Zwischen Kunst und Medien. Mu¨nchen: Fink. 209–219. Wolf M (2003). ‘Semiotic aspects of mass media studies.’ In Posner R, Robering K & Sebeok T A (eds.) vol. 3, 2926–2936. Wollen P (1969). Signs and meaning in the cinema. London: Secker & Warburg, 2nd edn. 1972; Bloomington: Indiana University Press; London: British Film Institute 1998, 4th edn.: revised & enlarged. Wulff H J (1978). ‘Medium und Kanal.’ In Dutz K D (ed.) Papmaks 10: Zur Terminologie der Semiotik I. Mu¨nster: MAkS Publikationen, 3rd edn., 47–71.

Relevant Website http://www.cosignconference.org – COSIGN conference on computational semiotics for games and new media.

Medical Communication, lingua francas I Taavitsainen, University of Helsinki, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.

The term lingua franca refers to the earliest Romancebased pidgin and has gained the meaning of a widely used auxiliary language to enable communication between people of different mother tongues. At present, English is the most widely spread lingua franca of the western world, used in several fields, including science and medicine. Latin was the lingua franca of Western medical writing for several centuries. The roots of Western

medicine lie in Greek; medical learning was transmitted in Latin translations of Greek and Arabic texts, mostly by translators whose first language was not a European vernacular but Arabic or Greek. Galen’s texts became available in the 13th century in Latin commentaries, with several layers of additions. Medical texts started to be translated into vernacular languages such as French, English, German, Portuguese, and Catalan in the 14th and 15th centuries, almost simultaneously in different parts of Europe (see, e.g., Crossgrove et al., 1998). Latin retained its firm position as a pan-European language of science. The situation started to change in England at the end of the 17th century, and several authors published in both

644 Medical Communication, lingua francas

languages (Webster, 1975: 267). Latin retained its position longer in other parts of Europe, e.g., in German-speaking countries. It is possible that journal publication in medicine from the 17th century onward played a part in nationalizing medical communication. In the early 20th century, there were rival languages for the lingua franca position in science: French, German, and English. German was a very strong candidate before the Second World War. German had served as a lingua franca in large parts of Europe for centuries, for example, in the Baltic area since the Middle Ages, and the names of several scholarly journals and series in many fields are still in German. The situation started to change in the 1950s and 1960s in favor of English with the increasing impact of AngloAmerican culture. The development has been extremely rapid in the last decade, and English prevails in medical research writing to the extent that researchers have noticed signs of register narrowing; for example, in Scandinavian languages the trend was noticed more than ten years ago (Gunnarsson and Ba¨ cklund, 1995: 47). Practically all medical dissertations in the Nordic countries are written in English, and the same trend has been noticed in other parts of Europe. The widening circles of English in medical research writing can easily be verified by looking at the authors of recent issues of the British Medical Journal. The former lingua franca of medicine, Latin, has not entirely lost its position. In hospital communication between medical doctors, Latin is still used in written documents in a highly conventionalized form, for example, in the Nordic countries the diagnoses are

regularly given in Latin in the title line, and Latin anatomical terminology is employed. As a rule, the rest of the medical report is in the native language, for example, in Finnish or Swedish. A new feature is the use of English abbreviations and English words like sign, instead of the native word. In pharmacological use and in medical recipes, Latin abbreviations prevail; abbreviations provide an easy way out of the problem with inflectional endings as Latin is no longer taught much in schools. The trend is toward the increasing use of English. See also: Medical Discourse: Early Genres, 14th and 15th

Centuries; Medical Discourse: Developments, 16th and 17th Centuries.

Bibliography Crossgrove W, Schleissner M & Voigts L E (eds.) (1998). Early Science and Medicine: A Journal for the Study of Science, Technology and Medicine in the Pre-Modern Period 3(2), Special issue: ‘The vernacularization of science, medicine, and technology in Late Medieval Europe.’ Gunnarsson B-L & Ba¨ cklund I (1995). Writing in academic contexts. TEFA 11, Forskningsgruppen fo¨ r text-och fackspra˚kstudier, Uppsala Universitet. Webster C (1975). The great instauration: science, medicine and reform 1626–1660. London: Duckworth.

Relevant Website http://bmj.bmjjournals.com/ – British Medical Journal.

Medical Communication: Professional-Lay K K Zethsen, Aarhus School of Business, Aarhus, Denmark I Askehave, Aalborg University, Aalborg, Denmark ! 2006 Elsevier Ltd. All rights reserved.

The Information Society and the Changing Status of Experts Fifty years ago, authorities were generally more clearly defined in society than they are today. Teachers, doctors, and lawyers had undisputed expert status and as such they were highly respected and their authority was rarely questioned by the average citizen. Academic training was the exception rather than the rule and a much larger number of people were employed within the agricultural sector or within

other manual sectors following a very limited period of schooling. Being practical, skillful, and able to manage were important values, whereas intellectualism, theoretical discussions, and criticism of society were not the order of the day. To a large degree, knowledge was still passed on from your elders and gained by experience. Although not academically trained, most people (men) were skilled in something often through some form of apprenticeship and areas of expertise were generally acknowledged. Gradually, this has changed. In the 1960s, academics started challenging the authorities and their status as omniscient. They demanded the right to have a say (not only within their own fields) and this demand required information. From the 1960s on, the number of people receiving academic training

644 Medical Communication, lingua francas

languages (Webster, 1975: 267). Latin retained its position longer in other parts of Europe, e.g., in German-speaking countries. It is possible that journal publication in medicine from the 17th century onward played a part in nationalizing medical communication. In the early 20th century, there were rival languages for the lingua franca position in science: French, German, and English. German was a very strong candidate before the Second World War. German had served as a lingua franca in large parts of Europe for centuries, for example, in the Baltic area since the Middle Ages, and the names of several scholarly journals and series in many fields are still in German. The situation started to change in the 1950s and 1960s in favor of English with the increasing impact of AngloAmerican culture. The development has been extremely rapid in the last decade, and English prevails in medical research writing to the extent that researchers have noticed signs of register narrowing; for example, in Scandinavian languages the trend was noticed more than ten years ago (Gunnarsson and Ba¨cklund, 1995: 47). Practically all medical dissertations in the Nordic countries are written in English, and the same trend has been noticed in other parts of Europe. The widening circles of English in medical research writing can easily be verified by looking at the authors of recent issues of the British Medical Journal. The former lingua franca of medicine, Latin, has not entirely lost its position. In hospital communication between medical doctors, Latin is still used in written documents in a highly conventionalized form, for example, in the Nordic countries the diagnoses are

regularly given in Latin in the title line, and Latin anatomical terminology is employed. As a rule, the rest of the medical report is in the native language, for example, in Finnish or Swedish. A new feature is the use of English abbreviations and English words like sign, instead of the native word. In pharmacological use and in medical recipes, Latin abbreviations prevail; abbreviations provide an easy way out of the problem with inflectional endings as Latin is no longer taught much in schools. The trend is toward the increasing use of English. See also: Medical Discourse: Early Genres, 14th and 15th

Centuries; Medical Discourse: Developments, 16th and 17th Centuries.

Bibliography Crossgrove W, Schleissner M & Voigts L E (eds.) (1998). Early Science and Medicine: A Journal for the Study of Science, Technology and Medicine in the Pre-Modern Period 3(2), Special issue: ‘The vernacularization of science, medicine, and technology in Late Medieval Europe.’ Gunnarsson B-L & Ba¨cklund I (1995). Writing in academic contexts. TEFA 11, Forskningsgruppen fo¨r text-och fackspra˚kstudier, Uppsala Universitet. Webster C (1975). The great instauration: science, medicine and reform 1626–1660. London: Duckworth.

Relevant Website http://bmj.bmjjournals.com/ – British Medical Journal.

Medical Communication: Professional-Lay K K Zethsen, Aarhus School of Business, Aarhus, Denmark I Askehave, Aalborg University, Aalborg, Denmark ! 2006 Elsevier Ltd. All rights reserved.

The Information Society and the Changing Status of Experts Fifty years ago, authorities were generally more clearly defined in society than they are today. Teachers, doctors, and lawyers had undisputed expert status and as such they were highly respected and their authority was rarely questioned by the average citizen. Academic training was the exception rather than the rule and a much larger number of people were employed within the agricultural sector or within

other manual sectors following a very limited period of schooling. Being practical, skillful, and able to manage were important values, whereas intellectualism, theoretical discussions, and criticism of society were not the order of the day. To a large degree, knowledge was still passed on from your elders and gained by experience. Although not academically trained, most people (men) were skilled in something often through some form of apprenticeship and areas of expertise were generally acknowledged. Gradually, this has changed. In the 1960s, academics started challenging the authorities and their status as omniscient. They demanded the right to have a say (not only within their own fields) and this demand required information. From the 1960s on, the number of people receiving academic training

Medical Communication: Professional-Lay 645

has exploded. Jobs that just a few decades ago required very little schooling cannot be had without years at college. Written material is distributed like never before, the various media provide a constant stream of information and the advent of the computer and the Internet in particular has catapulted us into the era of the information society. Many citizens and consumers today consider information their right in a democratic society, but there is also another side to the coin. Apart from the self-assured, critical, and challenging citizen, there are still many people who find it difficult to digest even fairly simple texts and who are not used to asserting themselves publicly. This group of people is at a disadvantage in the selfservice information society. Authorities have been able to lower their level of personal service and have become used to handing out brochures and referring people to web pages. Furthermore, within some areas the reason for publishing huge amounts of information is not a true desire to convey information but, rather, a way to limit responsibility. That is why, for example, patient package leaflets contain long lists of extremely unlikely side effects, a measure that prevents the medical company from being taken to court on a ‘should-have-told’ basis. The New Roles of Medical Experts

Society is now at a stage at which it is possible to obtain information about almost anything within a very short period of time. This has changed the way in which experts and authorities are perceived. Now that there is easy access to so much information it has become possible, and quite common, to criticize and challenge the views and decisions of experts. Within the field of medicine this has led to new roles for medical professionals and the entire medical industry (it should be noted that the contents of this section first and foremost apply to the industrialized world). To a large degree, the medical profession is still highly respected and much authority still surrounds the health-care practitioner. Nevertheless, many patients do not accept a diagnosis as readily as they used to, they seek second opinions, read up on the matter themselves, and suggest alternatives, etc. Once they are convinced the diagnosis is right they do not always accept the treatment proposed by the doctor; they may challenge his/her views by means of the latest research available on the Internet and they may seek alternative treatment. Patients want access to their files to check what is going on. The medical industry is being met with claims of openness and information about the medicine they produce. Patients want to know about possible side effects

(and to have them graded statistically) in order to decide whether to take the medicine prescribed or not. Medical experts no longer just diagnose and prescribe; they are expected to be willing and able to inform and discuss with the patient to a degree never seen before. Generally speaking, patients are more literate than ever and used to seeking and digesting new information. But even the most well-educated part of populations do, however, not possess the background knowledge of a medical expert and are linguistically speaking not part of the discourse community of medical experts. There is no doubt that patients who are used to digesting complex texts will often benefit from the information they obtain. However, many people do not possess this ability and this is problematic, because in today’s society they are expected to. In earlier days, the doctor assumed complete responsibility; on the one hand, this meant that the patient was left to the mercy or competence of the individual doctor; on the other hand, it meant that the patient could leave things in the hands of fate and the doctor and did not have to carry the burden of having to be informed and being able to make informed decisions. But in the 21st century, authorities expect people to understand the technical and semitechnical language of doctors, various health campaigns on prevention and warning signals, the contents of patient package inserts (also in connection with the increasing amount of medicine sold over the counter, i.e., without any consultation), a number of informative brochures on specific ailments, their own medical files, and so on. This may be quite feasible to part of the population, but is likely to be problematic to many people.

What Is Medical Language? Medical language is traditionally regarded as the language used by medical experts when communicating in an expert-to-expert context. It is the language of the ‘specialist,’ often defined as a special language as opposed to general language used by the general public in everyday situations: Special languages are semi-autonomous, complex semiotic systems based on and derived from general language: their use presupposes special education and is restricted to communication among specialists in the same or closely related fields. Sager et al. (1980: 69)

Those who master medical language have been encultured or socialized into the language. The student of medicine automatically becomes a student of medical language when attending university to become a health-care practitioner. Thus, apart from acquiring knowledge about the medical field, s/he

646 Medical Communication: Professional-Lay

learns to communicate with peers using the linguistic tools appropriate in the medical context. It is a process in which books, medical journals, traineeships at hospitals, conversations with lecturers, fellow-students, doctors, etc. contribute to a gradual buildup of a specialist, medical language. When the student graduates, s/he not only possesses thorough knowledge of the medical field, s/he also masters the language of the medical discourse community.

Characteristics of Medical Language The most obvious characteristic of medical language is its extensive use of words related to the subject matter – also referred to as ‘medical jargon.’ Apart from the medical jargon, medical communicators also favor a passive and impersonal style that focuses on objective, measurable phenomena rather than concrete actions. This style is attained through the use of heavy noun phrases (with nominalized actions), passive clauses, and a preference for third-person pronouns rather than first personal pronouns. The medical jargon and the passive and impersonal style allow experts to provide precise and condensed information for other experts who are trained to perceive and consequently talk about the physical world in a rational, objective, and measurable way. An example could be the pharmacist describing the attributes of a medicinal product in the so-called product summary – which is an official document from the pharmaceutical company that provides approving authorities with information about a particular product in order for them to authorize the marketing of the product. It gives health-care practitioners detailed information about a medicinal product as in the following extract: The antidepressant, antiobsessive-compulsive and antibulimic actions of fluoxetine are presumed to be linked to its inhibition of CNS neuronal uptake of serotonin. Studies at clinically relevant doses in man have demonstrated that fluoxetine blocks the uptake of serotonin into human platelets. Studies in animals also suggest that fluoxetine is a much more potent uptake inhibitor of serotonin than of norepinephrine. (http://pharmahelp.com)

This example is used within a medical discourse community that is ‘pure,’ in the sense that the community is composed of equals or near-equals in knowledge and professional role (pharmacist to doctor). However, medical genres and medical discourse are not necessarily restricted to that of experts talking to experts. As patients, consumers, and members of society in general, nonspecialists momentarily enter the medical discourse community – not as producers but as consumers of medical texts such as patient

package inserts for medicinal products, social marketing leaflets explaining the dangers of smoking, or when consulting GPs or pharmacists. Therefore, medical language is not restricted to the discourse community of experts but can also be found in communities in which the addresser is a professional and the addressee is a layperson. The need for medical information accessible to people outside the professional medical domain has brought about a significant change in the premises of medical communication. The traditional symmetrical communication between equals (e.g., pharmacist to doctor) has been challenged by the demand for asymmetrical communication between experts and laypeople (e.g., doctors to consumers). This demand calls for recognition of a medical language at different levels of abstraction. If both communicators are specialists, the highest level of abstraction (i.e., ‘traditional’ medical language) is the obvious choice. If the communication is asymmetrical (professional-lay), the medical subject matter has to be adjusted to the knowledge of laypeople. In practice, this means that texts, which originate from an expert discourse community but serve as the basis for a consumer-oriented version, need to be ‘translated’ to become meaningful to nonexpert readers.

What Is Professional-Lay Medical Language? This section deals with written communication though many features apply to oral communication as well. Target Group

In 1859, Kierkegaard wrote about the following situations in which experts, or people who know more than others, want to convey their knowledge to other people: If I am to succeed in guiding another human being towards a certain goal, I have to find the place where he is and start right there [. . .]. In order to help somebody, I certainly have to understand more than he does, but first and foremost understand what he understands. (our translation of Kierkegaard, 1869, in Becker Jensen, 2001: 18)

Kierkegaard tells us that if you do not have this understanding it is no help that you are more knowledgeable than your target group, and he adds that: All true helpfulness begins with humbleness towards the person I seek to help, and this is why I have to understand that helping is not wanting to rule, but to serve. If I cannot do this, I cannot help anybody (our translation of Kierkegaard, 1869, in Becker Jensen, 2001: 18).

Medical Communication: Professional-Lay 647

Kierkegaard points out two important maxims that are still valid for expert-to-layman communication: It is the level of the target group which should determine the level and style of the text, not the writer’s level – otherwise the extra knowledge of the expert becomes useless to the reader. The writer must be humble and possessed of a true desire to be of assistance, i.e., should not be preoccupied with his own status and authority. The target group in expert-to-layperson communication is often potentially the entire population and must therefore be characterized as very broad indeed, which is why the visualization of a target group may well be a very substantial problem to the writer. The expert writer is in danger of overestimating his audience because of his own extensive knowledge and may be afraid of sounding patronizing if too much is explained. An expert writer may also feel that very simple language questions his status as expert – after all, expert language signifies expertise and authority to many. All things considered, if there is a true desire to make a text understandable to all laypeople and not just half of them the safe way is to use the lowest common denominator as a yardstick. If there is a well-defined target group, the lowest common denominator within that group should be used. Kierkegaard’s two maxims should be kept in mind at an overall level, whereas the following may be of assistance at a more specific level when writing for laypeople. Characteristics of Professional-Lay Medical Language

Incomprehensible Medical Jargon Perhaps the most defining feature of medical – or expert – language is the use of expert terms unknown to most people. For example, we may see the use of ‘therapeutic indications,’ ‘contra-indications,’ and ‘interactions’ in the headlines of an insert. These terms should be replaced by lay terms when possible or should be paraphrased. It should be noted that most medical jargon has Latin or Greek roots, but that the extent to which Latin medical terms have been incorporated in everyday language varies greatly from country to country. French and English have, for instance, been more receptive to Latin than German and Scandinavian languages. False Friends False friends within an expert-to-lay context are words and expressions that are used both in everyday situations and in special contexts but in which the meaning of the words differs depending on the context in which the words are used. For example, the expression ‘to administer’ is the formal

use of giving someone a drug. However, the expression is also used in a more informal sense in business or legal settings but with a totally different meaning. It may be very confusing to the reader if he knows the word well from other contexts but cannot make sense of it in the context in question. The use of such terms should therefore be avoided. Inconsistent Use of Synonyms Generally, medical language is characterized by sparse use of synonyms but, when they occur, perhaps especially in expert-tolay texts in which semiexpert terms or lay terms are used, too, the reader who does not possess the expert knowledge needed to judge whether the terms are synonyms or not may become confused when confronted with two or three different terms for the same thing. This is, for example, the case when ‘lactation’ and ‘breastfeeding’ are used interchangeably. Stylistic variation should be avoided if there is a risk of sacrificing understanding. Long or Complicated Words or Expressions Traditionally medical expert language often makes use of officialese. Strictly speaking, this has nothing to do with medical jargon or with the advantages of expert language such as brevity and precision, rather the opposite. Still medical texts are often characterized by unnecessarily long or complicated words and expressions which make the text more difficult to digest. These superfluous words or blown up expressions should be removed and replaced with more simple ones. Long and Complicated Sentences These are often a direct consequence of situations in which long and complicated words and expressions coupled with a too complex sentence structure result in lengthy, inflated sentences. Such sentences should be reduced by omitting superfluous words and expressions and can in many cases be split up into two or three shorter sentences. Passive and Impersonal Style In expert communication, it is often not particularly relevant to know who the ‘actor’ is. Thus, instead of using the active voice, medical experts rely on a passive style making use of the passive voice and nominalization. In the passive voice, the person performing the action is deleted. This strategy may be quite useful in texts in which the actor is either unknown or simply not important, but in cases in which the patient needs to know that s/he is to perform some kind of action (e.g., in connection with a patient package leaflet) it is important to use the active style. The passive style makes the text impersonal and it forces the reader to take extra

648 Medical Communication: Professional-Lay

mental steps as s/he converts the passive sentence into an active one in order to work out ‘‘who is doing what.’’ For example, an impersonal expression such as ‘‘X should not be taken during the first 3 months of pregnancy’’ should be replaced by a more active expression ‘‘you should not take X during the first 3 months of pregnancy.’’ Passive sentences should be turned into active voice and undue nominalizations should be avoided in favor of the more direct verbal form. Too Much Information in One Sentence Contrary to sentences containing superfluous words, these sentences are complex because they contain too much relevant information. For example, ‘‘This includes previously untreated patients and patients who have previously responded to treatment with X, but whose condition has recurred.’’ Apart from listing too much relevant information, the sentence is also complex because it is packed with heavy noun phrases. Very often, sentences become rather long and complex because the writer adds extra information to the noun by means of pre- and postmodifiers. The result is heavy noun phrases, which – when unpacked – would constitute sentences in their own right. To nonexperts, these sentences can be very difficult to unpack correctly. Such sentences should be edited by splitting them up into two or three shorter sentences. Remnants from Translation Many medical experts are used to reading and discussing in English although it is not their native language. Modern non-English medical language is often very influenced by English lexis and even syntax. This may be a problem to nonEnglish speaking laypeople. Presuppositions A text that to an expert has a logical structure may not be logical to a layperson. If the writer relies on presumed (but nonexisting) background knowledge of the reader, the reader may be unable to follow the thoughts of the writer and may think that he jumps to conclusions. Evidently, it is not realistic to completely avoid all of the above expert features. But the reason why expert language may be difficult to laypeople is the fact that these features tend to appear together in a sentence. Several studies show that they make the text less accessible to the general reader (see, e.g., Killingsworth and Gilbertson, 1992; Killingsworth and Steffens, 1989). Length There are of course different conventions for the length of a text, but lay texts should never be longer than strictly necessary.

Print Size Information on medicine is often provided in far too small print that may discourage readers even before they have started. The print size should be reader-friendly. Order of Information The order of information should be as logical as possible. Usually a sound strategy is to place the most important information in the beginning of the text. Headings Well-placed, informative, and precise headings help weak readers navigate through a text. Headings should be written in clear and understandable terms and should not be too long. Pictograms For instructional texts, pictograms may be a good solution, but the pictures or symbols should in no way be open to interpretation. This is more difficult than it sounds – a picture of a glass of water may to one reader mean that a pill should be dissolved in a glass of water, whereas to another it indicates that a glass of water should be drunk after swallowing the pill.

Conclusion In spite of this attempt to describe the ‘ideal’ characteristics of professional-lay medical communication, professional-lay communication is still in its infancy and the discourse conventions are not in place. Professional-lay discourse in the medical context may best be described as a ‘pseudo’ discourse for the time being, in which professionals – with varying success – try to adapt their medical language to a mixed audience (potentially the entire population) whose knowledge of the subject matter and the discourse conventions of the medical field is very restricted. What we experience is a semiprofessional discourse that attempts to merge the qualities of traditional medical language with the discourse conventions of ‘plain English’ style guides. One could speculate about the reasons for the lack of successful professional-lay communication. No doubt many of the problems can be attributed to the enculturation of health-care practitioners into the medical domain through language that means that the discourse and practice of medicine are difficult to separate. More specifically, it has the following consequences: . Medical experts generally lack the ability to downgrade their special language in order to accommodate a target group of nonexperts. The medical experts are experts within medicine, not within plain English communication.

Medical Discourse and Academic Genres 649

. Medical experts feel less inclined to adopt a fully professional-lay discourse, as it may question their status as experts – after all, expert language signifies expertise and authority to many. They may even feel that talking medicine at a lower level of abstraction demystifies their profession and results in status loss. . Finally, medical experts resist the professional-lay discourse for ideological reasons. Because of their scientific schooling, medical experts may regard professional-lay discourse a language for mediation, inappropriate for talking about medicine because the required simplification in professional-lay discourse (which promotes a personal, subjective, action-oriented style) does not meet the demands for precision, conciseness, objectification, and passivity, which is the paradigm of medical science. Thus, instead of developing a professional-lay medical discourse, which enables experts to explain medical concepts at a lower level of abstraction, the experts resort to their habitual way of communicating (i.e., the language into which they have been socialized when acquiring their expert knowledge) in spite of the fact that their expert language is difficult to understand for ‘outsiders’ and therefore hampers the readability and usability of consumer-oriented medical documents. And so, today, we face a situation in which, in spite of the fact that the medical community produces numerous documents for patients (package inserts, health leaflets, information letters prior to hospitalization, etc.), the professional-lay discourse has not been fully developed (and embraced) by the medical discourse community. Therefore, the discourse remains

a hybrid between traditional medical language and consumer-oriented plain English. See also: Jargon; Languages for Specific Purposes; Medi-

cal Communication: Linguas Francas; Socialization.

Bibliography Askehave I & Zethsen K K (2000a). The patient package insert of the future. Report for the Danish Ministry of Health [Danish and English version]. Aarhus: Aarhus School of Business. Candlin C N & Candlin S (2003). ‘Health care communication: A problematic site for applied linguistics research.’ Annual Review of Applied Linguistics 23, 134–154. Consumers’ Association (2000). Patient information leaflets: sick notes? Report, June 2000. Janssen D & Neutelings R (eds.) (2001). Reading and writing public documents. Amsterdam/Philadelphia: John Benjamins Publishing Company. Killingsworth J M & Steffens D (1989). ‘Effectiveness in the environmental impact statement.’ Written Communication 6, 155–180. Killingsworth J M & Gilbertson M K (1992). Signs, genres and communities in technical communication. New York: Baywood Publishing Company. OECD and Statistics Canada (2000). Literacy in the information age: Final report on the International Adult Literacy Survey. Paris: Author. Sager et al. (1980). English special languages: principles and practice in science and technology. Wiesbaden: Oscar Brandstetter Verlag. Sless D & Wiseman R (1997). Writing about medicines for people – usability guidelines for consumer medicine information. Canberra: Communication Research Institute of Australia.

Medical Discourse and Academic Genres J Pique´-Angordans, Universitat de Valencia, Valencia, Spain S Posteguillo, Universitat Jaume I, Castello´, Spain ! 2006 Elsevier Ltd. All rights reserved.

The Scientific Medical Discourse Community The construction of knowledge is both an individual and a social task. It is a continuous dialectical process that has been taking place for generations. This process has adopted different forms of communication, namely, oral (conferences, conversations,

dialogues, remarks, among other spoken genres) and written (editorials, research articles, letters, peer reviews and referees’ remarks, case studies, and even chats and e-mails). New knowledge is drawn from the conversion of research resources derived from ‘‘how authors, editors, and reviewers together transform the raw material of manuscripts into the finished product of validated knowledge claims’’ (Chubin and Hackett, 1990: 95). This is usually carried out through a series of conventions that each discourse community has. While attributing to the communicative purpose the role of linking genres, tasks, and discourse community goals,

Medical Discourse and Academic Genres 649

. Medical experts feel less inclined to adopt a fully professional-lay discourse, as it may question their status as experts – after all, expert language signifies expertise and authority to many. They may even feel that talking medicine at a lower level of abstraction demystifies their profession and results in status loss. . Finally, medical experts resist the professional-lay discourse for ideological reasons. Because of their scientific schooling, medical experts may regard professional-lay discourse a language for mediation, inappropriate for talking about medicine because the required simplification in professional-lay discourse (which promotes a personal, subjective, action-oriented style) does not meet the demands for precision, conciseness, objectification, and passivity, which is the paradigm of medical science. Thus, instead of developing a professional-lay medical discourse, which enables experts to explain medical concepts at a lower level of abstraction, the experts resort to their habitual way of communicating (i.e., the language into which they have been socialized when acquiring their expert knowledge) in spite of the fact that their expert language is difficult to understand for ‘outsiders’ and therefore hampers the readability and usability of consumer-oriented medical documents. And so, today, we face a situation in which, in spite of the fact that the medical community produces numerous documents for patients (package inserts, health leaflets, information letters prior to hospitalization, etc.), the professional-lay discourse has not been fully developed (and embraced) by the medical discourse community. Therefore, the discourse remains

a hybrid between traditional medical language and consumer-oriented plain English. See also: Jargon; Languages for Specific Purposes; Medi-

cal Communication: Linguas Francas; Socialization.

Bibliography Askehave I & Zethsen K K (2000a). The patient package insert of the future. Report for the Danish Ministry of Health [Danish and English version]. Aarhus: Aarhus School of Business. Candlin C N & Candlin S (2003). ‘Health care communication: A problematic site for applied linguistics research.’ Annual Review of Applied Linguistics 23, 134–154. Consumers’ Association (2000). Patient information leaflets: sick notes? Report, June 2000. Janssen D & Neutelings R (eds.) (2001). Reading and writing public documents. Amsterdam/Philadelphia: John Benjamins Publishing Company. Killingsworth J M & Steffens D (1989). ‘Effectiveness in the environmental impact statement.’ Written Communication 6, 155–180. Killingsworth J M & Gilbertson M K (1992). Signs, genres and communities in technical communication. New York: Baywood Publishing Company. OECD and Statistics Canada (2000). Literacy in the information age: Final report on the International Adult Literacy Survey. Paris: Author. Sager et al. (1980). English special languages: principles and practice in science and technology. Wiesbaden: Oscar Brandstetter Verlag. Sless D & Wiseman R (1997). Writing about medicines for people – usability guidelines for consumer medicine information. Canberra: Communication Research Institute of Australia.

Medical Discourse and Academic Genres J Pique´-Angordans, Universitat de Valencia, Valencia, Spain S Posteguillo, Universitat Jaume I, Castello´, Spain ! 2006 Elsevier Ltd. All rights reserved.

The Scientific Medical Discourse Community The construction of knowledge is both an individual and a social task. It is a continuous dialectical process that has been taking place for generations. This process has adopted different forms of communication, namely, oral (conferences, conversations,

dialogues, remarks, among other spoken genres) and written (editorials, research articles, letters, peer reviews and referees’ remarks, case studies, and even chats and e-mails). New knowledge is drawn from the conversion of research resources derived from ‘‘how authors, editors, and reviewers together transform the raw material of manuscripts into the finished product of validated knowledge claims’’ (Chubin and Hackett, 1990: 95). This is usually carried out through a series of conventions that each discourse community has. While attributing to the communicative purpose the role of linking genres, tasks, and discourse community goals,

650 Medical Discourse and Academic Genres

Swales (1990: 34) brought up a conceptualization of discourse communities: Discourse communities are socio-rhetorical networks that form in order to work toward sets of common goals. One of the characteristics that established members of these discourse communities possess is familiarity with the particular genres that are used in the communicative furtherance of those sets of goals.

This is why he further adds that genres do not belong to individuals, but ‘‘are the properties of discourse communities.’’ Dialogical communication among scientists within a discourse community takes different shapes and forms. More often than not, their communication is an effort to win acceptance by their readership, particularly when controversial research is involved. In Myers’s (1990: 144) words, scientific texts often represent ‘‘a negotiation of knowledge claims, not simply the communication of knowledge.’’ This negotiation is carried out through the different genres at the disposal of a given discourse community. In addition, genres not only are established sites of social action, but also contribute to coordinate the work of groups and organizations. This social action in the science-making process is carried out through the scientific communication in a continuous exchange of information with other researchers in the same field and finally represents the acceptance of the investigation by the scientific community. Communication among scientists and between scientists and their readership has been particularly enhanced by developments in computer science through which a web of information is distributed with no limitations in sight. ‘‘Links from electronic articles,’’ wrote Delamothe (2002: 1477), ‘‘can pass backwards, forwards, and sideways to other articles.’’ Delamothe further added, quoting from the Charleston Report (2002, 7, 1–2), that ‘‘a single document is no longer the pivot of knowledge but rather a node in a cognitive web in a system of coupled databases.’’ The Internet potential is said to be unlimited, at least in reference to most first-world countries. Unfortunately, however, this does not hold true for thirdworld countries. Data show that only 2% of the world population has access to the Internet.

Written Communication Since the 17th century, medical writing has gone through what Lang (2000) called the age of ‘‘formalized medical writing,’’ in which a series of major steps took place: the first Western scientific journal, Journal des Scavans, was published in French and in English by Denis de Sallo, a member of the French Parlement, in 1665, antedating the Philosophical Transactions of the

Royal Society of London by only 10 weeks. The first English medical periodical was the Medicina Curiosa, of which the only two issues extant appeared on June 17, 1684 and October 23, 1684. Over a century later, in 1797, the first medical journal, The Medical Repository, was founded in the United States. In 1812, J. C. Warren established The New England Journal of Medicine and Surgery and the Collaterial Branches of Science. This journal, after merging with Medical Intelligencer, became the weekly Boston Medical and Surgical Journal, which has been the official conduit of information of the American Medical Society since 1914, and, finally, in 1928 it became The New England Journal of Medicine. In Great Britain, the first issue of the Provincial Medical and Surgical Journal, considered the precursor of the British Medical Journal (BMJ), was published in 1840. A hundred year later, the American Medical Writers Association (AMWA) was established. In the 1960s, authors began to incorporate abstracts in medical journal articles and also the first edition of the AMA Manual of Style appeared. Pasteur introduced the methods section in the 1870s, essentially creating the IMRD (Introduction, Methods, Results, Discussion) format, also known as IMRAD (Introduction, Materials and Methods, Results And Discussion) and including TAIMRAD (Title, Abstract, Introduction, Materials and Methods, Results, And Discussion) (Maher, 1992); however, it was not until 1972 that this structure became the official standard for presenting scientific information (Standard Z39 of the American National Standards Institute). In 1979, the Uniform Requirements for Manuscripts Submitted to Biomedical Journals were published, and the structured abstract was introduced in 1987. Finally, in 1997, the Consolidated Standards for Reporting Trials Statement (the CONSORT Statement) was published. The organization of top biomedical journals is not always well delimited and structured, resulting in dissimilar patterns. This can be attested by looking at five of the journals most widely consulted by researchers, namely, Annals of Internal Medicine (twice monthly), British Medical Journal (BMJ) (weekly), Journal of the American Medical Association (JAMA) (weekly), The New England Journal of Medicine (weekly), and The Lancet (weekly), all listed in Index Medicus and Medline. For example, most of them will include the typical original and revision articles, editorials, reports from clinical medicine, occasional notes or brief notes, book and journal reviews, and the like. In addition, some will feature special articles, such as The New England Journal of Medicine, in which a series of cases from the registry of the Massachusetts General Hospital is presented. Both The Lancet and Annals of Internal Medicine also offer short comments and brief communications, and the BMJ has

Medical Discourse and Academic Genres 651

special cases based on Internet reader feedback. Of the five top medical journals mentioned, BMJ and The Lancet offer free access to their full contents, although the latter requires the user’s free registration; quite a few articles are also offered in JAMA. Only contents (major features) and abstracts can be freely accessed in the rest of journals and sometimes a summary for patients (as in Annals of Internal Medicine) can be accessed. Obviously, the easier the access to these journals, the more feedback is obtained from the readership through the Internet, which makes the process one of the most innovative means of communication to and from a given medical journal.

Academic Genres in Medicine The Importance of Worldwide Communication in the Medical Profession

Communication in medical English has been essential throughout the history of medicine (see Medical Discourse: Sociohistorical Construction). As medical science became internationally widespread, the need to keep in touch for the sake of the development of the science has become increasingly important for a number of reasons: first, as a means to complement research findings; second, to avoid the repetition of the experiments that yielded no positive results; and third, to control new diseases. These three factors are closely interconnected as the following examples illustrate: when AIDS began to spread around the world, major medical research centers had to share information, first, in order to control the disease, and, second, to create useful treatments that, frequently, were the results of complementing reports of previous investigations by other colleagues. Most likely, the SARS syndrome will again show the relevance of all these factors. The words that doctors articulate in medical communication may be organized into various formats, but they always follow well-designed communicative written or spoken patterns. Linguists have come to label these recognizable patterns as ‘academic genres’ (Swales, 1990). The fact that medical professionals systematically resort to specific genres is a key factor in medical communication. A second essential element is the fact that doctors around the world basically communicate in a single language: English. This presents a series of consequences for all professionals in the field: in the first place, a reasonable level of English proficiency must be attained by doctors in order to read about new advancements in their discipline if they are to internationally publish their findings (necessary for promotional purposes in most countries). The problem is that the international use of the English language in

the medical profession does not simply entail transferring one’s findings into English, but also includes proceeding to adapt one’s rhetoric to the features that are representative of academic English rhetoric (in its written or spoken versions). Written and Spoken Genres

The concept of genre, as we have already mentioned, is a key term in medical communication. In fact, all medical communicative events may be classified into specific written or spoken genres (see Medical Discourse: Doctor–Patient Communication; Medical English: Conferencing; Psychotherapy and Counselling). With regard to written genres, we may highlight the following: editorials, research articles, abstracts, case reports, review articles, peer reviews, replies to these reviews, letters of acceptance/rejection of a paper, conference programs, medical popularizations, letters of application, book reviews, and letters to the editor (see Medical Journals: Letters to the Editor), to name but a few. Most of these genres are found in many other academic disciplines; however, each of them develops a set of peculiarities characteristic of the medical profession alone. The general patterns as well as some of the specific medical intricacies of several of these genres are elucidated below. Stability through Time

Medical genres, especially written genres, have become quite stable in their form, structure, and style. This is mainly due to the fact that, as an independent scientific discipline, medicine is centuries old. This indicates that doctors have had hundreds of years to find fixed patterns – at least in certain paradigmatic genres – that adequately account for their medical findings. The Royal Society of London contributed to the development of key genres such as the research article in medicine. Nowadays, research articles in this discipline tend to follow the well-known IMRAD pattern, which has then been adopted by several other scientific disciplines. Engineering, however, and, in particular, new branches in this area, such as computer engineering, do not necessarily follow the IMRAD models. Dynamism

The concept of genre, as defined by Swales (1990), implies that these textual patterns are subject to change and evolution. As outlined above, in medical writing there is an overall tendency toward a specific set of fairly stable patterns or genres. This, however, does not mean that the dynamic characteristic of genres is absent in medicine. Medical genres are also sensitive to changes in the outer and wider contexts surrounding the discipline. For instance, the introduction of new

652 Medical Discourse and Academic Genres

technologies or, as they are currently referred to, the information and communication technologies (ICTs), have triggered a significant amount of change and transformation in the way that research is generally carried out or communicated throughout the world. The ICTs have generated the following major changes in medical communication and in medical genres: 1. The introduction of completely new generic forms (e.g., videoconferences used for lecturing via the Internet or even to carry out surgery under the supervision of a highly qualified surgeon miles away from the operating room; the use of e-mail and websites). 2. The increasing relevance of some already existing specific genres, namely, abstracts, which were used for decades before the invention of the Internet and Computer-Mediated Communication (CMC), even before computer databases. However, the Internet and, most particularly, the World Wide Web have increased the importance of abstracts exponentially as a key means of gaining access to relevant information. 3. The progressive transformation of existing genres, as in the case of conventional research articles that have seen two stages in their inclusion in the web system. Initially, academic papers were simply copies on the Internet that could be consulted or downloaded as they appeared in academic journals; however, the existence of on-line journals and the introduction of faster Internet facilities currently allow medical researchers to incorporate into their papers high-quality images, diagrams with movement, URLs with complementary information, etc. Thus, research articles are now being contemplated with a new set of elements heretofore not included in conventional written academic research articles.

Abstracts The myriad of medical research papers published around the world today has made abstracts a tool of increasing relevance for medical professionals (see Medical Discourse: Structured Abstracts). The abstract allows a doctor or a scientist to select papers that are relevant to their everyday practice or research projects. The Internet, on-line databases, and other electronic resources have been added to the already vast research documentation available in conventional academic journals. Not only has the Web assisted researchers in locating relevant papers more quickly but, at the same time, it has also exponentially increased the number of likely to be profit-

able texts for readings. The medical professional, therefore, will inevitably have to read abstracts in order to be selective when choosing from such a wide array of reading material. Consequently, the abstract has become an essential genre in medicine. However, important as abstracts are nowadays for medical advancement, it is interesting to note that they are a recent addition to the already established genres, in particular, the research and review papers, to which they are closely related. According to the Ad Hoc Working Group for Critical Appraisal of the Medical Literature (1987: 600), up until the late 1960s, abstracts were not included by most clinical journals. The next step taken by some journals was to move the summary and conclusions to the beginning of the research article, as occurred in JAMA and the Canadian Medical Association Journal, and shortly thereafter in the BMJ.

Editorials Editorials are introductory articles in the initial pages of an academic journal. As opposed to research articles or case reports, editorials are subjective in their content, they are usually written by a single author, and they reflect the expertise of that person. For instance, Salager-Meyer and Alcaraz-Ariza (2003), when studying the frequency of academic criticism in four written genres of medical discourse, namely, editorials, review articles, research papers, and case reports, found that the frequency of critical statements was significantly greater in editorials than in the other three genres. We believe that the results of this study, although carried out on a corpus of Spanish medical texts in relation to the use of criticism, could also be extended to medical discourse at large. Thus, editorials are subjective, argumentative, and evaluative (in fact, it is their main communicative function). Editorials are written by expert knowledge-holders and knowledge-builders and include more critical statements than other medical genres. It is likely that only peer reviews and book reviews might incorporate a similar or greater level of direct academic criticism, though occurring to a lesser extent. Editorials are also persuasive in their nature, since they try to convince their readership of the writer’s point of view regarding their assessment of the papers being introduced or the ideas being expressed. Accordingly, editorials could be more specifically defined as introductory, persuasive, critical, and evaluative academic texts. They are opinion pieces within medical journals, analyses to evaluate previous and present work, and essays to suggest further areas of research.

Medical Discourse and Academic Genres 653

Research Articles: Internal Structuring The scientific or research article can be defined as a technical document that describes a significant experimental, theoretical, or observational extension of current knowledge or advances in the practical application of known principles. The reported findings must not only be original, that is, previously unpublished, but also valid in terms of providing sufficient and important information, and they must be published in accordance with a structure and style previously agreed on by the scientific medical community. The Internet has introduced a whole new set of parameters in biomedical publication. Top journals in medicine have decided to go on-line with their publications, and this is the case, for instance, for the five top journals mentioned above, to name but a few. In some cases, and for some specific types of articles, journals have begun to avoid printing and, consequently, the articles in question never appear in the print journal. Newmark and Tracz (1997) predicted that, within 5 years, on-line publications would still closely resemble papers. We believe, with Edward Huth (2000: 5), Editor Emeritus of Annals of Internal Medicine, that it may basically be an economic question, a question of ‘‘needs and costs in a marketplace.’’ Electronic publishing of journals will actually supersede paper publishing, he adds, ‘‘when the new methods satisfy needs of authors and readers better than the paper medium and at a lower price.’’ In addition, through multiple links, information about other articles from different journals is provided, a given paper or a report is cited, and readers are also invited to search for possible related articles through databases such as Medline. As indicated above, medical papers are basically organized following the IMRAD macrostructure. This is the required macrostructure for both paper journals and on-line publications. Their mode of referencing, through the so-called Vancouver system, has been consistent since it was agreed on at a meeting of medical journal editors in Vancouver, Canada, held in 1978. Let us now examine the internal structure of each section of the IMRAD macrostructure. Introduction

The introduction is probably the most complex section of the medical research article, but it also follows a more fixed structure than the other three sections do. It cannot be reduced to a mere expression of the author’s aim in writing the paper. Rather, it should constitute the connecting link between readers and the research carried out by the authors. Through this fixed structure of the introduction, authors control their ideas and adapt them to their rhetorical

aims and to the needs derived from their belonging to a specific scientific community. The structure of the introduction is constituted by a set pattern based on three basic moves that Swales (1990) described in terms of the CARS model (Create A Research Space), later reinterpreted and applied to the whole body of the medical research article by Nwogu (1997). These three introductory moves and their ‘constituent elements’ are depicted in Table 1. Materials and Methods

The methods section, or materials and methods section depending on the type of study (e.g., laboratorybased studies), basically describes the sample and its size, the subjects and criteria for inclusion or exclusion from the research, and the procedures followed in carrying it out. This section includes all the necessary details that enable readers to accurately grasp the research process, to the extent that the scientific community will be able to replicate it with the information provided. Structurally this section may include three basic moves (see Table 2): the first describes the data collection procedure (source and size of data, sample description, and selection criteria), the second describes the experimental procedure (apparatus description and experimental process), and the

Table 1 Introduction moves in the medical research article (Nwogu, 1997: 135) Move 1: by Move 2: by Move 3: by

Presenting background information: (1) Reference to established knowledge in the field (2) Reference to main research problems Reviewing related research: (1) Reference to previous research (2) Reference to limitations of previous research Presenting new research: (1) Reference to research purpose (2) Reference to main research procedure

Table 2 Structure of the Materials and Methods section (Nwogu, 1997: 135) Move 4: by

Move 5: by

Move 6: by

Describing data collection procedure: (1) Indicating source of data (2) Indicating data size (3) Indicating criteria for data collection Describing experimental procedures: (1) Identification of main research apparatus (2) Recounting experimental process (3) Indicating criteria for success Describing data analysis procedures: (1) Defining terminologies (2) Indicating process of data classification (3) Identifying analytical instrument/procedure (4) Indicating modification to instrument/procedure

654 Medical Discourse and Academic Genres Table 3 Structure of the Results section (Nwogu, 1997: 135) Move 7: by

Move 8:

Indicating consistent observation: (1) Highlighting overall observation (2) Indicating specific observations (3) Accounting for observations made Indicating nonconsistent observations

Table 5 Structure of the RA Discussion section (Reprint from Docherty and Smith, 1999: 1224)

! Statement of principal findings ! Strengths and weaknesses of the study ! Strengths and weaknesses in relation to other studies, discussing particularly any differences in results

! Meaning of the study: possible mechanisms and implications for clinicians or policy makers

Table 4 Structure of the Discussion section (Nwogu, 1997: 135) Move 9: Move 10: by

Move 11: by

Highlighting overall research outcome Explaining specific research outcomes: (1) Stating a specific outcome (2) Interpreting the outcome (3) Indicating significance of the outcome (4) Contrasting present and previous outcomes (5) Indicating limitations of outcomes Stating research conclusions: (1) Indicating research implications (2) Promoting further research

third describes the data analysis procedure (definition of terminology, classification of data, analysis or modification of the instrument or the procedure, etc.). Results

In the results section, which is the core of the research paper, the results are presented along with charts, graphs, tables, and figures, and the most important findings are underscored. Generalizations (see Table 3) are made based on the results obtained in the study and possible reasons are drawn to justify those results in view of results from other studies, termed by Nwogu (1997) as an indication of ‘‘consistent observations.’’ It also provides possible negative results, that is, ‘‘those results which do not conform with expected outcomes in the study,’’ so-called ‘‘nonconsistent observations’’ (Nwogu, 1997: 131). Discussion

The final section, or discussion, has as its main objective the description of how these results fit in with what is already known about the topic of the study. Authors will refer back to the main purpose or hypothesis advanced in the introduction and see whether, in view of the results obtained, it can be held or not and whether it agrees with the findings of other researchers (see Table 4). This section often ends by making a generalization of the results obtained, their implications, and their possible applications, and, if an interesting but as yet unanswered question has arisen from the study, future research is usually recommended. Explanations or speculations may also be necessary to justify the possible limitations of these results.

! Unanswered questions and future research

Docherty and Smith (1999) argued for a discussion section in which similar points are raised, but with particular emphasis on differences with previous research, as well as on unanswered questions throughout the author’s research. They proposed the structure shown in Table 5. This idea of speculating about the results in the discussion section of the research paper has given rise to a certain amount of polemic. To some extent, it reflects the opinion of those authors who defend a less rigid structure and favor the message itself. However, even though there is widespread acceptance of the writing conventions of a scientific community, some authors prefer – perhaps in an effort to increase their chances of publication in top medical journals – to focus more exclusively on the message itself. It is further contended that there is a tendency among authors to use excessive ‘rhetoric’ to speculate on the results, going in fact ‘beyond data’ and thus inducing polemic within the scientific community. Their advice to authors is that they should adhere more to the facts and dwell less on rhetorical devices. Skelton and Edwards (2000), however, are opposed to this restrictive view and suggested that speculation not only is desirable but, in fact, cannot be removed by imposing structural rules in medical research papers, since science is actually speculation, because there is a great deal of doubt and uncertainty. From a linguistic point of view, this is shown by the systematic use of modality in this section. These authors justified their opinion by pointing to the fact that it is in the results section where we find the data and where the statistical conventions are found to determine what is ‘significant,’ whereas determining how they are ‘relevant’ is the prerogative of the discussion section.

Case Reports Case reports constitute a special section within certain journals; in these reports, the medical history of a single patient, usually related to an unusual pathology, is described. However, it is also true that these reports often constitute a case series about a specific and rare disease in which the medical

Medical Discourse and Academic Genres 655 Table 6 Typical structure of a case report (a) Presentation of signs and symptoms of the case in question (b) Description of tests performed and interventions (c) Presentation of possible diagnoses (d) Treatment administered (e) Outcome evaluation

histories of a number of patients may be studied and described. The aim of this type of text is basically to illustrate an aspect of the treatment prescribed and the reaction to that treatment in relation to the disease common to all the patients in the survey or in the series of case reports. Some authors (Iles, 1998) stress the importance of case reports and encourage writing them since it is one of the best ways to get started in medical writing. It has also been the source material in educational programs throughout the medical university curricula. The structure of this genre depends on the type of case study. In some journals, they appear as a text of considerable length, in which the case report is preceded by an introduction and followed by a discussion. However, the medical literature essentially provides its information based on the structure shown in Table 6. Iles (1998) proposed a similar structure that consists of the following sections: an introduction, and a case description, followed by a discussion and comments. This structure closely resembles the widely used problem–solution paradigm for text structure that consists of four moves, namely, situation, problem, solution, and evaluation. The overall frequency of the case report was reduced between 1965 and 1995. This reflects the marginalization of the genre in the medical literature. The total number of case reports in the BMJ has gradually decreased: the January 1965 issues contained 43 case reports, between 1970 and 1990, the figures ranged between 8 and 20, and in the January 1995 issues there was only 1 case report. At the same time, the genre has been adapted to new specialized functions. The BMJ has introduced special sections in which case reports are selected for specific purposes under the heading ‘Lesson of the week’ and ‘Drug points,’ where the didactic purpose is emphasized and the focus is on a particular aspect, such as medication or complications. Another explanation for the decrease of case reports in printed journals is that they are being published in on-line versions. The following is a paradigmatic example of how a case report can be used with the active participation of readers through the Internet. The BMJ presents a case report that involves many interested people and, at the same time, contributes to the science-making

process through the subsequent dialogical communication that is produced. This journal includes interactive case reports, under the general heading of ‘Clinical Review.’ Sodeck et al. (2003) presented a three-part case report in three different issues (April 26, 2003; May 3, 2003; and May 24, 2003), in which the case was studied and a follow-up of its progression proposed. Its pedagogical implications become obvious when in the process the readers were asked several questions, thus inviting them to respond through the Internet. The questions from the first two parts produced 127 responses from readers in over 35 different countries, between April 26, 2003 and May 17, 2003. Part 3 offered an added element to this interactive case report; Peile (2003: 1136) postulated that there is more to be learned from the discussion than from the diagnosis and this occurs by focusing primarily on learning.

Review Articles When speaking of primary and secondary research sources, review articles are considered as examples of secondary research material. They are, nonetheless, a very useful tool for researchers since they give a general view of the topic under discussion. They provide not only possible source materials, but also information on how to locate them. The starting point of review articles is the original medical literature which has already been published. Through a thorough analysis, review article authors give an overview of recent publications and developments in a given area, subarea, or very specific topic. Review articles may then be considered a state-of-the-art description and revision of a given topic. The literature classifies review articles in three different categories: Systematic Reviews

A systematic review can be defined as a method of locating, evaluating, and synthesizing evidence. It is subject to predefined criteria for authors, and it has been recently qualified as a ‘‘new PubMed filter’’ (Nahim, 2002). The criteria are usually defined by the journal itself and authors must adhere to them for publication in that given journal. They are usually covered by the set of journal instructions that are mainly referred to as identification, selection, limitation, and interpretation of the bibliographical material. In other words, they have to do with data source, for instance, conference proceedings only; data collection and analysis, in which the process of abstraction of data is described; and the method of analysis and number of variables analyzed in order to obtain the results for the review.

656 Medical Discourse and Academic Genres Nonsystematic Reviews

Nonsystematic reviews, also called narrative reviews, are not subjected to any set of pre-established criteria of selection of the materials to be reviewed. Without a rigorous criterion of analysis, however, the authors’ opinion is paramount and the risk of bias in the selection of materials, as well as in the identification process, may be present. This creates the risk of producing undesired results, due mainly to the writer’s possible bias in reviewing other people’s work. Meta-analytic Reviews

In meta-analytic reviews, a clear set of guidelines for reviewing results of previous studies is offered. Some authors have referred to this type of review as a series of ‘analyses of analyses,’ in which the review article analyzes data collected from different studies that are considered as being part of one large investigation. Its scope, therefore, is also on one topic but only in terms of analyzing the results obtained in those various studies reviewed. It is not just a simple description of results; it tries to answer the question of why some organ or some mechanism works the way it does.

Peer Reviews Peer reviews have existed since the 17th century; however, because of their heretofore undisclosed significance, they have been an undervalued genre. For centuries, peer reviews have remained anonymous, their identity accessible only to a select group within the scientific community. This, however, is no longer the case, especially in medicine. Peer reviews have become absolutely essential in the development of medical research. It is through these texts that decisions on what is to be published or rejected are made (see Medical Writing, Revising and Editing). High-ranking medical journals have a rejection rate of 80% and elite journals of approximately 90%. Peer reviews are the tool used for these rejections. Nevertheless, authors consider one-third of these reviews as irrelevant and 17.5% as incompetent (see JAMA, 1998). If this were true, relevant research is not being published for subjective reasons. Nature showed good sense in admitting that its peer review process has not always successfully identified significant new work, but it is not the only journal to have made ‘historical misjudgments.’ Campanario (2004) compiled an extensive list of rejections and criticisms of manuscripts reporting Nobel-quality breakthroughs. Peer reviews are expected to be blunt. This is striking in the scientific community, where mitigated assertions and modalized statements characterize the

rest of the genres, particularly research articles. In a peer review, on the other hand, one reads an average of only 1.51 mitigated criticisms but 6.88 blunt criticisms. This could explain why 15% of authors qualify the peer reviews they receive as ‘impolite,’ 73.8% as ‘more or less polite,’ and only 11.2% as ‘very polite’ (Kourilova´ , 1996: 9, 11). Editors usually rely on reviewers and their comments. Most editors accept or reject a paper on the basis of peer reviews, although this is not always the case. The BMJ, for instance, receives over 4000 research papers per year. Approximately two-thirds of these submissions are rejected in-house by editors without consulting external peer reviewers (Schroter and Barratt, 2004). Reviewers have no specific training to carry out their task. They are given instructions only regarding what to assess. Since training programs for reviewing have yet to be implemented, the best solution thus far seems to be to select the best reviewers possible. In this respect, the best quality reviews are generally submitted by younger reviewers (in their 40s rather than in their 60s), who are likely to have a junior status in academia. To be more precise, the ideal reviewer is young, is a resident of the United States, has training in epidemiology or statistics, is a current research investigator, and takes approximately 3 h to write the report (Black et al., 1998). It has also been shown that blinding reviewers as to the identity of authors decreases the rejection rate, and even more so if editors make sure that no reviewer reports on authors they may know personally (Godlee et al., 1998). Additionally, if reviewers are asked to sign their reviews, rejection also decreases. Moreover, reviewers who have published extensively themselves produce better and more reliable reports. Investigating reviewers in the BMJ, it has been shown that editors were efficient in detecting good reviewers, taking into account what has been explained above. It should be noted that blinding reviewers decreases the rejection rate but does not at the same time improve the quality of a review itself. Blinding seems to be a good technique to reduce bias, but further efforts need to be made to improve the reviewing process. Having reviewers sign their reports, thus making these reports public, also seems to be a good technique for, at least, putting an end to the heretofore hidden status of this genre. There is also an overall tendency to accept these techniques within the medical community at large. In 1998, approximately 50% of reviewers did not wish to sign their reports or to have their reports made public. Nowadays, in the BMJ only 5% of reviewers maintain this position. It is likely that important journals will start to make their reviews known via official websites (Hagan, 2003).

Medical Discourse and Academic Genres 657

Two issues remain open for further study in relation to peer review in medicine: the bias against non-U.S. researchers and the role of women in the reviewing process of top journals. It has been detected that U.S. reviewers favor U.S. submissions much more that non-U.S. reviewers (Link, 1998) and that the presence of women on editorial boards of highly ranked medical journals is not proportional to their presence in medical associations nor to the number of papers published by women in those same journals (Dickersin et al., 1998). See also: Medical Discourse: Developments, 16th and 17th

Centuries; Medical Discourse: Doctor–Patient Communication; Medical Discourse: Sociohistorical Construction; Medical Discourse: Structured Abstracts; Medical English: Conferencing; Medical Journals: Letters to the Editor; Medical Writing, Revising and Editing; Psychotherapy and Counselling.

Bibliography Ad Hoc, Working Group for Critical, Appraisal of the Medical, Literature (1987). ‘A proposal for more informative abstracts of clinical articles.’ Annals of Internal Medicine 106, 598–604. Bartlett C, Sterne J & Egger M (2002). ‘What is newsworthy? Longitudinal study of the reporting of medical research in two British newspapers.’ British Medical Journal 325, 81–84. Black N, van Rooyen S, Godlee F, Smith R & Evans S (1998). ‘What makes a good reviewer and a good review for a general medical journal?’ Journal of the American Medical Association 280, 231–233. Campanario J M (2004). ‘Rejecting Nobel class papers.’ http://:www2.uah.es/jmc/nobel.html [accessed November 2, 2004]. Chubin D & Hackett E (1990). Peerless sciences: Peer review and U.S. science policy. Albany, NY: SUNY Press. Delamothe T (2002). ‘Is that it? How online articles have changed over the past five years.’ British Medical Journal 325, 1475–1478. de Semir V, Ribas C & Revuelta G (1998). ‘Press releases of science journal articles and subsequent newspaper stories on the same topic.’ Journal of the American Medical Association 280, 294–295. Dickersin K, Fredman L, Flegal K M, Scott J D & Crawley B (1998). ‘Is there a sex bias in choosing editors? Epidemiology as an example.’ Journal of the American Medical Association 280, 260–264. Docherty M & Smith R (1999). ‘The case for structuring the discussion of scientific papers.’ British Medical Journal 318, 1224–1225.

Godlee F, Gale C F & Martyn C N (1998). ‘Effect of the quality of peer review of blinding reviewers and asking them to sign their report.’ Journal of the American Medical Association 280, 237–240. Hagan P (2003). ‘Review queries usefulness of peer review.’ The Scientist. http://www.the-scientist.com [accessed January 28, 2004]. Huth E J (2000). ‘Paper journals: Ready for burial?’ AMWA Journal 15(2), 5–6. Iles R L (1998). Guidebook to better medical writing. Olathe, KS: Island Press. Journal of the American Medical Association (1998). 280, Special Edition on Peer Reviews. Kourilova´ M (1996). ‘Interactive functions of language in peer reviews of medical papers written by non-native users of English.’ Unesco ALSED-LSP Newsletter 19(1), 4–21. Lang T (2000). ‘Medical writing: Where it’s been, where it’s going.’ AMWA Journal 15(2), 9–12. Link A M (1998). ‘US and Non-US submissions: An analysis of reviewer bias.’ Journal of the American Medical Association 280, 246–247. Maher J C (1992). International medical communication in English. Ann Arbor: University of Michigan Press. Myers G (1990). ‘The social construction of science and the teaching of English: An example of research.’ In Robinson P C (ed.) Academic writing: Process and product. Modern English Publications and The British Council, ELT Documents 129. 143–150. Nahim A M (2002). ‘New PubMed filter: Systematic reviews.’ NLM Technical Bulletin (January–February) 324, e7. Newmark P & Tracz V (1997). ‘The electronic future: What might an online scientific paper look like in five years’ time?’ British Medical Journal 315, 1692–1696. Nwogu K N (1997). ‘The medical research paper: Structure and functions.’ English for Specific Purposes 16, 119–138. Peile E (2003). ‘More to be learnt from the discussion than the diagnosis.’ British Medical Journal 326, 1136. Salager-Meyer F & Alcaraz-Ariza M A (2003). ‘Academic criticism in Spanish medical discourse: A cross-generic approach.’ International Journal of Applied Linguistics 13(1), 96–114. Schroter S & Barratt H (2004). ‘Editorial decision-making based on abstracts.’ EASE (European Science Editing) 30(1), 8–9. Skelton J R & Edwards S J L (2000). ‘The function of the discussion section in academic medical writing.’ British Medical Journal 320, 1269–1270. Sodeck G, Partik B & Domanovits H (2003). ‘A 42-year-old man with acute chest pain: Case progression.’ British Medical Journal 326, 920; 974; 1133–1136. Swales J M (1990). Genre analysis. English in academic and research settings. Cambridge: Cambridge University Press.

Medical Discourse, Illness Narratives 697

show that the expression of doubt and possibility is central to the negotiation of claims and that what counts as effective persuasion is influenced by the fact that evidence, observations, data, and flashes of insight must be shaped with due regard for the nature of reality and their acceptability to an audience. See also: Accessibility Theory; Corpora; Corpus Studies:

Second Language; Genre and Genre Analysis; Medical Discourse and Academic Genres.

Bibliography Adams Smith D (1984). ‘Medical discourse: Aspects of author’s comment.’ English for Specific Purposes 3, 25–36. Fahnestock J (1986). ‘Accommodating science: The rhetorical life of scientific facts.’ Written Communication 3(3), 275–296.

Hyland K (1998a). Hedging in scientific research articles. Amsterdam: Benjamins. Hyland K (1998b). ‘Boosting, hedging and the negotiation of academic knowledge.’ TEXT 18(3), 349–382. Hyland K (2000). Disciplinary discourses: Social interactions in academic writing. London: Longman. Hyland K & Tse P (2005). ‘Hooking the reader: A corpus study of evaluative that in abstracts.’ English for Specific Purposes 24(2), 123–129. Lakoff G (1972). ‘Hedges: A study in meaning criteria and the logic of fuzzy concepts.’ Chicago Linguistic Society Papers 8, 183–228. Myers G (1989). ‘The pragmatics of politeness in scientific articles.’ Applied Linguistics 10(1), 1–35. Salager-Meyer F (1994). ‘Hedges and textual communicative function in medical English written discourse.’ English for Specific Purposes 13(2), 149–170. Skelton J (1997). ‘The representation of truth in academic medical writing.’ Applied Linguistics 18(2), 121–140.

Medical Discourse, Illness Narratives L-C Hyde´n and P H Bu¨low, Linko¨ping University, Linko¨ping, Sweden ! 2006 Elsevier Ltd. All rights reserved.

Introduction During the past decades, the health field has become a battlefield where alternative concepts of illness, health, and treatment compete with the dominant traditions of Western scientific medicine. New health care practices such as complementary and alternative medicine have gained status. Etiological factors have come to include ‘lifestyle’ factors, as in the relationship between smoking and lung cancer or the relationship between stress and certain coronary diseases. Patients have access today via the Internet to sources of information that were previously unavailable to them and can approach their physicians with more knowledge and new demands. As part of these changes, researchers have become interested in questions having to do with the relationship between, on the one hand, the patient and his or her illness and, on the other hand, traditional scientific medicine and its concept of diseases. In this context, the study of illness narratives has gained prominent status. Research on the forms and functions of illness narratives has expanded rapidly since the early 1980s. Its development is marked by diversity in the theoretical perspectives and methods that are brought to bear on a variety of problems.

Illness narratives is a wide field encompassing interview studies of patients’ narratives of illnesses, studies of the way that narratives are used in the interaction between medical staff and patients, and clinical studies of how narratives could be used by medical professionals in encounters with patients. Another field of study that has emerged is the study of written and published illness narratives; Hawkins (1993) called these pathographies. Theoretically, researchers have been interested in the narrative as an opportunity to study the subjective experience of illness, the way in which identity is reconstructed narratively in the face of illness, and how the institutional context affects the relationship between medical professionals and patients. This is especially interesting in terms of power and the patient’s ability to make his or her voice heard in the medical encounter. Thus, researchers focus on narrative structure and coherence, as well as on the functions of narratives in various social contexts.

Two Voices A central problem area for many studies of medicine and illness narratives is the relationship between what is called the voice of the lifeworld and the voice of medicine. These concepts were introduced by Elliot Mishler in his book Discourse of medicine (1984). Mishler pointed out that in the discourse of ordinary medical interviews it was possible to discern two

698 Medical Discourse, Illness Narratives

different attitudes to the problems that patients brought to the encounter with the medical doctor. What he called ‘‘the voice of the lifeworld’’ is characterized by a biographical contextualization of the events and problems of patients’ lives. Problems and problematic events are taken at face value and related directly to the patient and his or her actions and experiences. In contradistinction to this, ‘the voice of medicine’ is characterized by a scientific and critical attitude; events and problems are removed from the biographical context and related to more general and abstract medical theories of diseases. When these two attitudes were found in the same discourse or conversation, a conflict between the voices could be seen, instead of the voice of medicine being dominant. This struggle results in a disruption of the flow of discourse. That is, the patient tries to voice his or her view of things, whereas the medical doctor tends to ignore this while holding onto the voice of medicine. This leads to an ‘objectification’ of the patient and ‘to a stripping away of the lifeworld contexts of patient problems.’ As a result, the ‘‘form of discourse severely limits, if it does not exclude entirely, the possibility of humane medical practice’’ (Mishler, 1984: 128). A similar point was raised by Arthur Frank (1995, 1997) when he argued that medicine ‘‘hails’’ ill people, categorizing them as patients with a certain disease and thereby placing them in a story told by medicine using medical reasoning. According to Frank, ‘‘many ill people find they cannot live the story, or just the story, that biomedicine tells of their illnesses; the need for a voice of one’s own is a particularity of our times’’ (1997: 31). This conflict between different attitudes toward illness-related events and experiences, i.e., the fact that patients and medical professionals view illness, health, and treatment very differently, is found at the heart of much of the research on illness and narrative. Researchers have focused on various aspects of this conflict and in the following we discuss three different strands of this research. First, we focus on the way that ill persons use narratives as a way to understand illness and reconstruct their identity in light of the illness experience, i.e., illness narratives in the lifeworld of patients. Second, we discuss research on narratives in the encounter between the patient and medical professionals. This includes research on the use of narratives as a clinical tool, i.e., narratives in the encounter between the world of medicine and the lifeworld. Finally, we focus on the way in which medical professionals use narratives, i.e., the narratives in the medical world.

Narrative, Lifeworld, and Identity

Illness narratives, that is, stories about illness told by the ill persons themselves (as opposed to narratives about illness told by medical professionals (Hyde´ n, 1997)), usually concern life-threatening diseases or chronic illnesses. This kind of illness commonly, perhaps always, influences the ill person’s perception of self. A serious and chronic illness not only means a biographical disruption, it also means that life takes a new turn often totally unrelated to how life was before illness. According to American sociologist Kathy Charmaz (1983), chronic illness is diminishing to an individual’s identity and sense of self and a fundamental consequence of illness is the experience of loss of self. This changed perception of self can primarily be described as a social interactional process that partly originates from the loss of social contacts that is inherent in illness and partly from the way that other people seem to regard ill persons and treat them. Considering narrative as a powerful cultural resource for organizing, making sense of, and communicating experiences such as illness, the study of illness narratives becomes important for understanding phenomena such as identity and transformation of self due to illness. This holds true for oral as well as written stories and for what sometimes is called performed or enacted stories (Frank, 1997; Langellier, 2001). Chronic illness is almost always undesired and probably beyond what a person has imagined his or her life to be. Being unexpected and unwanted, illnesses raise questions such as ‘‘why me?’’ and ‘‘why this?’’ These are questions that medicine is hardly ever able to answer in a clear, straightforward way. In an ill person’s endeavors to understand what is happening, narratives become an important resource for making sense of things. Narratives help to sort out and organize different kinds of experiences into a more comprehensible and coherent whole. Hence, it becomes possible to use narratives to connect events and experiences that did not seem to be related at the time they occurred. The connection can be used to explain illness. Time is central to chronic illness. For one thing, it is a characteristic of a chronic illness that there is no immediate and certain cure and that such an illness thus can be part of the ill person’s life for many years. Another typical feature of chronic illness is the insidious onset (Bury, 2000), which makes it difficult for medicine as well as for the ill person to know when the illness actually started. The difficulty of fixing the time of the debut has implications for issues of responsibility for illness and suffering. This

Medical Discourse, Illness Narratives 699

vagueness simultaneously offers the possibility of temporalizing illness and incorporating illness into the personal life story and thereby handling questions about blame and responsibility and ultimately the sense of self. In a study concerning a medically unexplained illness – a contested illness exemplified by Chronic Fatigue Syndrome (CFS) – it was argued that the interviewees temporalized their illness in different ways. There was a discussion of what implications this had for questions about responsibility, blame, and freedom of liability (Bu¨ low and Hyde´ n, 2003). The analysis draws on the shadows of time concept, borrowed from literary historians (Morson, 1994). This concept involves events casting their shadows over the narrator’s present. These shadows of time can come from the back (backshadowing, i.e., what one later thinks should have been foreseen), can come from the side (sideshadowing, i.e., alternative roads and ends), or can appear as a kind of vortex (i.e., when several events separated in time finally converge into illness). The use of different time shadows (e.g., backshadowing) when telling one’s illness narrative means that the ill person might blame him- or herself for not acting in a proper way in order to prevent illness or for not handling it differently. It can also mean the reverse, that the temporalization of illness might give the narrator limited or even total freedom from liability. In the study, this also became obvious in the way that interviewees used backshadowing by placing their illness back in time and simultaneously accounting for why they should not be blamed and in what way they would be seen as responsible persons. This could be achieved by making statements about an early visit to a doctor and describing the feeling of confidence in the results of medical examinations that did not indicate any disease and in the competence of the doctor who told the narrator not to worry. In so doing, the narrator presents him- or herself as a person who, even at the time he or she is describing, showed great responsibility. This way of telling the story of illness counteracts some of the questions that might be posed about possible actions to be taken. Another way of managing illness narratively by using a medical diagnosis was shown by Lisa Capps and Elinor Ochs (1995) in an analysis of the stories told by a 34-year-old married woman about agoraphobia. The researchers showed how the woman created a master narrative ‘‘that holds for all her panic experiences, wherein anxiety is presented as an irrational response to being in an immediate activity setting’’ (Capps and Ochs, 1995: 80). The connection established by the narrator between the situations she

finds herself in and the feeling of panic makes her think that a causal relation exists between the area in which she is not able to move and her anxiety. At the same time, she regards her own avoidance behavior as a token of being agoraphobic. Capps and Ochs demonstrated with a rigorous linguistic analysis that it was possible to interpret the panic attacks as a delayed ‘no’ to accommodate the situation in which panic later appears. The authors argued that the diagnostic construction provided the woman ‘‘. . . with a medical reason for neither accommodating nor even negotiating demands and desires imposed or promoted by others’’ (Capps and Ochs, 1995: 100). At the same time, she created a coherent story of her life. Illness narratives are also important for the identity of the ill because of the way in which narratives and narrating work in gaining a new identity as a special kind of person. This transformation of identity might happen through learning to tell a new story about oneself, one’s life, and the form of illness, often together with other sufferers. This emphasizes the social act of storytelling that is sometimes lost in narrative analysis. Research on self-help groups and patient associations are examples of this transformation of identity through storytelling in groups. In her study of Alcoholics Anonymous (AA), American anthropologist Carol Cain (1991) indicated that narrating is probably the most important element for someone becoming a new member of the group. She argued that ‘‘identity reconstitution in AA takes place through reinterpretation, of self and of one’s life, and that the major vehicle for this is the AA personal story.’’ According to Cain, becoming a member of an AA group is about learning to tell a new story from prototypical narratives about life. This learning takes place by participating in collective storytelling. Within AA, there is a well-defined idea that it is the effects of drinking alcohol that constitute the problem and not the other way around. To become an AA member, one must adopt this view, which means that one must rethink the life one has lived and one’s relationship to alcohol. Through what Cain called identity diffusion, the old identity is weakened. ‘‘In acquiring a new identity,’’ she explained, ‘‘individuals must understand the identity, internalize it, and become emotionally attached to it’’ (1991: 218). This gives rise to stories resembling conversion stories in which the teller reinterprets his or her past in light of a turning-point experience. Cain stated that it is the process of learning to tell his or her story with the appropriate structure that helps an AA member to understand him- or herself as an alcoholic. In this process, newcomers learn from old-timers to tell the story in the right way. Cain’s

700 Medical Discourse, Illness Narratives

study indicated that the longer a person has been a member of AA, the more his or her story resembles the prototypical AA story. The major vehicle for the identity reconstitution process is thus the AA personal story. Such stories, which Cain described as a ‘learned genre,’ can be told at meetings in a formalized manner as well as being published in pamphlets and the like. By reading the stories of others and by hearing other alcoholics tell their stories at meetings, newcomers learn both the structure and the content of the prototypical AA personal story and at the same time these stories also provide a newcomer with experiences to compare with his or her own. This shows that learning to tell a new life story is something that people do together in social interactions. Learning to tell a new story through interaction in a group such as AA shows a kind of enculturation – becoming a group member and gaining a new identity through this membership. This process partly differs from the idea of narrative as a means to articulate a voice of one’s own and thereby resist a medical categorization. On the contrary, each member in an AA group adopts the collective identity of a nondrinking, recovering alcoholic by learning to tell the right story in the correct way. Many chronic illnesses have an insidious, vague onset. This not only makes it difficult to describe at what point one’s illness started, but also means that changes in the sense of self can constitute a very long process that is not always possible to follow in a single story. On the contrary, it might more often be the case that people describe changes in their sense of self and identity transformations through a series of stories. American sociologist Susan Bell has studied what she calls ‘‘DES daughters,’’ i.e., women who were prenatally exposed to the drug diethylstilbestrol (DES), which increases their own risks of infertility, miscarriages, and vaginal and cervical cancer. In her study, Bell (1988) showed how a slow and gradual change in identity is narrated with the help of what she called linked stories. ‘‘Through linked stories, people explain how their experiences – and their interpretations of these experiences – have changed over time’’ (Bell, 1988: 101). By reducing each story to its core narrative, it becomes possible to see how the several narratives, though not told in a direct sequence, are linked to one another through structure and content (Bell, 1988). Together, the multiple narratives describe a change in the attitude the woman has toward her illness. In her first story, she presents a rather distant and disinterested attitude in regard to the illness and the risk that this implies. In the second story, the consequences of being a DES daughter become obvious for the narrator. In a third narrative, she describes a political

awakening and commitment that also includes an interest in other women in a similar situation. This gradual change also includes an altered view of medicine and physicians. Drawing on Mishler’s (1984) voice of medicine and voice of the lifeworld, Bell demonstrated how the transformation is linked in the first two narratives to a confidence in the voice of medicine; she receives what she regards as reliable information about the illness from medical sources. In the third story, her trust in medicine has changed into a critique of medicine as the narrator criticizes the way that medicine treats the risks to which these women have been exposed. The narrator also criticizes the medical world for not providing the right information, for not preventing problems, and finally for prescribing the medication that caused the problem in the first place. According to Bell, ‘‘becoming political means balancing the voice of medicine with the voice of the lifeworld’’ (1988: 120). Following Bell, Langellier (2001: 152) studied what she called ‘‘interpretive links’’ among the stories one woman tells about her experiences of having breast cancer. Langellier, however, emphasized ‘‘the performative struggle of identity’’ in these narratives by asserting that the full meaning of stories is enacted rather than semantic and ‘‘located in the consequences of narrative as well as its meanings’’ (Langellier, 2001: 175).

Narratives in the Clinical Encounter In a study on the use of narratives in the actual encounters between patients and doctors, Clark and Mishler (1992) pointed out that narratives do not constitute a neutral instrument. Instead, there is a struggle in the encounter to determine which narrative about the patient’s illness will be the dominant one – the patient’s everyday narrative about the illness or the biomedical narrative about the disease. In typical medical interviews, physicians either ignore or interrupt patients’ everyday accounts of their problems. The traditional narrative presented by the medical doctor often has the form of a chronicle in which signs and symptoms are organized according to a biomedical plot but removed from the larger context of patients’ lives. Patients try to restore the everyday context of their illnesses, locating symptoms within their daily experiences and observing the impact of said symptoms on how they function personally and socially. In order for the patient to be able to present these contextualized stories, a realignment of the relationship as it is typically enacted in medical interviews is required. The physicians must relinquish or at least moderate their dominance and move

Medical Discourse, Illness Narratives 701

toward becoming more attentive and responsive listeners, encouraging patients to tell their stories instead of imposing the medical plot of illness and cure. In several studies, Young focused on the negotiations between various interpretative frames, for instance, between the world of medicine and the lifeworld. The patient visiting a medical doctor must relate in some way to the world of medicine (Young, 1997). Young stated that the patient can adopt at least two different strategies in relating to the medical dominance. The first strategy is – as Mishler pointed out – an attempt to break in against the voice of medicine and make the voice of the patient heard. This can be accomplished through ignoring, misunderstanding, or flouting medicine’s conventions. The second strategy involves inserting a narrative enclave consisting of lifeworld experiences into the realm of medicine (Young, 1997: 33). A case in point is Young’s analysis of the encounter between a medical doctor and an elderly Jewish professor who arrived in the United States via Auschwitz in 1945 (Young, 1997, 1989). By telling stories to the doctor during the physical examination about the bodily injuries he suffered during his imprisonment in Auschwitz, it becomes possible for the elderly patient to sustain his own sense of identity – despite the fact that he is standing naked before his doctor. This narrative about his experiences in Auschwitz has the form of a narrative enclave inserted into the medical discourse. The narrative enclave is an alternate reality that makes it possible for the patient to reappear as a person in the midst of the medical world. Several authors including Kleinman (1988) have pointed out that the illness narrative ought to have a central position in clinical medicine. He and several other researchers emphasized the importance of illness narratives as a tool by which doctors can acquire a more detailed clinical picture of the patient (Kleinman, 1988; Brody, 1987; Greenhalgh and Hurwitz, 1998). Narratives can also serve as active devices in therapeutic and clinical work. In her work on occupational therapy, American social anthropologist Cheryl Mattingly (1994, 1998) described how health professionals try to shape a progressive course of treatments and recovery processes into a coherent plot – what she referred to as ‘‘therapeutic emplotment.’’ A basic problem in clinical practice is how to insert therapy in some meaningful way into patients’ lives. Therapy as a clinical practice could be conceived of as a sequence of actions directed at the patient. This may work in certain contexts but in other contexts the clinician must work with the patient – especially with chronically ill patients. This means that the

clinician together with the patient must give the therapeutic actions a shared and common meaning. Mattingly pointed out that this can be accomplished by the healers actively striving to shape the therapeutic events into a coherent action organized by a plot. In this sense, telling narratives in the clinical context becomes a form of social action. Placed in a narrative context, physical dysfunctions or the daily tedium of routine rehabilitation may take on new meaning and thus become endurable. This form of therapeutic emplotment may also influence or change the patient’s time horizon for the course of the illness, by establishing a link between the medical interventions and the trajectory of recovery and engendering hope of eventual cure. Rita Charon (2001) argued for the development of a ‘‘narrative medicine,’’ emphasizing the physician’s expertise in absorbing, interpreting, and responding to patients’ stories. The doctor must learn the process of close, attentive listening to the patient in order to hear the patient’s narrative questions, but also to recognize that there are often no clear answers to these questions. Through attentive listening, a relationship is created that allows the physician to arrive at a diagnosis, interpret physical findings, and involve the patient in obtaining effective care.

Narratives, Medical Knowledge, and Experience It is not only those who are sick who use narratives to understand illness and suffering; narratives are central to the medical profession as well. The case stories published by Sigmund Freud are proof of this argument. In the practical, clinical context, narratives and storytelling are also part of the creation and communication of knowledge about illness and suffering in the processes of ‘‘constructing cases’’ (Atkinson, 1995) and telling ‘‘doctor’s stories’’ (Hunter, 1991). It is probably not surprising that narratives play such an essential role in medical practice, since medicine is partly based on descriptions of people’s illnesses and suffering. Medical professionals use narratives to make sense of illness in their work. In discussions between colleagues, stories seem to be crucial to creating patterns and to defining and reaching an agreement on what kind of illness one is dealing with and the best way to describe it. This means that stories are part of medical practice as well as of the creation of medical knowledge. Thus, both the study of illness narratives and narratives about illness (Hyde´ n, 1997) function as a window into how people create an understanding of their own suffering as well as others’ suffering.

702 Medical Discourse, Illness Narratives

In many ways, medicine is a communicative enterprise although the stories told by the patients constitute just a small part. British sociologist Paul Atkinson (1995: 90) wrote that ‘‘the cynical observer’’ might speculate that patients’ complaints about their suffering are rather ‘‘the pretext for medical talk.’’ Narratives become part of medical practice through such medical talk. Medical knowledge is created and re-created through a range of communicative activities – oral as well as written – between colleagues. The medical world is composed of, and exists partly through, reports, documents, clinical rounds, and conferences. It is possible to regard all these forms of collegial ‘talk’ from a narrative perspective since stories and storytelling are included in different ways. In his microsociological analysis of medical collegial talk, Atkinson (1995), drawing on what Elliot Mishler (1984) called voices, identified several voices of medicine representing various, and sometimes even competing, modes of medical knowledge. These are, for instance, the voice of science, which articulates knowledge ‘‘warranted by an appeal to research such as published scientific papers’’ (Atkinson, 1995: 131), and the voice of the eyewitness, which represents what a physician or a student physician has seen and done in relation to a certain case. In contrast to the latter, in which ‘‘the doctor stands in a personal relation to the knowledge’’ (1995: 131), Atkinson described the voice of personal experience, which is, at least when expressed in narrative ways, ‘‘almost exclusively the prerogative of senior physicians’’ (1995: 147). Even the nature of seniority, Atkinson argues, is partly established by the right to tell this kind of story. These different voices are used in the presentation of cases and thus become part of the stories told at clinical rounds and conferences. In accordance with the struggle between the voice of medicine and the voice of lifeworld that Mishler (1984) emphasized, Atkinson (1995) noted asymmetries between different voices of medicine. Not only are the narratives from the voice of personal experience told exclusively by senior physicians, but the voices of students and younger colleagues can be interrupted and fragmented by questions from senior physicians. Similarly, the voice of personal experience commonly includes laboratory findings and thereby also a more scientific voice of medicine. Through Atkinson’s study, the crucial role of narratives and storytelling for voicing and managing different opinions as well as domains of credibility and for the learning of medical knowledge becomes obvious. Hunter (1991) studied how physicians use narrative forms to present cases in patient rounds, case

conferences, and medical charts as modes of communicating medical knowledge to both colleagues and students. She argued that the narratives of medical doctors tend to focus on unusual or deviant instances of disease signs and processes, thus emphasizing the features of particular cases within the general framework of what is known about different types of disease and their typical forms of appearance. This narrative strategy expresses and supports a pragmatic and instrumental orientation to the treatment of patients.

Summary Narratives and storytelling serve important purposes in creating knowledge about personal illness for clinical practice, in constructing medical knowledge, and in teaching about illness and suffering. With the help of illness narratives, the individual can explain, legitimize, and manage experiences and actions that are difficult to understand by using, for instance, a medical diagnosis as the frame of explanation. Likewise, narratives are used in collegial talk to construct cases and to deal with opposite opinions as well as various modes of medical knowledge. Through enacted or performed illness narratives, ill persons or their relatives can oppose the story that medicine tells about them by the way that they narrate their actions rather than by their experiences of illness. This means that narratives are crucial to creating a coherent story, whether it is the story of one’s life or the case story of a patient that is at stake. See also: Medical Communication: Professional–Lay; Medical Discourse: Doctor–Patient Communication; Medical Discourse: Sociohistorical Construction; Narrative and Discourse Impairments; Narrativity and Voice.

Bibliography Atkinson P (1995). Medical talk and medical work: The liturgy of the clinic. London: Sage Publications. Bell S E (1988). ‘Becoming a political woman: The reconstruction and interpretation of experience through stories.’ In Todd A D & Fisher S (eds.) Gender and discourse: The power of talk. Norwood, NJ: Ablex Publishing Corporation. 97–123. Brody H (1987). Stories of sickness. New Haven: Yale University Press. Bu¨ low P H & Hyde´ n L-C (2003). ‘In dialogue with time: Identity and illness in narratives about chronic fatigue.’ Narrative Inquiry 13, 71–97. Bury M (2000). ‘On chronic illness and disability.’ In Bird C E, Conrad P & Fremont A M (eds.) Handbook of medical sociology. Englewood Cliffs, NJ: Prentice Hall. 173–183.

Medical Discourse: Non-Western Cultures 703 Cain C (1991). ‘Personal stories: Identity acquisition and self-understanding in Alcoholics Anonymous.’ Ethos 19, 210–253. Capps L & Ochs E (1995). Constructing panic: The discourse of agoraphobia. London: Harvard University Press. Charmaz K (1983). ‘Loss of self: A fundamental form of suffering in the chronically ill.’ Sociology of Health and Illness 5, 168–195. Charon R (2001). ‘Narrative medicine: A model for empathy, reflection, profession, and trust.’ Journal of the American Medical Association 286, 1897–1902. Clark J A & Mishler E G (1992). ‘Attending patient’s stories: Reframing the clinical task.’ Sociology of Health and Illness 14, 344–371. Frank A W (1995). The wounded storyteller: Body, illness, and ethics. Chicago: The University of Chicago Press. Frank A W (1997). ‘Enacting illness stories. When, what, and why.’ In Nelson H L (ed.) Stories and their limits: Narrative approaches to bioethics. New York: Routledge. 31–49. Greenhalgh T & Hurwitz B (eds.) (1998). Narrative based medicine: Dialogue and discourse in clinical practice. London: BMJ Books. Hawkins A (1984). ‘Two pathographies: A study in illness and literature.’ The Journal of Medicine and Philosophy 9, 231–2552.

Hunter K M (1991). Doctors’ stories. Princeton, NJ: Princeton University Press. Hyde´ n L-C (1997). ‘Illness and narrative.’ Sociology of Health and Illness 19, 48–69. Kleinman A (1988). The illness narratives: Suffering, healing, and the human condition. New York: Basic Books. Langellier K M (2001). ‘‘‘You’re marked’’ Breast cancer, tattoo, and the narrative performance of identity.’ In Brockmeier J & Carbaugh D (eds.) Narrative and identity: Studies in autobiography, self and culture. Amsterdam: John Benjamins. 145–184. Mattingly C (1994). ‘The concept of therapeutic ‘‘emplotment.’’’ Social Science and Medicine 38, 811–822. Mattingly C (1998). Healing dramas and clinical plots: The narrative structure of experience. New York: Cambridge University Press. Mishler E G (1984). The discourse of medicine: Dialectics of medical interviews. Norwood, NJ: Ablex Publishing Company. Morson G S (1995). Narrative and freedom: The shadows of time. New Haven, CT: Yale University Press. Young K (1989). ‘Narrative embodiments: Enclaves of the self in the realm of medicine.’ In Shotter J & Gergen K J (eds.) Texts of identity. London: Sage Publications. 152– 165. Young K (1997). Presence in the flesh: The body in medicine. Cambridge: Harvard University Press.

Medical Discourse: Non-Western Cultures J Clarac de Bricen˜ o, University of the Andes, Me´rida, Venezuela ! 2006 Elsevier Ltd. All rights reserved.

The term ‘shaman’ has its etymological root in the word ‘saman’ from a Manchurian language, meaning ‘one who knows.’ Originally the term was used in various areas of northern Asia and was later brought to the Americas, to the southeastern part of India, and to Australia and Africa. According to Mircea Eliade, shamanism or the socially specific role of the shaman was brought to the Americas by the first eastward waves of immigration from Asia. The origin of the shaman in America (the continent in which the role has been most studied) is well documented (see Eliade, 1974: 266; Lowie, 1934: 183–188). The role of the shaman is not limited to individual practice. It is currently portrayed in the literature as a category of social activity with a legitimate function as a health care phenomenon. Classifications include legal aspects, health program functions, adaptation in the migration of ethnic groups, and shamanic practice in the integration of tribal peoples and their customs

into the social mainstream. As such the role of the shaman has become an interdisciplinary study. Elaborations in the area of anthropology and ethnopsychiatry have focused on therapeutic aspects and role models which include the magical or nonlocal physical aspects of shamanic activity. These include Geza Roheim (1950), Mircea Eliade (1951), Claude Le´ vi-Strauss (1958), Alfred Me´ traux (1967), George Devereux (1970), and others who treat aspects of hypnotic suggestion or trance as therapy.

Mircea Eliade: Ancient Techniques for Producing Ecstatic States Eliade is the great classic authority on shamanism, and is recognized as such by those who wrote after him. His magnum opus (Shamanism and the ancient techniques of extasis, 1951) is the first attempt to comprehensively approach the subject from the point of view of a religious historian. His sources include works on what is described as classical shamanism in Siberia, the Americas, Indonesia, and Oceania. Mystical experience is central to his theme, in which he emphatically rejects the proposition of

Medical Discourse: Communication Skills and Terminally Ill Patients 665

Medical Discourse: Communication Skills and Terminally Ill Patients I G Finlay and S Sarangi, Cardiff University, Cardiff, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Recent years have seen a strong drive for the teaching of communication skills at undergraduate and postgraduate levels of medicine in the United Kingdom. Many studies, particularly of doctor–patient interactions, have objectively supported the need for skills training, especially as health professionals’ attitudes have been changing toward patients’ psychosocial needs (Fallowfield et al., 1998; Royal Society of Medicine, 2000). The National Health Service, in seeking to better inform and understand patients’ needs, has sought to develop a paradigm of interaction that assumes of, or imposes upon, doctors and nurses the necessary ‘communication’ skills (SUPPORT, 1995). Problems with communication are recognized as being not only a major source of anxiety for patients and their relatives but also a common source of complaints and litigation; about 90% of the complaints that reach the NHS Ombudsman are attributed to poor communication between the health care professional and the patient and/or his family. Within a culture of blame and litigation, and with many patients having unrealistic expectations of clinical care, the doctor can also fear accusations of culpability, even if the patient’s disease has progressed beyond the bounds of clinical science. The training and assessment of communication skills in the medical curricula has been consolidated since the General Medical Council produced its document ‘Tomorrow’s doctors,’ which specifically outlines the core requirements of any undergraduate curriculum. At the undergraduate level, the curriculum has become increasingly crowded as medical science advances and medical students need to be familiar with a bewildering amount of rapidly advancing knowledge. Some have claimed that medical training is being compromised as a result (Williams and Lau, 2004) and that communication skills are inherent in the student upon entry to medical school, with very little cost-effective change from teaching. This recognition of the importance of effective communication has also driven the formal assessment of consultation skills in professional examinations, often using trained actors to simulate patients in an attempt to achieve a uniform assessment standard. The drive to teach communication skills at the postgraduate level has gathered momentum in the

secondary and tertiary care sectors in recent years. General practice training in the United Kingdom has incorporated communication skills training and assessment for over 20 years, yet only half of general practitioners report having had training to prepare them to communicate with the terminally ill (Barclay et al., 2003). The importance of the quality of information giving that is required to ensure that patients can make informed, competent, voluntary decisions free of coercion is now being recognized in law. In the United Kingdom, the forthcoming Mental Capacity bill focuses on the wishes of the patient, with an assumption of capacity for decision making and a rigorous requirement for objective evidence to support the view that a person is not competent to be involved in making decision about his care. The importance of effective communication and assessment of true need is nowhere more stark than when the patient requests death to be hastened or refuses consent to treatment. In the terminally ill, the desire to shield patients from the reality of their situation and avoid causing heightened anxiety often leads to poor truth telling and nondisclosure; importantly, this has been shown to usually create even greater difficulties for patients and their relatives, as it denies them the opportunity to reorganize and adapt their lives toward realistic goals (Fallowfield, 2002). Having better understanding of the nature of this important area of communication should not only improve the quality of the patients’ experience and clinical care but also improve the decision-making process, and hopefully might lead to a reduction in complaints (and possibly litigation). An additional benefit of good communication skills training may be a decrease in work-related stress, as staff often report feeling inadequately prepared to deal with patients who are dying and their families.

The Ethos of the Terminally Ill Terminally ill patients are particularly vulnerable, often having a heightened awareness and sensitivity to those around them. They are vulnerable by dint of facing death, and they are also vulnerable precisely because of their disease (Christakis, 1999; Seale, 2000). Infection, metabolic disturbances, cerebral disturbance by metastases or ischemia, and the effect of medication can make patients confused and less able to cope with new information, thus impairing their competence. Their families are also distressed and keen to minimize the distress experienced by the person they love

666 Medical Discourse: Communication Skills and Terminally Ill Patients

– the patient. Clinicians are in a difficult position as well. They may have spent months or years advising and taking a patient through grueling investigations, treatments, and interventions in the hope of achieving a cure or disease control. When these efforts fail, the clinician can feel a sense of personal failure, making it difficult to remain strong and supportive of the patient. Attempts to keep up a strong front are then interpreted as indifference by the patient and/or his family. Distress in terminal illness is multifactorial, with overlap and interaction between the domains of concerns: physical, emotional, social, and spiritual. Problems intermingle, amplifying each other and enhancing the overall distress felt by the patient, to the point that the concept of Total Pain, coined by Cicely Saunders (Saunders, 2000), is experienced by those whose needs are not addressed. The physical assessment of symptoms will depend on taking a history, often in a semistructured way, examining the patient physically, and investigating further as appropriate with blood tests and so on. However, assessment of the concerns, fears, beliefs, and wishes of the patient requires a much more subtle interactional approach that depends on detecting cues, demonstrating empathy by reflecting comments in the language the patient uses, and presenting an overall nonverbal demeanor of active, engaged, and nonjudgmental listening. In the terminally ill, patients’ needs fall into the broad categories of: a decrease in functional status or activity; role change; symptoms, particularly pain; loss of control; financial burden; and conflict between wanting to know what is going on and fearing bad news. There have been studies of information giving in those with a poor prognosis, usually relating to the imparting of bad news (Kutner, 1999). These needs are broad in scope, not particularly associated with any specific patient characteristic, and therefore underline the need for careful individual assessment. For patients with a terminal illness, the diagnostic or prognostic news received in an outpatients’ clinic can be devastatingly traumatic; they are suddenly aware of what lies ahead of them, including the unavailability of cure, the uncertainties about how the illness will develop over time, the long-term and short-term side effects that any treatment regime might inflict, the impact of their situation on those they love, and the extent to which family relations might become affected, to name a few issues. Much research about the terminally ill has focused on the perceptions of relatives, their judgments and evaluations of the care given. Remarkably little has concentrated on the needs of children and teenagers as relatives. Children have different adapting styles

closely related to emotional development. Although beyond the scope of this article, the topic is in urgent need of good and sensitive research to guide clinical services. A cancer patient’s deficit in his quality of life has been defined as the relationship, in terms of a gap, between his expectations and the current reality experienced (Calman, 1985). Yet no health care professional can narrow this gap without really understanding a person’s hopes, expectations, and the complex nature of their current distress. Hence, the quality of interactions of the patient with various health professionals will inevitably become a key influence on his quality of life. Many of the communication issues explored in these studies, especially the tensions between the ‘medical’ and the ‘social’ in the delivery of bad news, communicating risk, decision making, and so forth, are important areas for future inquiry in hospice and palliative care patients.

Communication Issues Specific to the Terminally Ill Although the quality of clinical interaction is seen as important in training professionals, the quality of interaction has not been a topic of inquiry in its own right. From a discourse/communication perspective, there is a large body of doctor–patient communication studies, but these have been mainly limited to general practice. Most health care studies in the sociolinguistic communication tradition have mainly engaged with the general practice setting, with their analyses being doctor-oriented and focusing on the structural organization of the consultation process. Very few studies deal with cancer clinics and care centers or focus on the life history of patients or their cumulative interactional experiences. There is now widespread recognition that ‘‘the quality of clinical communication is related to positive health outcomes’’ (Simpson et al., 1991) and that the same holds true for cancer care: ‘‘inadequate communication may cause distress for patients and their families’’ (Fallowfield and Jenkins, 1999; Fallowfield et al., 1998). Specific difficulties encountered in discussion with the terminally ill relate to prognosis. Before information is given, the clinician must assess how much the patient already knows, and what he expects will be the likely course of the disease and its accompanying problems. Some patients may have witnessed poor care or distressing death in others. Others may have guilt about issues in their own life, blaming themselves for the illness or viewing it as some type of retribution for past deeds. Family conflicts, varying

Medical Discourse: Communication Skills and Terminally Ill Patients 667

expectations from health care, and emotions such as anger, denial, or guilt are powerful influences on communication. Although to the dispassionate pragmatist the situation may feel hopeless, the vast majority of patients can cope better with reality if they have something to hope for. However, such hope must be realistic – it must be the patient’s own wish and arise from his aspirations rather than from the imposed values of others. An example is the patient who knows he is dying but hopes to be alive for his son’s wedding. Knowing that every effort will be made to achieve that goal will instill realistic hope, which is preferable to reiterating the obvious but more pessimistic view that in reality a sudden catastrophic event such as an embolism may occur at any time, and so he should not hold out too much hope!

Quality of Information Many studies have been concerned with information giving (Davison et al., 2004; Friedrichsen and Strang, 2003; Walsh and Nelson, 2003), attempting to evaluate the patients’ understanding of their situation as a proxy measure of the quality of the communication, rather than assessing the interactional data within a consultation. Such studies fail to recognize the varied competencies of those who are frail and ill, the differing individual contexts of their lives and their diseases, and that emotional factors can act as a powerful block to the patient being able to retain information, even if the clinician is normally skilled at giving information. The studies also fail to assess the information needs of the patient, often omitting assessment of the patients’ perception of the issues they feel rightly concern clinicians and those they feel would be viewed as trivial by their doctor. Patients’ past experiences, beliefs, and background knowledge are powerful influences that must be recognized in order for information giving to be effective and sensitive. Accurate information is crucial to decision making. The powerful influence of the doctor cannot be underestimated – hope, concern, and realistic optimism are powerful feelings that suffuse a consultation and are absorbed by patients. By contrast, a pessimistic outlook is rapidly detected and can lead to hopelessness, despair, loss of a sense of dignity, and demoralization of the patient (Chochinov et al., 2002; Kissane et al., 2001). There are two groups of studies: (1) those dealing with information the patients need/want; and (2) those addressing the quality of interaction between patients/their families and health professionals, seen as information exchange. The studies have used

questionnaire and interview methods and have suggested that patients prefer interview, particularly in the home setting (Montazeri et al., 1996a; Montazeri et al., 1996b). Studies of information needs of cancer patients in general (Christakis and Lamont, 2000; The et al., 2000; Montazeri, 1998; Billings, 2000) confirm that most need clear information about diagnosis and treatment. According to Sell et al. (1993), 92% of patients required a truthful disclosure of diagnosis, but 26% of patients indicated a lack of information about prognosis. A recent large-scale questionnaire study (Jenkins et al., 2001) also showed that 87% of the 2331 patients wanted all possible information concerning their illness and treatment. Gamble (1998) elicited patients’ experiences and reported variations in individual patients’ need for information and their perceptions of ‘‘insufficient information about the side effects of treatment.’’ More information per se is not the solution. Information must be contextualized, and its understanding depends on the prior knowledge base of the recipients, who are usually the patients and their family members. There is some evidence for variable communication with cancer patients and for their dissatisfaction (Royal Society of Medicine, 2000). Quirt et al. (1997) used a combination of interview and questionnaire to elicit both cancer patients’ and professionals’ views about the extent of the cancer, the intent of treatment, and the risks and benefits of treatment. Despite some agreement, the researchers found significant discordance between the understanding of doctors and patients of the extent of disease and intent of treatment. There was an indication that patients’ lack of understanding of their condition could impair treatment decisions. Only a few studies have engaged explicitly with prognosis and quality of life issues. Leydon et al. (2000) used interviews to investigate informationseeking behavior in a small study of 17 patients with a variety of cancers. They found that, as part of the coping strategy, patients may choose not to obtain further information about their condition based on ‘faith,’ ‘hope,’ and ‘charity.’ ‘Hope,’ or ‘false optimism about recovery’ may be seen as an overriding principle that guides patients’ reluctance to talk about prognosis and its impact on their quality of life (Christakis and Lamont, 2000). Montazeri et al. (1998), in an interview study of 82 patients with newly diagnosed lung cancer, commented that ‘‘proper communication remains limited.’’ A case study by The and colleagues (2000) has looked at the whole trajectory of illness; this concentrated on patients with small cell lung cancer and the issues of collusion about prognosis and

668 Medical Discourse: Communication Skills and Terminally Ill Patients

of false optimism. However, the findings of Sell et al. (1993) and Jenkins et al. (2001) differed; these researchers found that patients and their family members may even interpret the term ‘treatment’ in a curative rather than palliative sense. In a follow-up study, The et al. (2003) explore the reasons underlying ‘false optimism’ and suggest how radiographic images as clinical evidence are given priority over interpretation of subjective bodily sensations as a way of coming to terms with the illness trajectory. This supports the change in clinical awareness that quality of life assessments should prioritize palliation of symptoms, psychosocial interventions, and understanding of patients’ feelings more than duration of survival (Billings, 2000; Girling et al., 1994). Most of these studies overlook the interactional dimensions of what actually happens in a given clinical setting.

Quality of Interaction For thematic analysis, we will regard Quality of Interaction as constituted in psychosocial/relational and biomedical/procedural dimensions. The longstanding distinction between doctors inhabiting ‘‘the voice of medicine’’ and patients inhabiting ‘‘the voice of the lifeworld’’ may be a useful starting point (Mishler, 1984). With older patients, social and relational issues often assume greater significance. Nurses are more likely to embrace a social/relational style when dealing with patients (Fisher, 1991). The decision on the part of doctors to focus on the biomedical dimension (e.g., treatment options such as chemotherapy and the treatment schedule) at the expense of the psychosocial/relational implications of a diagnosis is very much embedded in Western medical practice, but it is driven by the need to ensure that patients’ consent to treatment is realistically informed by the choices available and the probable outcomes of each course of action. Barton et al. (2005), in their study of end-of-life discussions in surgical intensive care units in the United States, identify four phases: opening, description of current status, holistic decision making, and logistics of dying. Of particular significance is the second phase, description of current status, where a shift from therapeutic to palliative care has to be accomplished discoursally. This also happens to be the phase where delivery of bad news (Buckman, 1992) has to be strategically managed (not in the diagnostic sense, though). Barton et al. (2005) thus complement Curtis et al. (2002), which is geared toward a content-based analysis of physician contributions in these end-of-life discussions.

Iedema et al. (2004), drawing upon data from Australian multiprofessional team meetings, focus on professional self-presentation and pay attention to the tension between the medical worldview and that of the dying patient. Studies in the UK context include the geriatric clinic (e.g., Coupland and Coupland, 1998), where the framing of familyrelational issues become significant in the care of the elderly. Styles of Involvement in Patient-Centered Palliative Care under Examination Conditions

In this section we outline a pilot study under examination conditions (Cardiff University Diploma in Palliative Medicine). We shift attention from labeling of itemized skills such as ‘open questioning’ or ‘active listening’ to an understanding of how consultations are interactionally achieved by using a discourse analytic perspective. In broad terms, discourse analysis focuses on the jointly constructed process of interaction as it pays attention to the multifunctional, context-specific nature of language use. Styles of involvement at the communicative level are manifest at various levels of linguistic choices and interactional trajectories. At the linguistic level, degrees of involvement can be signaled through selection of pronouns, preference for active or passive agents, choice of modalization and lexicalization, and so forth. At the interactional level, features such as overlapping talk, ellipsis, other-initiated turncompletions, and perspective display sequences are indicative of conversational involvement. Our analytic focus will be on how doctor–candidates interactionally orient to their actor–patients, with particular reference to history taking and care/treatment management. We offer a distinction between ‘recycling of patients’ words’ and ‘reformulation of patients’ words’ and suggest that these styles elicit different accounts of patients’ experiences and expectations. The data consists of transcripts of audio-recorded sessions of simulated consultations within the Cardiff Diploma in Palliative Medicine. The analysis is carried out in two phases using the discourse analytic methodology of interactional mapping and thematic staging. In conclusion, we discuss the implications of such microanalytic analysis for validating assessment and for raising professional awareness. We adopt a discourse/communication perspective, focusing on information exchange structures, use of rhetorical strategies, management of interactional role-relationships, and so on (Sarangi and Roberts, 1999). For current purposes, we draw upon what may be called theme-oriented discourse analysis (Roberts and Sarangi, 2005) based on interactional

Medical Discourse: Communication Skills and Terminally Ill Patients 669

mapping (Roberts and Sarangi, 2002). Of particular interest in the palliative care context is the notion of topic, i.e., how topics are navigated following a schema in patterned ways (Chafe, 2001). Thematic Maps in Palliative Care Consultation

Theme-oriented discourse analysis draws upon a number of concepts from linguistics and social sciences. The following list, suggested by Roberts and Sarangi (2005), can be applied to the palliative care setting. . Interactive framing: Framing is a filtering process whereby issues of concern to each communicant are either brought into their ‘frame’ or are discarded from the focus of the interaction (Goffman, 1974; Goffman, 1981). The priorities of each party will determine the frame and its content – for example, the doctor may be focused on physical symptoms, so that descriptors of pain and other physical changes will remain in the frame, but expressions of concern for the distress that this illness is causing to the patient’s child may be discarded. By contrast, the main concern of the patient may be the child, with the awareness of physical symptoms acting as a reminder and marker of impending death and the herald to bereavement for the child. Within their shared frames there is little true overlap. When the problems are listed together, understood by both, and shared, the quality of the interaction is perceived to have improved (Roberts and Sarangi, 2003). . Contextualization cues and inferences: These are the linguistic and prosodic features that invoke the context of each utterance to give it meaning. They channel the inferences in a particular direction by calling up the frames and affecting the footing of each moment of interaction (Gumperz, 1982; Gumperz, 1992). These features include intonation, stress, pausing, and rhythm, which tend to be used unconsciously and usually are unnoticed in their role of establishing or reinforcing social relationships. For example, the clinician encourages disclosure of a previously unsatisfactory consultation by use of ‘mmm’ in a nonjudgmental tone, and then follows with ‘that seems to have angered you.’ . Face and face-work: In an attempt to ‘save face’ in an interaction, the person perceived as weaker or more vulnerable – the patient – will use politeness as a strategy to establish closeness and engage the empathy of the more powerful person – the doctor (Brown and Levinson, 1987). Thus, the patient will be hesitant to disagree with the doctor, even when in fact the doctor has failed to grasp the patient’s

priorities in the situation. This, is starkly seen when the doctor uses closed or rhetorical questions: ‘is your pain better?’ ‘your pain seems better, doesn’t it?’; to which the patient agrees before hesitantly adding a qualifier such as ‘yes – a bit – but it still gets bad as the day goes on.’ . Social identity: Gender, social status, regional and ethnic background, and many other influences affect a person’s social identity. These preexisting factors are brought to the consultation, but the way one person behaves to another will influence the social identity – the performed social identity (Erickson and Schultz, 1982) – of each during the course of the consultation. Thus, the patient who finds he comes from the same village as the doctor will feel closer and assume greater empathy than the patient who is racially different from the physician and who speaks with a very different accent. In both cases, such assumptions are reinforced or dispelled by the nature of the interaction itself, thereby increasing or decreasing the patient’s sense of comfort with and confidence in the doctor. An important factor is the time that each is allowed by the other to speak – as a general rule of thumb, the doctor does well to spend about 80% of the consultation time listening and only 20% speaking, thereby affording the patient a sense of respect and encouraging the patient to disclose anxieties. . Rhetorical devices: Rhetoric is the language used to influence or persuade, usually in a political context. However, in the consultation, such devices can be parallel syntax, repetition of lexical items, metaphor, analogy, reported speech, and lists. For example, the patient may describe various symptoms, including pain that is ‘terribly sharp out the blue, makes me bend double . . . gets me down,’ to which the doctor may reply ‘sharp . . . like a knife’ or may summarize with a list of the symptoms just reported and then check back to ensure that nothing has been omitted. This makes the patient aware that the doctor has understood the patient’s agenda of problems to be attended to. . Analysis of the consultations showed that those deemed effective by validated scoring, when used for assessment, also had specific features in discourse analysis. High scoring (effective communicator) consultations demonstrated that the doctor verbally dominated for a much shorter time (about 20% of the time); used lexical reflective statements early to encourage disclosure of problems; summarized issues demonstrating coinciding frames from the patient’s concerns and the items afforded attention by the doctor; and used contextualization cues that encouraged the patient to disclose concerns.

670 Medical Discourse: Communication Skills and Terminally Ill Patients

Extract A [D ¼ female doctor; P ¼ female patient] 1 D: [. . .] your oncologist thought it would be good to come and see me and I just wondered what you why you were coming to see me 2 P: uhm (.) well basically it’s because my hip’s still giving me pain I’m getting problems with my breathlessness it seems it seems to be almost getting worse and uh walking is becoming difficult and oh I mean it’s just (.) I don’t know where to start really but .hh I ju – I need stronger painkillers the tylex isn’t doing particularly much you know taking away the pain (it’s almost making me worse) but then I get constipation and nothing else even gets rid of that (^^^^^^) even though I tried prune juice and you know the normal kind of remedy 3 D: can you tell me who started you on tylex 4 P: my GP I think yea 5 D: I see 6 P: yea it does take some of the pain away but I’m not taking them I (take probably) eight a day but it’s sti:ll it’s still becoming quite a strain to almost sort of get through the day and I don’t know what’s happening and there seems to be so much going on and I really had – when I went to Birmingham I was like so full of hope and everything else and then suddenly it’s like no and it’s just all come as a bit of a shock as well to me 7 D: so thinking about what you’ve just said you said it’s all come as a bit of shock you wanna just tell me a bit more about that 8 P: ((crying voice)) well I thought I was gonna get treated I thought I was gonna to be okay and now just from what I can gather I’m just not going to be okay and I don’t know how long I’ve got and what I’m doing or where I’m-I mean where I’m even going you know I just ((crying voice)) I even had hopes of going to university so ((crying voice)) (^^^^^) be realistic so we even uhm ((crying)) my my family and I were (^^^^^^) and I mean they know that I do have this pain and I know and dampen it down for them and I think they know that so they think it’s probably even worse than it is ¼ 9 D: mhm 10 P: ¼ and (.) oh it’s just ((crying voice)) I don’t even like talking about it with them because they just get worried and they (can do nothing either) then there’s no one to talk to and I don’t know what’s going on and it’s just becoming really really (.) ((crying)) just stressful 11 D: (^^^^^^^^^^^) 12 P: (^^^crying voice renders words inaudible^^^)) and now my painkiller is not even doing very much for me so now I’m at a point where I’m rapidly going downhill 13 D: (^^^^^^^^^^^) 14 P: yes so it’s mainly about about the painkillers really [uhm if I can have any stronger ones

15 D: [is it about the breathing as well 16 P: yea I am quite breathless just in everyday life uhm walking becomes more and more difficult uhm because of the breathlessness and I I find it very difficult to exercise you know and I’ve got the pain in my hip as well so to be honest 17 D:hm 18 P: I never exercise 19 D:okay (1.0) (^^) when (.) when did this when did they say there was no treatment 20 P: ((crying voice)) it was about a month ago a month and a half ago when I went up to Birmingham 21 D:(and then you went) 22 P: ((angry voice)) well I had two days of tests up there and then they suddenly went well no actually we don’t – no we can’t do anything for you because you’ve got problems with your lung (2.0) and up until that point I really thought ((emotional voice)) I really thought that I was going to be okay 23 D:so it was a complete shock (2.0) 24 P: yea 25 D:what about your family 26 P: uhm (1.0) oh uhm (2.0) they know they know that I am not going to get treated they don’t know how long I’ve got and neither do I ((emotional)) you know there’s something inside me you know it’s just so: frightening to think it’s taking over my my life and my body and that its it’s gonna end [it’s just it’s just so 27 D:[it’s hard isn’t it to be (^^^^^^) isn’t it (1.0) 28 P: ((crying voice)) I don’t even know when I’ve got no idea and (^^^^^^^^^^) and uhm yeah my family do know you know they love me and they are very supportive and helpful but they are just so worried that I don’t wanna worry them more 29 D:mhm 30 P: and you know they’ve problems of their own to deal with not major ones but you know they’ve gotta go through life and it’s just 31 D:is there (^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^) 32 P: no I think it’s I think it’s my fault I know it’s a big strain on (my) marriage ((cries)) (^^^^^^^^^^^^^^^^^^^^^^) (and for me) 33 D: have you got anybody to talk to 34 P: ((crying voice)) not really 35 D:any support? 36 P: ((crying voice)) not really no well my family are supportive they are I just don’t like 37 D:you don’t like 38 P: talking to talk to them it’s not their fault it’s more it’s more me not wanting to say anything 39 D:mhm

Key analytic points: D seems to be following the flow of topic as P wishes to develop it; P is given adequate interactional space to cover her major concerns, ranging from coping with painkillers to frustration with the medical personnel to family

Medical Discourse: Communication Skills and Terminally Ill Patients 671

support. At crucial points D summarizes the key issues (turn 7, turn 15, turn 23). Most of the doctor’s questions are derived from what the patient has said (e.g., turn 19, turn 25, turn 27, turn 33). The minimal responses throughout help the patient to reveal all her concerns, unhindered. In a sense, this flexible approach allows the doctor to get a sense of P’s priorities and major concerns before responding in any definitive sense. Let us consider a second extract from the same setting, but with rather different thematic/interactional development. Extract B [D ¼ female doctor; P = male patient] 1 D: I’m Dr (first name last name) and I work with the (symptom) control team that work here at the hospital and Dr Bloggs has dropped me a short note just to ask me to (^^^) see you and give us an opportunity to look at how you are at present 2 P: yeah 3 D: what would really help me if you could give me a little more background about what’s been happening how you’ve come to where you are now 4 P: yea (^^^^^^^) yea about a year ago I uhm I was diagnosed with a cancer in the hip and uhm I get a lot of pain from that 5 D: hm 6 P: and uhm I was I had chemotherapy for six weeks a:nd I was all geared up to go for surgery (^^^^^^^^^^) a:::nd uhm I went up to Birmingham for that ¼ 7 D: mhm 8 P: ¼ and (.) and then on the day of the operation they went and canceled it ¼ 9 D: right 10 P: ¼ and uh said that I wasn’t that the problem was it was probably my lungs 11 D: it must have been a big shock for you to have got all geared up for your surgery 12 P: yes that’s right˚it was˚ 13 D: and what problem were they explaining you had with your lungs? 14 P: well uhm they were sort of saying that the cancer had spread into the lung uhm (.) that was it 15 D: you’d been expecting that to be a possibility? were you warned that that might happen? 16 P: well no 17 D: hmm okay how long ago was that that you should have had your surgery? that was actually very [recent was it? 18 P: [that was about a month ago 19 D: ri:ght (0.5) okay how have things been in the month since then 20 P: uhm well not too brilliant uhm (0.3) a lot of pain uhm (1.0) and I just—I don’t know what to (have) (0.5) what if (^^^^^) from that (0.5) 21 D: will you just (.) just hold on to that and let me find out just a little bit more about you that would just

help me ¼ 22 P: yea 23 D: ¼ to to understand and how old are you now? 24 P: twenty 25 D: right and whereabouts are you living (^^^^)? 26 P: well I I live at home in XXX 27 D: right 28 P: uhm with my parents I am a student in uh in YYY uhm but I haven’t been there for the last (0.3) eight months or so 29 D: hm the last year has been very much taken up with being ill (I would think 30 P: yea 31 D: yes do you have any brothers and sisters? 32 P: yea I’ve got one sister 33 D: and she is at home as well is she? (^^^^^^^^^^^^) and how are your family been finding situation (.) have they been supportive to you or (^^^^^^^^^^^) 34 P: well they’re worried sick 35 D: hm 36 P: uhm (1.0) my mother is uhm (.) supportive (^^^^) taken it in (^^^^^^) 37 D: right 38 P: and my father is (.) he does not really talk about it much (^^^^^^^^^^) 39 D: mhmm that is his way (^^^^^) 40 P: yeah I think so he doesn’t really talk about it much [Ten turns omitted] 51 D: yeah (2.0) and before you had the problem with your hip did you (^^^^^^^^^^^) or (^^^^^^^) 52 P: it was a shock [5 min] 53 D: yes (2.0) oh (.) that’s helpful for me to see if you have an idea of (.) to find out more about you (.) is there anything else you think would be helpful for me to know just to (^^^^^^^) bigger picture (^^^^^^^) 54 P: I just wonder (what to do next) 55 D: yea right 56 P: it’s (^^^^^^^^) 57 D: yea (1.0) so shall we go back to (.) the pain which seems to be the main thing that Dr Bloggs asked me to see you about 58 P: yes 59 D: can you just give me a little bit more idea what’s been happening with that 60 P: yes uhm well it’s been getting worse really I’m taking all these painkillers ¼

Analytic points: D’s style of questioning borders on hyper-questioning (see turns 11–33), which does not allow P to nominate or elaborate on what concerns him most. D’s questions are partly unrelated to what P has just revealed. Some of the questions preempt the patient’s experience (e.g., turn 29), expecting the patient to confirm or disconfirm. This style of questioning is very different from what we saw in Extract A. Overall, one gets the sense that here D is pursuing his own history-taking agenda, as evidenced in turn 21 (‘‘will you just hold on to that and let me find out

672 Medical Discourse: Communication Skills and Terminally Ill Patients

just a little bit more about you’’). Notice that D returns to this topic in 57 (‘‘shall we go back to the pain’’). In both extracts, patients’ circumstances have been framed in the wider family context, but there are significant interactional differences across the two styles of topic involvement. We have chosen to present the above two extracts that show contrasting styles of interactional involvement. Space does not permit us to offer a detailed discourse analysis of all aspects of the interaction, so we limit ourselves to a theme-oriented approach as suggested earlier. Having decided on the theme of involvement, we can notice that extracts A and B have different involvement patterns. Whereas in A the patient is allowed generous interactional space to narrate her concerns, extract B is characterized by a sequence of question–answer turns mainly around the doctor’s agenda. Although both consultations explore the family support system, they do so very differently and give this theme differential priority in the interaction. Communication about the illness within the family setting receives due attention.

between the expectations and experiences of individual patients and how patients respond to such critical moments. A distinction between quality-of-life outcome measures in clinical research and the management processes and communication issues that address quality of life in daily clinical practice is overdue. Given our focus on clinical practice, the following dimensions of communication need to be highlighted: identifying and prioritizing problems; facilitating communication; screening for hidden problems; facilitating shared clinical decision making; and monitoring changes or responses to treatment. We would suggest that a detailed study of quality of interaction can contribute to our understanding of quality of life measures. This is particularly relevant in the context of cancer patients, for whom quality of interaction with health professionals and family members prove critical to all aspects of personal and clinical decision making during their treatment and end-of-life care.

Quality of Life

In the introduction we challenged the possible contributions that language and communication research can make to health care practice. One option is what we have displayed here – collaborative interdisciplinarity (Sarangi, 2002). The other option is longstanding ethnographic involvement in the research site so as to understand the very setting and the professional roles and identities (Candlin, 2003; Clarke, 2003; Cicourel, 2003). It does seem true that communication skills cannot be taught or researched generically by nonclinical experts alone, as they do not have a comprehensive understanding of the complexities of medical conditions and the various aims of a consultation. Although they may have specialized knowledge of language and communication, they may have little knowledge and understanding of factual clinical content, which is necessary for their analysis to be useful to practitioners. On the other hand, communication skills are usually being taught in a prescriptive and formulaic way, with no recognition of language and communication as a discipline (Sarangi, 2004). Although the quality of clinical interaction is seen as important in training professionals, the quality of interaction is not a topic of inquiry in its own right – but there is a pressing need to combine linguistic theory with clinical practice in the context of care giving.

Both the extracts above deal with quality of life issues, but they do so differently in interactional terms. Quality of interaction has been given a separate emphasis from quality of life, which is affected by the quality of interaction. Quality of life is seen as functions in specific domains, with the aim of minimizing distress through attention to symptoms, psychosocial, and existential concerns. Much of the patient disclosure relates to informing the physician of the factors that are impairing quality of life, and these factors form the basis of the areas to be addressed in the clinical management plan. The notion of ‘quality of life’ embraces almost everything, ranging from physical/functional and social/ economic to psychological/spiritual dimensions. A useful operational definition of quality of life is ‘the relation between expectation and experience’ (Calman, 1984; Carr et al., 2001). We assume that both life expectations and lived experiences are seen at the level of interaction. It is then possible to look for quality of life issues at the interactional plane, while allowing for individual variations. In the clinical encounter, different views of both expectations and experiences are bound to surface. We will be looking for the interplay of experience and expectation (i.e., quality of life) in terms of how past, present, and future scenarios are talked about. One research task will be to see the extent to which health professionals orientate their communication to the relation

Concluding Remarks

Bibliography Barclay S, Wyatt P, Shore S, Finlay I, Grande G & Todd C (2003). ‘Caring for the dying: how well prepared are

Medical Discourse: Communication Skills and Terminally Ill Patients 673 general practitioners? A questionnaire study in Wales.’ Palliative Medicine 17(1), 27–39. Barton E, Aldridge M, Trimble T & Vidovic J (2005). ‘Structure and variation in end-of-life discussions in the surgical intensive care unit.’ Communication and Medicine 2, 1. Billings J A (2000). ‘Recent advances: palliative care.’ British Medical Journal 321, 555–558. Buckman R (1992). How to break bad news: a guide for health care professionals. Baltimore, MD: Johns Hopkins University Press. Buckman R (1998). ‘Communication in palliative care: a practical guide.’ In Doyle D (ed.) Oxford textbook of palliative medicine. New York: Oxford University Press. 141–158. Calman K C (1984). ‘Quality of life in cancer patients – an hypothesis.’ Journal of Medical Ethics 10(3), 124–127. Candlin S (2003). ‘Issues arising when the professional workplace is the site of applied linguistic research.’ Applied Linguistics 24(3), 386–394. Carr A J (2001). ‘Is quality of life determined by expectations or experience?’ British Medical Journal 322, 1240–1242. Chafe W (2001). ‘The analysis of discourse flow.’ In Schiffrin D, Tannen D & Hamilton H (eds.) The handbook of discourse analysis. Malden, MA: Blackwell. 673–687. Chochinov H M, Hack T, Hassard T, Kristjanson L J, McClement S & Harlos M (2002). ‘Dignity in the terminally ill: a cross-sectional, cohort study.’ Lancet 360(9350), 2026–2030. Christakis N (1999). Death foretold: prophesy and prognosis in medical care. Chicago: University of Chicago Press. Christakis N A & Lamont E B (2000). ‘Extent and determinants of error in doctors’ prognoses in terminally ill patients: prospective cohort study.’ British Medical Journal 320, 469–473. Cicourel A V (2003). ‘On contextualising applied linguistic research in the workplace.’ Applied Linguistics 24(3), 360–373. Clarke A (2003). ‘On being an object of research: reflections from a professional perspective.’ Applied Linguistics 24(3), 374–385. Curtis J R, Engelberg R, Wenrich M, Nielsen E, Shannon S, Treece P, Tonelli M, Patrick D, Robins L, McGrath B & Rubenfeld G (2002). ‘Studying communication about end-of-life care during the ICU family conference: development of a framework.’ Journal of Critical Care 17, 147–160. Davison B J, Parker P A & Goldenberg S L (2004). ‘Patients’ preferences for communicating a prostate cancer diagnosis and participating in medical decision-making.’ British Journal of Urology Int. 93(1), 47–51. Erickson F & Schultze J J (1982). Counsellor as Gatekeeper: Social Interaction in Interview. Academic Press. Fallowfield L & Jenkins V (1999). ‘Effective communication skills are the key to good cancer care.’ European Journal of Cancer 35(11), 1592–1597. Fallowfield L, Lipkin M & Hall A (1998). ‘Teaching senior oncologists communication skills: results from phase I of a comprehensive longitudinal program in the

United Kingdom.’ Journal of Clinical Oncology 16(5), 1961–1968. Fallowfield L J, Jenkins V A & Beveridge H A (2002). ‘Truth may hurt but deceit hurts more: communication in palliative care.’ Palliative Medicine 16(4), 297–303. Friedrichsen M J & Strang P M (2003). ‘Doctors’ strategies when breaking bad news to terminally ill patients.’ Journal of Palliative Medicine 6(4), 565–574. Gamble K (1998). ‘Communication and information: the experience of radiotherapy patients.’ European Journal of Cancer Care 7(3), 153–161. Girling D J, Hopwood P & Ahmedzai A (1994). ‘Assessing quality of life in palliative oncology.’ Progress in Palliative Care 2, 80–86. Goffman E (1974). Frame analysis. Cambridge: Harvard University Press. Goffman E (1981). Forms of talk. Philadelphia: University of Pennsylvania Press. Iedema R, Sorensen R, Braithwaite J & Turnbull E (2004). ‘Speaking about dying in the intensive care unit, and its implications for multi-disciplinary end-of-life care.’ Communication and Medicine 1(1), 85–96. Jenkins V, Fallowfield L & Saul J (2001). ‘Information needs of patients with cancer: results from a large study in UK cancer centres.’ British Journal of Cancer 84(1), 48–51. Kissane D W, Clarke D M & Street A F (2001). ‘Demoralization syndrome – a relevant psychiatric diagnosis for palliative care.’ Journal of Palliative Care, 17(1), 12–21. Kutner J S, Steiner J F, Corbett K K, Jahnigen D W & Barton P L (1999). ‘Information needs in terminal illness.’ Social Science and Medicine 48(10), 1341–1352. Leydon G, Boulton M, Moynihan C, Jones A, Mossman J, Boudioni M & McPherson K (2000). ‘Cancer patients’ information needs and information seeking behaviour: in depth interview study.’ British Medical Journal 320, 909–913. Meredith C, Symonds P, Webster L, Lamont D, Pyper E, Gillis C R et al. (1996). ‘Information needs of cancer patients in West Scotland: a cross-sectional survey of patient’s views.’ British Medical Journal 313, 724–726. Mishler E G (1984). The discourse of medicine: dialectics of medical interviews. Norwood, NJ: Ablex. Montazeri A, Gillis C R & McEwen J (1998). ‘Quality of life in patients with lung cancer: a review of literature from 1970 to 1995.’ Chest 113(2), 467–481. Montazeri A, Milroy R, Gillis C R & McEwen J (1996a). ‘Interviewing cancer patients in a research setting: the role of effective communication.’ Supportive Care in Cancer 4(6), 447–454. Montazeri A, Milroy R, Macbeth F R, McEwen J & Gillis C R (1996b). ‘Understanding patients: let’s talk about it. A study of cancer communication.’ Supportive Care in Cancer 4(2), 97–101. Quirt C F, Mackillop W J, Ginsburg A D, Sheldon L, Brundage M & Dixon P (1997). ‘Do doctors know when their patients don’t? A survey of doctor–patient communication in lung cancer.’ Lung Cancer 18(1), 1–20.

674 Medical Discourse: Communication Skills and Terminally Ill Patients Roberts C & Sarangi S (2002). ‘Mapping and assessing medical students’ interactional involvement styles with patients.’ In Spellman-Miller K & Thompson P (eds.) Unity and diversity in language use. London: Continuum. 99–117. Roberts C & Sarangi S (2005). ‘Theme-oriented discourse analysis.’ Medical Education (in press). Royal Society of Medicine (2000). Bulletin on the Effectiveness of Health Service Interventions for Decision Makers, 6(6). The Royal Society of Medicine Press. Sarangi S (2002). ‘Discourse practitioners as a community of interprofessional practice: some insights from health communication research.’ In Candlin C N (ed.) Research and practice in professional discourse. Hong Kong: City University of Hong Kong Press. 95–135. Sarangi S (2004). ‘Towards a communicative mentality in medical and healthcare practice.’ Communication and Medicine 1(1), 1–11. Saunders C (2000). ‘The evolution of palliative care.’ Patient Education and Counselling 41(1), 7–13. Seale C (2000). ‘Changing patterns of death and dying.’ Social Science and Medicine 51, 917–930. Sell L, Devlin B, Bourke S J, Munro N C, Corris P A & Gibson G J (1993). ‘Communicating the diagnosis of lung cancer.’ Respiratory Medicine 87(1), 61–63.

Simpson M, Buckman R, Stewart M, Maguire P, Lipkin M, Novack D & Till J (1991). ‘Doctor–patient communication.’ British Medical Journal 303, 1385–1387. The A-M, Hak T, Koeter G & van der Wal G (2000). ‘Collusion in doctor–patient communication about imminent death: an ethnographic study.’ British Medical Journal 321, 1376–1381. The A-M, Hak T, Koeter G & van der Wal G (2003). ‘Radiographic images and the emergence of optimism about recovery in patients with small cell lung cancer: an ethnographic study.’ Lung Cancer 41, 113–120. The SUPPORT Principal Investigators (1995). ‘A controlled trial to improve care of seriously ill hospitalized patients: the study to understand prognoses and preferences for outcomes and risks of treatments (SUPPORT).’ Journal of American Medical Association 274, 1591–1598. Walsh D & Nelson K A (2003). ‘Communication of a cancer diagnosis: patients’ perceptions of when they were first told they had cancer.’ American Journal of Hospice and Palliative Care 20(1), 52–56. Williams G & Lau A (2004). ‘Reform of undergraduate medical teaching in the United Kingdom: a triumph of evangelism over common sense.’ British Medical Journal 329, 92–94.

Medical Discourse: Developments, 16th and 17th Centuries M Gotti, Universita` di Bergamo, Bergamo, Italy ! 2006 Elsevier Ltd. All rights reserved.

medical books published in the years 1640–1660, 207 were in English (Webster, 1975: 267).

Introduction

Epistemological and Linguistic Changes

The increasing need to use the English language for the expression of specialized texts caused a heated debate in 16th and 17th century England, as the adoption of other languages (Latin, in particular) was considered to be no longer sufficient for this purpose. The two centuries taken into consideration here are very important for the development of English medical discourse, as the years 1500–1700 mark a remarkable period of increasing use of the vernacular for medical and scientific writing. Indeed, at the beginning of this period, Latin still had a dominant role. Moreover, an analysis of the combination of Latin and English in late-medieval English medical manuscripts reveals code mixing as a widely exploited discourse strategy, one that implies a century of bilingual readership (Voigts, 1996). At the end of this period, English prevailed, and the process of vernacularization can be described as largely complete by 1500, when we can find a full range of sophisticated university treatises on medicine in English in which Latin plays little or no role. Indeed, of the 238

Great epistemological and methodological developments took place in that period, in both medicine and surgery; old scholastic thinking began to be replaced by new patterns of thought and new methodologies based on observation and interpretation of physical phenomena. These developments determined the need for corresponding changes both in the ways of communicating the new discoveries attained by means of innovative procedures and apparatus and in the expressive tool to be used to describe and argue about the new phenomena observed and analyzed. The evolution of the methods adopted in the study of medicine and the development of new surgical procedures indicated a change not only in the approach to the interpretation of the issues analyzed, but also in the way in which phenomena ought to be described and opinions expressed. Criticism was made both of how language was employed in the various processes of medical research and, in particular, of the suitability of the tool itself for an accurate, precise expression of the concepts

674 Medical Discourse: Communication Skills and Terminally Ill Patients Roberts C & Sarangi S (2002). ‘Mapping and assessing medical students’ interactional involvement styles with patients.’ In Spellman-Miller K & Thompson P (eds.) Unity and diversity in language use. London: Continuum. 99–117. Roberts C & Sarangi S (2005). ‘Theme-oriented discourse analysis.’ Medical Education (in press). Royal Society of Medicine (2000). Bulletin on the Effectiveness of Health Service Interventions for Decision Makers, 6(6). The Royal Society of Medicine Press. Sarangi S (2002). ‘Discourse practitioners as a community of interprofessional practice: some insights from health communication research.’ In Candlin C N (ed.) Research and practice in professional discourse. Hong Kong: City University of Hong Kong Press. 95–135. Sarangi S (2004). ‘Towards a communicative mentality in medical and healthcare practice.’ Communication and Medicine 1(1), 1–11. Saunders C (2000). ‘The evolution of palliative care.’ Patient Education and Counselling 41(1), 7–13. Seale C (2000). ‘Changing patterns of death and dying.’ Social Science and Medicine 51, 917–930. Sell L, Devlin B, Bourke S J, Munro N C, Corris P A & Gibson G J (1993). ‘Communicating the diagnosis of lung cancer.’ Respiratory Medicine 87(1), 61–63.

Simpson M, Buckman R, Stewart M, Maguire P, Lipkin M, Novack D & Till J (1991). ‘Doctor–patient communication.’ British Medical Journal 303, 1385–1387. The A-M, Hak T, Koeter G & van der Wal G (2000). ‘Collusion in doctor–patient communication about imminent death: an ethnographic study.’ British Medical Journal 321, 1376–1381. The A-M, Hak T, Koeter G & van der Wal G (2003). ‘Radiographic images and the emergence of optimism about recovery in patients with small cell lung cancer: an ethnographic study.’ Lung Cancer 41, 113–120. The SUPPORT Principal Investigators (1995). ‘A controlled trial to improve care of seriously ill hospitalized patients: the study to understand prognoses and preferences for outcomes and risks of treatments (SUPPORT).’ Journal of American Medical Association 274, 1591–1598. Walsh D & Nelson K A (2003). ‘Communication of a cancer diagnosis: patients’ perceptions of when they were first told they had cancer.’ American Journal of Hospice and Palliative Care 20(1), 52–56. Williams G & Lau A (2004). ‘Reform of undergraduate medical teaching in the United Kingdom: a triumph of evangelism over common sense.’ British Medical Journal 329, 92–94.

Medical Discourse: Developments, 16th and 17th Centuries M Gotti, Universita` di Bergamo, Bergamo, Italy ! 2006 Elsevier Ltd. All rights reserved.

medical books published in the years 1640–1660, 207 were in English (Webster, 1975: 267).

Introduction

Epistemological and Linguistic Changes

The increasing need to use the English language for the expression of specialized texts caused a heated debate in 16th and 17th century England, as the adoption of other languages (Latin, in particular) was considered to be no longer sufficient for this purpose. The two centuries taken into consideration here are very important for the development of English medical discourse, as the years 1500–1700 mark a remarkable period of increasing use of the vernacular for medical and scientific writing. Indeed, at the beginning of this period, Latin still had a dominant role. Moreover, an analysis of the combination of Latin and English in late-medieval English medical manuscripts reveals code mixing as a widely exploited discourse strategy, one that implies a century of bilingual readership (Voigts, 1996). At the end of this period, English prevailed, and the process of vernacularization can be described as largely complete by 1500, when we can find a full range of sophisticated university treatises on medicine in English in which Latin plays little or no role. Indeed, of the 238

Great epistemological and methodological developments took place in that period, in both medicine and surgery; old scholastic thinking began to be replaced by new patterns of thought and new methodologies based on observation and interpretation of physical phenomena. These developments determined the need for corresponding changes both in the ways of communicating the new discoveries attained by means of innovative procedures and apparatus and in the expressive tool to be used to describe and argue about the new phenomena observed and analyzed. The evolution of the methods adopted in the study of medicine and the development of new surgical procedures indicated a change not only in the approach to the interpretation of the issues analyzed, but also in the way in which phenomena ought to be described and opinions expressed. Criticism was made both of how language was employed in the various processes of medical research and, in particular, of the suitability of the tool itself for an accurate, precise expression of the concepts

Medical Discourse: Developments, 16th and 17th Centuries 675

reported. Those specialists who intended to use the native language in the expression of medical phenomena often pointed out its deficiencies and inaccuracies. For example, Borde often inserted Latin quotations in his work ‘‘for [he] nor no man els [could] not in theyr maternall tonge expresse the whole termes of phisicke’’ (Borde, 1552: 38v). Similarly, Christopher Langton complained of the limited number of vernacular anatomical terms: I can do no lesse then count the negligence of our Phisitions to be the cause of [the lack of English anatomical terms]: for yf they had written of theyr arte in theyr mother tunge, as they do in other places, why shulde we lacke englysh names more then we lacke eyther Latyn names or Greke names? and yet to saye the truthe, it is better for vs English men to haue English names, then eyther Latyn or Greke. (Langton, c. 1550: Diiir)

Another criticism often made by specialists was the polysemy characterizing many words, which often made texts ambiguous. The remedy suggested consisted of the coining of new terms providing a stricter delimitation of meaning. Indeed, the coining of native terms – particularly frequent in case of translation – is often explained with the motivation of greater clarity; this is the motivation provided by Andrew Borde for his translation of ‘‘obscure wordes and names into Englyshe’’: For as muche as olde, auncyent, and autentyke auctours or doctours of Phisicke in their bokes doth wryte many obscure terms, geuinge also to many and dyuerse infyrmities, darke termes, dyffycyle to vnderstande, some and mooste of all beyng Greeke wordes, some and fewe beynge Araby wordes, some beyng Latyn wordes, and some beyng Barbarus wordes. Therfore I haue translated all soche obscure wordes and names into Englyshe, that euery man openlye and apartly maye vnderstande them. (Borde, 1552: Avr)

Coining New Terms The field in which the English language proved to be particularly inadequate was that of ‘names of art,’ that is, of the technical terms that made up the basic lexis of a subject. This lack of specialized terms often made the translation of works into English an arduous task or made it difficult for English specialists to write essays in their native tongue. In the following quotation, for example, Robert Recorde points out the difficulties encountered in choosing to write in his native tongue: But now as touchyng myne entent in writyng this treatise in the english. Though this cause might seme sufficient to satisfy many men that I am an englysh man, and therefore may more easely and plainly write in my natyue tonge,

rather then in any other: yet vnto them that know the hardness of the mater, this answer shuld seme vnlykely: considering that it is more harder to translate into such a tonge, wherein the arte hath not ben written before, then to write in those tongues that are accustomed, and (as I might say) acquainted with the termes of the science. (Recorde, 1547; quoted in Jones, 1953: 73)

The realization that the English language was inadequate for medical purposes led to its gradual amelioration, from both a quantitative and a qualitative point of view. Specialists made great efforts to increase the number of specialized terms and improve the exactness of their meanings. Two main principles were followed in coining new terms: that of using the resources of the native tongue, either to give a specialized meaning to an existing word or to form a new one, and that of borrowing a similar term from a foreign language. In defining a new concept, the specialist sometimes employed a word already existing in the language, adding a specialized meaning to its usual one(s). However, the most frequently adopted strategy was the borrowing of terms from other languages, particularly from Latin. Indeed, Latin and Greek were endowed with large and respected specialized lexicons, as too was – although less extensively – Arabic (Standard Arabic). The choice of a loan, rather than the specialization of an existing word or the coinage of a new term, was often suggested by the fact that the concept to be referred to was already expressed in a foreign language. The availability of a term was particularly evident in the case of translation of texts. In that case, when the translator came across a word with no equivalent in the tongue into which he was translating, he was obliged to use the original word, thus enriching the lexical load of the receiving language. The remarkable increase in the number of new medical terms is in line with the great expansion of the Early Modern English lexicon as a whole (Finkenstaedt et al., 1970), with intense borrowing and neological formations in all fields. The medical sciences seem to reach their peak in the first half of the 17th century, as shown by the numbers of new words provided by Wermser (1976); see Table 1. However, as they are mainly based on the entries reported in the Oxford English dictionary (OED), these statistical data should be treated with caution. Indeed, due to the criteria adopted by the compilers, Table 1 Number of new medical terms

Medicine, anatomy

1510–1524

1560–1574

1610–1624

1660–1674

12

46

141

84

676 Medical Discourse: Developments, 16th and 17th Centuries

the data used in compiling this dictionary are incomplete; moreover, many words may have been in use before the dates indicated in the OED, either in texts that have not survived into our times or in publications not taken into consideration by the compilers (for a discussion of this issue cf. Scha¨ fer (1980)). This increase in medical terminology was favored both by the large number of translations from other languages and by original works in English. At the end of many of these works, the authors added a glossary in which they defined some of the new words that they had coined or adopted. This is, for example, the case of the translation by Richard Tomlinson of Renodaeus’ Dispensatory (1657) to which the publisher adds ‘‘A Physical Dictionary. Or, an Interpretation of such crabbed Words and Terms of art, as are derived from the Greek or Latin, in which words such as abstersive, buccellation, caliginous, cardiogmos, circumdated and commaculate are explained.’’ These glossaries – appended to books and explaining ‘hard words’ excerpted from the texts they accompanied – are the forerunners of the hard word dictionaries (cf. below).

Lexical Borrowing The most frequent policy adopted to increase the English medical lexicon was to take over Greek or Latin terms, sometimes without any change (e.g., delirium, diabetes, embolus, metacarpion, parenchyma, paresis, pus, uvea, virus). In the first use of these terms in a text written in English, the authors often added an explanation in vernacular. This is the practice commonly followed by Andrew Borde: Cancer is the latyn word. In englyshe it is named a canker the whiche is a sore the whiche doth corode & eat the flesshe corruptynge ye Arters the vaynes & the sinewes corodyng or eatynge the bone and doth putryfy & corrupt it. And then it is seldome made whole. (Borde, 1552: 28r) Colera is ye latyn worde. In greke it is named Cholae. In englysh it is named Coler, the whiche is one of the .iiii. humours. And it is hote and drye, lyenge or beyng in the stomake and is mouable. There be .v. kyndes of coler. [. . .] (Borde, 1552: 38r)

In general, however, in adopting the loan, the translator usually adapted the word he was borrowing to the morphological features of the receiving language, following the conventions in use concerning word formation. At times, their original termination was dropped (see examples in Table 2), but generally the original suffix was replaced by one commonly used in English. Thus, the suffixes -s and -y were commonly employed to naturalize foreign nouns

Table 2 Original medical terms and adopted terms created by changes to suffix Original term

Adopted term

Suffix dropping

ligamentum empiricus

ligament empiric

Suffix substitution

anatomia/anatomie, zootomia assimilabilis, visibilis deligatio, embrocatio

anatomy, zootomy

imminutio, remollitio narcotique insalubrite´ conflare/conflatus, evaporare/evaporatus, lymphatus, ulcerare/ ulceratus antidotarius incisorius eminentia, indecentia

Suffix addition

choler pinea

assimilable, visible deligation, embrocation, imminution, remollition narcotyke insalubrity conflate, evaporate, lymphate, ulcerate antidotary incisory eminence, indecency cholerous pineal

referring to scientific branches, -able and -ible were used to render the Latin adjectival terminations -abilis and -ibilis, and -al and -ous were used to form adjectives from foreign nouns, with the suffix -ous also being used to adapt adjectives ending in Latin in -us. Latin words ending in -atio, -itio, and -utio were provided with the suffixes -ation, -ition, and -ution, the terminations -ic(k) and -ity were usually employed as adaptations of the French suffixes -ique and -ite´, and -ate rendered the Latin endings -are and -atus; the Latin suffixes -arius and -orius were usually naturalized in the forms -ary and -ory, whereas -ence and -ency were meant as equivalents of the Latin ending -entia. This process of borrowing involved the adoption of not only single words, but also of prefixes and suffixes, which were used more and more often to create new terms. For example, if we consider combining forms of classical origin such as -ology and -meter, we can see many new English words created in the 17th century including these suffixes, e.g., pathology, ichthyology, zoology, osteology, psychology, micrometer, thermometer. Native and classical elements usually mixed in texts, as can be seen in Copeland’s (1541) definition of ulna: ‘‘The arme is devided in thre great partyes. One is called vlna, the other lytel arme.’’ Indeed, the existing native terms commonly referred to general parts (e.g., ankle, arm, knee, shoulder, elbow, hip) and the new terminology (mainly derived from classical sources) defined the component elements of these

Medical Discourse: Developments, 16th and 17th Centuries 677

general parts (e.g., ulna, radius, humerus as components of arm). Some native terms experienced a semantic transition from one referent to another, scientifically more appropriate: womb, for instance, passed from generalized ‘stomach’ to exclusively feminine ‘uterus.’ In some cases, two forms (general and medical) were neo-Latin; in these cases, the former was already present in the language, having been borrowed during the Middle English period, and the latter appeared in the language in the Early Modern English period; some examples of these doublets are gender / genus, prove / probe, spice / species, palsy / paralysis. Moreover, once a loan had been introduced, it was frequently used as a root from which further words could be formed by means of affixation. Examples include parenchyma > parenchymatic, anatomy > anatomical > anatomically. Some neologisms were obtained from existing words (either native or loan words themselves) by means of compounding: e.g., lung-sick (‘sick of a pulmonary complaint’), pine-glandule (‘pineal gland’). Although he thinks he is right in borrowing terms from other languages, the writer is aware of the difficulties of interpretation that readers might meet when they encounter the new terms. Therefore, he often tries to help in the form of a synonym when they first appear in the text; typical are cases where the author gives two expressions, a native and a foreign term, elucidating each other (‘stones or testicles’). On other occasions, instead of providing a concise synonym or antonym of the new term, the author gives a full paraphrase of it or an exemplification: Androtomy, as some of the moderns call the dissection of man’s body, to distinguish it from zootomy, as they name the dissection of the bodies of other animals. (Boyle, 1772: I. 68)

‘Transparent’ Terms Apart from the processes of word formation seen above, the complex operation of creating new terminology sometimes adopted other criteria, based mainly on the pragmatic principle of maximum transparency, which is extremely important in specialized discourse (Gotti, 2003). In the application of this principle, the writer created terms in such a way that their form clearly reflected the concept to which it referred. An example can be taken from Golding (1587), in which the author, in need of new terms, invented transparent forms such as fleshstrings for muscles and primetime to indicate an early period of world history. These terms may seem so strange as to suggest a desire for idiosyncrasy on the part of the author. Conformity to the criterion of transparency is instead confirmed by Ralph Lever:

Therfore (gentle reader) if thou doubt, what is ment, by any of our strange and new deuised termes, consider their partes, as they are taken by themselues alone: and the consideration of the partes, shall leade thee to the knowledge of the whole. (Lever The Arte of Reason, Rightly Termed, Witcraft, quoted in Jones, 1953: 129)

As we can see, these transparent terms were usually obtained by means of the juxtaposition of words already existing in the English language. This process of compounding was particularly favored by the brevity of English words; in fact, most were monosyllables and could therefore be easily linked to form compounds, which in turn were not too long. However, the recourse to this word-formation process based on native English lexemes was not very frequent, as the decoding of Latinate terms was not a real problem for the medical profession. Moreover, general use would make these new terms common and familiar to the profession as a whole: And althoughe perchaunce at the fyrst it may seeme somwhat obscure and harde [. . .] yet yf you doe accustomablye vse to reade them, and conferre either wyth the Apothecarie where as you doe not perfectly vnderstande the same, or elles vse the helpe of a Dictionarie, they wyll bee vnto you bothe familier and playne. (Gale, 1563: Aaaiiir)

Glossaries and Dictionaries Medical terms were already included in dictionaries at an early stage. They constitute a relevant part of the lexemes contained in the 1538 edition of Thomas Elyot’s Latin–English dictionary, probably facilitated by the fact that the compiler had studied medicine. Also the reviser of this dictionary, Thomas Cooper, was trained in medicine and dealt with these terms carefully not only in his revision of Elyot’s dictionary published in 1548, but also in his own very successful Latin–English dictionary. The standards of these bilingual dictionaries were high and the treatment of medical terms in them was accurate. The first monolingual dictionaries, instead, presented only a limited and unsystematic selection of medical terms among those that their compilers considered hard words, i.e., semantically obscure and difficult to understand. Indeed, the rapid development in the various scientific branches had caused a dramatic expansion of the English vocabulary and the coining of thousands of new words, which, however, were not always interpreted correctly by readers and were often re-employed wrongly. Therefore, many hard word dictionaries were compiled to present the various Latinate expressions that had appeared so abundantly in English specialized texts

678 Medical Discourse: Developments, 16th and 17th Centuries

of those centuries, as well as the more obscure words appearing in literary works. The tradition of hard word dictionaries is generally considered to date back to the beginning of the 17th century. However, even before then, glossaries had been used for the explanation of hard words. In less than a century, these glossaries increased in number and size and gradually developed into separate publications exhibiting great comprehensiveness. It is not at all strange, therefore, that Robert Cawdrey’s Table Alphabeticall (1604), which is often quoted as the first hard word dictionary, should be published during that period, as his list of hard words could be considered an expansion of those glossaries. Two other very popular hard word dictionaries are John Bullokar’s An English Expositor (1616) and Henry Cockeram’s English Dictionarie (1623). (For a survey of monolingual printed glossaries and dictionaries of this period cf. Scha¨ fer (1989) and McConchie (1997).) Some of the medical terms included by Cawdrey are as follows: adustion, confection, debility, entrails, epilepsis, fever, frenzy, fume, genitals, imbecility, judicial, lethargy, obstruction, ministration, ossicle, pest, sperm, tertian, vehement, ventricle. Also, the following terms belong to the medical field, although the examination of Cawdrey’s definitions does not always show a consideration of their medical senses: antidote, catarrh, decoction, gargarize, flux, instinct, nerve, putrefy, spleen, ulcerate, vapour. Some of these terms might not sound very innovative for those times, but the fact that they are included in this text means that – although some had been coined even a few centuries before – they had not become very popular among English speakers. These words, however, should only be considered as indicative of the popularity of the terms at that time. Indeed, it should be remembered that the words included in hard word dictionaries were not collected meticulously after a thorough examination of all the specialized texts available; instead, in the selection of his entries, the author mainly borrowed the terms that he thought suitable from the dictionaries and specialized glossaries he had come across in his unsystematic search for appropriate sources for his dictionary. The picture that can be drawn from a reading of these works is that the medical terminology dealt with in them was involved in a process of rapid growth yet still needed some form of systematization. This impression is backed up by the existence of doublets within the dictionaries to express identical concepts. Indeed, there are often frequent cases in which two different words refer to the same human organ, e.g., kidnies or reines. It must be pointed out, however, that the picture offered by hard word dictionaries is more dated than the actual state of

medical research at that time, as the author often made use of glossaries and dictionaries published a few decades before. Indeed, if we look at a few terms belonging to the medicine dealt with in hard word dictionaries and coined in the decades preceding their publication, we can see that many of them are not included in those works. Table 3 lists a few terms belonging to the branch of medicine, with the date of their first publication as derived from the OED and the Supplement to the Oxford English dictionary (OEDS); each word is followed by either a plus or a minus sign: the plus sign indicates the presence of that entry in the dictionary mentioned at the top of each column and the minus sign indicates its absence. Table 3 shows the unsystematic approach to the selection of items adopted by the compilers of hard word dictionaries. A comparison of the distribution of the items in Table 3 shows that Bullokar’s interest in medical ‘termes of art’ was more prominent (Bullokar himself was a physician), as is shown also by the lengthier explanations that accompany the various items: Splene, milt. (Cawdrey, 1604) Splene. The milte of man or beaste: which is like a long narrow tongue, lying vnder the shorte ribbes on the left side, and hath this office of nature, to purge the liuer of Table 3 The (non-)Occurrence of some medical terms in the first three hard word dictionaries

Spleen (c. 1300) Palate (1382) Genitals (1382) Sperm (1386) Artery (1398) Colon (1398) Trachaea (c. 1400) Uvula (c. 1400) Testicle (c. 1425) Embryo (1477) Fracture (1525) Nerve (1531) Muscle (1533) Abdomen (1541) Cartilage (1541) Cavity (1541) Rectum (1541) Tendon (1541) Ulna (1541) Cranium (1543) Vulva (1548–1577) Larynx (1578) Pancreas (1578) Scapula (1578) Skeleton (1578) Intestine (1597) Scrotum (1597)

Cawdrey (1604)

Bullokar (1616)

Cockeram (1626)

þ " þ þ " " " " " " " þ " " " " " " " " " " " " " " "

þ þ " þ þ " " þ " þ þ þ þ " " þ " " " " " " " " " " "

þ " þ þ þ " " " " þ þ þ " " " " " " " " " " " " " " "

Medical Discourse: Developments, 16th and 17th Centuries 679 superfluous melancholicke blood: sometime it signifieth anger or choler. (Bullokar, 1616) Spleene, The milt of man or beast. (Cockeram, 1626)

In this greater length, we may perceive a tendency toward a transition from mere verbal gloss to an encyclopedic article, which was gradually taking place in several hard word dictionaries, often making the distinction between a monolingual dictionary and an encyclopedia rather unclear. Another interesting feature of Bullokar’s dictionary is his inclusion of several terms referring to medicinal plants from the new world, which he had derived from the John Frampton’s Ioyfull Newes out of the Newe Founde Worlde (1577), a translation of the original book in Spanish by Nicola´ s Monardes (1574). Such terms are not only reported in Bullokar’s dictionary, but they are also given substantial definitions, as shown by this quotation: Sassafras. A tree of great vertue, which growth in the Florida of the West Indies: the rinde hereof hath a sweet smell like Cinnamome. It comforteth the lyuer, and stomack, and openneth obstructions of the inward parts, being hotte and dry in the second degree. The best of the Tree is the roote, next the boughes, then the body, but the principall goodnesse of all resteth in the ryndes.

The reading of the definitions of some of the entries in Bullokar’s dictionary shows that there was still a gap between the specialized terminology of other languages – particularly Latin – and the English tongue; the latter still lacked some of the termes of art that instead already existed in foreign codes. The following quotation provides a confirmation of this state of inferiority; in dealing with a medical term, Bullokar was not able to make exclusive use of English words to express his definition and had to borrow two Latin expressions to refer to the two types of meninges he was describing: Meninges. Thinne skins in which the braine is contained. There are two such skinnes: one called by Phisitians, Dura mater, which is the stronger of the two, and next vnto the scull. The other named Pia mater, is within this first, being more tender and fine, and close wrapping the braine it selfe. If any of these skinnes bee wounded, it causeth speedy death.

Syntactic and Textual Developments of Medical English The 16th and 17th centuries also showed interesting developments in the syntactic features of medical English. Unlike the lexical field, changes took place in a nonexplicit way, as specialists were usually unaware of the syntactic modifications that they were

introducing into the language by means of their writings. The new perspective that characterized their research method conflicted with the constraints of the language and forced them to adapt its rules to their expressive needs. Sentences became quite long, with very lengthy noun phrases preceding and following the verb. Moreover, the verb (often represented by forms of to be, often accompanied by present participles) started assuming the copular function usual in modern scientific English, where the verb merely links the very long nominal phrases coming before and after it. Each sentence was structurally simple, with few or no subordinate clauses, complying with the modern preference for coordination rather than subordination in sentence structure. Moreover, there was a large number of words referring to processes and a distinct preference for the use of nouns deriving from verbs. This linguistic preference, which is commonly referred to by the term ‘grammatical metaphor’ within Systemic Functional Linguistics (cf. Halliday, 1994), has become one of the salient features of scientific discourse and can be explained by the fact that verb-derived nouns reflect the parallel process whereby results are inferred from experiments and objects from their construction process (cf. Halliday and Martin, 1993). An example of this preference can be seen in the use of processes such as inspection and dissection in the following definition drawn from the manuscripts of William Harvey’s lectures delivered at the Royal College of Physicians in London around 1616: Anatomy is that faculty which through inspection and dissection reveals the uses and actions of the parts. (Harvey, c. 1616: 22)

Indeed, the adoption of nominalized forms was becoming more and more popular as, by allowing the thematization of the actions commonly expressed by verbs and increasing the textual potentialities of those lexemes, it enabled the scientist to include more information in the same sentence and guaranteed a better flow of his discourse. Another advantage offered by the process of nominalization consists of allowing the writer to create concise noun phrases, which can be made to perform the various different syntactical functions required in specialized texts. The increase in the use of nominalization was part of a gradual tendency toward a loss of importance of the verb, compensated by a growth in importance of the noun. This change was mainly for textual reasons, as the process of nominalization promoted a better cohesion of the text by means of the recovery of the information given in the previous statement as the theme of the following sentence.

680 Medical Discourse: Developments, 16th and 17th Centuries

Also from the textual point of view, there is a great development in this period; the early printed works mainly consist of translations of treatises and new compositions of general guides to health and handbooks of medical instructions, often including accounts of illustrative and typical cases (Bennett, 1969). The normal mode of publication for scientific work was in a separate treatise and men of science kept abreast of the times through private manuscript correspondence. The 17th century shows a greater variety of genres, with the addition of new forms such as anatomical observations, book reviews, journal articles (Lefanu, 1937), and experimental essays (Gotti, 2000). The first English medical periodical was the Medicina Curiosa, of which the only two numbers extant appeared on June 17, 1684 and October 23, 1684, as part of a wide phenomenon taking place all over Europe (Garrison, 1934). After the formation of scientific societies (Rome, 1603; Paris, 1635; London, 1660; etc.), the serial publication of their Acta developed (cf. the issues of the Philosophical Transactions of the Royal Society). The experimental essay originated in the Early Modern English period as a result of a complex process of scientific evolution, which determined the need for a new expository genre to suit the new epistemic approach of 17th century ‘natural philosophers.’ The innovative characteristics of this new text type derived from the great importance attributed to the experimental process in the research programs of Early Modern English men of science, who – elaborating on Francis Bacon’s intuitions – shared the principle that the progress of knowledge could not be based on the servile observance of traditional theory, but should rely on the observation of natural phenomena and accurate experimental activity. Hence, there was a need for a form of expression that would offer the scientist the opportunity to report briefly experiments carried out, procedures followed, results obtained, and any personal comments. This genre was identified in the experimental essay, which became very popular also in the medical field. To appreciate the importance of the latter genre, we should consider the fact that, in the following centuries, whereas some forms of specialized writings – such as the dialogue – almost disappeared, the experimental essay survived and became an essential part of specialized literature, as the rapid diffusion of scientific journals made it an established genre commonly used by men of science. Although most of the structural parts of this textual genre originate from Early Modern English experimental essays, early essays showed more frequent use of the active form, the report of unsuccessful experiments, and less

emphasis on theoretical conclusions than nowadays. Moreover, they were characterized by first-person narration, subjective point of view, and expressions of low modality (Taavitsainen, 2001). See also: Anatomical Nomenclature: History; Lexicography: Overview; Metaphors in English, French, and Spanish Medical Written Discourse; Nominalization; Systemic Theory.

Bibliography Bennett H S (1969). English books and readers 1475–1557. Cambridge: Cambridge University Press. Borde A (1552). The breuiary of helthe. London. Boyle R (1772). The works (6 vols). Birch T (ed.) London: J. & F. Rivington. Reprinted (1965). Hildesheim: Georg Olms Verlag. Bullokar W (1616). An English expositor. London: J. Legatt. Reprinted (1967). Menston: The Scholar Press. Facsimile (1971). Hildesheim: Georg Olms Verlag. Burchfield R W (ed.) (1989). The Oxford English dictionary (2nd edn.). Oxford: Clarendon Press. Cawdrey R (1604). A table alphabeticall. London. Reprinted (1966). Menston: The Scholar Press. Cockeram H (1623). The English dictionarie or an interpreter of hard English words (1st edn.). London: N. Butter. Cockeram H (1626). The English dictionarie or an interpreter of hard English words (2nd edn.). London. Reprinted (1970). Hildesheim: Georg Olms Verlag. Copeland R (1541). The questyonary of chirurgens. London. Finkenstaedt T, Leisi E & Wolff D (1970). A chronological English dictionary. Heidelberg: Winter. Gale T (1563). Certaine workes of chirurgerie. London. Garrison F H (1934). ‘The medical and scientific periodicals of the 17th and 18th centuries.’ Bulletin of the Institute of the History of Medicine 2(5), 285–343. Golding A (1587). A woorke concerning the trewnesse of the Christian religion. London. Gotti M (2000). ‘The experimental essay in Early Modern English.’ European Journal of English Studies 4(2), 145–163. Gotti M (2003). Specialized discourse: Linguistic features and changing conventions. Bern: Peter Lang. Halliday M A K (1994). An introduction to functional grammar (2nd edn.). London: Arnold. Halliday M A K & Martin J R (1993). Writing science: Literacy and discursive power. London: Falmer Press. Harvey W (c. 1616). Lectures on the whole of anatomy. O’Malley C D, Poynter F N L & Russell K F (1961) (eds.). Berkeley/Los Angeles: University of California Press. Jones R F (1953). The triumph of the English language. Oxford: Oxford University Press. Langton C (c. 1550). An introduction into phisycke, with an vniuersal diet. London.

Medical Discourse: Doctor–Patient Communication 681 Lefanu W R (1937). ‘British periodicals of medicine: A chronological list.’ Bulletin of the Institute of the History of Medicine 5(8), 735–761; 5(9), 827–855. McConchie R W (1997). Lexicography and physicke: The record of sixteenth-century English medical terminology. Oxford: Clarendon. Murray J A H, Bradley H, Craigie W A & Onions C T (eds.) (1933). The Oxford English dictionary (1st edn.). Oxford: Oxford University Press. OEDS (1980). In Burchfield R W (ed.) A supplement to the Oxford English Dictionary. Oxford: Oxford University Press. Recorde R (1547). The vrinal of phisick. London. Scha¨ fer J (1980). Documentation in the O.E.D.: Shakespeare and Nashe as test cases. Oxford: Clarendon.

Scha¨ fer J (1989). Early Modern English lexicography. Oxford: Clarendon. Taavitsainen I (2001). ‘Evidentiality and scientific thought styles: English medical writings in Late Middle English and Early Modern English.’ In Gotti M & Dossena M (eds.) Modality in specialized texts. Bern: Peter Lang. 21–52. Voigts L E (1996). ‘What’s the word? Bilingualism in latemedieval England.’ Speculum 71, 813–826. Webster C (1975). The great instauration: Science, medicine and reform 1626–1660. London: Duckworth. Wermser R (1976). Statistische Studien zur Entwicklung des englischen Wortschatzes. Bern: Francke.

Medical Discourse: Doctor–Patient Communication R Wodak, Lancaster University, Lancaster, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction A number of new developments in the research on doctor–patient communication are discussed in this paper. First, I provide a brief overview of the most important theoretical and methodological paradigms in this field. Then, I provide some examples, based inter alia on a qualitative empirical study in an outpatients’ clinic (Lalouschek et al., 1990), as well as on innovative and recent research on the expression of pain (Vodopiutz et al., 2002). As doctor–patient communication takes place in different institutional settings, such as hospitals or private clinics, I also consider some features of the rules and norms governing institutional discourses (Wodak, 1996; Iedema, 2003; Sarangi and Roberts, 1999). Moreover, it is of particular importance to evaluate new work that utilizes virtual communication, in other words, the Internet. The impact of this kind of research on the medical profession and possible application of research results will also be of interest.

Doctor–Patient Communication Communication in institutions is a major area of investigation in sociolinguistics and discourse analysis. In the domain of ‘critical linguistics’ (Wodak and Meyer, 2001), communicative interactions in relevant social fields need to be made transparent. Institutions incorporate such areas and manifest social structures, power allocations, hierarchies, and access to positions and information, and they define societies in

their subdivisions of work. Institutions have their own value systems, which crystallize in ideologies unique to particular institutions (Mumby, 1988). It is important to make a distinction between claims and expectations explicitly uttered in accordance with a professional ideology (such as the doctor is there for all patients in equal terms) and implicit, latent norms and rules (such as publications and empirical studies are more important for a clinical career than healing patients). The contradictions in practice that arise from such norms are often concealed and legitimized with myths, in the sense of Roland Barthes (1974). ‘Myth’ is understood to mean a secondary semiotic system that constitutes, as it were, a second layer of reality that all participants (are obliged to) believe. At the same time, there is an important applied aspect to institutional research that is relevant to practice: analyses of the status quo have made it possible to develop proposals for changes in communication, for example, in training programs for members of institutions or for a different approach on the part of clients (Azoulay et al., 2000; Kurtz et al., 1998). Sociolinguistic analysis has therefore proved to be an essential part of the investigation of institutions, since a purely sociological analysis is not capable of making transparent the details of dynamics and processes and is therefore obliged to adopt a high level of abstraction (Weick, 1985). The above-mentioned studies, carried out by medical researchers, are mostly quantitative and use content analysis or measurements of information acquired through questionnaires and interviews; these tools, however, mostly allow for self-assessment and are often oriented toward the investigators who might

Medical Discourse: Doctor–Patient Communication 681 Lefanu W R (1937). ‘British periodicals of medicine: A chronological list.’ Bulletin of the Institute of the History of Medicine 5(8), 735–761; 5(9), 827–855. McConchie R W (1997). Lexicography and physicke: The record of sixteenth-century English medical terminology. Oxford: Clarendon. Murray J A H, Bradley H, Craigie W A & Onions C T (eds.) (1933). The Oxford English dictionary (1st edn.). Oxford: Oxford University Press. OEDS (1980). In Burchfield R W (ed.) A supplement to the Oxford English Dictionary. Oxford: Oxford University Press. Recorde R (1547). The vrinal of phisick. London. Scha¨fer J (1980). Documentation in the O.E.D.: Shakespeare and Nashe as test cases. Oxford: Clarendon.

Scha¨fer J (1989). Early Modern English lexicography. Oxford: Clarendon. Taavitsainen I (2001). ‘Evidentiality and scientific thought styles: English medical writings in Late Middle English and Early Modern English.’ In Gotti M & Dossena M (eds.) Modality in specialized texts. Bern: Peter Lang. 21–52. Voigts L E (1996). ‘What’s the word? Bilingualism in latemedieval England.’ Speculum 71, 813–826. Webster C (1975). The great instauration: Science, medicine and reform 1626–1660. London: Duckworth. Wermser R (1976). Statistische Studien zur Entwicklung des englischen Wortschatzes. Bern: Francke.

Medical Discourse: Doctor–Patient Communication R Wodak, Lancaster University, Lancaster, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction A number of new developments in the research on doctor–patient communication are discussed in this paper. First, I provide a brief overview of the most important theoretical and methodological paradigms in this field. Then, I provide some examples, based inter alia on a qualitative empirical study in an outpatients’ clinic (Lalouschek et al., 1990), as well as on innovative and recent research on the expression of pain (Vodopiutz et al., 2002). As doctor–patient communication takes place in different institutional settings, such as hospitals or private clinics, I also consider some features of the rules and norms governing institutional discourses (Wodak, 1996; Iedema, 2003; Sarangi and Roberts, 1999). Moreover, it is of particular importance to evaluate new work that utilizes virtual communication, in other words, the Internet. The impact of this kind of research on the medical profession and possible application of research results will also be of interest.

Doctor–Patient Communication Communication in institutions is a major area of investigation in sociolinguistics and discourse analysis. In the domain of ‘critical linguistics’ (Wodak and Meyer, 2001), communicative interactions in relevant social fields need to be made transparent. Institutions incorporate such areas and manifest social structures, power allocations, hierarchies, and access to positions and information, and they define societies in

their subdivisions of work. Institutions have their own value systems, which crystallize in ideologies unique to particular institutions (Mumby, 1988). It is important to make a distinction between claims and expectations explicitly uttered in accordance with a professional ideology (such as the doctor is there for all patients in equal terms) and implicit, latent norms and rules (such as publications and empirical studies are more important for a clinical career than healing patients). The contradictions in practice that arise from such norms are often concealed and legitimized with myths, in the sense of Roland Barthes (1974). ‘Myth’ is understood to mean a secondary semiotic system that constitutes, as it were, a second layer of reality that all participants (are obliged to) believe. At the same time, there is an important applied aspect to institutional research that is relevant to practice: analyses of the status quo have made it possible to develop proposals for changes in communication, for example, in training programs for members of institutions or for a different approach on the part of clients (Azoulay et al., 2000; Kurtz et al., 1998). Sociolinguistic analysis has therefore proved to be an essential part of the investigation of institutions, since a purely sociological analysis is not capable of making transparent the details of dynamics and processes and is therefore obliged to adopt a high level of abstraction (Weick, 1985). The above-mentioned studies, carried out by medical researchers, are mostly quantitative and use content analysis or measurements of information acquired through questionnaires and interviews; these tools, however, mostly allow for self-assessment and are often oriented toward the investigators who might

682 Medical Discourse: Doctor–Patient Communication

be seen as authorities. Yet the detailed microanalysis of conversations between doctors and patients in various settings provides more authentic insight into on-going interactions, misunderstandings, satisfactions, and dissatisfactions as well as insight into compliance. Nevertheless, a textbook, such as that by Kurtz et al. (1998), is most certainly to be viewed as a first important step to be complemented by sociolinguistic and discourse analytic proposals (Lalouschek, 2004). Thus, the discourse analytical approach has made it possible to uncover the workings of explicit and latent institutional rules and norms in every specific text and to demonstrate how power structures, gender relations, etc. are always manifest and constantly (re)-produced in interaction. To summarize briefly the most important sociolinguistic research on institutions, we find the following picture, which indicates tendencies but does not permit simple generalizations: . Institutions are anonymous: the names of clients, for example, are known, but the names of insiders often remain secret. Written products, such as forms, rulings, and legal texts, are mostly written in the passive voice and characterized by vagueness. . Institutions have their own specific rituals: in many institutions historically determined linguistic forms (such as forms of address, greetings, and oaths of loyalty) that have lost their specific function have developed. These also include dress regulations, rites of initiation, etc. . Institutions have a quasi-military character: power and hierarchy are essential components of the internal dynamic. Many linguistic actions might take on the explicit character of requirements, requests, or even commands. . Institutions strive to reflect harmony: contradictions and conflicts are often concealed and not discussed or exposed openly; this might lead to many latent tensions that affect the interaction with colleagues or with patients in the wards. . Institutions are usually status oriented: members of the upper classes are often treated quicker and more thoroughly than others, men better than women, nationals better than foreigners or migrants (always depending on the country of origin), and more experienced patients more efficiently than the inexperienced. Social class, gender, education, ethnicity, and so on combine in unique ways and play an important role how individuals are perceived and treated by an institution. Typically, the most important positions are occupied by powerful, white males.

. Institutions often work inefficiently: inexperienced specialists frequently occupy the most important interfaces with the general public, such as in the outpatients’ clinic or in the reception area of hospitals. This means that often no decisions can be taken and problems must be referred to other experienced doctors or nurses (Coiera and Tombs, 1998). Moreover, experienced insiders in subordinate positions (such as nurses) sometimes must present their knowledge with great care so as not to constitute a threat to the more powerful specialists.

A Short Overview of Research Research on doctor–patient communication relies mainly on two approaches: the medical–sociological approach, which is oriented toward the institutional structures, and linguistic research, which focuses on microstructural aspects of communication and interaction. Thus far, there has been little success in unifying these two aspects. For example, Cicourel (1981), using a number of case histories, already showed the advantage of a conversation-analytical procedure as opposed to quantitative psychological categorizations. He cautions repeatedly that the structural framework must be included together with the different perspectives of the two principal protagonists – the doctor and the patient (doctors want to reach a diagnosis as swiftly as possible, whereas patients want to tell the story of their suffering; this results in conflicts of both time and interest that are sometimes difficult to resolve: ‘frame conflicts’) (see Medical Specialty Encounters). In the United States and United Kingdom, research focused more and more on the analysis of individual conversation patterns, such as question–answer sequences and other adjacency pairs. Since these are often isolated from the context of the complete ongoing discourse, they may ultimately be interpreted only to a limited extent (Fisher and Groce, 1990; West, 1990; Todd, 1983). Nevertheless, relevant insights from such research, albeit from 20 years ago, must be mentioned at this point: Todd (1983) described the collision between the institutional and lay worlds as a ‘frame conflict’: value systems, modes of knowledge structuring, and handed-down experiences diverge. The structural result of this kind of interaction is repeated misunderstanding and conflict. Mishler (1984) has shown that research remains in ‘‘the voice of medicine’’ and that scientific interpretation is therefore accomplished from the viewpoint of experts, but the signals from patients are ignored. Cordella (2004) has also illustrated in her empirical

Medical Discourse: Doctor–Patient Communication 683

study that the ‘voices’ of patients are often ignored. (see Doctors and Patients in Multilingual Settings). Erickson (1999) has carried out a particularly thorough and detailed investigation of the roles and identity conflicts of doctors that give great insight into difficult decision-making situations. Specifically, the combination of training young doctors and treating patients at the same time leads to situations of conflicting interests (see also Menz, 1991). In German-speaking countries, research was originally dominated by medical sociology, which lacked a linguistic toolkit. The center of interest was case histories and conversations during doctors’ ward visits. Bliesener’s work (1982), however, on conversations during ward visits was certainly the first relevant attempt to capture a particular substructure of everyday life in a hospital from a discourse analytical viewpoint. Visits are broken down precisely into their features and phases, and the obstacles to successful patient communication are demonstrated in detail. However, the genuinely sociolinguistic aspect was overlooked: whether, for instance, women and men, old and young patients, are treated differently. Bliesener’s claim is that simply devoting more time to patients does not solve the problems of obstacles to doctor–patient communication. The quality of the conversation would have to be significantly changed to permit alternative types of communication to develop. In addition, Hein (1985) – in a then innovative pilot study on family doctors – was able to illustrate the presence of class-specific language barriers in communication with patients of different social backgrounds. The same family doctors prescribed for lower class patients suffering from sleep disorders psycho-pharmaceuticals, whereas for middle-class patients (with the same symptoms) they recommended psychotherapy, valerian drops, or autogenic training. In the first case, the doctor–patient conversation was typically short, full of closed questions, and often cynically derogatory, whereas in the second case, the conversations were twice as long, amicable, and full of anecdotes and digressions (Hein and Wodak, 1987) (see Medical Discourse: Illness Narratives). Lalouschek (1995) analyzed, in a longitudinal study, the training of psychosomatic specialists. Here, too, she found that because of a lack of self-reflection on the part of the doctor, little success was achieved. The change to a rather more therapeutic mode of conversational behavior was rarely made or made in an exaggerated fashion. Her conclusion was as follows: training must become considerably broader and more complex, and above all the results of discourse analytical research should be included. Even in these studies, however, the definition of the unit of

discourse was isolated and limited. The everyday life of the institution was only partially considered. New research has now developed the more holistic approach further with the aim of applying linguistic research in training seminars for doctors (BeckerMrotzek and Bru¨ nner, 2004).

Sociolinguistic and Ethnographic Studies: Pain, Fear, and the Everyday Life of the Hospital All of the studies thus far mentioned with few exceptions derived from an external perspective; i.e., certain discourses were arbitrarily extracted from the life of the hospital and analyzed according to prior assumptions. The relationship to the institution and to the life of the institution is essentially lost in such a process. Because of this limited knowledge of the context, the interpretations have only little validity. In an investigation of the outpatients’ ward of a hospital in Vienna, it was the internal perspective that we were concerned with (i.e., the perspective from the viewpoint of the participants): we set out to observe, understand, and record mornings as a closed discourse unit and to then conduct conversations with both doctors and patients before and after this time (Lalouschek et al., 1990). We shared, as it were, the life of this clinic for a certain time. Because of this participant observation, it was possible to discover – alongside the familiar static social variables – some meaningful new dynamic categories. The outpatients’ clinic itself was perceived as an outpost – a ‘‘penal colony’’ in the words of one doctor – of low prestige and therefore a job for doctors in training. This means, as mentioned above, that at the most important interface with the outside world, inexperienced insiders are located instead of experienced ones. This leads to chain reactions: inexperienced doctors cannot make decisions and so senior doctors must be summoned. People must wait for them and this wastes valuable time. As a stereotypical answer to our questions, we learned that doctors have too little time and never make mistakes and that there is simply no better way. And in the everyday world, strategies had to be developed on the spot to cope individually with any problems that arose. The discourse analysis always related to an entire morning (as a macrodiscourse); every individual conversation constituted a subdiscourse that should always be seen within the total context of the setting in the hospital and the specific morning. In what follows, I focus on a few categories used in analyzing some sample conversations between doctors, patients, and nurses.

684 Medical Discourse: Doctor–Patient Communication

The occurrence of patient initiatives (questions for information, stories, complaints, judgments) and the doctors’ way of dealing with initiatives (answers, interruptions, cutting off, ignoring, and so on) demonstrate forms of exercising power on the part of the doctors as well as the voices of the patients. The quality of a frame conflict can also be captured in this way. Every specific discursive coping with problems by the doctors demonstrates the assumed contradiction between explicit rules, myths, and actual events. (If doctors become aware of problems, do they deal with them or defer them? Do they turn the associated uncertainty and aggression against others? In particular, how do doctors deal with problems that are perceived by patients?) The specific creation of a relationship involves basic factors in relationship management. If it proves possible to create a personal relationship, perhaps by means of ‘social turns’, if important politeness rituals are observed, if particular forms of address employed, and so on, does this lead to more compliance (Wodak, 1986)?

A Typical Morning On this particular morning, a male doctor and a female doctor are working together. Both are overtired; both have been on night duty. At 9:45, the second patient enters. She is considered to be a difficult patient, because of her age (87). 1 DF2: Right we’ll have to take off the gown too P: (. . .) don’t 2 DF2: Why not? – We are in the hospital you know. Right – 3 DF2: now then let’s sit down her shall we? P: /quietly/ (. . .) 4 DF2: RIGHT take off the gown please. P: [Gown – but I’ve 5 DF2: [Take it off please – the gown. We’ve P: got nothing under the gown. 6 DF2: got to do an ECG. Right No one’s P: (. . .) Gown 7 DF2: looking – well – it’s only the doctor 8 DF2: – isn’t it. He’s allowed P: [The doctor can look – but 9 DF2: to look isn’t he – right let’s sit down here 10 DF2: shall we. exactly P: [Sometimes he even has to look (. . .) 11 DF2: Right – tell me, which was the broken arm? 64 DF2: She keeps wobbling around – NOW JUST LIE STILL 65 DF2: DON’T KEEP WOBBLING AROUND – OR THE ECG WON’T WORK 66 DF2: QUITE STILL – JUST RELAX OK. Good – right:

P: 67 DF2: that’s fine.

[All right – yes.

221 DF2: But she’s sore EVERYWHERE – she’s sore 222 DF2: everywhere. DOES IT HURT THERE TOO? P: Ah: yes / sighs / (. . .) 223 DF2: Ah, not there – only the back and there it hurts, P: [it’s OK 224 DF2: right? Yes – and there? P: It hurts there. [Well – I can feel it but 225 DF2: [not too bad. P: it’s bearable.

First, the patient is asked to take off her blouse; she is then examined; and only then comes the first question: ‘‘what problems have you had, then, with your heart?’’ Next we hear the patient’s own account: she got herself worked up. ‘‘Old patients shouldn’t get themselves worked up,’’ says the female doctor, raising the matter of age for the first time. The social turn then comes to an abrupt end (‘‘all right’’), and there follows a closed question: ‘‘whereabouts in your stomach does it hurt?’’ Finally, toward the end, the patient seems cooperative. What is striking about this macrostructure is that instead of the usual questioning activities at the beginning, there are initially speech acts of demands and the consultation sequence is embedded into these. When the patient does not want to take off her blouse, after a first indirect speech act, linked to a childish form of address such as ‘‘we have to,’’ there is an unsatisfactory explanation. This form of address – pluralis hospitalis – is often employed with old people. After a structuring signal (‘‘right’’), the female doctor makes a second attempt: (‘‘right, would you take off your blouse, please’’): this is more polite but also direct, with the unmistakable character of a request (almost as command), using the more distant form of second-person address and a sociophonological switch to standard pronunciation: ‘‘take it off, please, your blouse.’’ Finally, there is a simple explanation. The patient still quietly resists, and the next explanation follows: only the (male) doctor is there, and the particle ‘‘OK’’ serves as a closing signal and as reassurance. ‘‘He can look,’’ says the patient, halfquestioning, half-submitting. This is repeated by the female doctor, and again a closer relationship develops expressed in dialect and the word ‘‘OK,’’ to which is attached a further appeal and a concluding and reassuring ‘‘OK.’’ The old woman repeats: ‘‘he has to look sometimes,’’ as though she was trying to convince herself. ‘‘Exactly,’’ replies the female doctor in confirmation; she changes the subject, again with a structuring ‘‘so,’’ and the examination begins with the first closed questioning activity. This is realized as a

Medical Discourse: Doctor–Patient Communication 685

further command (the fourth command in this short text) in which impatience is clearly manifest. In the following part, the female doctor yells at her, no longer able to control her annoyance, and the seventh direct request is realized in standard language. This is immediately followed by four more direct requests. For this, she later expresses regret. The closing section of the conversation is polite and calm; we find the third form of address: they are talking about the patient (‘‘everything hurts her’’). This causes a great deal of uncertainty in patients, since it is not clear who is being referred to. Only the change to quite direct questioning and a different form of address makes this switch clear: the utterance was aimed at the other doctor. The patient is now well behaved. She has adapted to the institutional frame. The female doctor therefore oscillates, in her handling of the patient, between coming close, impatient distancing behavior, and brutal authority. This communicative behavior is, on the one hand, a kind of power display, but on the other hand it reflects her state of conflict that may be attributed to a number of factors: it was her first consultation in the presence of a tape recorder; the patient was also ‘difficult,’ since she is very inexperienced and causes a delay in the sequence of events. This case is therefore perceived as a disturbance, as uncooperative behavior, and the doctor does not succeed in reducing the patient’s fear and establishing a reasonable relationship and trust. Explanations and information are denied precisely where they would be needed. In the end, the patient becomes completely silent. Here frame conflict and language barriers are obstacles to face-to-face communication. In the outpatients’ ward, the staff is not equipped for this kind of disturbance. The examination lasted 21min.

The Rest of the Morning After 11:00 A.M., there was a loudspeaker announcement: ‘‘Attention please: here is an announcement. The Go¨ tzendorf field ambulance will be assisting in snow clearance in the car park from 1:00 P.M. onward. All staff are requested to remove their cars to make this work easier.’’ Initially the content of this announcement causes laughter: ‘‘Oh God, they can’t be serious. One o’clock this afternoon. We’ll never get out. That’ll be fun. . .’’ When the doctor gradually realizes, however, what the announcement means, she becomes menacing: ‘‘But look, but there’s the training at 1 o’clock, I haven’t – I think it’s 3 weeks since I last went for training. . .Where can I put it? We’ll just leave it. I can’t cope any more.’’ The doctor transfers this

bad mood to the waiting patients: ‘‘They’re still there waiting. How many more are there, you hear me?’’ The strain apparently can no longer be tolerated, and any reasonable way of handling the situation is prevented. Here it becomes particularly clear how important it is to have a precise definition of a discourse unit and the context. Taken out of context and treated as a single sequence for investigation, the doctor’s behavior could only be interpreted as uncontrolled and irrational and, in general, as unwillingness to become involved with patients. In the total context of the morning, in contrast to the first consultations, this discursive phase takes on a different value, as an example of an inability to deal with conflict and a lack of problem-solving strategies (Van Dijk, 2003; Panagl and Wodak, 2004). For this reason, the final phase of the morning becomes hectic and full of tensions, the case consultations become significantly shorter, and postconsultation discussions are completely missing. Initiatives are no longer responded to and distance is indicated.

Dealing with Pain: Gender-Specific Differences Chest pains are a frequent cause of admission to hospitals, and they can be caused by a number of different conditions, ranging from a harmless pulled muscle to life-threatening coronary heart disease (Everts et al., 1996). Despite highly developed technology in the diagnosing of coronary diseases, there remain a number of uncertainties that complicate a diagnosis. Apart from the findings of such diagnostic tools as ECG and chest X-rays, together with laboratory investigation, consultation with a patient still plays a significant role in a diagnosis. In determining the cause of chest pain, the patient’s description of the pain is most relevant (see Medical Discourse: Communication Skills and Terminally Ill Patients). Medical investigations have shown that in women a dangerous coronary condition requiring treatment more frequently goes unrecognized, and instead an erroneous diagnosis of some harmless noncoronary cause is made. A number of medical studies have shown that these differences may be traced back to different descriptions of pain (Penque et al., 1998). For this reason, on the medical side, an interdisciplinary project was initiated at the interface between cardiology and discourse analysis. In this project, one of the guiding hypotheses was provided by the clinical experience: coronary conditions are more difficult to diagnose in women. Thus, are there gender-specific forms and linguistic behaviors in the description of pain?

686 Medical Discourse: Doctor–Patient Communication

Out of a complete corpus of 102 recorded interviews developed to elicit the most spontaneous and comprehensive descriptions of pain, a total of 24 interviews were selected according to criteria of gender, cause of illness, and age, and these were subjected to a discourse analytical investigation. Like other previous investigations, this study showed that in doctor–patient communication, gender-specific differences become salient. The linguistic differences in the description of chest pain that were discovered in the present discourse analytical investigation may possibly contribute to the fact that coronary problems are more frequently overlooked in female patients than in males and that consequently the mortality rate is higher (Vodopiutz et al., 2002). Differences were found in four different areas in particular: . Women downgrade their pain (diminish it or talk instead about the psychosocial environment); men rate their pain highly (take it seriously, show themselves as informed and interested). . Women regard themselves interactively as tolerant of pain (passive with respect to the pain and delegating the treatment to the institution ‘hospital’); men present themselves in the interaction as dealing with their pain (active with respect to the therapy). . Men display more strongly than women a desire for explanation of causes. . Men describe their pains very concretely (in the sense of the definition), in that they give very full symptomatic descriptions; women, conversely, give diffuse descriptions of pain (in the sense of the definition), in that they barely focus on symptomatic aspects and frequently use markers of diffuseness and meta-communicative utterances, in which they talk of the impossibility of describing their pains precisely. Strikingly, no age-specific differences were found. These differences have relevant implications in doctor–patient communication because the linguistic activity of men fulfills the expectations of the medical staff to a much greater extent than that of women with regard to the desired information, for doctors both need and require a symptomatic description of pain – that is, the most exact possible statement of location, intensity, and duration or frequency of the incidence of chest pain. The result of this is that patients who spontaneously pursue what they see as the relevant ‘clarification of causes’ and give predominantly symptomatic descriptions of pain are viewed by the doctors as more precise, informative, and cooperative. Since the major proportion of cardiologists are still men and the description of symptoms and the

clinical differential diagnosis of coronary heart disease were developed predominantly with male patients, this bias is possibly inherent in the system.

Final Remarks: Information for the Informed! Let us sum up briefly: the division between the beginning and the end of the morning in the outpatients’ clinic is also manifest in the degree of emotionality. Ultimately, ‘difficult’ patients are accused of causing the disturbances, and so the scapegoat is outside the outpatients’ clinic rather than anchored in the structure. The clients are made into an external enemy; the institution defends itself and functionalizes its contradictions both consciously and subconsciously. Only a precise context analysis, an understanding of everyday life in the institution, and a sequential analysis of the discourse make it possible to grasp what is happening, to reveal the contradictions and the use of power. Since what is not permitted is not done, myths and rationalizations are preserved, which often puts a stop to any reflection or possible change. In the second case, it becomes clear again how important the category of gender is in communication and how relevant the conscious perception, admission, and sharing of pain are. Not only must medical education give training in conversational competence, it must also include knowledge of emotions and the different socialization patterns of the two genders. Can a face-to-face conversation with a doctor be replaced with the use of new media, such as websites, debate forums, or the Internet? Could this help overcome staff shortages? On the basis of research on Internet communication and on the websites of individual doctors and group practices, this kind of claim must be seen as unrealistic (Danet, 2001; Herring, 2001). The Internet can be sensibly used only by those who know precisely what they are looking for. Initial barriers affect both expert knowledge and language. Medical experience is also essential, because one must be able to distinguish between commonplace advice and detailed suggestions. If one is really looking for precise help, one inevitably comes up against specialized knowledge, scientific articles, mostly in English, on individual medical problems. How can a lay person deal with this? And how can one assess whether particular diagnoses and therapies are applicable to one? A real face-to-face interaction and conversation with a doctor therefore remain indispensable. On the other hand, the Internet or some other multimodal channel functioning as advisor may sometimes be helpful in finding important starting points or in giving patients the feeling that others have similar problems and thereby reducing fear

Medical Discourse: Doctor–Patient Communication 687

and shame. Alternatively, after an established diagnosis, it enables chatroom conversations with other ‘fellow sufferers,’ and this helps to reduce loneliness and fear. A precondition, however, must still be a conversation – not ‘virtual’ – but face to face, because the problems of each patient remain unique and must be seen in the context of the patient’s everyday life. This will always require empathy, listening, and questioning skills as well as sufficient time to follow and understand the sometimes too brief or maybe too extended narratives of the patients, thus detecting the relevant points to come to an adequate diagnosis. See also: Doctors and Patients in Multilingual Settings; Medical Discourse: Communication Skills and Terminally Ill Patients; Medical Discourse: Illness Narratives; Medical Discourse: Psychiatric Interviews; Medical Specialty Encounters; Psychotherapy and Counselling.

Bibliography Azoulay E, Chevret S, Leleu G & Pochard F (2000). ‘Half the families of intensive care unit patients experience inadequate communication with patients.’ Critical Care in Medicine 28, 3044–3049. Barthes R (1974). Mythen des Alltags. Frankfurt/Main: Suhrkamp. Becker-Mrotzek M & Bru¨ nner G (eds.) (2004). Analyse und Vermittlung von Gespra¨ chskompetenz. Bern: Peter Lang. Bliesener Th (1982). Die Visite – ein verhinderter Dialog. Tu¨ bingen: Narr. Cicourel A V (1981). ‘Language and medicine.’ In Ferguson C & Heath S B (eds.) Language in the USA. Cambridge: Cambridge University Press. 403–430. Coiera E & Tombs V (1998). ‘Communication behaviors in a hospital setting: An observational study.’ British Medical Journal 316, 673–677. Cordella M (2004). The dynamic consultation: A discourse analytical study of doctor–patient communication in Chilean Spanish. Amsterdam: Benjamins. Danet B (2001). Cyberpl@y: Communicating online. Oxford: Berg Publishers. Erikson F (1999). ‘Appropriation of voice and of self as a fellow physician: Aspects of a discourse of apprenticeship in medicine.’ In Sarangi S & Roberts C (eds.) Talk, work and institutional order. Berlin: Mouton. 109–144. Everts B, Karlson B W & Wa¨ hrborg P (1996). ‘Localization of pain in suspected acute myocardial infarction in relation to final diagnosis, age and site and type of infarction.’ Heart and Lung 25, 430–437. Fisher S & Groce S B (1990). ‘Accounting practices in medical interviews.’ Language in Society 19, 225–250. Hein N (1985). Gespra¨che beim praktischen Arzt. M.A. Thesis: Vienna. Hein N & Wodak R (1987). ‘Medical interviews in internal medicine.’ Text 7, 37–66.

Herring S C (2001). ‘Computer-mediated discourse.’ In Tannen D, Schiffrin D & Hamilton H (eds.) Handbook of discourse analysis. Oxford: Blackwell. 612–634. Iedema R (2003). Discourse in post bureaucratic organizations. Amsterdam: Benjamins. Kurtz S, Silverman J & Draper J (1998). Teaching and communication learning skills in medicine. Abingdon: Radcliffe Medical Press. Lalouschek J (1995). A¨ rztliche Gespra¨ chsausbildung. Eine diskursanalytische Studie. Opladen: Westdeutscher Verlag. Lalouschek J (2004). ‘Kommunikatives Selbst-Coaching im beruflichen Alltag. Ein sprachwissenschaftliches Trainingskonzept am Beispiel der klinischen Gespra¨ chsfu¨ hrung.’ In Becker-Mrotzek M & Bru¨ nner G (eds.) Analyse und Vermittlung von Gespra¨chskompetenz. Bern: Peter Lang. 137–158. Lalouschek J, Menz F & Wodak R (1990). Alltag in der Ambulanz. Tu¨ bingen: Narr. Menz F (1991). Der geheime Dialog. Bern: P. Lang. Mishler E G (1984). The discourse of medicine. Dialectics in medical interviews. Norwood, NJ: Ablex. Mumby D K (1988). Communication and power in organizations: Discourse, ideology and domination. Norwood, NJ: Ablex. Panagl O & Wodak R (eds.) (2004). Text und Kontext. Wu¨ rzburg: Ko¨ nigshausen & Neumann. Penque S, Halm M & Smith M (1998). ‘Women and coronary disease: Relationship between descriptors of signs and symptoms and diagnostic and treatment course.’ American Journal of Critical Care 7, 175–182. Sarangi S & Roberts C (eds.) (1999). Talk, work and institutional order. Berlin: Mouton. Todd A (1983). ‘A diagnosis of doctor–patient discourse in the prescription of contraception.’ In Fisher S & Todd A (eds.) The social organization of doctor–patient communication. Washington, DC: Center for Applied Linguistics. 159–188. Van Dijk T A (2003). ‘The discourse–knowledge interface.’ In Weiss G & Wodak R (eds.) Critical discourse analysis: Theory and interdisciplinarity. London: Palgrave/ MacMillan. 85–109. Vodopiutz J, Poller S & Schneider B (2002). ‘Chest pain in hospitalized patient: Cause-specific and gender-specific differences.’ Journal of Women’s Health and Gender Based Medicine 11(8), 1–9. Weick K (1985). Der Prozess des Organisierens. Frankfurt/ Main: Suhrkamp. West C (1990). ‘Not just ‘‘doctors’ orders’’: Directive– response sequences in patients’ visits to women and men physicians.’ Discourse and Society 1(1), 85–112. Wodak R (1986). Language behavior in therapy groups. Los Angeles: University of California Press. Wodak R (1996). Disorders in discourse. London: Longmans. Wodak R & Meyer M (eds.) (2001). Methods in CDA. London: Sage.

688 Medical Discourse: Early Genres, 14th and 15th Centuries

Medical Discourse: Early Genres, 14th and 15th Centuries I Taavitsainen, University of Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.

Western science was initiated by Greek philosophers, mathematicians, and medical persons in their search for principles of nature, including human nature, and of argument itself (Crombie, 1995: 225). Knowledge disseminated into Latin via Arabic and from Latin to western vernaculars. Latin prevailed as the language of learning in Europe until the 18th century and served as the lingua franca of science until much later. In the late medieval period, scientific and medical ideas were expressed mainly in Latin, but texts started to be translated into various vernaculars and the circle of learning widened. The discourse world of medicine was multilingual: Greco-Roman texts provided the model for vernacular treatises and the standard against which the evolution of vernacular scientific writing can be projected. The influence of the Latin models on vernacular languages, and on the development of vernacular genres, has often been underestimated.

Medical Discourse and the Languages of Writing In modern linguistics, ‘medical discourse’ refers collectively to the communicative practices of the medical profession, both written and spoken. In the late medieval period, the medical profession consisted of heterogeneous groups of practitioners, including physicians, surgeons, barbers, midwives, itinerant specialists (e.g., bonesetters and oculists), herbalists, apothecaries, wisewomen, and others. They can be roughly divided into clerical and elite practitioners and tradespeople or ordinary practitioners; literacy was restricted mostly to the elite group. The core genres of learned Latin writing were used by a small elite of learned physicians and surgeons, and included commentaries, compilationes (‘compilations’) and encyclopedic treatises, question-and-answer guides, pedagogical dialogues, consilia, and practica; some treatises are called ‘sermons,’ but without further specification of the term. The Latin situation provides a starting point for the assessment of vernacular genres, although the users of vernacular texts and the genre map are somewhat different. Readership of Vernacular Texts

Genres are important operational tools and constitute dynamic systems that undergo change and variation. Over the course of time, sociocultural needs change and genres change accordingly: old genres become adapted to new functions, new genres are created,

and genres that have lost their function cease to exist. The question of audience is important. In earlier literature, language and audience were thought to correlate, so that Latin was associated with learned writing for professionals and vernacular languages were linked with more popular texts for lay audiences. Recent research has proved that this distinction does not hold, but the pattern is more complicated. For example, surviving evidence indicates that the readership of texts addressed to ‘the poor’ and ‘unlearned’ in prologues and dedications is not to be taken at face value. The real consumers can be assessed through external evidence, such as library catalogues, wills, and ownership inscriptions. According to this evidence, owners and potential readers were mostly people of high rank in society and professionals in the field of medicine. For example, French-language copies of the popular verse encyclopedia Sidrak and Bokkus were owned by important figures in English political life, including Simon Burley, tutor to Richard II (Burton, 1998: 1:xxxiii). Guy de Chauliac’s surgical writings constitute highly learned compilations, and these texts were also translated into vernacular languages. Their distribution shows how learned knowledge disseminated throughout Europe. The Latin originals (c. 1360) were targeted at doctors of three leading faculties of medicine, Montpellier, Bologna, and Paris, and doctors and clerks of the papal court (McVaugh, 1997); thirty-three Latin manuscripts are extant. In the 15th century, Chauliac’s works were translated into several vernaculars, with at least three anonymous translations in Middle English. The audiences of the vernacular versions must have been more heterogeneous, but conclusive evidence is lacking. Some of the manuscripts show no signs of wear or use at all. Rather, they seem to have been display objects, with the vernacular serving in a new function in the prestige register of scientific writing. Transfer of Latin Genres into Vernacular Languages

In the late medieval period, there was a gradual increase in vernacular writing in various realms of knowledge, so that by the 14th and 15th centuries, texts dealing with scientific, medical, and technological subjects were becoming increasingly common in Europe (Crossgrove, 1998). By the end of the 14th century, conventionalized discourse forms had been created to present and organise knowledge in Latin; they can well be called ‘genres,’ as genres are defined according to the functions that they fulfil in society. Genres of Latin writing had become legitimate modes

Medical Discourse: Early Genres, 14th and 15th Centuries 689

for disseminating learning, but they developed and changed during the period in question. New writing forms and genres like consilia and practica were created, new diseases were treated, and new fields, such as astrological physic and medicinal alchemy, emerged. The process of vernacularization started in the 14th century and took place on a broad front in the multilingual context of medieval Europe in several countries roughly simultaneously. The register was new in these languages, and the means to express scientific ideas did not exist; they had to be created. Vernacular texts occupied an intermediate position between the world of learning and the more popular attitudes, between ars and vulgus, as demonstrated by the German versions of Bernard of Gordon’s Lilium medicine (original 1305), which was also translated into Middle English, French, Castilian, Gaelic, and Hebrew (Demaitre, 1998: 88). Genre conventions of scientific writing in English writing show a great deal of fluctuation; features of vernacularization are likely to be parallel in different languages. Traditions of Vernacular Writing

Some genres survive in written vernaculars even from earlier periods, e.g., handbooks with practical advice, recipes, prognostications, and charms are extant from late Old English. We can talk about different traditions of writing, defined as a continuing series of texts building on one another, with a great deal of intertextuality. The traditions are an underlying factor, with several genres within them. Learned translations of academic and surgical texts were new and emerged in the last quarter of the 14th century, whereas the remedy book tradition is longer and goes back to the 10th and 11th centuries in English (Voigts, 1984). Yet the aforementioned division into three traditions is not always clear. Surgical texts are easy to identify because of their subject matter, but academic writing is a difficult category because of the lack of knowledge of early authors and their audiences. Likewise, remedy books are a difficult category, as these texts also contain learned materials, which may have come to vernaculars earlier, and there are several chronological layers of influence and borrowing.

Institutional Settings of Medical Discourse Medicine as a Discipline

The position of medicine among sciences is special because of the relationship between theory and practice; this problem occupied learned writers for centuries. Medicine was both a learned university discipline

and an occupation involving technical skills. Medical education comprised both general theories and specialist doctrines, but also practical applications through apprenticeship. The development of learned genres from the 12th century onward took place within an institutional context, with the newly founded universities as centers of medical learning (Crisciani, 2000: 75–78). More practical, noninstitutional writings are extant in great numbers in both Latin and in vernacular languages. Institutionalization of Medical Education

Before the founding of universities, medicine was taught in households of noble patrons or in medical schools attached to monasteries or cathedrals. From the 11th century onward, the Latin western world began to seek out a fuller version of ancient wisdom. Italy was in the forefront of the developments in the formation of writing conventions and establishing textual traditions, with new institutions and expanding scientific horizons. The 12th century has come to be seen as the starting point for a long medical Renaissance, because intellectual traditions that, evolving over 400 years, eventually began to resolve a broad range of issues, thereby defining a new science of medicine, which grew out of a range of equally new areas of scholarly investigation (McVaugh and Siraisi, 1990: 9). The influence of the new translations from Arabic in natural philosophy and medicine started in the 12th century, e.g., Constantinian translations date from this period. The center of medical knowledge was Salerno, Italy, and the most important product of the school of Salerno was the fully developed textbook known as the Articella, a collection of classical medical knowledge with Arab learning (French, 2003: 72; Ottosson, 1984: 60). Universities became institutionalized in the 13th century, and there was a general agreement about the form and content of medical teaching throughout Europe. Oxford and Cambridge, like others, were monolingual in Latin and part of the pan-European educational network that paid no regard to the vernacular. Institutional developments promoted the creation of discourse forms in a more formalized direction. During the 14th and 15th centuries, the growth of towns in Europe was rapid, and various kinds of corporations started to develop. Teachers of medicine formed masters’ guilds and taught their pupils to a level of ‘master,’ thus controlling the continuity of the trade (French, 2003: 70). An important step toward the institutionalization and full development of medical training took place in England, with the establishment of the guilds of practicing barbers and surgeons in the late 14th century. Guilds started to record their official documents in the vernacular;

690 Medical Discourse: Early Genres, 14th and 15th Centuries

for example, the Barber-Surgeons of York gathered all kinds of useful information into their guild book from the year 1486 in English and Latin. It is likely that the medical treatises in this book were used in instruction, but otherwise, vernacular medical literature was probably non-institutional.

Latin Genres of Learning Commentaries and Compilations

‘Commentaries’ and ‘compilations’ (compilationes) were at the heart of the intellectual mainstream of scholasticism. According to medieval literary theory, commentaries are distinguished from compilations by the fact that the commentator takes the responsibility for conclusions. His own materials are ‘‘annexed for the purpose of clarifying’’ the issues, whereas opinions are attributed to others by the compiler, who ‘‘writes the materials of others, adding but nothing of his own’’ (Minnis, 1979: 387; quotations from Bonaventura translated by Minnis). Commentaries were used both for research in reconciling ancient authorities and in teaching. The twofold nature of medicine shows in them, as, in addition to the truth and meaning of the work, the commentator had to point out its usefulness (Crisciani, 2000: 79). Each discipline had a canon of authoritative texts. The commentary tradition has several layers, from Greek to Latin and Arabic, and the developments continued after the high point of the 12th century (see Minnis et al., 1988). Textual transmission took the form of systematic and often heavily glossed and commented versions of ancient knowledge. Commentaries reflected the logocentric mode of scholastic science, with a reliance on axioms, i.e., statements accepted as being true, though not necessarily so. The source of knowledge was ‘that someone said so,’ i.e., the ‘Quotative.’ Reporting the opinions of various authorities was an evidential feature of scholastic texts. The layered nature of medical learning showed in the hierarchy of authorities referred to. Galen and Hippocrates were frequently mentioned in learned and specialised treatises; next in frequency came Arab authors, such as Avicenna, Rhases, Haly Abbas, and Averroes, and medieval Latin authors. General references to doctors, leeches, physicians, and masters prevailed in more popular layers of writing (Taavitsainen and Pahta, 1998: 169). Compilationes reflect logocentric science as well. This genre was important for the dissemination of knowledge and gained wide use in its different forms in the vernacular. Compilations had a twofold didactic function. First, they provided easy access to authoritative passages and convenient ways of finding important opinions. Second, they made the

authorities available to readers not able to work their way through the originals (Minnis, 1979: 402–403). The two top genres of scholasticism approached one another at the end of the medieval period, and it is often difficult to make a distinction and classify texts. Textual histories of several works, including the Articella components, are extremely complicated and may include several layers, with commentaries and compilations intertwined. This practice of building upon various discourse forms was probably widespread. Alongside authentic texts, pseudo-authorial texts were written by near-contemporary authors and transmitted alongside the authentic ones; in addition, to elaborate the medieval genre map further, any important text in the Middle Ages could be subject to commentary. Encyclopedias

The techniques of compiling reached a high level of sophistication in medieval encyclopedic treatises. They had multiple uses and audiences. The De proprietatibus rerum of Bartholomaeus Anglicus was originally composed in Latin (c. 1245) for friars to illustrate their sermons (Seymour et al., 1975/1988: 11). In addition to religious matters, it included natural history and medical lore; some copies, owned by high aristocracy, had handsome illustrations of medical practice and of the universe. A large part of the work overlapped with medical literature and seems to have served as a model for vernacular writing in the following centuries. A medical encyclopedia, the Breviarum bartolomei, was produced toward the end of the 14th century by London priest John of Mirfield. It recorded all medical authorities of the day, plus charms and prayers, perhaps reflecting hospital use (Getz, 1998: 50). Questions-and-Answers Literature

‘Questions literature’ had several layers and traits with chronological developments. Teaching methods of universities, with their oral practices, contributed to the development of learned written genres. Conventions of scientific argumentation took shape: the lectio influenced the commentary, and oral disputations affected the question-and-answer pattern in written discourse. Theoretical and practical experience could be combined, e.g., Taddeo Alderotti’s commentary was the result of 10 years of teaching with a long professional practice (Crisciani, 2000: 79). Likewise, Guy de Chauliac combined a university career with a long experience in medical praxis. Public disputes were central in communicating knowledge orally. Their written counterpart originally derived from Aristotelian treatises on scientific problems and became conventionalized at universities

Medical Discourse: Early Genres, 14th and 15th Centuries 691

in the 13th and 14th centuries. The formula of oral disputes was very similar to that seen in the written genre, and the influences intertwined, so that the medieval commentary also contributed to the development of the questions genre (Siraisi, 2001: 144–148). The pattern was elaborate: the question was posed, considered with pros and cons, given an affirmative or a negative answer, then expanded by descriptions, definitions, and explanations as well as a review of authorities. At the end, the main argument was raised again and the problem was answered. This pattern developed a standard stock of questions and remained valid for centuries (Cadden, 1993: 114; see also Minnis et al., 1988: 212). Salernitan questions were in wide circulation and found in both verse and prose collections in Latin. The sources are complicated; some can be traced to Salerno, some to Montpellier and Paris, showing Arabic and Salernitan influences, and many seem to stem from classical collections of Greek problemata (Lawn, 1979: xiv–xxiii).

1999). On the one hand, the processes seem to be connected with the rise of nationalistic feelings and the desire to enhance the status of the languages (e.g., Trevisa); on the other hand, it was a commonplace to apologize for the use of the vernacular for learned purposes in prefaces of early printed books. Latin scientific and medical texts provided the model for vernacular versions. Translations covered all levels, from academic texts to miscellaneous collections of useful materials such as recipes, rules of health, and applications of astrological doctrines, and some new texts were composed as well. Because vernacular languages were not used at the highest institutional level, the practical side became enhanced in them. In general, the transmission of surgical texts in the vernacular obeys laws which were different from those applying to Latin (Jones, 1989: 88). To date, this area has not been studied in great detail.

Pedagogical Dialogues

The learned genres of commentaries, compilations, and questions-and-answers were institutional. In the vernacular, these genres lost their original functions, and they may have gained new applications; such new uses and functions were central for dissemination of knowledge to the wider public.

‘Pedagogical dialogues’ also derived from classical models but had different underpinnings. They were found in Greek literature, from Socrates through Plato and Aristotle, and continued throughout history, with Boethius’s De consolatione philosophiae as the most influential medieval text of this genre. This ‘‘schoolroom colloquy between a master and a student’’ (Lerer, 1985: 19) set the model for the stereotypical roles in pedagogical dialogues for centuries to come and was translated several times into vernaculars (e.g., into English in the Old, Middle, and Early Modern English periods). Case Studies, Consilia, and Practica

In the medieval period and for centuries afterward, the core of medical instruction was based on typical cases of disease. The genres connected with case reports were consilia and practica. The genre of consilia was modeled on legal documents and originated in the 13th century. A consilium was a piece of medical advice on a particular case. It offered a diagnosis and suggested a therapy (French, 2003: 121). Practica dealt with particulars of a disease and treatment and became a university genre with growing intellectual attention; the plague literature, for example, is connected with this genre. Surgical case histories were probably influenced by Arabic models and were not used in institutional teaching.

Vernacularization of Medical Writing The complex motivations for vernacularization have received attention recently (see Wogan-Brown et al.,

Learned Genres in the Vernacular

Commentaries and Compilations

There are severe problems with identification of genres in vernacular languages. A recent discovery (Tavormina, forthcoming) revealed that even classical Greek authors’ texts were translated into English in the late medieval period. The vernacular texts showed characteristics of the commentary genre but did not indicate their affiliation. The contrary seems to be even more common; some texts were explicitly called ‘commentaries,’ or the word ‘comment’ occurred in the title, but in contrast to genre expectations, the text contained explications of key terms and concepts so that the discourse features of the ensuing text were not in accordance with Latin genre features of the title. A survey of manuscripts indicated as commentaries in Voigts and Kurtz (1998) revealed that translations focused on the more utilitarian kind of academic texts, such as urinoscopies. Rather than quoting authorities and reconciling their views, these texts strived at making the contents easily accessible. The role of the translator varied; some seemed to have distanced the text (Benvenutus Grassus), and some acted as editors by selecting and excerpting materials, as in the case of John Arderne’s surgical texts. There was no stability, and it seems that the genre was not conventionalized in the vernacular. It is evident that the conventions of taking responsibility for

692 Medical Discourse: Early Genres, 14th and 15th Centuries

one’s opinions, drawing conclusions, and conciliating conflicting views, did not hold in vernacular texts, and many of them simply listed opinions of various authorities on a topic. Chauliac’s top-level academic surgical treatises and the academic encyclopedia Of the properties of things represented the learned end of medical writing in the vernacular, and chronological developments can be detected in their transmission. Trevisa’s translation (1398–1399) represented the early phase of vernacularization, whereas Wynkyn de Worde’s edition was printed around 1495 and belonged to a different phase, when vernacular writing was already established. Adaptations were made for a broader audience; for example, pictures were added to act as visual texts, and the use of Latin in the rubrics was maintained as part of the scheme (Holbrook, 1998). The scope of encyclopedic treatises was wide in the vernacular. The focus of the more popular kind was on astrological medicine and its applications. Differences are obvious: learned compilationes assembled loci communes into coherent wholes, otherwise the technique merged with normal book-making practices. It was customary to gather useful texts, which also circulated independently, into the same manuscripts in so-called ‘commonplace books.’ The distinguishing feature of a commonplace book was its purpose: miscellaneous items were copied for the compiler’s own personal use, interest, and amusement. The best example of this genre in medicine is the commonplace book of Thomas Fayreford (see Jones, 1998), but other commonplace books also included medical materials. Questions-and-Answers and Pedagogical Dialogues

The elaborate patterns of Latin writing were not found in the vernacular, but because questions and answers were particularly suitable for instruction, the genre was adapted to vernacular teaching. The pattern was sporadically found in medical prose of the 15th century, but more consistently only in the 16th century. In these texts, the questions were straightforward, sometimes even blunt, and the answers came from some monological treatises, e.g., from Chauliac’s surgical treatises in Guido’s questions (1579). Questions and answers were found earlier in verse than in prose. Verse provided a more elementary mode of expression in Middle English, and prose was associated with philosophy and higher learning. An interesting work combining various literary and nonliterary genres is Sidrak and Bokkus, which circulated widely in several vernaculars in the 14th and 15th centuries. This treatise was an encyclopedic compilation with questions and answers, a philosophical dialogue with the roles of a teacher and a learner, and

a saint’s legend. It was written in rhyming couplets with common stock rhyme words and thus easy to memorize. Unlike learned encyclopedias, Sidrak and Bokkus was not coherent, but the 362 scientific and medical questions were dispersed in the text. Its dialogue has been characterized as a ‘‘cathecism’’ without conflict (Burton, 1998: 1:xxvii): the pupil accepts the teacher’s answers without argument or further queries. Topics of Salernitan questions were frequently dealt with in vernacular writing as well, but they survive mainly in references to the school of Salerne in late medieval English-language medical literature. Case Reports

Narratives of the course of illness are commonly found embedded in academic and surgical treatises, e.g., Benvenutus Grassus’ learned opthalmology and John Arderne’s texts. Recipes in remedy books may contain occasional case reports to prove the efficacy of the cure.

Remedy Book Material: Recipes, Charms, and Prognostication Recipes were a well-defined procedural genre, but even their transmission was extremely complicated. Recipes were found on their own and embedded within a wide range of texts. Recipes in different traditions of medical writing in Middle English show different degrees of textual standardization: in remedy books, recipes followed a standardized format, but in academic and surgical texts they were more varied. The difference can be explained by the different functions of recipes in these traditions. In learned treatises they provided illustrations of healing practices, whereas remedy books served as handbooks for quick reference. The standardized format, with explicit titles and regular structure, served practical purposes and made consultation easier (Taavitsainen, 2001).

Conclusions Vernacularization processes are complicated and need further study. The converging development of commentaries and compilations, for example, is likely to have taken place in Latin writing as the traditions of commentary and compilation approached one another. Medieval commentators were influenced by the practice of compilations; attitudes changed as works of revered authors were noticed to include compilations as well, and compilations acquired sophistication and refinement especially in the 13th and 14th centuries (Minnis, 1979: 386–387, 413). It is no surprise that commentaries and compilations

Medical Discourse: Early Genres, 14th and 15th Centuries 693

overlap and merge in vernacular texts and that it is difficult, sometimes impossible, to distinguish between the genres. Perhaps it is not even necessary to do so. Compilations form a broad category in the vernacular. Some thematic compilations became fused with commentaries, and at the other end the process overlapped with the type of activity found in commonplace books. In general, vernacular medical texts had a bias to instruction and practical knowledge. There was a great deal of variation between individual texts. More theoretical treatises were also found, but there were difficulties in making vernacular languages function in the new prestige register. The late medieval period was important in establishing genre conventions in vernacular writing. In the 15th century the scale of medical writing was wide: there were texts with theoretical considerations transferred from the Latin exemplars, but there were reduced adaptations and applications as well. The difference between learned and popular layers of writing was reflected in some pertinent features of scholasticism, for example, the scale of authorities quoted and the degree of specificity. It seems possible to trace the transmission of scientific ideas to various audiences with a detailed comparison of the contents and styles of writing. More definitive conclusions need more studies on the Latin background and on features of vernacular writing. Although Latin genres of writing have not been studied in detail, the situation is improving. A major research project of revising the Thorndike and Kibre Catalogue into electronic form is forthcoming (Voigts), and recent editions of Latin medical texts provide excellent material for more detailed studies. The scope of vernacularization in different European languages is not fully known yet and new discoveries are possible. New research tools have made the task easier and new projects have been launched. The database Scientific and medical writings in Old and Middle English: An electronic reference (Voigts and Kurtz, 1998) and George Keiser’s volume A manual of the writings in Middle English 1050–1500: Works of science and information (1998) are invaluable in charting the extent of English medical writing and provide a good overview of the situation around 1500 A.D. An electronic corpus Middle English Medical Texts (Taavitsainen, Pahta and Ma¨ kinen, 2005) makes a wealth of vernacular medical texts of the period available to scholars. See also: Medical Communication: Linguas Francas; Medical Discourse: Developments, 16th and 17th Centuries.

Bibliography Burton T L (ed.) (1998–1999). Early English Text Society 311–312: Sidrak and Bokkus: A parallel-text edition from Bodleian Library, MS Laud Misc. 559 and British Library, MS Lansdowne 793. Oxford: Oxford University Press for the Early English Text Society. Cadden J (1993). Meanings of sex difference in the Middle Ages: medicine, science and culture. Cambridge: Cambridge University Press. Crisciani C (2000). ‘Teachers and learners in scholastic medicine: Some images and metaphors.’ History of universities 15, 75–101. Crombie A C (1995). ‘Commitments and styles of European scientific thinking.’ History of science 33, 225–238. Crossgrove W (1998). ‘Introduction.’ In Crossgrove W et al. (eds.). 81–87. Crossgrove W, Schleissner M & Voigts L E (eds.) (1998). Early science and medicine: a journal for the study of science, technology and medicine in the pre-modern period 3(2). Special issue: The vernacularization of science, medicine, and technology in Late Medieval Europe. Demaitre L E (1998). ‘Medical writing in transition: Between ars and vulgus.’ In Crossgrove W et al. (eds.) 88–102. French R (2003). Medicine before science: The business of medicine from the Middle Ages to the Enlightenment. Cambridge: Cambridge University Press. Getz F (1998). Medicine in the English Middle Ages. Princeton, NJ: Princeton University Press. Holbrook S E (1998). ‘A medical scientific encyclopedia ‘‘Renewed by goodly printing’’: Wynkyn de Worde’s English De proprietatibus rerum.’ In Crossgrove W et al. (eds.). 119–156. Jones P M (1989). ‘Four Middle English translations of John Arderne.’ In Minnis A J (ed.) Latin and vernacular: Studies in late-medieval texts and manuscripts. Cambridge: D. S. Brewer. 61–89. Jones P M (1998). ‘Thomas Fayreford: An English fifteenthcentury medical practitioner.’ In French R et al. (eds.) Medicine from the Black Death to the French disease. Aldershot: Ashgate. 156–183. Keiser G (1998). A manual of the writings in Middle English 1050–1500 10: Works of science and information. New Haven: The Connecticut Academy of Arts and Sciences. Lawn B (ed.) (1979). The prose Salernitan Questions, edited from a Bodleian manuscript (Auct. F. 3. 10): An anonymous collection dealing with science and medicine written by an Englishman c. 1200, with an appendix of ten related collections. London: Oxford University Press. Lerer S (1985). Boethius and dialogue: Literary method in The consolation of philosophy. Princeton, NJ: Princeton University Press. McVaugh M R (ed.) (1997). Guigonis de Caulhiaco (Guy de Chauliac): inventarium sive chirurgia magna 1: Text. Leiden, New York, and Cologne: Brill.

694 Medical Discourse: Early Genres, 14th and 15th Centuries McVaugh M R & Siraisi N (eds.) (1990). Osiris 2nd series 6: Renaissance medical learning: Evolution of a tradition. Philadelphia: University of Pennsylvania Press. Middle English Medical Texts (2005). CD-ROM. Compilers Irma Taavitsainen, Pa¨ivi Pahta and Martti Ma¨ kinen. With software by Raymond Hickey. Amsterdam and Philadelphia: Benjamins. Minnis A J (1979). ‘Late medieval discussions of compilatio and the roˆ le of the compilator.’ Beitra¨ge zur Geschichte der deutschen Sprache und Literatur 101, 385–421. Minnis A J, Scott A B & Wallace D (eds.) (1988). Medieval literary theory and criticism c. 1100–c. 1375: The commentary tradition (rev. edn.). Oxford: Clarendon Press. Ottosson P-G (1984). Scholastic medicine and philosophy: A study of commentaries of Galen’s Tegni (ca. 1300–1450). Napoli: Bibliopolis. Seymour M C et al. (eds.) (1975/1988). On the properties of things: John Trevisa’s translation of Bartholomaeus Anglicus De Proprietatibus Rerum, vol. 1. Oxford: Oxford University Press. Siraisi N (1990). Medieval and Early Renaissance medicine: An introduction to knowledge and practice. Chicago and London: University of Chicago Press. Siraisi N (2001). Medicine and the Italian universities, 1250–1600. Leiden: Brill. Taavitsainen I (2001). ‘Middle English recipes: Genre characteristics, text type features and underlying traditions of writing.’ Journal of Historical Pragmatics 2, 85–113.

Taavitsainen I (2004). ‘Transferring classical discourse conventions into the vernacular.’ In Taavitsainen I & Pahta P (eds.). Medical and scientific writing in Late Medieval English. Cambridge: Cambridge University Press. 37–72. Taavitsainen I & Pahta P (1998). ‘Vernacularisation of medical writing in English: A corpus-based study of scholasticism.’ In Crossgrove W et al. (eds.). 157–185. Tavormina M T (ed.) (forthcoming). MS Trinity College Cambridge R.14.52: A medieval medical manuscript, its language and scribe. Tempe: Arizona Center for Medieval and Renaissance Studies. Voigts L E (1984). ‘Medical prose.’ In Edwards A S G (ed.) Middle English prose: A critical guide to major authors and genres. New Brunswick, NJ: Rutgers University Press. 315–335. Voigts L E (forthcoming). Electronic Thorndike Kibre Project (eTK): L. Thorndike and P. Kibre Catalog of Medieval scientific writings in Latin (1963, 1965, 1968). Voigts L E & Kurtz P D (eds.) (1998). Scientific and medical writings in Old and Middle English: An electronic reference. Ann Arbor: University of Michigan Press [(eVK) CD-ROM.]. Wogan-Brown J, Watson N, Taylor A & Evans R (eds.) (1999). The idea of the vernacular: an anthology of Middle English literary theory, 1280–1520. Exeter: University of Exeter Press.

Medical Discourse: Hedges K Hyland, University of London Institute of Education, London, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Hedging is the expression of tentativeness and possibility, and it is central to academic writing where statements are rarely made without subjective assessments of their reliability and the need to present unproven propositions with caution and precision. In medical writing, hedges play a critical role in gaining ratification for claims from a powerful peer group by allowing writers to present statements with appropriate accuracy, caution, and humility, expressing possibility rather than certainty and prudence rather than overconfidence. In a context where the accreditation of knowledge depends on the consensus of the research community and the need to evaluate evidence, to comment on its reliability, and to avoid potentially hostile responses, expressions such as might, perhaps, and possible can contribute to

gaining the acceptance of research claims. Medical papers provide interesting and useful examples of the use of hedging in scientific discourse because they relate to matters impinging on significant issues of our lives. In the past decade, a growing literature has demonstrated the clear pragmatic importance of hedging as a resource for expressing uncertainty, skepticism, and open-mindedness about one’s propositions. Studies have revealed the significance of hedging in textbooks, scientific letters, science digests, and academic research papers (Hyland, 1998a, 1998b, 2000; Myers, 1989). Hedges have also been shown to play an major role in medical discourses by Salager-Meyer (1994), Skelton (1997), and Adams Smith (1984). These studies indicate how hedges represent a writer’s attitude within a particular context and help negotiate the perspective from which research conclusions can be accepted. In this article, I will briefly discuss the importance and role of hedging in academic medical papers and sketch a framework for understanding the functions they perform for academic writers.

694 Medical Discourse: Early Genres, 14th and 15th Centuries McVaugh M R & Siraisi N (eds.) (1990). Osiris 2nd series 6: Renaissance medical learning: Evolution of a tradition. Philadelphia: University of Pennsylvania Press. Middle English Medical Texts (2005). CD-ROM. Compilers Irma Taavitsainen, Pa¨ivi Pahta and Martti Ma¨kinen. With software by Raymond Hickey. Amsterdam and Philadelphia: Benjamins. Minnis A J (1979). ‘Late medieval discussions of compilatio and the roˆle of the compilator.’ Beitra¨ge zur Geschichte der deutschen Sprache und Literatur 101, 385–421. Minnis A J, Scott A B & Wallace D (eds.) (1988). Medieval literary theory and criticism c. 1100–c. 1375: The commentary tradition (rev. edn.). Oxford: Clarendon Press. Ottosson P-G (1984). Scholastic medicine and philosophy: A study of commentaries of Galen’s Tegni (ca. 1300–1450). Napoli: Bibliopolis. Seymour M C et al. (eds.) (1975/1988). On the properties of things: John Trevisa’s translation of Bartholomaeus Anglicus De Proprietatibus Rerum, vol. 1. Oxford: Oxford University Press. Siraisi N (1990). Medieval and Early Renaissance medicine: An introduction to knowledge and practice. Chicago and London: University of Chicago Press. Siraisi N (2001). Medicine and the Italian universities, 1250–1600. Leiden: Brill. Taavitsainen I (2001). ‘Middle English recipes: Genre characteristics, text type features and underlying traditions of writing.’ Journal of Historical Pragmatics 2, 85–113.

Taavitsainen I (2004). ‘Transferring classical discourse conventions into the vernacular.’ In Taavitsainen I & Pahta P (eds.). Medical and scientific writing in Late Medieval English. Cambridge: Cambridge University Press. 37–72. Taavitsainen I & Pahta P (1998). ‘Vernacularisation of medical writing in English: A corpus-based study of scholasticism.’ In Crossgrove W et al. (eds.). 157–185. Tavormina M T (ed.) (forthcoming). MS Trinity College Cambridge R.14.52: A medieval medical manuscript, its language and scribe. Tempe: Arizona Center for Medieval and Renaissance Studies. Voigts L E (1984). ‘Medical prose.’ In Edwards A S G (ed.) Middle English prose: A critical guide to major authors and genres. New Brunswick, NJ: Rutgers University Press. 315–335. Voigts L E (forthcoming). Electronic Thorndike Kibre Project (eTK): L. Thorndike and P. Kibre Catalog of Medieval scientific writings in Latin (1963, 1965, 1968). Voigts L E & Kurtz P D (eds.) (1998). Scientific and medical writings in Old and Middle English: An electronic reference. Ann Arbor: University of Michigan Press [(eVK) CD-ROM.]. Wogan-Brown J, Watson N, Taylor A & Evans R (eds.) (1999). The idea of the vernacular: an anthology of Middle English literary theory, 1280–1520. Exeter: University of Exeter Press.

Medical Discourse: Hedges K Hyland, University of London Institute of Education, London, UK ! 2006 Elsevier Ltd. All rights reserved.

Introduction Hedging is the expression of tentativeness and possibility, and it is central to academic writing where statements are rarely made without subjective assessments of their reliability and the need to present unproven propositions with caution and precision. In medical writing, hedges play a critical role in gaining ratification for claims from a powerful peer group by allowing writers to present statements with appropriate accuracy, caution, and humility, expressing possibility rather than certainty and prudence rather than overconfidence. In a context where the accreditation of knowledge depends on the consensus of the research community and the need to evaluate evidence, to comment on its reliability, and to avoid potentially hostile responses, expressions such as might, perhaps, and possible can contribute to

gaining the acceptance of research claims. Medical papers provide interesting and useful examples of the use of hedging in scientific discourse because they relate to matters impinging on significant issues of our lives. In the past decade, a growing literature has demonstrated the clear pragmatic importance of hedging as a resource for expressing uncertainty, skepticism, and open-mindedness about one’s propositions. Studies have revealed the significance of hedging in textbooks, scientific letters, science digests, and academic research papers (Hyland, 1998a, 1998b, 2000; Myers, 1989). Hedges have also been shown to play an major role in medical discourses by Salager-Meyer (1994), Skelton (1997), and Adams Smith (1984). These studies indicate how hedges represent a writer’s attitude within a particular context and help negotiate the perspective from which research conclusions can be accepted. In this article, I will briefly discuss the importance and role of hedging in academic medical papers and sketch a framework for understanding the functions they perform for academic writers.

Medical Discourse: Hedges 695

Hedging and the Construction of Knowledge Hedging has been a subject of interest to linguists since Lakoff (1972) first used the term to describe ‘words whose job it is to make things more or less fuzzy.’ Essentially, it represents an absence of certainty and is used to describe ‘any linguistic item or strategy employed to indicate either a) a lack of commitment to the truth value of an accompanying proposition or b) a desire not to express that commitment categorically’ (Hyland, 1998a: 1). The importance of hedging lies in the fact that transforming claims into accredited knowledge requires reader acceptance and therefore linguistic and rhetorical means of persuasion. Academic knowledge is now generally acknowledged to be a social accomplishment, the outcome of a cultural activity constituted by agreement between a writer and a potentially skeptical discourse community. As a result, the research paper is a rhetorically sophisticated artifact, carefully crafted to display a careful balance of factual information and social interaction, set out using community-recognized and -accepted argument forms. Academic writers need to make the results of their research not only public, but also persuasive, and this involves them carefully weighing claims for the significance and plausibility of their work against the convictions and expectations of their readers. Successful academic writing, in other words, involves authors evaluating their material and acknowledging alternative views because all statements require ratification. This, at least in part, depends on the appropriate use of various rhetorical and interactive features, of which hedges are among the most important. These examples show how hedges can modify the factual status of a proposition to suggest that it is based on imperfect reckoning rather than certainty. provide an idea of how this is often achieved. The hedges (italicized) indicate interpretations and allow writers to convey their attitude about the truth of the statements they accompany, thereby presenting unproven claims with prudence and softening categorical assertions. More than this, however, hedges open a discursive space in which readers are able to dispute the writer’s arguments and interpretations, thereby enabling writers to take a position with respect to an audience as well as to facts.

The Functions of Hedging Although medical research scientists gain and retain their academic credibility by securing readers’ acceptance of the most significant claims that their

findings will support, such strong claims are risky. Few assertions in the scientific world survive for long before being replaced by those with greater explanatory efficacy and there is always the possibility that claims will contradict or challenge the beliefs of one’s peers. The fact that readers are able to reject claims means they have an active and constitutive role in how writers construct them, and it is the writer’s anticipation of this potential opposition that makes hedging central to academic writing. This opposition can be divided into two types and claims must address these to stand any chance of success. First, claims must correspond with what is believed to be true in the world. Hedges here are ‘content-oriented’ and concern the relationship between a proposition and a representation of reality. At a further level of delicacy, we can distinguish the obligation on writers to present their claims as accurately as possible and the equally pressing need to anticipate the dangers of overstatement. Second, a proposition that could be presented categorically from an objective perspective may be explicitly hedged because of reader considerations. Thus, ‘reader-oriented hedges’ incorporate an awareness of interpersonal factors. It is often difficult to pin down precisely what the writer intends, as indeterminacy is a widely recognized feature of modal semantics. However, although there is inevitably some overlap, it is useful to conceptually distinguish these three broad functions to understand the roles that hedges play in academic discourse and readers’ motivations for employing them (Hyland, 1998a). First, hedges allow writers to express propositions with greater precision in areas often characterized by rapid reinterpretation. Hedging here is an important means of attesting to the degree of precision or reliability of a claim and accurately stating uncertain statements with appropriate caution. In medicine, writing is necessarily a balance of fact and evaluation as the writer tries to present information as fully, accurately, and objectively as possible. Thus, writers often say ‘X may cause Y’ rather than ‘X causes Y’ to specify the actual state of knowledge on the subject. Hedges here distinguish the actual from the potential, or the known from the inferential, and imply that a proposition is based on the writer’s plausible reasoning rather than certain knowledge. Readers are expected to understand that the proposition is true as far as can be determined. The second reason for using hedges concerns the writer’s desire to anticipate the possible negative consequences of being proved wrong and the eventual overthrow of a claim (Hyland, 1998a; Salager-Meyer, 1994). Academic reputations are built on making novel, interesting, and plausible contributions to

696 Medical Discourse: Hedges

knowledge, which means stating the strongest claims possible for any particular evidence. However, writers also need to protect themselves against the hazardous consequences of overstatement. Hedges here help writers avoid personal responsibility for statements in order to protect their reputations and limit the damage that may result from categorical commitments. This usage follows Lakoff in associating hedges with ‘fuzziness,’ but I am using the term fuzziness here not to describe connections between propositions, but the ways that hedges can blur the relationship between a writer and a proposition when referring to speculative possibilities. One way writers achieve this is to express claims using to employ evaluative that structures with modal devices and non-agentive subjects (Hyland and Tse, 2005). Most commonly, this involves the use of dummy it (see Example 7) or ‘abstract rhetors,’ which attribute judgments to inanimate sources (Example 8). In the medical sciences, writers may hedge in this way because of preliminary results, small samples, doubtful evidence, uncertain predictions, imperfect measuring techniques, and other uncertainties in the experimental process or case histories. Finally, hedges can be ‘reader-oriented’ in that they contribute to the development of a writer–reader relationship, addressing the need for respect and cooperation in gaining readers’ ratification of claims. Research writers must always consider both the reader’s role in accrediting knowledge and the need to conform to the expectations of the medical research community concerning limits of self-assurance. Most importantly, categorical assertions leave no room for dialogue and are inherently face-threatening as they suggest that the arguments need no feedback, thus relegating the reader to a passive role. By explicitly referring to themselves as the source of the claim, often with a cognitive or discourse verb, writers are able to mark the statement as one possible position, an alternative view rather than a definitive statement of truth, and thereby indicate a personal opinion awaiting verification. Examples 10 and 11 in provide illustrations. Here, hedges appeal to readers as intelligent colleagues, capable of deciding about the issues, and indicate that statements are provisional, pending acceptance by one’s peers. Thus, we can see that hedges help protect the writer against possible wrong interpretations or faulty results, but they also allow them to demonstrate an awareness of the reader’s possible alternative viewpoint, displaying the conditional nature of statements out of strategic respect for them and indicating the degree of confidence that the writer judges it prudent to attribute to statements. In sum, though there is

considerable overlap in these functions, hedging looks three ways: toward the proposition, toward the writer, and toward the reader.

Extent and Distribution of Hedging This multifunctional importance means that academic medical writing is extensively hedged. The essentially rhetorical nature of these functions, however, means that there are considerable variations in the distribution of hedges across different genres, with greater concentrations in more argumentative and persuasive types of texts. In a study of different papers in the British Medical Journal, for instance, Adams Smith (1984) found that editorials and review articles are more heavily hedged than research papers and medical case reports. When qualifications are omitted, the result is both greater certainty and less professional deference, reflecting a different attitude to information and readers. Thus, genres that present information as accredited knowledge, such as undergraduate textbooks (Hyland, 2000) and popular science articles (Fahnestock, 1986), contain far fewer hedged propositions. The authors of such genres do not have to persuade an expert audience of a new interpretation or anticipate the consequences of being proved wrong because most claims already have factual status. Similarly, hedges are differently distributed across different parts of each genre, with their use especially marked in the more discursive sections. They are particularly prevalent in the introduction and discussion sections of research articles and the comment section of clinical case notes, for example (Adams Smith, 1984; Salager-Meyer, 1994). It is here that writers are seeking to establish the relevance and significance of their research and the plausibility of their interpretations and is therefore where they need to be more tentative and circumspect in their assertions. There is also evidence that the type of hedges employed differ as well, with content-oriented hedges dominating the methods and results sections and more reader-oriented forms being found in introductions and discussions (Hyland, 1998a; SalagerMayer, 1994). These patterns allow writers to convey approximations when discussing symptoms and methods and to evaluate, interpret, and comment on the evidential status of their information in the discursive sections.

Conclusion In this article, I have discussed some of the contextual factors that shape the ways writers say what they believe and want others to accept. I have tried to

Medical Discourse, Illness Narratives 697

show that the expression of doubt and possibility is central to the negotiation of claims and that what counts as effective persuasion is influenced by the fact that evidence, observations, data, and flashes of insight must be shaped with due regard for the nature of reality and their acceptability to an audience. See also: Accessibility Theory; Corpora; Corpus Studies:

Second Language; Genre and Genre Analysis; Medical Discourse and Academic Genres.

Bibliography Adams Smith D (1984). ‘Medical discourse: Aspects of author’s comment.’ English for Specific Purposes 3, 25–36. Fahnestock J (1986). ‘Accommodating science: The rhetorical life of scientific facts.’ Written Communication 3(3), 275–296.

Hyland K (1998a). Hedging in scientific research articles. Amsterdam: Benjamins. Hyland K (1998b). ‘Boosting, hedging and the negotiation of academic knowledge.’ TEXT 18(3), 349–382. Hyland K (2000). Disciplinary discourses: Social interactions in academic writing. London: Longman. Hyland K & Tse P (2005). ‘Hooking the reader: A corpus study of evaluative that in abstracts.’ English for Specific Purposes 24(2), 123–129. Lakoff G (1972). ‘Hedges: A study in meaning criteria and the logic of fuzzy concepts.’ Chicago Linguistic Society Papers 8, 183–228. Myers G (1989). ‘The pragmatics of politeness in scientific articles.’ Applied Linguistics 10(1), 1–35. Salager-Meyer F (1994). ‘Hedges and textual communicative function in medical English written discourse.’ English for Specific Purposes 13(2), 149–170. Skelton J (1997). ‘The representation of truth in academic medical writing.’ Applied Linguistics 18(2), 121–140.

Medical Discourse, Illness Narratives L-C Hyde´n and P H Bu¨low, Linko¨ping University, Linko¨ping, Sweden ! 2006 Elsevier Ltd. All rights reserved.

Introduction During the past decades, the health field has become a battlefield where alternative concepts of illness, health, and treatment compete with the dominant traditions of Western scientific medicine. New health care practices such as complementary and alternative medicine have gained status. Etiological factors have come to include ‘lifestyle’ factors, as in the relationship between smoking and lung cancer or the relationship between stress and certain coronary diseases. Patients have access today via the Internet to sources of information that were previously unavailable to them and can approach their physicians with more knowledge and new demands. As part of these changes, researchers have become interested in questions having to do with the relationship between, on the one hand, the patient and his or her illness and, on the other hand, traditional scientific medicine and its concept of diseases. In this context, the study of illness narratives has gained prominent status. Research on the forms and functions of illness narratives has expanded rapidly since the early 1980s. Its development is marked by diversity in the theoretical perspectives and methods that are brought to bear on a variety of problems.

Illness narratives is a wide field encompassing interview studies of patients’ narratives of illnesses, studies of the way that narratives are used in the interaction between medical staff and patients, and clinical studies of how narratives could be used by medical professionals in encounters with patients. Another field of study that has emerged is the study of written and published illness narratives; Hawkins (1993) called these pathographies. Theoretically, researchers have been interested in the narrative as an opportunity to study the subjective experience of illness, the way in which identity is reconstructed narratively in the face of illness, and how the institutional context affects the relationship between medical professionals and patients. This is especially interesting in terms of power and the patient’s ability to make his or her voice heard in the medical encounter. Thus, researchers focus on narrative structure and coherence, as well as on the functions of narratives in various social contexts.

Two Voices A central problem area for many studies of medicine and illness narratives is the relationship between what is called the voice of the lifeworld and the voice of medicine. These concepts were introduced by Elliot Mishler in his book Discourse of medicine (1984). Mishler pointed out that in the discourse of ordinary medical interviews it was possible to discern two

Medical Discourse: Non-Western Cultures 703 Cain C (1991). ‘Personal stories: Identity acquisition and self-understanding in Alcoholics Anonymous.’ Ethos 19, 210–253. Capps L & Ochs E (1995). Constructing panic: The discourse of agoraphobia. London: Harvard University Press. Charmaz K (1983). ‘Loss of self: A fundamental form of suffering in the chronically ill.’ Sociology of Health and Illness 5, 168–195. Charon R (2001). ‘Narrative medicine: A model for empathy, reflection, profession, and trust.’ Journal of the American Medical Association 286, 1897–1902. Clark J A & Mishler E G (1992). ‘Attending patient’s stories: Reframing the clinical task.’ Sociology of Health and Illness 14, 344–371. Frank A W (1995). The wounded storyteller: Body, illness, and ethics. Chicago: The University of Chicago Press. Frank A W (1997). ‘Enacting illness stories. When, what, and why.’ In Nelson H L (ed.) Stories and their limits: Narrative approaches to bioethics. New York: Routledge. 31–49. Greenhalgh T & Hurwitz B (eds.) (1998). Narrative based medicine: Dialogue and discourse in clinical practice. London: BMJ Books. Hawkins A (1984). ‘Two pathographies: A study in illness and literature.’ The Journal of Medicine and Philosophy 9, 231–2552.

Hunter K M (1991). Doctors’ stories. Princeton, NJ: Princeton University Press. Hyde´n L-C (1997). ‘Illness and narrative.’ Sociology of Health and Illness 19, 48–69. Kleinman A (1988). The illness narratives: Suffering, healing, and the human condition. New York: Basic Books. Langellier K M (2001). ‘‘‘You’re marked’’ Breast cancer, tattoo, and the narrative performance of identity.’ In Brockmeier J & Carbaugh D (eds.) Narrative and identity: Studies in autobiography, self and culture. Amsterdam: John Benjamins. 145–184. Mattingly C (1994). ‘The concept of therapeutic ‘‘emplotment.’’’ Social Science and Medicine 38, 811–822. Mattingly C (1998). Healing dramas and clinical plots: The narrative structure of experience. New York: Cambridge University Press. Mishler E G (1984). The discourse of medicine: Dialectics of medical interviews. Norwood, NJ: Ablex Publishing Company. Morson G S (1995). Narrative and freedom: The shadows of time. New Haven, CT: Yale University Press. Young K (1989). ‘Narrative embodiments: Enclaves of the self in the realm of medicine.’ In Shotter J & Gergen K J (eds.) Texts of identity. London: Sage Publications. 152– 165. Young K (1997). Presence in the flesh: The body in medicine. Cambridge: Harvard University Press.

Medical Discourse: Non-Western Cultures J Clarac de Bricen˜o, University of the Andes, Me´rida, Venezuela ! 2006 Elsevier Ltd. All rights reserved.

The term ‘shaman’ has its etymological root in the word ‘saman’ from a Manchurian language, meaning ‘one who knows.’ Originally the term was used in various areas of northern Asia and was later brought to the Americas, to the southeastern part of India, and to Australia and Africa. According to Mircea Eliade, shamanism or the socially specific role of the shaman was brought to the Americas by the first eastward waves of immigration from Asia. The origin of the shaman in America (the continent in which the role has been most studied) is well documented (see Eliade, 1974: 266; Lowie, 1934: 183–188). The role of the shaman is not limited to individual practice. It is currently portrayed in the literature as a category of social activity with a legitimate function as a health care phenomenon. Classifications include legal aspects, health program functions, adaptation in the migration of ethnic groups, and shamanic practice in the integration of tribal peoples and their customs

into the social mainstream. As such the role of the shaman has become an interdisciplinary study. Elaborations in the area of anthropology and ethnopsychiatry have focused on therapeutic aspects and role models which include the magical or nonlocal physical aspects of shamanic activity. These include Geza Roheim (1950), Mircea Eliade (1951), Claude Le´vi-Strauss (1958), Alfred Me´traux (1967), George Devereux (1970), and others who treat aspects of hypnotic suggestion or trance as therapy.

Mircea Eliade: Ancient Techniques for Producing Ecstatic States Eliade is the great classic authority on shamanism, and is recognized as such by those who wrote after him. His magnum opus (Shamanism and the ancient techniques of extasis, 1951) is the first attempt to comprehensively approach the subject from the point of view of a religious historian. His sources include works on what is described as classical shamanism in Siberia, the Americas, Indonesia, and Oceania. Mystical experience is central to his theme, in which he emphatically rejects the proposition of

704 Medical Discourse: Non-Western Cultures

such experience as neurotic, as has been held by some psychiatrists and ethnopsychiatrists. His concept of techniques for producing ecstasy is presented as being in the ancient magical tradition of the fire element and the symbol of the veil. He treats initiation into the mysteries in its various forms as well as the so-called secret societies which are concerned with such rites. The token death and resurrection of the candidate, including seclusion, the use of the funeral mask to indicate the passage of the shaman into the realms of the dead, token burial, and the descent into the nether regions called hells are also discussed. The hypnotic dream or trance and the obstacles which must be overcome to reach this state, such as the abandonment of previous lifestyles so that curative powers can be achieved, and the intonation of songs which tell of the adventures of the way into the spirit world and the spheres of the gods, are elaborated as part of shamanic experience everywhere.

Shamanic Trance Induced suggestive states are widely employed for both lay and religious purposes in all cultures, particularly in the Native American and mestizo Latin American. The ancient Greek Dionysian mysteries and the European transition into Christianity used trance and mesmeric techniques which are still in wide use. The so called ‘epidemics of possession,’ such as outbreaks of demonic possession and the witch-hunts of the Middle Ages and Renaissance, are later developments. Judeo-Christianity has a tendency to reject religious trance as pythonic and for this reason all devotees of the current autochthonous religions are condemned as superstitious in the Celto-Germanic persuasions. This condemnation extends to members of the Catholic priesthood who have shown a tendency to stray toward induced states, although some religious practitioners have been given lay recognition and are treated as sainted by various cults. Trance is variously delineated in European languages as possession, hysteria, dissociation, altered consciousness, or elevated states, all of which may be researched as multidisciplinary anthropology.

The Deployment of Symbol in Shamanic Healing Le´ vi-Strauss (1961) uses the term ‘symbolic efficiency’ for effects produced by shamanic therapy, such terminology being opposed by Jean Chiappino, who claims that the significance of actual physical treatment is underplayed. (Chiappino, 2003: 39). Symbolic efficiency concerns psychological effects produced

at the symbolic level as an isolated holon. Trance and the shamanic song are outstanding in psychological cure and include group participation. Contemporary anthropologists point out that there are mystical and mythical aspects in all therapeutic cure including the most scientific, and that both mysticism and the archetypal roots of myth constitute the paradigm and provide the dynamics required for all healing.

Shamanism and Neurophysiology Insofar as objective efficiency and scientific proof are concerned, since 1975 neurophysiology has demonstrated the existence of endogenous processes that inhibit nociceptive sensitivity and produce opiate substances that inhibit pain. Endorphins secreted by the pituitary gland and by cells in the brain act on pain receptors in the entire central nervous system so that pain is reduced. The endogenous interaction between receptors is accomplished by peptones called encephalins, such as endorphin and dimorphine. Researchers conclude that there are processes that trigger receptors at a substrate level which also contribute toward inhibiting the transmission of pain information throughout the central nervous system. The human body is thus seen to be naturally equipped to reduce pain sensation. The neurophysiologist James L. Henry of the Department of Physiology and Psychiatry at McGill University has remarked that symptoms commonly associated with such states may include the activation of endorphins, and that it seems reasonable to suppose that this interaction involves a particular mechanism with an extension or global effect within the brain. Two shamanic symptoms appear outstanding to Henry: (a) detachment or separation from the physical world, that is to say dissociation which results in euphoria or ecstasy, and (b) immunity to physical pain, even the most severe. He has found that in states of altered consciousness endorphins excreted from the pituitary gland are present in the blood. This research opens the way to a multidisciplinary approach for anthropologists, neurophysiologists, and psychiatrists.

Shamanism and Ethnopsychiatry Geza Roheim, an outstanding ethnopsychiatrist, has concerned himself with an ontogenic theory of culture. Largely influenced by Sir James Frazer (The Golden Bough, 1890) and his unilineal evolutionary concepts, Roheim considers the shaman as a psychiatric pioneer: ‘‘the scientific precursors who held that the human being was capable of matching his

Medical Discourse: Non-Western Cultures 705

strength against the world’’ (Roheim, 1982: 80–81). He categorizes the shaman as a schizophrenic. After reviewing the case of a schizophrenic patient at Worcester State Hospital, Worcester, Massachusetts (1938–1939), he drew parallels between shamanic fantasies and hallucinations observed in Oceania and in the Yumas of California, as well as those of the Australian aborigines. He has said that the attitudes and behaviors of those concerned with magical practices are related to symptoms observed in states of neurosis. His theories of ontogenic culture are based on the effects of prolonged infancy in the human being and the trauma produced by the exaggerated dependence which is generated upon the relationship with the mother, making it impossible for the child to experience pleasure in sexual interaction. Culture then becomes a permanent search for a mother substitute because the mother archetype has not been individuated. Magical fantasy and schizophrenia are related although the former deals with an empirical world while schizophrenia deals only with substitutes. Special emphasis is placed on oral magic, which is defined as healing through suction, and this is particularly applicable to the therapeutic work of the shaman, whom Roheim describes as one who assaults the mother’s breast, the patient in this analogy being the mother, or the person who has never individuated the mother in the psyche, and the shaman being the suckling child who is the healer.

George Devereux: ‘Sacred Disorder’ Devereux proposes a concentrated psychotherapy which will result in a cure rather than in mere accommodation, which is according to him the usual outcome. As an ethnologist he insists that any given culture should not accept as normal a situation which is virulently pathological. At the same time, he suggests that the Western psychiatrist should not impose his views on cultures which are not Western. Devereux is somewhat contradictory since he recommends culture-specific values while treating the shaman as a universal archetype. ‘Sacred disorder’ is classified as neurosis or shamanic disorder and this is likened to schizophrenia. He affirms that it is diagnostic error to classify individuals as neurotic or psychotic who because of their ethnic conditioning behave in a manner that is considered pathogenic in Western psychology. At the same time, he discusses shamanism in terms of psychopathological disorder, claiming that the shaman is severely neurotic bordering on psychotic in temporary remission, and one who is often in conflict with his own culture. He goes on to say that the shaman

must himself be cured because uncured he is a social nuisance and that the only difference between a shaman and a psychotic is that the shaman’s conflicts are expressed at the level of a cultural model rather than at the level of unconscious idiosyncrasy. The curing of one shaman by another cannot be considered actual but is rather a restructuring of conflicts and symptoms according to their own convention without a conscious understanding of the nature of these conflicts and also without achieving any sort of true sublimation. Anthropologists claim that Devereux takes an ethnocentric position because he believes that the shaman must sublimate in order to be cured (as Freud points out). The question remains as to why this Western value should be applied in a non-Western community which sustains a belief in a shaman who has been cured or indoctrinated into Western processes. Ethnological aspects in a situation where there is crosscultural exchange cease to be meaningful. Furthermore, Devereux fails to recognize the importance of the consensus of the cultural group, which has always been an anthropological consideration. The shaman cannot be limited to the mores and values of a marginal group in any given society or culture since adaptions have been made for reasons of efficiency in the protection of his own people against the constant inroads of Western value systems. Very often the nonWestern group finds itself surrounded by political and economic forces which have invaded the community without its consent. This was often the case in the latter part of the 20th century in Latin American countries, where there are large unassimilated indigenous populations. Devereux’s concept of structuring mental disorder in accord with universal cultural mores is not practical since under actual conditions any cognitive apparatus employed must cover each particular ethnic difficulty. This cannot be done by resorting to the mores of an assumed metaculture. Such an a priori model may well bear an inadequate correlation to the actual state of knowledge with respect to human potential as it exists in the specific culture and will be biased toward a Western viewpoint. Eliade and others reject the concept that magical therapeutic events are pertinent to categories of mental disorder.

The New Shaman It has been thought that shamanic lore is culturally specific and, since it is steeped in tradition, not affected by Western thought. This is not the case, since in the second half of the 20th century, communities which rely on the shaman had their geopolitical and cultural space invaded by large dominant cultures. Thus, in

706 Medical Discourse: Non-Western Cultures

order for the shamanic culture to prosper in the face of changes in political thought and modes of health care, the shaman has had to adopt methods of defense and to transform his role in order to harmonize with the cultural milieu propagated by the mass media. The first transformation is in the community proper, which adopts new techniques that have been employed by other shamanic communities. The new shaman is fashion conscious and is aware of communities which speak various languages. In Venezuela, voodoo from the Afro-Caribbean and Brazil mingles with the rites employed by the priestesses of Maria Lionza. A spiritual brotherhood is inaugurated and is maintained by travel as well as through electronic media. The shaman leaves his own community to impose his ideas and methods on other communities, including sections of the mainstream culture (which increasingly makes use of his knowledge). A feedback loop becomes established between cultures, particularly in Latin America. Thus a defense is established not only of shamanic lore but also of the traditions within the whole culture. The shaman can operate within the legal systems imposed upon medical services, which impose licensing and qualifications required for practice. In order to maintain social dignity within the body of a multiethnic and pluricultural society and to protect himself from aggression, the shaman has linked his role with organizations that support indigenous cultures at national and international levels. Such organizations acquire an ever-increasing amount of power and influence. Cults of possession, such as that of Maria Lionza, which has been popular in Caracas since the 1970s, adopt shamanic techniques and the leaders often use the title of ‘shaman.’ These cults proliferate throughout Latin America and Africa, although these new aspects of shamanism have so far been studied only superficially. The phenomenon which has come to be widely designated as possession uses trance-inducing techniques which stem from shamanic traditions. However, the various cults differ from the shamanic tradition, since the shaman uses the trance as a means of transcending himself so that he may enter the spirit world and become a medium for higher powers of inspiration. Inversely, cults of possession do not attempt to transcend but rather to bring the higher powers down or to invoke communication so that the spirit enters the body or inspires the mind of the shaman or medium.

Shamanism and the Cult of Possession In the cult of possession, which is again a distinct form of shamanism, any manifestation of disorder is

restructured so that communication is possible at a new level. Most psychiatrists consider any sort of possession to be psychopathological and related to hysteria. Franc¸ ois Laplantine (1975: 23) defines possession as ‘‘psychosomatic conversion of an unconscious conflict into a symptom.’’ He understands possession as a realization of a conversion which is not hysteria but its contrary: ‘‘it expels the symptom and fills the body of the entranced dancer with a social and cultural language.’’ Because hysteria denotes a sort of suffering, he further denotes possession as ‘‘the fervent celebration of myth.’’ The ethnologist Louis Price-Mars considers possession to be a crisis which he terms ‘teolepsia,’ ‘‘a metamorphosis or normal psychological state which reproduces aspects and mannerisms of the gods within the context of a dramatic personification.’’ Like Laplantine he insists on the distinction between religious possession as it is seen in Africa and the Americas, and possession as hysteria, which is clinical. Hysteria is a common disorder while possession implies religious rites of some sort. Possession can also be described as a semiotic sign with the signified being the spirit or the god which manifests in a ceremonial context with percussion instruments, dance, song, aromatics, tobacco smoke, candles, and other appurtenances of the sacred.

Roger Bastide: Adorcism and Exorcism, Shamanism A and Shamanism B Bastide (1972: 72) makes the distinction between trances which occur through possession as adorcism, which is the imposition of a new soul or spirit into the person, and exorcism, which is the extraction of a spirit which is causing harm. Exorcism is a widely practiced shamanic technique (in some regions of the Americas called ‘bad air exorcism’) and consists of suction or a sucking action accompanied by the sounding of a sacred maraca or rattle made from a gourd filled with seeds or stones. Bastide describes shamanism A as adorcism, the possessive trance which transports the shaman to the spirit world to recover the soul of the patient which has been stolen by spirits. Exorcism is designated as shamanism B. The shaman (priest or medium) is capable of performing both A and B. Bastide includes what he calls the ‘savage trance’ and also the ‘baptized trance.’ This is a distinction which reflects attitudes of the African slave when presented with shamanic practices, the traditional and the introduced. The traditional savage entered into an uncontrolled trance which was involuntary and without any kind of social projection; the baptized trance was the result of a long and arduous

Medical Discourse: Non-Western Cultures 707

training which capacitated the recipient or initiate to become socially useful in a new situation.

Classifications of the Trance Trance can be classified either as disassociative, or as euphoric or theatrical. Disassociative trance is asocial in character and indicative of biological or cultural states which are pathological, or again may be the first indications of a tendency toward shamanic initiation. The euphoric trance is achieved by the shaman after initiation, and permits a cultural channeling of the disassociative, enabling the practitioner to invoke the trance at will and give it a therapeutic character, which is expressed both biologically and psychologically in a socially therapeutic manner (biopsychosocial). The phenomenon is endogenous or introverted and is culturally accepted. It may be brought on with or without external stimuli such as hallucinogenic plants, tobacco rituals, incense, candles, flute music, the drum, the maraca, song, or prayer, and may include the employment of sacred objects such as cords, collars, crucifixes, sacred images, dolls, necklaces, capes, or gunpowder, and massage and hypnotic manipulations of various sorts. Euphoric trance is socially cultivated and is much venerated. Shamans who are in charge of these rites are highly respected and adulated with varying degrees of awe in respect of the sacred character of their office. Nevertheless, the shaman, or mojan as he is called in Spanish, or the American piache or priestess does not always truly ‘go under’ or enter into the trance but, being familiar with each step of the process, is able to imitate the behavior as part of being a therapist without the patients or the faithful being aware of any guile. Such histrionics may well be classified as theatrical trances and often imitate the euphoric trance perfectly. The transformations of the shaman are various, both in trance states and out of them, and these variations are often changed by apprentices, adepts, and priestesses, who are all aware of global changes, political and ethnic. Thus the student or the investigator should not attempt to categorize fixed shamanic practice, but rather observe the ever-changing scenario which is even now unfolding, and to note carefully one’s own reactions to these new sociocultural conditions. Western medicine, including biomedicine, has shown a tendency to deprecate traditional or natural medicine. The shaman and folk doctors are often ignored or considered as quacks who lack education and information. Some scholars, such as O. Carl Simonton, think, rather as does the neoshaman and the folk medicine practitioner, that the devotion to

technology and the dogmatic principles of the medical profession have lost the sense of what it means to be human. Medical science and medical schools based upon mechanistic or material principles have become clinically impersonal, and ignore the cultivation of a gestalt approach. This is nothing new; Plato put it quite clearly: ‘‘So neither ought you attempt to cure the body without the soul, and this is the reason why the cure of many diseases is unknown to the physicians of Hellas, because they are ignorant of the whole, which ought to be studied also; for the part can never be well unless the whole is well’’ (Plato, Charmides, 156, 157.)

The Shaman, Myth, and the Cosmos The shaman in all forms and manifestations is respected and highly esteemed in his own community and in other similar ones as well as in mainstream society. In some cultures the shaman is seen as the representative on earth of the Cosmic Creator and so plays the role set out in the sagas for the culture hero at the conception of the world. In trance he enacts the creation of human beings and the relations which they had with the spirit world at the beginning of time. Because the shaman has the ability to communicate with spirits or motivating forces he is able to perceive more clearly the pathogenic agents which debilitate his patients and is able to neutralize them. Patients or the faithful are led to confront the psychological dangers to which they are exposed during the sessions at which they are cured. Shamanic societies interact through their myths with the whole region of cosmic space, where there are various heavens overarching various mountains and sacred lakes. In the Himalayas the gods of the people continue to dwell, as do the spirits they command and those of the dead. Places such as those known as the Houses of Thunder are domains in which the Sun and the Moon move in their roles as mythic heroes. In the Andes there is the home of the double rainbow, male and female, which is the symbol of the bridge across the chasms of the many cosmic spaces which yawn between the heavens and the Earth. These the shaman negotiates as he or she passes from one space to another or passes to the Houses of the Grandfathers to visit the Old People. There the first humans, who have continued to exist from the beginning of time, find their abode and thence upon occasion descend to dance with their offspring. There live their companions, the mythic animals and birds, the eagle, the anaconda, the rainbow serpent, the quetzal, and the bear. This belief in the Old Ones is common throughout the Americas from south to north. Viewing this area

708 Medical Discourse: Non-Western Cultures

as a cosmic space, according to Eliade, fire has a primordial role, particularly in South America. However, earlier research has shown that in South America the elements of air and fire take a preponderant role in the cosmology and shamanic ritual. Fire is described as a glimmering, or as a beam or flame seen to exist within the air, the primordial element from which emerged the first gods who came to Earth. The primordial elements are of the same physical constitution as the gods, the spirits, and the enchanters with which the shaman works and whose language he has learned as part of his initiation. Through this knowledge the shaman brings rain, commands the thunder, and calms the storm. The shaman knows the sacred language because he or she has become familiar with the primordial sounds, those of nature which cannot tolerate the noises of human beings. It is for this reason that the shamans of the Andes recommend silence when walking in the highland meadows and beside the lakes. The shamanic veil is also conceived as a cosmic element and has become the object of research, although not always mentioned as such. There is some discussion of spiritual helpers or allies with which the shamans work and which are often described as spirits of nature. The various airs, the spirits of the water, of the rivers, the spirits of the Sun and Moon and of the rainbow are all called upon. In the northern part of South America, in Colombia and Venezuela, metaphysical persons fly like eagles when they come from the realms of the spirits and like vultures when they fly from the lakes of the dead. There is a well-defined relationship between the shaman and certain birds. In the Venezuelan Andes, this alliance this has been portrayed as a winged stone, usually of serpentine (hydrous magnesium silicate) although sometimes this symbol is worked in gold. The shaman hangs this symbol from the elbows or around the neck when dancing alone or with a group. Flight often has to do with the search for the soul of a patient, a soul that has been stolen by a spirit or by someone who has died. This search is sometimes performed as evocation, the shaman calling back the soul of the patient, which is less dangerous than the shamanic journey in its quest. The process of drawing out or of suction, used traditionally to relieve illness in the body of the patient, and the object which is regurgitated as proof of the extraction vary in different geographical regions and from one shamanic society to another. In general these objects take the form of parasitical worms or tangles of blood-soaked hair, the object being spat into a bucket of water, since water, particularly in the Andes of Venezuela, is the principal ally of the South American shaman.

The shamanic methods for healing by means of the imagination may be summed up as trance, flight, song, imagery, hypnosis, meditation, dance, tobacco, regurgitation, and the use of hallucinogenic plants (peyote, the San Pedro cactus, ayahuasca, yopo, and fungi). The power of the symbols which are considered to be contained in these substances permits the mojan or shaman to intuitively hear messages from the combined body-mind-spirit, which has been disregarded by Western medicine. However, the scientific realm in the eyes of the common people has its own symbolic powers: the white lab coat and the stethoscope are seen by mediums in trance as apparatus borne by doctor spirits (particularly in the cult of Maria Lionza) which they use on their patients with cold-blooded aplomb. Every shamanic culture maintains certain objects and rites as ideals which have the potential for conveying physical forces, and the various objects function in proportion to the amount of mental or spiritual energy invested in each symbol. In every society there is that which has potential curative power and that which does not. In multiethnic societies such as those in Latin America, each group identifies with not just one complex of belief objects as having real authenticity, but the participation is with various beliefs and the paraphernalia pertinent to each one. There is no sense of logical or religious contradiction in respect to any participation. Seen thus, shamanism must be studied within the new sociocultural contexts to which it has had to adapt in order to survive.

Acknowledgments This article was translated by Alastair Beattie, University of the Andes.

Bibliography Bastide R (1960). Les Religions africaines au Bre´ sil. Paris: Presses Universitaires de France. Bastide R (1972). Le Reˆ ve, la transe et la folie. Paris: Flammarion. Bautista F (2000). ‘La Culebrilla o Herpes Zoster: una enfermedad y su curacio´ n entre diversas visiones del ˜ O (eds.) El mundo.’ In Clarac J, Rojas B & Gonza´ lez-N discurso de la salud y la enfermedad en la Venezuela de fin de siglo. Me´ rida, Venezuela: Universidad de Los Andes. Bouteiller M (1950). Chamanisme et gue´ rison magique. Paris: Presses Universitaires de France. Chiappino J (2003). ‘La cura chama´ nica yanomami y su eficacia.’ In Ale`s C & Chiappino J (eds.) Caminos cruzados: ensayos en antropologı´a social, etnoecologı´a y etnoeducacio´n. Paris: IRD. Chiappino J & Ale`s C (1997). Del microscopio a la maraca. Caracas: Ex Libris.

Medical Discourse: Sociohistorical Construction 709 Clarac de Bricen˜ o J (1981). Dioses en exilio: representaciones y pra´cticas simbo´ licas en la cordillera de Me´ rida. Caracas: FUNDARTE. Clarac de Bricen˜ o J (1992). La enfermedad como lenguaje en Venezuela. Me´ rida, Venezuela: Universidad de Los Andes. Clarac de Bricen˜ o J (1996). Dioses en exilio: representaciones y pra´ cticas simbo´ licas en la cordillera de Me´ rida (2nd edn.). Me´ rida, Venezuela: Universidad de Los Andes. Devereux G (1970). Essais d’ethnopsychiatrie ge´ ne´ rale. Paris: Gallimard. Doore G (ed.) (1993). El viaje del chama´ n: curacio´ n, poder y crecimiento personal. Barcelona: Kairo´ s. Eliade M (1951). Le Chamanisme et les techniques arcaı¨ques de l’extase. Paris: Payot. Eliade M (1974). Le Chamanisme et les techniques arcaı¨ques de l’extase (3rd edn.). Paris: Payot. Evans Shultes R & Hoffmann A (1984). Les Plantes des dieux: les plantes hallucinoge`nes, botanique et etnologie. Paris: Berger-Levrault. Harner M (1993). ‘¿Que´ es un chama´ n?’ In Doore (ed.). Lapassade G (1976). Essai sur la trance. Paris: Jean-Pierre Delarge. Laplantine F (1975). La culture du Psy. Toulouse: Eppsos Privat. Le´ vi-Strauss C (1961). Anthropologie structurale. Paris: Plon.

Lowie R H (1934). ‘Religious ideas and practices of the Eurasiatic and North American areas.’ In EvansPritchard E E, Firth R, Malonowski B & Schapera I (eds.) Essays presented to C. G. Seligman. London: Kegan Paul. Mehl L (1993). ‘El chamanismo moderno: integracio´ n de la biomedicina con las visiones tradicionales del mundo.’ In Doore (ed.). Me´ traux A (1949). ‘Religion and shamanism.’ In Steward J H (ed.) Handbook of South American Indians 5: The comparative ethnology of South American Indians. Washington, DC: United States General Post Office. Perrin M & Machado J U (1980). ‘El arte guajiro de curar frente a la medicina occidental.’ Boletı´n Indigenista Venezolano 19(16), 39–200. Reichel-Dolmatoff G (1975). The Shaman and the Jaguar: a study of narcotic drugs among the Indians of Colombia. Philadelphia: Temple University Press. Roheim G (1973). Psicoana´ lisis y antropologı´a. Buenos Aires: Sudamericana. Roheim G (1982). Magia y esquizofrenia. Buenos Aires: Paido´ s. Simonton O C & Hensen R (1993). Sanar es un viaje: el poder de la mente y del espı´ritu en la superacio´ n de enfermedades graves (2nd edn.). Barcelona: Urano. Wilbert J (1987). Tobacco and shamanism in South America. New Haven: Yale University Press.

Medical Discourse: Sociohistorical Construction B-L Gunnarsson, Uppsala University, Uppsala, Sweden ! 2006 Elsevier Ltd. All rights reserved.

Introduction Scientific language and discourse emerge in a cooperative and competitive struggle among scientists to create the knowledge base of their field, to establish themselves in relation to other scientists and to other professional groups, and to gain influence and control over political and socioeconomic means. In every strand of human communication, language and discourse play a role in the formation of a social and societal reality and identity. This is also true both of the formation of the different professional and vocational cultures within working and public life and of the formation of different academic cultures. Historically, language has played a central role in the creation of different professions and academic disciplines, and it continues to play an important role in the development and maintenance of professional and institutional cultures and identities. Societal,

social, and cognitive factors all make important contributions to the construction of professional cultures. Professionals try to create a space for their field within society. They try to establish themselves in contact and competition with others within their group as well as with other groups. Their knowledge base and its linguistic forms are created in a societal and social framework.

Diachronic Studies on the Development of Scientific Genres A constructionist perspective on the emergence of scientific discourse and text genres is found in various traditions. Within the tradition of sociology of science, several studies have been devoted to analyzing the role of texts in the establishing of scientific fact. The scientific field is seen as a workplace, a laboratory, where social rules determine the establishing of facts and the rank order of the scientists. KnorrCetina (1981) was one of the first to describe the writing up of results as a process of tinkering with facts rather than a knowledge-guided search. Latour

658 Medical Discourse: Psychiatric Interviews

Medical Discourse: Psychiatric Interviews B T Ribeiro, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil and Lesley University, Arlington, MA, USA D de Souza Pinto, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil ! 2006 Elsevier Ltd. All rights reserved.

Psychiatry, defined as ‘‘the branch of medicine that focuses on the diagnosis and treatment of mental illness’’ (Andreasen and Black, 1995: xi), examines mental disturbances that afflict the patient/client in his or her relationship with himself or herself, family and friends, and society at large. It relies on the psychiatric interview as its key diagnostic instrument given that laboratory exams do not provide evidence for psychopathologic processes. The anamnesis – a procedure through which the doctor assesses and evaluates the patient’s situation – structures the psychiatric interview and its clinical interpretations (Sullivan, 1970; Shea, 1998). The purpose of the interview is to detect patterns of affect relations that may trouble the patient, to make a diagnosis, and to prescribe a treatment. The doctor has an active question-asking role: he or she follows a series of pre-established steps that make up the diagnostic exam. Such steps follow specific topics (eating habits, sleep and sleep functions, parenthood, and vocational history, among others). However, it is the quality (rather than the quantity) of the doctor’s questions that qualify him or her as an expert in the patient’s perspective (Shea, 1998: 35). For the patient, certain questions or issues are of great importance. Some can be captured as requests for help and attention with an implicit message ‘Can this person help me? Will she listen to me?’ It is not surprising, therefore, that successful interviews may often close with an acknowledgment by the patient of ‘having been listened to.’ Compared with medicine in general, psychiatry derives its practice from clinical observations of patients’ behavior that take place in face-to-face encounters between patient/client and therapist/ doctor (Shea, 1998) where language and communication disorders are closely investigated (Andreasen, 1979). It is also based on interviews with patients’ partners, relatives, and/or friends. As key components of the mental status examination, language and cognitive behaviors are routinely assessed in the interview (Andreasen and Black, 1995: 49). Thus, psychiatric definitions related to thought, language, and communicative disorders (i.e., poverty of speech, pressure of speech, distractible speech, tangentiality, incoherence) derive from the clinician’s

observation and experience. Also, the doctor infers the patient’s thought disorders from the patient’s speech as thinking cannot be observed directly (1995: 68). Such understanding assigns a prominent role to language studies in psychiatry, where researchers in linguistics can contribute to more accurate and adequate descriptions for language and cognitive disorders. Given the absence of specific diagnostic laboratory exams (common in other medical specialties, such as oncology, heart disease, and diabetes), this field of medicine relies on the expert’s interpretive practices of the patient/client’s symptoms (Sullivan, 1970). As Kleinman (1988: 7) states, ‘‘a psychiatric diagnosis is an interpretation of a person’s experience . . . whose roots are deeply personal and physiological.’’ Thus, interpretation may differ for professionals with different orientations (for example, a psychiatric interview performed by a neuropsychiatrist contrasts with an interview conducted by a psychoanalyst). Such differences attest to the experience of illness as a culturally shaped phenomenon mediated by language. Discourse analysis of psychiatric interviews (Ribeiro, 1994; Pinto, 2000) provides relevant information on the use of language, communication disorders, and interpretive practices.

The Psychiatric Interview as a Speech Event A psychiatric interview is a complex institutional encounter (Sullivan, 1970; Shea, 1998). For the psychiatrist, the interview aims at gathering the patient’s information to reach a diagnosis and establish a course of action for proper treatment. An agenda should be followed to address and assess topics related to why the patient has been hospitalized, the history of the illness, and its previous episodes, among other clinical concerns. It is framed as an inquiry and not as a conversation. For the patient, however, this social encounter is seen as an optimal moment to introduce topics on health and illness, where personal stories emerge, and the doctor is framed as a potential listener (Ribeiro, 2002). These differences in expectations are usually not resolved. Most often than not, the institutional discourse prevails, with its key communicative goals: to diagnose and prescribe. Research on the clinical interview has developed mostly from anthropology, sociology, and sociolinguistics, related to concerns on communicative practices that constitute human social life. To understand the social construction of the interview, the

Medical Discourse: Psychiatric Interviews 659

concepts of speech event (Hymes, 1967) and speech activity (Gumperz, 1982) seem particularly interesting: ‘‘activities, or aspect of activities, that are directly governed by rules for the use of speech’’ (Hymes, 1967: 19). Research in these fields has shown that participants’ roles, their rights and responsibilities in the course of the interaction, expectations about ways of introducing information, and constraints on content are all based on culture-specific norms. Thus, a well-established set of cultural conventions governs this social encounter and allows both the doctor and the patient to face the interview as an acknowledged clinical practice. The sociolinguistic analysis of clinical discourse also resonates with Sullivan’s (1970: ix) definition of psychiatry as ‘‘the field of the study of interpersonal relations, emphasis placed on the interaction of the participants in a social situation.’’ He added that ‘‘the psychiatric interview is a special instance of interpersonal relations’’ (1970: ix). The purpose of the interview is to find out who the patient is by reviewing the course of events that he or she has gone through in order to be who he or she is. The DSM – A Unidirectional Model for Communicative and Speech Disorders

The diagnostic and statistical manual of mental disorders (DSM; American Psychiatric Association, 1994), a standard classification for mental health problems, provides a widely recognized reference for the psychiatric interview. It is based on the belief that mental disorders are universal and that they have a clear course of predictable symptoms. Additionally, its definitions focus solely on the patient, seldom taking the doctor–patient interaction and relationship into consideration. For example, a definition of a disorder known as flight of ideas, ‘‘instances of behavior where the patient may shift idiosyncratically from one topic to another and where things may be said in juxtaposition that lack a meaningful relationship’’ (Andreasen, 1979), does not take into account complex topic negotiation processes between the two participants; it focuses rather on the immediate prior context (i.e., a sequential response to a question). Social and clinical researchers have questioned the DSM framework. Based on crosscultural studies of mental disorders, anthropologists and psychiatrists (Kleinman, 1988; Good, 1994) claim that there are cultural variants in the expression, course, and outcome of mental disorders. Social responses to the illnesses reveal cultural differences in the way a disorder is interpreted and handled. According to this approach, ‘‘abnormality and pathology are inseparable from cultural interpretation’’ (Good, 1994: 35) and the standard notion of normality must be assessed within a cultural context. Language is then conceived

as a fundamental tool for understanding how human beings experience mental disorders. The Interactional Model for Communicative Studies: Implications for Psychiatry

The interactional model for communication studies (Schiffrin, 1994) brought to light aspects of communication that have implications for psychiatry. It states that both speaker and hearer play active roles in everyday face-to-face interactions, as they jointly produce meaning and discourse. The situated nature of communicative behavior (verbal, nonverbal, and paralinguistic) is also foregrounded. Everything that happens in an interactive situation displays information, regardless of the speaker’s intentions, and might be interpreted accordingly. The hearer then assigns meaning based on multileveled interpretations (semantic, pragmatic, conversational) to ‘‘whatever information becomes available’’ (1994: 401). The goal of an interactive situation then shifts from the interpretation of linguistic information encoded by the speaker to an interpretation of situated behavior. For psychiatry, this alternative model of communication has relevant contributions. It underscores that human communication relies on both the message – what is said – and how it is encoded. Pause, hesitation, gaze, and intonation are some of the cues the psychiatrist needs to attend to in order to assign interpretations to the patient’s behavior. Also, interpretations are negotiated, repaired, and changed through interaction rather than unilaterally conveyed. Coherence thus might be assessed based on the different levels of communication: semantic (attending to the message, to what is being said); conversational (focusing on the sequence of conversational moves, i.e., question/answer); pragmatic (whether certain utterances are appropriate and relevant at a certain point of the interview); and interactional (the discursive strategies used by participants in the joint construction of talk). The interactional model of communication also calls for a balance between the institutional agenda introduced and maintained by the doctor, on the one hand, and the personal topics brought up by the patient, on the other, allowing for a more conversational encounter, where the patient can introduce narratives and the doctor can align himself or herself as a listener of stories. Based on transcripts of audio and videotaped psychiatric interviews, Ribeiro (1994, 2002) and Pinto (2000) have systematically analyzed the discourse of doctors and patients by using both topic analysis and narrative analysis. The former has pointed to different types of topic discontinuity, which reveal psychopathological problems regarding thought

660 Medical Discourse: Psychiatric Interviews

disorders (such as tangentiality); the latter discusses common phenomena in everyday conversations such as digressions, often evaluated by clinicians as a progressive deviation from the topic in relation to the information flow of an assessment interview. Ribeiro (2002) suggested that a narrative might accomplish textual and conversational coherence, when the doctor aligns himself as both a conversationalist and an inquirer, attending both to his clinical agenda and to the patient’s story.

Language, Thought, and Communication Disorders Psychiatry is driven by diagnostic procedures. The doctor must investigate each one of the patient’s symptoms and search for a standard definition. Part of this work is performing an assessment of thought, language, and communication disorders, as disturbances in the use of language are displayed in several psychiatric conditions, such as mania, severe depression, organic disorders, and schizophrenia (Andreasen, 1979; Shea, 1998). The psychiatric diagnosis investigates pathological processes that underlie the patients’ mental and psychological states. It is based on phenomenology, ‘‘the empathic method for eliciting symptoms’’ (Sims, 1995: 5). According to Jaspers (1963), we have only indirect access – the patient’s own descriptions – to what happens to a patient; these descriptions are then compared and interpreted according to our own experiences. The patient’s words (conveyed through statements, queries, complaints, etc.) are, therefore, reinterpreted in light of our own understandings. Hence, the task of providing a psychiatric diagnosis is an extremely complex activity. It involves mainly the doctor’s ability to unravel and understand the nature of the sufferer’s experience, given the absence of laboratory exams to sustain the diagnostic hypothesis. The psychiatric diagnosis is the key reference point for understanding the patients’ condition and related therapeutic procedures. The psychopathological examination – which unfolds in the psychiatric interview – plays a key role in the psychiatric diagnosis. It assesses the patient’s attitude toward the interview, that is, whether he or she is being cooperative. It investigates memory, consciousness, attention, orientation, and language and thought processes, etc. Aspects related to pressure of patient’s speech, rhythm, and the use of very long pauses are classified as speech disorders. Another range of disturbances – thought disorders – relate to how thoughts are verbally expressed, how the patient connects ideas (topic development), and whether he or she expresses paranoid or delusional ideas.

According to Andreasen (1979), the term thought disorder has been largely used in psychiatry to describe a gamut of different issues regarding human cognition, which ranges from the patient’s capacity to abstract from external stimuli to how he or she organizes his or her utterances in a piece of discourse, by so communicating (in)coherently. Thus, the concept of thought disorder ‘‘is not a unitary construct, but rather encompasses several different components’’ (Berembaum and Barch, 1995: 350). In an attempt to distinguish the various connotations of this term, a distinction between disorders in content and form was established. Broadly speaking, the former refers to what the patient talks about (Taylor, 1981), whereas the latter is concerned with the speed and with the forms of related links between ideas, i.e., how ideas are expressed. Despite these distinctions, psychopathologists have not established a common understanding of the different phenomena labeled under thought disorders; neither do they all consider the distinction between content and form a productive one. The category of thought disorder has been under criticism for the past 30 years for various reasons. Andreasen (1979) pointed out that the current definitions of formal thought disorder in psychiatry are based on a close and mistaken correlation between speech and thought. Those who advocate for this correlation assume that thought might be assessed through speech, disregarding the complex cognitive processes that underlie linguistic behavior. Pinto and Ribeiro (2001) added that the concept of thought has not been uniformly defined. Most definitions presented in different manuals of psychopathology are tautological, circular, and imprecise, referring to vague notions such as ‘ideas,’ ‘associations,’ and ‘symbols.’ As Chaika (1990: 57) aptly stated, ‘‘thought remains an undefined entity. How then can we correlate speech with a concept so nebulous as thought?’’ Andreasen suggested the term ‘‘disorganized speech’’ instead, as it translates what can indeed be observed, that is, the patient’s speech. A list of operational and empirical categories deriving from clinical experience (such as derailment, poverty of content of speech, and tangentiality) describes the most frequent disorders. It distinguishes among the different facets of thought disorder and provides critical information about the relationship among specific symptoms (Andreasen, 1979). These categories represent a clinical assessment of thought, language, and communication disorders. Until the 1970s, specific thought disorders, such as ‘‘looseness of association’’ (Bleuler, 1950), were

Medical Discourse: Psychiatric Interviews 661

believed to be a pathognomic symptom (symptoms that are closely associated with a certain disorder or illness) of schizophrenia. That is, based entirely on this symptom, the clinician could develop a differential diagnosis between schizophrenics and bipolar patients (manic-depressives). However, in the past 30 years, studies have shown that this symptom ‘‘might as well occur in other diagnostic groups’’ (Andreasen, 1979: 1325), such as in patients with bipolar disorders. Thus, today the differential diagnosis between schizophrenia and mania is based on: (1) the course of action of the mental illness (i.e., its developmental traits) and (2) its prognosis. Experimental studies were carried out to investigate patterns of disorganized speech in psychotics by psychiatrists (Taylor et al., 1994) and linguists (Chaika, 1990; Rochester and Martin, 1979). To control variables, these researchers employed structured instruments to trigger and elicit linguistic behavior that could function as evidence of pre-established categories. Rochester and Martin (1979) used summaries of short narratives to investigate types of failures in the speech of schizophrenic patients. Their problematic use of cohesive ties was seen as an indication of different types of thought disorders expected to occur in psychotic patients. Chaika (1990) made an analogous study of psychotic narration that does not support the view that psychotics do not use cohesive ties competently. More recently, research based on discourse analysis has examined data collected in natural settings, focusing mostly on psychiatric interviews (Ribeiro, 1994; Pinto, 2000). These studies have shifted the analysis to the coconstruction of doctor– patient communication, where meaning is conceived as jointly produced by both participants.

Coherence in Psychiatric Interviews: Topic, Schema, and Frame ‘‘Using language involves doing several things at once, any one of which can go wrong’’ (Becker, 1982: 127): one speaks, composes words and structures, refers to the world, evokes prior texts, and relates to one’s audience, all at the same time. When the integration is disrupted, an unsuccessful language and interactional experience results. In analyzing doctor–patient interactions in psychiatry, certain concepts have proved useful. First, topic – what talk is about – requires a sense of being attuned to the message. Any encounter presupposes an understanding among participants regarding when and where to initiate talk, among whom, and by means of what topics (Goffman, 1982: 34). Additionally, participants must share rules for topic introduction and development. In this way, topics

present boundaries that are marked either semantically or interactionally. ‘‘A single focus of thought and visual attention, and a single flow of talk, tends to be maintained and to be legitimated as officially representative of the encounter’’ (1982: 34). This understanding corroborates the participants’ sense of coherence. In psychotic discourse, however, some of these expectations are not met. Psychotic talk often presents a ‘‘loss of goal directedness’’ (Bleuler, 1950). The speaker ceases to contribute to the sequential (or cyclic) development of topics, thus thwarting topic continuity (Pinto, 2000). Referents may be introduced with no meaningful relationship to one another. Often, the speaker goes further and further off the track. Word associations tend to be governed by rhythmic relationships (sound play) rather than by semantic and pragmatic ties (Ribeiro, 1994). To complicate matters further, both doctor and patient fail to identify each other’s topics. As a result, the process of topic negotiation often breaks down, few topics are expanded, and others are recycled with no preestablished agenda. Topic provides a natural criterion for distinguishing instances of a speaker’s deviant discourse from a discourse that matches our expectations about coherence (Ribeiro, 1994; Pinto, 2000). The complex step-by-step process of topic introduction and topic development reveals the amount of effort that participants must undertake in interaction to ensure topic continuity. When doctors face inconsistencies in the patient’s discourse, they may use a number of strategies to keep the communicative channel open: they constantly summon the patient to the interview, they acknowledge the patient’s statements, and they address some of his or her questions (Ribeiro, 1994). In doing so, doctors are constantly reassessing their assumptions about the patient’s topic of talk. How do topics relate to knowledge schemas? Knowledge schemas are cognitive entities consisting of ‘‘participants’ expectations about people, objects, events and settings in the world’’ (Tannen and Wallat, 1993: 60). Schiffrin (1987: 12) pointed out that the concept of topic may help us ‘‘understand the role of shared knowledge in discourse,’’ despite its inherent difficulties. This is particularly significant in instances of pathological discourse where social norms and linguistic rules are in many ways violated. For example, in an interview with the patient in a psychotic episode, discrepant knowledge schemas on the part of the doctor and patient may result in reciprocal distancing (Ribeiro, 1994). In each case, the speaker’s topic selection follows divergent expectations regarding ‘what this talk needs to be about.’ Whereas the doctor proposes talk exclusively as ‘interview talk,’ the patient proposes talk as a ‘family encounter.’ He

662 Medical Discourse: Psychiatric Interviews

or she may project an addressee (e.g., the father or mother), while the doctor remains mostly as an unratified participant (Goffman, 1982: 34). In this context, the patient introduces personal topics that belong to prior family schemas. Frequently, the referents lack specificity (i.e., they are encoded mostly by pronouns that presuppose a high degree of topic accessibility). In standard psychiatric interviews – such as discharge interviews – background knowledge related to the immediate needs of the communicative context is mostly shared by patient and doctor. It orients topic selection by promoting certain topics and inhibiting others. The doctor controls content; thus, talk focuses on the development of ‘official topics’ (i.e., identifying the patient’s name, age, marital status, assessing the patient’s chief complaint, present illness, and past history). The patient’s role is mostly confined to addressing the doctor’s topics. A joint discourse results, where speaker’s and listener’s roles alternate following social and institutional norms. As a consequence, it is fairly simple to identify topics in the psychiatric interview because the doctor follows a standard agenda (Sullivan, 1970); furthermore, it is usually the same participant, the doctor, who initiates topics. Whereas topics address the question ‘What is talk about?’ framing establishes a metamessage as ‘how this talk must be understood.’ Frames capture the dynamic way participants position themselves, relate to one another, and establish multiple contexts for talk. In Bateson’s ([1972] 1981: 186) terms a frame ‘‘is (or delimits) a class or set of messages (or meaningful actions).’’ A metamessage (an implicit message) constitutes instructions to the listener on how to understand the messages within the frame. As people speak and perform actions, they signal to one another what they believe they are doing (i.e., what speech act(s) they are performing, what activity they are engaged in) and in what way their words and gestures are to be understood. Frames are encoded and understood linguistically, paralinguistically (i.e., intonation, pitch, rhythm), and nonverbally. Topic is a key component in defining frames and part of the definition of an interactional situation. In clinical encounters, an institutional agenda often dictates topic selection. The doctor brings a topic agenda that helps establish the major interactional contexts (i.e., frames). In the psychiatric interview, the doctor plays a very active role in introducing topically relevant questions ‘‘to make sure that he knows what he is being told’’ (Sullivan, 1970: 21). Even such topical talk, however, can trigger more or less subtle frameshifting: a doctor’s request for information about the patient’s social history may trigger

Table 1 Excerpt of an initial sequence of a psychiatric interview Doctor: Patient: Doctor: Patient

What is your name? Joana Garcia. And how old are you now? Thirty-three.

a personal story and thus a personal framing and a change in participation structure. The interrelation of frame and topic has implications for coherence (Ribeiro, 1994). Speakers and hearers have multiple options for coherence, as illustrated in Table 1. Coherence is achieved within the exchange system as the doctor and the patient follow rules for turntaking (A/B/A/B), while alternating the speaking role. It is achieved simultaneously in the action structure, when the patient provides the expected sequence (a request for information is followed by a statement). Finally, the propositional content coheres as participants jointly refer to the ongoing topic (i.e., name and age). To make sense of the above interaction, however, one needs to attend to the situation at hand. This is the opening of a standard medical (psychiatric) interview. For a psychiatric patient, making the proper choice from different linguistic and social options has definite implications for the doctor’s evaluation and diagnosis. The patient must understand and jointly construct with the doctor the frames of the psychiatric interview. When both participants share an understanding of the ongoing talk, its purposes, and ways of interaction, coherence is achieved (Ribeiro, 1994).

Narrative and Topic Coherence The psychiatric interview, modeled after the standard medical interview, follows a Question/Answer chain format, where the doctor requests information or clarification; frequently he or she probes further for more information and expands the topic. Such speech actions result in a Q/A/Q/A sequence that structures the interview mostly as an inquiry. There are many moments, however, when the patient introduces topics related to his or her illness and thereby begins a narrative, as shown in Table 2. Whereas the doctor structures talk as inquiry, the patient views the encounter differently. She has stories (or a life story) to tell. ‘‘Patients view the medical encounter as an opportunity to account for their life and health stories in their own terms’’ (Ribeiro, 2002: 199). In interviews with a patient in a psychotic episode, such as in the example above, narrative works to

Medical Discourse: Psychiatric Interviews 663 Table 2 Example of a patient’s lead into a narrative as a response to a Doctors question Patient:

Doctor:

I work, I study, I take courses, I do things that are out of the ordinary, sometimes I did things out there that even people with a lot of courage wouldn’t do . . . do, and that’s why I’m taking medication here and I’m afraid For instance, what types of things do you do?

describe the patient’s personal topic agenda and its complex layering of references. Narrative analysis investigates ways that a story is told in bits and pieces and through multiple contextual embeddings. These bits of information may indicate to the psychiatrist relevant information about his patient’s social and family history, present illness, and chief complaint. Listening to and understanding such stories are essential to the psychiatrist, who has the task of using patients’ linguistic and nonlinguistic contributions as a source to assess their mental and psychological states. Different frameworks for analyzing narratives are used in discourse studies in psychiatry. Labov’s structural/functional framework – a narrative is ‘‘a method of recapitulating a past experience by matching a verbal sequence of clauses to the sequence of events which, we infer, actually occurred’’ (1972: 359–360) – has been used by researchers (Ribeiro, 2000) to analyze discourse cohesion and coherence. Labov’s framework presents a schematic structure: stories would often have an abstract, an orientation(s), complicating actions, an evaluation(s), a coda, and a resolution. The evaluation component is considered most relevant as it is used by the narrator to indicate the point of the story: why it is being narrated. Also, temporal order constitutes the main thread of coherence. In psychiatric interviews, however, patients’ stories are often presented in a quite fragmented form and do not adhere to the temporal order principle. For example, Ribeiro (2000) described a particularly cohesive story (‘the Bozo narrative’) told by a patient in a fragmented but yet still coherent discourse. Following Labov’s framework (1972), the narrator presents an abstract ‘‘I was married to Bozo, Bozo the clown.’’ Several orientation parts follow: ‘‘Beto is the Bozo, he has been named ‘Bozo the clown’ by people at the hospital where he works as a physical therapist.’’ In the complicating action, the patient informs that she ‘‘has been possessed by some Beto’’ and that they ‘‘have been fooling around.’’ The notion of possession (of being possessed) implies lack of control (powerlessness) as well as irrationality – both social and cognitive traits related to mental illness. A narrative schema emerges for ‘Bozo the clown’ (a reference to Beto and to the patient herself),

who shares attributes that play on ambiguity signaling both euphoria (laugher and fun) and depression (sadness). This oscillating mood and related references are particularly interesting since the patient has a manic-depressive (bipolar) disorder. Other research frameworks – that highlight narrative as a genre and the process of narration (Linde, 1993; Mishler, 1986) – provide ways of listening to (and analyzing) stories in clinical and research interviews. Such frameworks investigate how individuals share experiences through the telling of life stories. It also focuses on the coconstruction of participants’ social and conversational identities. Storytelling is viewed as a situated discursive practice, as a way to project who we are to our addressee, according to our communicative purposes for that specific interaction. In this process, the narrator uses a wide range of linguistic forms to express purpose and intentionality. Coherence emerges from the joint construction of meaning by the participants. It is not ‘‘an absolute property of a disembodied, unsituated text’’ (Linde, 1993: 12). Mishler (in press) added that narration time is not solely chronological; rather, people interpret activities according to the way they are experienced and narrated. This understanding subscribes to the relational concept of identity: human beings experience who they are along a continuum of tensions and contradictions in the many social worlds they live in. They are not progressive nor coherent. The interpretation of past events is achieved in light of how they are conceived in the very moment of narration, that is, as the story unfolds for that specific audience in a specific context. Accordingly, unfolding stories in psychiatric interviews must also be considered as situated discursive practices and viewed as context specific. Psychiatry relies on clinical practice where the patient’s sociolinguistic competence is being assessed. A successful assessment of the patient’s mental state requires that the patient address the institutional topics of the psychiatric interview. Whereas some interactional processes may facilitate communication, others may inhibit it. Open-ended questions, for example, allow for the production of more spontaneous and lengthy responses, where stories may then emerge. And probing questions about the details of a narrative, for instance, might reveal specific information that would otherwise not be evaluated. Thus, the unfolding of a narrative in an interview situation constitutes an important tool for gathering relevant information so that the doctor may successfully reach a diagnosis. Moreover, listening to stories in an interview situation strongly enhances involvement between the doctor and the patient.

664 Medical Discourse: Psychiatric Interviews See also: Coherence: Psycholinguistic Approach; Dis-

course Processing; Frame Analysis; Identity and Language; Institutional Talk; Interactional Sociolinguistics; Narrative: Linguistic and Structural Theories; Schema Theory: Stylistic Applications; Topic and Comment.

Bibliography American Psychiatric Association (1994). Diagnostic and statistical manual of mental disorders. Washington, DC: American Psychiatric Association. Andreasen N C (1979). ‘The clinical assessment of thought, language and communication disorders: The definition of terms and evaluation of their reliability.’ Archives of General Psychiatry 36, 1375–1381. Andreasen N C & Black D W (1995). Introductory textbook of psychiatry. Washington, DC: American Psychiatric Press. Bateson G ([1972] 1981). Steps to an ecology of mind. New York: Ballantine. Becker A L (1982). ‘Beyond translation.’ In Byrnes H (ed.) Contemporary perceptions of language: Interdisciplinary dimensions. Washington, DC: Georgetown University Press. 124–138. Georgetown University Round Table on Languages and Linguistics. Berembaum H & Barch D (1995). ‘The categorization of thought disorder.’ Journal of Psycholinguistic Research 24(5), 349–371. Bleuler E (1950). Dementia praecox or the group of schizophrenias. New York: International University Press. Chaika E O (1990). Understanding psychotic speech: Beyond Freud and Chomsky. Springfield, IL: Charles C. Thomas. Goffman E (1982). Interactional ritual. New York: Pantheon Books. Good B (1994). Medicine, rationality, and experience: An anthropological perspective. London: Cambridge University Press. Gumperz J J (1982). Discourse strategies. Cambridge: Cambridge University Press. Hymes D (1967). ‘Models of the interaction of language and social setting.’ Journal of Social Issues 23(2), 8–28. Jaspers K (1963). General psychopathology. Chicago: Chicago University Press. Kleinman A (1988). Rethinking psychiatry. From cultural category to personal experience. New York: The Free Press. Labov W (1972). ‘The transformation of experience in narrative syntax.’ In Language in the inner city. Philadelphia: University of Philadelphia Press. 354–396.

Linde C (1993). Life stories: The creation of coherence. New York: Oxford University Press. Mishler E (1986). Research interviewing: Context and narrative. Cambridge: Harvard University Press. Mishler E (in press). ‘Narrative and the paradox of temporal ordering: How ends beget beginnings.’ In De Fina A, Schiffrin D & Bamberg M (eds.) Discursive construction of identities. Cambridge: Cambridge University Press. Pinto D (2000). A Construc¸ a˜o da refereˆ ncia no discurso de uma paciente psiquia´trica: ana´lise lingu¨ ı´stica para distu´rbios de pensamento, fala e comunicac¸a˜o. Ph.D. diss., Institute of Psychiatry, Federal University of Rio de Janeiro. Pinto D & Ribeiro B T (2001). ‘Manifestac¸o˜es de desorganizac¸a˜o da fala ou transtornos na forma do pensamento? Uma abordagem linguı´stica.’ Arquivos Brasileiros de Psiquiatria, Neurologia e Medicina Legal 76, 22–29. Ribeiro B T (2002). ‘Erotic strategies in powerless women: Analyzing psychiatric interviews.’ In Sunderland J & Litosseliti L (eds.) Gender identity and discourse analysis. Amsterdam: John Benjamins. 193–219. Ribeiro B T (2000). ‘Listening to narratives in psychiatric interviews.’ In Coulthard M, Cotterill J & Rock F (eds.) Dialogue analysis VII: Working with dialogue. Tubingen: Max Niemeyer Verlag. 283–292. Ribeiro B T (1994). Coherence in psychotic discourse. New York: Oxford University Press. Rochester S R & Martin J R (1979). Crazy talk. New York: Plenum Press. Schiffrin D (1987). Discourse markers. Cambridge: Cambridge University Press. Schiffrin D (1994). Approaches to discourse. Oxford: Blackwell. Shea S C (1998). Psychiatric interviewing: The art of understanding. Philadelphia: W. B. Saunders Cia. Sims A (1995). Symptoms in the mind. An introduction to descriptive psychopathology. London: Saunders. Sullivan H (1970). The psychiatric interview. New York: W. W. Norton & Cia. Tannen D & Wallat C (1993). ‘Interactive frames and knowledge schemas in interaction: Examples from a medical examination-interview.’ In Framing in discourse. Oxford: Oxford University Press. 57–76. Taylor M A (1981). The neuropsychiatric mental status examination. New York: Spectrum. Taylor M A, Reed R & Berembaum S (1994). ‘Patterns of speech disorders in schizophrenia and mania.’ The Journal of Nervous and Mental Disease 182, 319–326.

Medical Discourse: Sociohistorical Construction 709 Clarac de Bricen˜o J (1981). Dioses en exilio: representaciones y pra´cticas simbo´licas en la cordillera de Me´rida. Caracas: FUNDARTE. Clarac de Bricen˜o J (1992). La enfermedad como lenguaje en Venezuela. Me´rida, Venezuela: Universidad de Los Andes. Clarac de Bricen˜o J (1996). Dioses en exilio: representaciones y pra´cticas simbo´licas en la cordillera de Me´rida (2nd edn.). Me´rida, Venezuela: Universidad de Los Andes. Devereux G (1970). Essais d’ethnopsychiatrie ge´ne´rale. Paris: Gallimard. Doore G (ed.) (1993). El viaje del chama´n: curacio´n, poder y crecimiento personal. Barcelona: Kairo´s. Eliade M (1951). Le Chamanisme et les techniques arcaı¨ques de l’extase. Paris: Payot. Eliade M (1974). Le Chamanisme et les techniques arcaı¨ques de l’extase (3rd edn.). Paris: Payot. Evans Shultes R & Hoffmann A (1984). Les Plantes des dieux: les plantes hallucinoge`nes, botanique et etnologie. Paris: Berger-Levrault. Harner M (1993). ‘¿Que´ es un chama´n?’ In Doore (ed.). Lapassade G (1976). Essai sur la trance. Paris: Jean-Pierre Delarge. Laplantine F (1975). La culture du Psy. Toulouse: Eppsos Privat. Le´vi-Strauss C (1961). Anthropologie structurale. Paris: Plon.

Lowie R H (1934). ‘Religious ideas and practices of the Eurasiatic and North American areas.’ In EvansPritchard E E, Firth R, Malonowski B & Schapera I (eds.) Essays presented to C. G. Seligman. London: Kegan Paul. Mehl L (1993). ‘El chamanismo moderno: integracio´n de la biomedicina con las visiones tradicionales del mundo.’ In Doore (ed.). Me´traux A (1949). ‘Religion and shamanism.’ In Steward J H (ed.) Handbook of South American Indians 5: The comparative ethnology of South American Indians. Washington, DC: United States General Post Office. Perrin M & Machado J U (1980). ‘El arte guajiro de curar frente a la medicina occidental.’ Boletı´n Indigenista Venezolano 19(16), 39–200. Reichel-Dolmatoff G (1975). The Shaman and the Jaguar: a study of narcotic drugs among the Indians of Colombia. Philadelphia: Temple University Press. Roheim G (1973). Psicoana´lisis y antropologı´a. Buenos Aires: Sudamericana. Roheim G (1982). Magia y esquizofrenia. Buenos Aires: Paido´s. Simonton O C & Hensen R (1993). Sanar es un viaje: el poder de la mente y del espı´ritu en la superacio´n de enfermedades graves (2nd edn.). Barcelona: Urano. Wilbert J (1987). Tobacco and shamanism in South America. New Haven: Yale University Press.

Medical Discourse: Sociohistorical Construction B-L Gunnarsson, Uppsala University, Uppsala, Sweden ! 2006 Elsevier Ltd. All rights reserved.

Introduction Scientific language and discourse emerge in a cooperative and competitive struggle among scientists to create the knowledge base of their field, to establish themselves in relation to other scientists and to other professional groups, and to gain influence and control over political and socioeconomic means. In every strand of human communication, language and discourse play a role in the formation of a social and societal reality and identity. This is also true both of the formation of the different professional and vocational cultures within working and public life and of the formation of different academic cultures. Historically, language has played a central role in the creation of different professions and academic disciplines, and it continues to play an important role in the development and maintenance of professional and institutional cultures and identities. Societal,

social, and cognitive factors all make important contributions to the construction of professional cultures. Professionals try to create a space for their field within society. They try to establish themselves in contact and competition with others within their group as well as with other groups. Their knowledge base and its linguistic forms are created in a societal and social framework.

Diachronic Studies on the Development of Scientific Genres A constructionist perspective on the emergence of scientific discourse and text genres is found in various traditions. Within the tradition of sociology of science, several studies have been devoted to analyzing the role of texts in the establishing of scientific fact. The scientific field is seen as a workplace, a laboratory, where social rules determine the establishing of facts and the rank order of the scientists. KnorrCetina (1981) was one of the first to describe the writing up of results as a process of tinkering with facts rather than a knowledge-guided search. Latour

710 Medical Discourse: Sociohistorical Construction

and Woolgar (1986) described the social construction of scientific facts as an antagonistic struggle among scientists, leading to a purposeful diminishing of the results of others and a leveling up – to a generalized level – of one’s own results. Bazerman (1988) studied the rise of modern forms of scientific communication, focusing on the historical emergence of the experimental article. A socialconstructivist approach in relation to written texts is also found in Bazerman and Paradis (1991), which examined the important role played by texts in profession building. Textual forms and definitions are found to impose structure on human activity and help shape versions of reality. Texts are shown to play powerful roles in staging the daily actions of individuals, and to be important factors in the rise of action. In Gunnarsson et al. (1997), which examined both professional written communication and spoken interaction, the central theoretical issue is how language, written genres, and spoken discourse are constructed as successive and continuous interplay between language and social realities. A sociorhetorical perspective on the scientific genre is also developed in Bhatia (1987), Swales (1990), MacDonald (1994), and Berkenkotter and Huckin (1995). The emergence of the English scientific text genres has been analyzed in several important works, including Atkinson (1999) and Valle (1999). Extensive studies have also been made into scientific written communication in other languages, e.g., German (Standard German) (Schro¨ der, 1991) and Swedish (Gunnarsson, 1997, 1998, 2001a; Melander, 1991, Na¨ slund, 1991). The differences between academic genres developed within different national cultures have also been analyzed from a socioconstructivist approach, e.g., Mauranen (1993).

The Construction of Professional Discourse In this article an exemplary picture of the emergence and development of medical written discourse will be sketched. In order to understand the historical development of professional medical language and communication in its rich and varied totality, we must study the dynamic processes behind the construction of medical language. In this dynamic process, three main layers are distinguished: one relating to cognitive types of activities, one to social, and one to macrosocial or societal. Professional culture is built up via these layers, which means that all three must be considered in order to get a full picture. Written texts as well as spoken discourse are constructed as cognitive, social, and

societal activities within the different professions and branches of working life. (The term ‘profession’ is here used in quite a general sense, including academic as well as nonacademic discourse.) If we begin by considering the cognitive layer, we find that each profession has a certain way of viewing reality, a certain way of highlighting different aspects of the world around it. Socialization into a profession means learning how to discern the relevant facts, how to view the relations between different factors. We are taught how to construct and use a grid or a lens to view reality in a professionally relevant way. Language, texts, and spoken discourse help us in this construction process. We use language in the construction of professional knowledge. Medical terminology, medical text patterns, and medical text and discourse content have developed as a means of dealing with reality in a way that is appropriate for medical purposes. The way in which language is used is related to existing knowledge within the field and also with conceptions about what constitutes knowledge and the attitude that should be adopted to it. What is therefore important is what scientists consider they know about different sectors of reality at different periods, what knowledge they believe is relevant in this field, and how they consider data should be collected, observed, and analyzed. Attitudes and norms regarding what is professionally relevant and right are thus built into the cognitive structure. The knowledge base of a field has a network of relations with other fields. The cognitive structure of a professional language thus reveals its dependence on and relationship toward other knowledge domains, and this knowledge-based network can vary over time. In metaphors, in terminology, in modes of reasoning, and in diagrams, the contribution of adjacent fields to the construction of professional knowledge is revealed. For example, many fields owe a debt to statistics, psychology, mathematics, sociology, physics, economics, politics, and religion, and this debt can be seen in the language that is used. Second, regarding the social layer, every professional group, like other social groups, is also formed by the establishment of an internal role structure, group identity, group attitudes, and group norms. The need for a professional identity, for a professional sense of ‘us-ness,’ for separation from the out-group, has of course played an important role in the construction of professional group language and constantly motivates people to adapt and be socialized into professional group behavior. Socialization into a group also means establishing distance from people outside the group.

Medical Discourse: Sociohistorical Construction 711

The use of medical scientific language during different periods is thus related to the type and level of the scientific community (the social group), its size, structure, degree of professionalization, degree of internationalization, degree and nature of mutual contacts, existence of publications, etc. Third, as regards the macrosocial or societal layer, each professional group also stands in a particular relationship to the society in which it operates; it exerts certain functions and is given a certain place within that society. The members of a profession play a role in relation to other actors in society, and the professional group acts in relation to other groups. They play – or do not play – a role on the political scene, within the business world, the education system, in relation to the media, etc. And this cluster of societal functions is essential for language. It is through language that professional groups exert their societal function. If they are going to play a role on the political scene, they have to construct their communicative behavior in a way that is adequate for that purpose. Their relationship to written texts and spoken discourse and to different genres is also important. Professionals adapt to established genres, but are also involved in forming new genres. The way in which language is used within science during different periods is linked with the relationship of the scientist and the scientific community to society in general. Factors that are important are the status of the profession and its role in society, both the status of the scientific community and to what degree and in what way it is integrated into society. The societal layer is thus related to economic and political factors. It is related to power and status patterns in the particular society, i.e., the nation-state, as well as on the global scene. The three layers are strongly related to the emergence and continuous recreation of professional language, and they constitute a part of the construction of professional language and discourse. Historically, language is constructed in relation to all these layers. The cognitive establishment of the field takes place at the same time as the professions fight for their place in society and for the strengthening of their group in relation to other groups.

Stages in the Development of Medical Science Medical knowledge and practice took a great step forward in the 17th and in particular in the 18th century. However, it was only gradually that it developed into a science in the modern sense. Since the 18th century, all societies have undergone radical

change. Changes have also taken place within the medical scientific community: 1. Medical knowledge has grown immensely. 2. Science in general and the philosophy of science have undergone changes. Statistics and empirical methods have developed. Positivism has become the only accepted view in many sciences. 3. The medical profession has gradually become increasingly established and recognized. Today, doctors are considered highly valuable professionals, and medical scientists and medical research are considered highly important to society. 4. The medical scientific community has become much larger. The number of doctors, medical scientists and students of medicine has increased, as has the number of medical journals and conferences. Important changes have thus taken place relating to medical science, science in general, the medical profession, and the medical-scientific community. This article will explore how language and discourse are essential elements in the construction of science, in profession building and in the shaping of the scientific community, and that academic genres play important roles in this process of constructing scientific knowledge and the role of the scientist in society, and in the growth and strengthening of the social network among scientists. Changes as to language and text patterns will be discussed in relation to three scientific stages: the preestablishment stage, the establishing stage, and the specialized stage. For each layer the three stages can be summarized on a developmental axis: Cognitive layer: ! Individual findings – Accumulation of findings – Integration into theory Social layer: ! Isolated researchers – Academic grouping – Advanced scientific community Macrosocial/societal layer: ! Scientists function within society – Scientists function within society and academic groupings – Scientists function with the scientific community.

Scientificality in Swedish Medical Articles 1730–1985 The empirical results referred to are based on studies on Swedish medical language carried out at Uppsala University. The whole corpus analyzed comprises a total of 360 scientific and popular articles from three fields – economics, technology and medicine – and six periods (Gunnarsson, 1998). The medical

712 Medical Discourse: Sociohistorical Construction

subcorpus which will be focused on here consists of 60 scientific articles, 10 from each of the six periods 1730–1799, 1800–1849, 1850–1880, 1895–1905, 1935–1945, and 1975–1985. All these articles come from scientific journals and deal with pulmonary diseases (30 articles) or skin diseases (30 articles). Our analyses have focused on four text linguistic levels: the cognitive, the pragmatic, the macrothematic and the microsemantic. We have also made analyses relating to the vocabulary and terminology (Gunnarsson, 1998). Content and Content Structuring of the Texts

The content and content structuring of medical articles will be related to the stage reached by the domain of medical science, in terms of degree and type of scientificality, and also the role of scientists in society. The cognitive analysis examines the content of the text in relation to five ‘cognitive worlds’: a scientific world, a practical world, an object world, a private world, and an external world (cf. Gunnarsson, 1992). Within each of these, two or three aspects were discerned: World Scientific

Practical Object

Private External

Aspect Theory Classification Experiment Work Interaction Phenomenon-focused Part-focused Whole-focused Experience Conditions Conditions Measures

Each proposition in the articles was categorized in relation to world and aspect. We could thus calculate the proportions of the total number of propositions representing each world and each aspect in texts from different periods. The analysis showed a very clear increase in the proportion of each text devoted to the scientific world, that is, to the presentation of ‘theory,’ ‘classifications,’ and ‘experiment,’ over the periods. On the other hand, there was a clear decrease in the role of the external world in particular, that is in the proportions of texts dealing with ‘conditions’ and ‘measures’ of a political, economic, and social nature. There was also an increase in the proportion of ‘experiment/ observation’ within the scientific world and a decrease in the proportion of ‘measures’ within the external world.

The cognitive analysis also comprised an analysis of text content in relation to four time dimensions, ‘cause,’ ‘phenomenon,’ ‘process,’ and ‘result,’ which showed that the proportions of each text devoted to describing ‘causes’ and ‘phenomena’ have decreased over time, while the proportions devoted to ‘processes’ and ‘results’ have increased. Another analysis focused on the macrothematic structure. The content of the medical articles was categorized in relation to four superthemes, ‘introduction,’ ‘theme development,’ ‘discussion,’ and ‘conclusions.’ This analysis revealed an increase as regards how much of each text is devoted to the superthemes ‘introduction’ and ‘theme development’ (that is, to a description of materials, methods, results). The proportion devoted to ‘discussion’ and in particular to ‘conclusions,’ on the other hand, had decreased. A third analysis, which aimed at a description of the pragmatic character of the texts, e.g., the types of illocution present, pointed to an increase as regards ‘informative’ and ‘explicative’ illocutions and a decrease in ‘expressive,’ ‘argumentative,’ and ‘directive’ illocutions. To sum up, the changes in the content and content structuring of Swedish medical articles from 1730 to 1985 show the following tendencies: more ‘scientific world,’ less ‘external world’; more ‘experiment,’ fewer social, political, and economic ‘measures’; more ‘process’ and ‘results,’ less ‘cause’ and ‘phenomenon’; larger proportion of ‘theme development,’ smaller proportion of ‘discussion’ and ‘conclusions’; more ‘informative’ and ‘explicative’ illocutions, fewer ‘expressive’ and ‘directive’ illocutions. These findings relating to changes in the medical article genre point for one thing to a development of medical science. The knowledge structure of the texts appears to have changed, to include more emphasis on experiment and on process and results. There is also a trend toward a genre of a more purely descriptive character, in which the main part of the text is devoted to developing the theme, that is, to a description of the experiment, observations, etc. These are features which could be related to a positivist scientific ideal. All these results can be related to the cognitive layer of the construction of academic discourse. These results also reveal the role played by scientists in society. In terms of text content, the proportion devoted to the external world and external measures has decreased, as has the proportion devoted to conclusions and directives. Such results can be discussed in light of the specialization and professionalization of society. Compared with earlier periods, scientists today act to a greater extent in a

Medical Discourse: Sociohistorical Construction 713

discourse community of their own. Science in general and medical science in particular is accepted and highly esteemed in modern society. Considerable funding is given to medical research. The role of large-scale experiments has increased. The discourse changes can be related to this endeavor among medical scientists to become specialists, and a profession of their own, with their own exclusive domain to deal with. A high degree of scientificality in spoken and written discourse provides prestige. A more purely scientific genre has emerged. Scientist-writers have turned toward their own group, and the medical article genre has become a within-science genre. The popularization of medical findings is taken care of by others – trained journalists. Scientists can write for their own group, and do not have to bother about a growing distance between the lay public and the experts. The article genre has become more exclusively internal, and less concerned with reaching out to other sectors of society. These results can also be interpreted in light of the interplay between the cognitive and societal layers. The role played by the medical profession in society interrelates with the presentation of scientific content. Formal Organization of the Text

The formal organization of the texts and their rhetorical patterns will be related to the stage reached in the development of the medical scientific community. A strong scientific community reveals itself in firm genre conventions, that is, in more homogeneous texts, and also in explicit markings of belonging to a group. The number and types of headings in the modern Swedish articles vary over time. The use of section headings has increased dramatically. Also the type of heading has changed. In the early periods, headings relate to the content of the article, while in the modern article they relate to its structure: ‘Material,’ ‘Methods,’ ‘Results,’ ‘Discussion,’ ‘Conclusions.’ The modern headings thus structure the presentation in a general scientific way, which also reflects a more homogeneous organization of the texts. An increasing homogeneity is also found in relation to the thematic article structure (Gunnarsson, 1993). The articles from the period around 1980 (1975–1985) were clearly more homogeneous in terms of their linear text structure than earlier articles. Also the introductions in articles from different periods revealed a gradually greater homogeneity. This homogeneity can also be seen in a contrastive perspective; that is, the Swedish pattern has come to

resemble that of the English scientific article, as described in Swales (1990). Another finding relates to the information flow of the text. An analysis of the connection between content structure and graphical disposition in articles from different periods showed that each sentence has become more independent with regard to the surrounding text. It introduces a new angle, which means that it becomes less integrated with its neighboring sentences (Melander, 1993). We thus find a change toward a fact-listing or ‘catalogue’ style in the modern article. This tendency toward a catalogue style can be seen as another feature reflecting firmer genre conventions. When texts are organized in a homogeneous and predictable way, there is less need to elaborate on the details. Readers know where in the text they will find different types of content. The tendency toward a more catalogue-like article could also be seen as indicating a strengthening of the scientific community. The knowledge of this community is well established among the specialist readers, and need not be elaborated on. Other analyses reveal that the number of references per article has increased over time, and that the way of presenting them has become more homogeneous. Another tendency relates to the changed use of personal pronouns. In the articles from the 18th century, the pronoun ‘I’ was quite a frequent word, while it has more or less totally disappeared in the articles from the latter part of the 20th century. The pronoun ‘we,’ on the other hand, which was quite unusual during the first two periods, has since the middle of the 19th century had a similar rank. The author’s explicit marking of article relevance has also changed over time. Here we find a shift from a societal orientation in earlier periods to a more internal orientation in the last period (Gunnarsson, 1998). To sum up, the changes as to the formal text organization and the rhetorical patterns show the following diachronic tendencies: more headings; more homogeneous text organization; more homogeneous article introductions; more fact listing; more references; less use of the personal pronoun ‘I’; more relevance marking relating to the ‘group.’ The medical article has developed toward greater homogeneity – relating to the use of headings, the superthematic text structure, the rhetorical structure of introductions, etc. – which indicates a strengthening of genre conventions. The medical article has become more established as a genre, and its genre conventions have become firmer and thus more homogeneous. This strengthening of the academic article genre, however, is also a sign of a growing and stronger medical discourse community. For the

714 Medical Discourse: Sociohistorical Construction

medical discourse community, as for most scientific discourse communities, writing plays an essential role as a group marker, and the establishing of firmer conventions for written text genres is part of the growth and strengthening of this community. The trend toward a more fact-listing and catalogue type of article can also be seen as a sign of a stronger discourse community, in the sense of being more homogeneous and closed. It is a well-known fact within sociolinguistics that communication within a dense group or network needs to be less explicit and elaborated than communication within one more open and less dense. The modern habit of giving references to the works of colleagues is another sign of a strong discourse community, a discourse community with a clear group feeling. When the group is essential to its members, it becomes important to indicate one’s sense of belonging and one’s relationship to other group members. Problems relating to the group also become more important than those relating to the world outside the group. The modern tendency to list references, to use ‘we’ instead of ‘I,’ and to mark relevance in relation to one’s own group can be viewed from this social perspective. I would suggest that these text features are part of the construction of an increasingly close-knit (dense) medical discourse community. There is also a connection between these features and the role of medical scientists in society; that is to say, the strengthening of the professional group is paralleled by a process of gradual specialization of the professions. These features are thus also part of the construction of a role for the medical community within society. We can thus see how the social and societal layers interact. Strengthening of the internal group structure is interrelated with the underlining of a role for the group in society. Linguistic Expressions of Evaluations

The linguistic expressions of evaluation and its variation over time will be related to the positioning of the scientist/author on the developmental axes for the three contextual dimensions. The study referred to was based on an analysis of 30 Swedish scientific medical articles from six periods. All the articles dealt with pulmonary diseases. The study, which comprised an analysis of evaluations linked to descriptions of the subjects studied, the diseases and treatments, the introduction of the author’s own initiatives, and descriptions of the research and findings of others, focused on three main aspects: (1) what is being evaluated, (2) through whom the evaluation is taking place, and (3) how the evaluation is being made (Gunnarsson, 2001a).

In articles from all the periods, the object of the study and the initiatives of other researchers were evaluated. The author also referred to his own initiatives in most articles. From a diachronic perspective, however, it is more interesting to consider the second aspect; that is, through whom the evaluation is taking place (author’s own voice, author through others, author through facts). A comparison of the medical texts from the 18th, 19th, and 20th centuries shows that in the earliest texts the evaluation is made by the author himself, using his own voice, whereas in later articles it is allowed to emerge indirectly via facts from others, e.g., in references to the other articles. A change over time is also found in relation to the third aspect, how the evaluation is being made. When articles from the 18th century are compared to articles from the 20th century, we find a weakening tendency; that is, the evaluations in articles from 1730 are expressed in a more severe way. A more obvious change, however, relates to the degree of certainty. Here we find a discernible increase in the use of hedges and other expressions of caution over time (cf. Salager-Meyer, 1994). We thus find a progressive moderation of the author’s own voice in the medical articles; in other words, increasingly the focus is placed on facts. There is another clear change in the author’s relationship to facts, which is revealed in an increase in the frequency of markers of epistemic modality. These tendencies were also found in analysis of word frequencies. As mentioned earlier, there is a change in the use of the personal pronouns jag ‘I’ and vi ‘we’ in the articles over time. The occurrences of the pronoun jag are reduced by half between the first period (18th century) and the fourth period (1895–1905) and disappear completely during the sixth period (1975–1985). To some extent the pronoun ‘I’ is replaced by the pronoun ‘we.’ In this case, however, the increased use of ‘we’ is mainly explained neither by the use of reader-inclusive ‘we’ nor by co-authorship. It could rather be linked to the progressive phasing out of authorial identity in scientific prose. A comparison of the frequencies of a number of markers of modality in the Swedish medical corpus revealed an increase over time. All nine markers – torde ‘is probably,’ tyder ‘suggests,’ tycks ‘seems,’ ta¨ nkbar ‘conceivable,’ tveksam ‘doubtful,’ sannolik ‘likely,’ sannolikhet ‘likelihood,’ mo¨ jlig ‘possible,’ mo¨ jlighet ‘possibility’ – revealed a linear increase in frequency over the six periods. This increasing tendency to be cautious can of course be seen as a sign of the progressive extension of medical knowledge; that is, it can be related to the author’s placement on the knowledge axis. The greater the body of collective

Medical Discourse: Sociohistorical Construction 715 Table 1 Text and context during three centuries Dimension

1700:1

1800:2

1900:3

Cognitive Social

Individual findings Isolated researchers community Scientists act within society

Accumulation of findings Academic groupings

Theoretical integration Developed scientific

Scientists act within society and academic groupings

Scientists act within the scientific community

Macrosocial

knowledge, the more aware authors are of its relativity. But it could also be linked to circumstances within the social group, in this case the medical community. In order to survive in a competitive society, which is what the world of medical research undeniably is, one must be careful not to lose face, and take care not to threaten the face of others. Ideas of this kind are proposed in Myers (1989). Myers claims that in order to survive in the competitive academic world, modern scientists adopt pragmatic politeness strategies, and that Brown and Levinson’s concepts of ‘face saving’ and ‘face threatening’ are also relevant in the analysis of scientific texts (Brown and Levinson, 1987). Scientists tread a narrow path between the need to emphasize their own achievements on the one hand and to criticize those of their peers on the other. It may well be that the difference in the wording results from the increased knowledge scientists now possess about illnesses and their treatment, i.e., that the difference can be linked to the cognitive dimension. Or it may result from greater awareness of the importance of politeness in a large and welldeveloped scientific community, i.e., the difference can be linked to the social dimension; doctors/ researchers admittedly make evaluations, but they avoid expressing them subjectively and straightforwardly and choose greater objectiveness, thus showing more caution. Lastly, the variation in linguistic expressions of evaluations will be systematically related to the three scientific stages and the three layers distinguished earlier, and articles from the 18th, 19th, and 20th centuries will be placed on developmental axes. illustrates the relationship between text and context for medical scientific articles during three centuries. In the articles from the 18th century, we encounter a number of different individuals – the author himself, his colleagues, and his patients – and their experiences and judgments are described. The typical article is full of explicit, severe, and assured evaluations which concern the object of the study – the illness and method – and also the advocates of the method, its naı¨ve practitioners. In the way the author writes, he places himself fairly obviously toward

the left of all three contextual axes in Table 1; he treats individual findings as if they exist per se, he describes himself and his colleagues as isolated researchers and he seems to act within society rather than the scientific community. In the articles from the 19th century, the typical author adopts a considerably more analytical attitude to the research of others. The author himself figures as an evaluator. He also explicitly adduces the opinions of other researchers. The evaluations are of medium severity and the author marks his doubts in different ways. The author is fair and square in the middle of the contextual axes. In the articles from the 20th century (around 1980), the typical author does not express himself in his own voice or explicitly through others. Evaluations take the form of the presentation of facts, supported by references to other works. Summaries of the research of others form an integral part of the description of the illness/method. What characterizes this and other articles in the subcorpus from this period is above all the attitude adopted to facts. The evaluations are not few in number, but they are weak to medium, severe rather than severe, and they are presented throughout as less certain – in other words these authors should be placed to the right on the contextual axes.

Conclusions Language constructs science in relation to the cognitive layer (the scientific content), the societal layer (scientists’ role in society), and the social layer (relations within the group). This construction process has been in progress since the first doctors tried to establish themselves as medical scientists and it is still continuing. In Sweden this process began in the 17th century. However, it was not until the middle of the 18th century that Sweden became a national writing community. Before 1740, the language of the learned was Latin, but in the Era of Liberty, from the middle of the 18th century, Swedish was gradually accepted as a scientific language, and the construction of medical science and the medical scientific community was related over a long period

716 Medical Discourse: Sociohistorical Construction

to the development of the Swedish medical article as a genre. This article has focused on this phase in the Swedish medical history, an account which ended in 1985. What has taken place since then is an accelerating Anglicization of the academic writing community in Sweden. English is now used in medicine as the medium for Ph.D. theses, for conference abstracts and papers, and for articles presenting original research (Gunnarsson, 2001b). La¨ kartidningen, the Swedish medical journal, still exists, but is no longer the prime forum for presentations of new findings. The Swedish medical scientists of today choose to present their research in English in international medical journals. When they write in La¨ kartidningen, they do so with other purposes than to present original research findings. The Swedish medical scientific community has thus turned diglossic; that is, English is used for certain purposes and Swedish for others. In the Swedish medical journal, articles give overviews and present research relating to basic diseases, but this is no longer a journal for the first presentation of new research (Gunnarsson et al., 1995). A development of the kind here described is not country specific. The shift from Latin to the national language took place around the same time in most Western countries, and the modern spread of English as a scientific language is universal (Ammon, 2001). The Anglicization of the medical scientific community and the accelerating use of the Internet as a communicative tool has lead to an intensified globalization and also homogenization of science and scientific language. From a sociohistorical perspective this development is most interesting and in the future will certainly lead to important investigations. See also: Cognitive Linguistics; Cognitive Science and Philosophy of Language; Constructivism; Evaluation in Text; Genre and Genre Analysis; Macrostructure; Pragmatics and Semantics; Society and Language: Overview; Text and Text Analysis; Text World Theory.

Bibliography Ammon U (ed.) (2001). The dominance of English as a language of science: effects on other languages and language communities. Berlin/New York: Mouton de Gruyter. Atkinson D (1999). Scientific discourse in sociohistorical context: the philosophical transactions of the Royal Society of London 1975–1975. Mahwah, NJ: Lawrence Erlbaum Associates. Bazerman C (1988). Shaping written knowledge: the genre and activity of the experimental article in science. Madison: University of Wisconsin Press.

Bazerman C & Paradis J (eds.) (1991). Textual dynamics of the professions: historical and contemporary studies of writing in professional communities. Madison: University of Wisconsin Press. Berkenkotter C & Huckin T N (1995). Genre knowledge in disciplinary communication: cognition/culture/power. Hillsdale, NJ: Lawrence Erlbaum Associates. Bhatia V K (1987). Analyzing genre: language use in professional settings. London/New York: Longman. Brown P & Levinson S C (1987). Politeness: some universals in language usage. Cambridge: Cambridge University Press. Collin J G (1942). ‘Underra¨telser om Asthma thymicum.’ Hygeia 6, 256–271. Gunnarsson B-L (1992). ‘Linguistic change within cognitive worlds.’ In Kellermann G & Morrissey M D (eds.) Diachrony within synchrony: language history and cognition. Frankfurt: Peter Lang. 205–228. Gunnarsson B-L (1993). ‘Pragmatic and macrothematic patterns in science and popular science: a diachronic study of articles from three fields.’ In Ghadessy M (ed.) Register analysis: theory and practice. London/ New York: Pinter. 165–179. Gunnarsson B-L (1997). ‘On the sociohistorical construction of scientific discourse.’ In Gunnarsson B-L, Linell P & Nordberg B (eds.) The construction of professional discourse. London/New York: Longman. 99–126. Gunnarsson B-L (1998). ‘Academic discourse in changing context frames: the construction and development of a genre.’ In Evangelisti Allori P (ed.) Academic discourse in Europe: thought processes and linguistic realisations. Rome: Bulzoni. 19–42. Gunnarsson B-L (2001a). ‘Expressing criticism and evaluation during three centuries.’ Journal of Historical Pragmatics 2(1), 115–139. Gunnarsson B-L (2001b). ‘Swedish, English, French or German: the language situation at Swedish universities.’ In Ammon U (ed.) The dominance of English as a language of science: effects on other languages and language communities. Berlin/New York: Mouton de Gruyter. 229–316. Gunnarsson B-L, Ba¨cklund I & Andersson B (1995). ‘Texts in European writing communities.’ In Gunnarsson B-L & Ba¨cklund I (eds.) Writing in academic contexts. Uppsala: Uppsala Universitet. 30–53. Gunnarsson B-L, Linell P & Nordberg B (eds.) (1997). The construction of professional discourse. London/ New York: Longman. Knorr-Cetina K (1981). The manufacture of knowledge. Oxford: Pergamon Press. Latour B & Woolgar S (1986). Laboratory life: the construction of scientific facts. Princeton, NJ: Princeton University Press. Lindstro¨ m F & Schildt B (1980). ‘Fo¨ renklad behandling av pneumotorax med Heimlich-ventil.’ La¨kartidningen, 999–1001. MacDonald S P (1994). Professional academic writing in the humanities and social sciences. Carbondale: Southern Illinois University Press.

Medical Discourse: Structured Abstracts 717 Mauranen A (1993). Cultural differences in academic rhetoric: a textlinguistic study. Frankfurt: Peter Lang. Melander B (1991). Inneha˚ llsmo¨ nster i svenska facktexter. Uppsala: Uppsala Universitet. Melander B (1993). From interpretation to enumeration of facts: on a change in the textual patterns of Swedish LSP texts during the 20th century. Uppsala: Uppsala Universitet. Myers G (1989). ‘The pragmatics of politeness in scientific articles.’ Applied Linguistics 10(1), 1–35. Na¨ slund H (1991). Referens och koherens i svenska facktexter. Uppsala: Uppsala Universitet.

Salager-Meyer F (1994). ‘Hedges and textual communicative function in medical English written discourse.’ English for Specific Purposes 13(2), 149–170. Schro¨ der H (ed.) (1991). Subject-oriented texts: languages for special purposes and text theory. Berlin/New York: Walter de Gruyter. Swales J M (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press. Valle E (1999). A collective intelligence: the life sciences in the Royal Society as a scientific discourse community 1665–1965. Turku: University of Turku.

Medical Discourse: Structured Abstracts F Salager-Meyer, Universidad de Los Andes and Hospital Universitario de Los Andes, Me´rida, Venezuela ! 2006 Elsevier Ltd. All rights reserved.

Introduction Since the inception of serious scientific publication in 1665, there has been a continued attempt to improve both the content and the presentation of articles. The peer review process (which dates back only to World War II and has been adopted by over two-thirds of biomedical journals) and the adoption of the IMRAD pattern (Introduction, Materials and Methods, Results, and Conclusion) in the late 1960s represent two such attempts. Numerous voices have been raised lately, however, against the peer review process and the IMRAD formula. Peer review indeed carries the recognized danger of delay, bias, and expense, yet it remains the best method of evaluating scientific work thus far and it has survived for over 300 years (Lock, 1988). As regards the IMRAD pattern, it has been named a straightjacket around the author, thus impeding creativity and personality. Yet the IMRAD pattern allows readers to answer the fundamental questions. In the 1960s, the Journal of the American Medical Association took an innovative step toward improving biomedical communication by moving the summary and conclusions of articles to the beginning. The Canadian Medical Association Journal also adopted this abstract format. The latest major development involving abstracts in the late 1980s was the introduction of the so-called structured abstract, currently required for all medical papers reporting clinical trials. This concept has been worked out by a McMaster University team and the Annals of Internal

Medicine. What is the motivation that lies behind such an initiative? The Pivotal Role of Abstracts

No experienced editor believes that readers of his or her journal go through each issue article by article, word by word. Some readers probably scan the tables of contents, decide which articles, if any, merit their further attention, and scan the titles and abstracts of selected papers. Abstracts of research articles are indeed by far the most widely read parts of scientific papers. Much of the time, it is the only part that is read. According to Stephen Lock (1988), former editor of the British Medical Journal, in practice, most readers of scientific papers are content to read the paper’s title and abstract, casting an eye over the remaining sections. The ubiquitous availability and widespread use of on-line literature search mechanisms, which provide an often untruncated abstract, have done nothing but increase this likelihood. The abstract, then, has a pivotal role not only in answering the fundamental questions readers ask themselves when reading a scientific article (What is the study’s objective? What is the method used? What are the results obtained? And what are the conclusions reached?), but also in being able to stand on its own as a packet of information. As argued before, this latter function is all the more important now because many online databases and medical journals do not provide the full articles (unless one pays for them) but only the title, the authors’ names, some bibliographical details, and the abstract. In view of its importance, then, the accuracy of information provided by the abstract is critical. Some of the abstracts that accompany medical papers are adequate for these purposes, but many are not. In her study on the structure of medical abstracts, for example, Salager-Meyer (1990) found

Medical Discourse: Structured Abstracts 717 Mauranen A (1993). Cultural differences in academic rhetoric: a textlinguistic study. Frankfurt: Peter Lang. Melander B (1991). Inneha˚llsmo¨nster i svenska facktexter. Uppsala: Uppsala Universitet. Melander B (1993). From interpretation to enumeration of facts: on a change in the textual patterns of Swedish LSP texts during the 20th century. Uppsala: Uppsala Universitet. Myers G (1989). ‘The pragmatics of politeness in scientific articles.’ Applied Linguistics 10(1), 1–35. Na¨slund H (1991). Referens och koherens i svenska facktexter. Uppsala: Uppsala Universitet.

Salager-Meyer F (1994). ‘Hedges and textual communicative function in medical English written discourse.’ English for Specific Purposes 13(2), 149–170. Schro¨der H (ed.) (1991). Subject-oriented texts: languages for special purposes and text theory. Berlin/New York: Walter de Gruyter. Swales J M (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press. Valle E (1999). A collective intelligence: the life sciences in the Royal Society as a scientific discourse community 1665–1965. Turku: University of Turku.

Medical Discourse: Structured Abstracts F Salager-Meyer, Universidad de Los Andes and Hospital Universitario de Los Andes, Me´rida, Venezuela ! 2006 Elsevier Ltd. All rights reserved.

Introduction Since the inception of serious scientific publication in 1665, there has been a continued attempt to improve both the content and the presentation of articles. The peer review process (which dates back only to World War II and has been adopted by over two-thirds of biomedical journals) and the adoption of the IMRAD pattern (Introduction, Materials and Methods, Results, and Conclusion) in the late 1960s represent two such attempts. Numerous voices have been raised lately, however, against the peer review process and the IMRAD formula. Peer review indeed carries the recognized danger of delay, bias, and expense, yet it remains the best method of evaluating scientific work thus far and it has survived for over 300 years (Lock, 1988). As regards the IMRAD pattern, it has been named a straightjacket around the author, thus impeding creativity and personality. Yet the IMRAD pattern allows readers to answer the fundamental questions. In the 1960s, the Journal of the American Medical Association took an innovative step toward improving biomedical communication by moving the summary and conclusions of articles to the beginning. The Canadian Medical Association Journal also adopted this abstract format. The latest major development involving abstracts in the late 1980s was the introduction of the so-called structured abstract, currently required for all medical papers reporting clinical trials. This concept has been worked out by a McMaster University team and the Annals of Internal

Medicine. What is the motivation that lies behind such an initiative? The Pivotal Role of Abstracts

No experienced editor believes that readers of his or her journal go through each issue article by article, word by word. Some readers probably scan the tables of contents, decide which articles, if any, merit their further attention, and scan the titles and abstracts of selected papers. Abstracts of research articles are indeed by far the most widely read parts of scientific papers. Much of the time, it is the only part that is read. According to Stephen Lock (1988), former editor of the British Medical Journal, in practice, most readers of scientific papers are content to read the paper’s title and abstract, casting an eye over the remaining sections. The ubiquitous availability and widespread use of on-line literature search mechanisms, which provide an often untruncated abstract, have done nothing but increase this likelihood. The abstract, then, has a pivotal role not only in answering the fundamental questions readers ask themselves when reading a scientific article (What is the study’s objective? What is the method used? What are the results obtained? And what are the conclusions reached?), but also in being able to stand on its own as a packet of information. As argued before, this latter function is all the more important now because many online databases and medical journals do not provide the full articles (unless one pays for them) but only the title, the authors’ names, some bibliographical details, and the abstract. In view of its importance, then, the accuracy of information provided by the abstract is critical. Some of the abstracts that accompany medical papers are adequate for these purposes, but many are not. In her study on the structure of medical abstracts, for example, Salager-Meyer (1990) found

718 Medical Discourse: Structured Abstracts

that 48% of them were poorly structured. Poorly structured abstracts were defined as those that did not follow the structural guidelines for abstract writing indicated in the journal’s Instructions for Authors/Contributors. Among the most frequent structural flaws that Salager-Meyer and her specialist informants identified were as follows: (1) lack of a fundamental move (e.g., the objective, the methods, the results, or the conclusions of the study); (2) conceptual overlapping (e.g., presentation of partial results, statement of methods, and presentation of some more results); (3) lack of two fundamental moves (e.g., the objective or the conclusions of the study); and (4) illogical structuring (e.g., the methods move precedes the statement of purpose, or the objective of the research is presented after the results move). Using objective criteria, other investigators (Narine et al., 1991) assessed the quality of traditional or conventional abstracts of the 33 original research articles published in the Canadian Medical Association Journal in 1989. The mean overall score was 0.63 out of 1. Abstracts were found to be deficient in the reporting of technical descriptors of study design, study variables, and subject selection. Results were often reported without supporting data and many abstracts failed to address study limitations and recommendations for future study. This is how the concept of ‘structured abstract’ arose.

Definition and Objective of a Structured Abstract The idea of a structured abstract, i.e., an abstract that describes a study using specified content headings rather than pure paragraph format, was suggested by the Ad Hoc Working Group for Critical Appraisal of the Medical Literature in 1987 to provide more information for articles reporting original research of medical care, to help health professionals to quickly assess the reliability and content of a clinical report, to facilitate peer review, and to aid accurate indexing and retrieval of reports from computerized databases such as MEDLINE and EMBASE. Structured Abstract of Clinical Trials

Such abstracts concisely report key aspects of the purpose, methods, and results of a trial in a consistent way and using a standard glossary of terms (e.g., cohort, cost–benefit analysis, randomized). The structured abstract for clinical trials should mention seven key aspects by means of requisite content headings. We thus have a seven-part tabular format such as:

1. Objective: states the question(s) addressed in the paper; 2. Design: indicates the basic design of the study (blind, double-blind, randomized, cross-over, etc.); 3. Setting: mentions the place where the research was carried out; 4. Patients or participants: indicates the number of patients who were enrolled in the study, how they were selected, and how they were distributed per group; 5. Main outcome measures: explains the treatment and its administration route; 6. Results: mentions the study end-points; 7. Conclusions: refers to the main conclusions, including direct clinical applications. Some medical journals, though, such as the Lancet, accept a four-part tabular format, but each ‘move’ is always clearly preceded by an explicit heading: Background (30 words). A summary of the general topic and the purpose or hypotheses of the study. Methods (50 words). A description of the materials (generic names of drugs and equipment should be used, unless the particular brands are crucial to the study), the methods (including the type of study design), and the subjects (important eligibility criteria, number, and selection process). Results (100 words). A statement of the primary results of the study; the types of analysis used should be indicated, as should levels of statistical significance and confidence intervals. Conclusions (30 words). A statement of the conclusions (the answers to the hypotheses posed at the beginning of the study). Only the conclusions that are directly supported by the evidence provided by the study should be included. Any need for further study should be indicated. Clinical Implications (30 words). A description of what the conclusions imply for clinical practice. Rapid adoption of structured abstracts by journals resulted in an annual doubling of reports with structured abstracts published from 1989 through 1991 appearing in MEDLINE; 15% of these were reports of clinical trials. Structured Abstract of Review Articles

Mulrow et al. (1988) also proposed standardized guidelines for informative review abstracts, i.e., abstracts of review papers. The proposed structure is the following: 1. Context or background: brief statement of the state-of-the-art; 2. Objective: the primary objective of the review article; 3. Data sources: a succinct summary of data sources;

Medical Discourse: Structured Abstracts 719

4. Study selection and data extraction: the number of studies selected for review, how they were selected, the type of guidelines used for extracting data, and how they were applied; 5. Data synthesis: the methods of data synthesis and key results; 6. Conclusions: key conclusions, including potential applications and research needs. The following excerpts are taken from Mulrow’s et al.’s (1988) paper. They give an example of an original (unstructured) abstract and of its structured equivalent: Original Abstract

Use of digitalis for the treatment of patients with congestive heart failure and sinus rhythm remains controversial. To ascertain the proper therapeutic role of digitalis, we have critically appraised the published clinical evidence of digitalis efficacy using standardized methodologic criteria. A search of the English literature between 1960 and 1982 identified 736 articles, of which 16 specifically addressed the clinical evaluation of digitalis therapy for patients with congestive heart failure and sinus rhythm. Only two double-blind, placebo-controlled trials provided clinically useful information. One study showed that digoxin therapy could be withdrawn successfully in elderly patients with stable congestive heart failure. The other showed that patients with chronic heart failure and an S3 gallop benefited from digoxin therapy. Structured Abstract

Purpose: To ascertain the clinical benefits of digitalis treatment in patients with chronic congestive heart failure and sinus rhythm. Data Identification: An English-language literature search using MEDLINE (1966–1982), Index Medicus (1960–1965), and bibliographic reviews of textbooks and review articles. Study selection: After independent review by three observers, 16 out of 736 originally identified articles were selected that specifically addressed the stated purpose. Data extraction: Three observers independently assessed studies using explicit methodologic criteria the quality of evaluating clinical trials. Results of data synthesis: Because of deficient selection criteria and study methods in 14 studies, therapeutic efficacy could not be adequately assessed. Two randomized, double-blind, placebo-controlled studies suggested that digitalis could be successfully withdrawn from elderly patients with stable heart failure, whereas patients with an S3 gallop might benefit from digitalis.

Conclusions: The benefits of digitalis treatment for patients with congestive heart failure and sinus rhythm are not well established. To better delineate the therapeutic benefits of digitalis, investigators must conduct more rigorously designed trials involving patients with newly diagnosed failure and varying degrees of failure. As can be seen from the examples provided, the differences in information and organization between the original and the proposed structured abstract are apparent. The structured abstract is longer, by approximately 48%. Qualitatively speaking, structured abstracts present greater detail: the purpose is more clearly defined and the methods of identifying, selecting, and extracting information are better delineated.

Evaluation of Structured Abstracts: Their Advantages over the Nonstructured Format Comans and Overbeke (1990), using criteria from the Ad Hoc Working Group, analyzed structured abstracts of original research articles published in three major medical journals: the British Medical Journal, the New England Journal of Medicine, and the Annals of Internal Medicine. They found that the abstracts were clearly written, but lacked information about sample selection, patients’ demographics, and statistical analysis. As a follow-up of the above studies, Tadio et al. (1994) set out to assess and compare the quality of a random sample of nonstructured and structured abstracts in the British Medical Journal, the New England Journal of Medicine, and the Journal of the American Medical Association over four selected years. The structured abstracts received significantly higher quality scores than the nonstructured abstracts, which suggests that the structured abstract is preferable to the conventional, nonstructured format in providing complete information. These authors note that the higher quality scores for structured abstracts may be a direct result of the design of structured abstracts. Indeed, the design provides a framework for the information that should be included and prompts the reader in retrieving this information. Tadio et al.’s study found that the following information was reported more frequently with the structured format than with the nonstructured format: study purpose, setting, number of dropouts, interventions, study variables, appropriate numeric and statistical values, and conclusions. The authors note, however, that although the structured abstracts more consistently met the assessment criteria, they did not meet all of them. For both abstract styles,

720 Medical Discourse: Structured Abstracts

there is room for improvement in this regard. Imperfect quality scores for structured abstracts may be the result of a lack of space allotted for abstracts in medical journals (usually limited to 250 words) and the inability of researchers to concisely summarize their research. In the case of review articles, the advantages of a structured abstract over a conventional abstract are the following (cf. Mulrow et al., 1988): First, readers can efficiently identify reviews that are relevant to their own interest and that are scientifically sound; potential sources of bias can be detected; and results and conclusions can be critically appraised by evaluating the methods of identifying, selecting, assessing, and integrating information. Second, authors are given a framework that will help them concisely organize and present the results of their investigations. A greater awareness of the vital elements of a good review article and clearer, more detailed summarizations of review articles may result. With regard to reading comprehension, it has been shown that structured abstracts significantly improve the reading comprehension of nonnative Englishspeaking scientists (NNES), but only in a highly specialized context (Salager-Meyer, 1994). In other words, in a familiar context, the structure of the abstract does not have a direct bearing on reading comprehension: a conventionally structured abstract will be as well understood as a structured abstract if the concepts expressed in the abstract are familiar to the reader. Conversely, if the concepts expressed are highly specialized, then NNES will understand better and in less time a structured abstract than a nonstructured abstract.

Conclusions All in all, then, although there have been objections (based on length and aesthetic concerns) to the structured format, because of its numerous advantages the structured format has been adopted widely in one form or another, and there is general acceptance that it is more informative than the unstructured variety. Structured abstracts improve both the quality of the paper and the reader’s understanding; they result in easier recognition of relevant and valid

articles, more precise and efficient computerized literature searches, and quicker, more consistent peer-review processes. Structured abstracts, moreover, unmask methodological problems that were left out in traditional abstracts. See also: Discourse Studies: Second Language; Genre

and Genre Analysis; Languages for Specific Purposes; Macrostructure; Rhetorical Structure Theory; Text and Text Analysis.

Bibliography Ad Hoc Working Group for Critical Appraisal of the Medical Literature (1987). ‘A proposal for more informative abstracts of clinical articles.’ Annals of Internal Medicine 106, 598–604. Comans M L & Overbeke A J (1990). ‘The structured summary: A tool for reader and author.’ Ned Tijdschr Geneeskd 134, 2338–2343. Huth E J (1987). ‘Structured abstracts for papers reporting clinical trials.’ Annals of Internal Medicine 106, 626–627. Lock S (1988). ‘Structured abstract: Now required for all papers reporting clinical trials.’ British Medical Journal 297, 156. Mulrow C D, Thacker S B & Pugh J A (1988). ‘Proposal for more informative abstracts of review articles.’ Annals of Internal Medicine 108, 613–615. Narine L, Yee D S & Einarson T R (1991). ‘Quality of abstracts of original research articles in CMAJ in 1989.’ Canadian Medical Association Journal 144, 449–453. Salager-Meyer F (1990). ‘Discoursal flaws in medical English abstracts: A genre analysis per research and text-type.’ TEXT 10(4), 265–384. Salager-Meyer F (1994). ‘Reading medical English abstracts: A genre study of the interaction between structural variables and the reader’s linguistico-conceptual competence (L2).’ Journal of Research in Reading 17(2), 120–146. Taddio A, Pain T, Fassos F F, Boon H, Ilersich A L & Einarson T R (1994). ‘Quality of structured and unstructured abstracts of original research articles in the British Medical Journal, the Canadian Medical Association Journal, and the Journal of the American Medical Association.’ Canadian Medical Association Journal 150, 1611–1615.

Medical English: Conferencing 721

Medical English: Conferencing P Webber, Universita degli Studi di Roma ‘La Sapienza,’ Rome, Italy ! 2006 Elsevier Ltd. All rights reserved.

Background Linguists have devoted more attention to the study of written science than oral communication of research. In the medical field, this is partly because reading abilities and providing access to the literature are seen widely as priorities for medical professionals, not only for research purposes but also for clinical practice. The topic has not been entirely neglected, however. Dubois (1980a, 1980b, 1982, 1985, 1987) published a series of articles on short biomedical presentations. These treated many aspects of conference language, from the genre and structure of biomedical talks and intonation to the use of imprecise numerical expressions. Dubois described both linguistic and nonlinguistic aspects of short papers presented at the Federation of American Societies for Experimental Biology. Among these is a study of nontechnical arguments found in the talks. Starting from the observation that the main purpose of this discourse is to persuade a skeptical audience, she noted the ‘‘pre-speech arguments,’’ such as the prestige of the speakers and the title and abstract of their paper, and the ‘‘speech arguments.’’ such as the display of confidence in the delivery of the talk, presentation of self as a reliable authority on the topic, and the judicious use of a variety of styles ranging from formal to informal. Finally, there are ‘‘post-speech arguments’’ brought to bear during the discussion phase if the speaker is competent at answering the questions. Apart from paper presentations, Dubois also published an article on the poster session (1985). She mentioned the reasons for the growing popularity of poster sessions and described some principles of layout and presentation of posters, designed to attract attention and present experiments as clearly and concisely as possible. Presenters stand near their poster while the other participants mill around, stopping to ask questions. This kind of session is less threatening and stressful than speaking from the podium, particularly for novices. Dubois described some typical linguistic realizations of the interchange and noted the great potential of the poster session for classroom use (see also Webber, 1986 and Shalom, 1996). The pioneering work by Dubois did not lead to any further research until some years later, when more studies appeared, written by Shalom (1993, 1995,

1996) and Ventola (1993, 1996). Although these are mostly on conferences in other fields, they are very interesting because the characteristics described in monologues from other disciplines are often relevant to medical discourse. Shalom analyzed research process genres and participant roles in ecology and humanities conferences, describing the main characteristics of the discussion and the new genre of the poster session discussion. She noted that discussants seemed to focus on certain aspects of their research listed as ‘Focus Categories.’ An interesting problem in the analysis is the distinction between accepted knowledge and assumptions. At these events where data at the frontiers of research are presented, the experts may not always agree about what is ‘‘accepted knowledge.’’ The distinction proved to be so fuzzy that they were collapsed into the one category of ‘assumptions.’ The role of the Chair (Shalom, 1995) is central to chaired sessions, such as paper presentations. The Chair opens and closes the session and introduces the next speaker, so that turns are partly pre-allocated. If there is a discussion phase, the Chair opens the paper to the floor and allocates discussants, who bid for turns to ask questions. This is very different from the joint construction of conversation. The Chair operates through a series of discourse moves that may be either procedural, serving to manage the discussion and keep speakers to a rigid time schedule, or social, fostering social cohesion in the group. Thompson (1994b) included some medical data in a study of three important aspects of cohesion in monologue: clause relations, lexico-grammatical cohesion, and intonation. These three aspects are interrelated and are drawn on by the speaker to signal the overall discourse pattern, thus aiding listeners in realtime processing of the text. The role of intonation is particularly important in spoken communication of science as it also conveys part of the meaning of the text. This partly compensates for the lack of graphic features that are present in written texts, such as text layout. Ventola (1996) analyzed the discussion phase and the linguistic relations between paper presentations and their discussions as speech events, addressing the problem of classifying them as linked or separate speech events. Ventola et al. (2002) edited a book entirely devoted to conference language, in which the chapters by Rowley-Jolivet (2002) and Webber (2002) include descriptions of medical data. The latter provides an analysis of the question and answer session. This

722 Medical English: Conferencing

post-presentation discussion phase presents difficulties for both the presenter and the discussant asking the question, because the development of the interaction is unpredictable and underlying conflicts may be brought into the open. Questions expressing criticism were found in this study to be quite frequent (30% of the total), showing that the discussion phase often represents an opportunity to challenge or attack other researchers. As noted by Rowley-Jolivet (2002), the work presented is often in a preliminary stage and may be considered an example of ‘‘science in the making.’’ She traced the construction of scientific knowledge claims from their early stages in the laboratory to their fully fledged stage as generally accepted facts published in refereed journals. In this development, she assigned the status of proto-claim to the knowledge claims put forward in conference presentations. The same author has published a study of visuals in conferences.

Session Types The genres of conference discourse vary considerably from one discipline to another, but most include paper presentations of research, plenary lectures, and poster sessions. Some conferences also include state-of-the-art lectures, symposium talks, workshops, current issues, round-table talks, meet-theprofessor sessions, fireside panels, and luncheon panels. The conditions of production of these different types also vary considerably. Thus, for paper presentations and posters, an abstract must be submitted to the organizing committee before the meeting. It must be written according to the association’s regulations and competes with other abstracts for admission. In fact, only a small percentage are accepted. Abstracts are not submitted for plenaries or symposia. Papers and symposia are followed by a discussion phase, whereas plenaries are not. This is summarized for the main session types in Table 1.

Table 1 Characteristics of main session types Type

Abstract

Discussion

Included in abstract book

Plenary Paper presentation Poster

No Yes

No Yes

Title only Yes

Yes

Yes

Symposium

No

(Personal interaction) Yes

Title only

Papers are in some ways similar to written medical articles, studied by Adams Smith (1984), Salager Meyer (1993), and Nwogu (1997). They usually start from what is known on the topic, to position the study, and then indicate an unknown or gap before moving on to provide an explanation or solution, which is usually evaluated at the end of the text. However, oral presentations often devote less space to positioning and to the conclusions, whereas they focus more on the data presented in the slides. The speaker’s claims and commitment are stated explicitly and forcefully, often using the present tense, which gives the impression of a live issue. Table 2 shows an example of the schematic moves in a paper presentation. The plenary is different. Speakers have been invited, have submitted only the title of the lecture, and are free to follow any format of their choice. Moreover, the plenary does not have to compete for an audience as there are no concurrent sessions. The plenary is therefore likely to be given by an experienced lecturer of prestige. Lecturers will often give a chronological overview of the way the field has evolved over the years, ending with comments on the outlook for the future. They provide the emerging view of current issues. As a result of the less rigid text structure, they tend to use more overt text markers providing a clearly organized framework.

Features of Conference Language Some speakers prefer to read from a text written out before the conference and many rehearse their talk carefully. In fact, the talks retain features of writtenness generally considered characteristic of the scientific register, such as detachment, precision, and a high occurrence of specialist terminology. However, they differ from the written article owing to the

Table 2 Schematic move analysis of a paper presentation 1 2 3 4 5 6 7 8 9 10 11 12 13

Background Definition Rationale Gap Purpose and announcement of present study Hypothesis Methods Subjects Results Details of equipment More results Claim and stance Implications for research and practice

Medical English: Conferencing 723

co-presence of the interactants. This leads to more expressions of uncertainty and hence modality, including modal auxiliaries such as might and subjective forms of modality such as I think or we are not sure. Epistemic devices and hedging (see Medical Discourse: Hedges) are used partly for reasons of politeness, to give an impression of modesty and caution. However, they also reflect more exactly the speaker’s confidence in the propositions and the reliability of the data. Webber (2004) compared a conference abstract, a paper presentation, and a research article on the same topic. The strategies used in the three text types are affected by the purpose of the text and by the limited space and time available for the two conference genres. The conference abstract is a very condensed version of the work presented. It is submitted before the conference and, to compete with other abstracts, it tends to stress the novelty and interestingness of the study described. It focuses on the purpose and main claims of the research and hence starts immediately with a statement of the purpose. The oral presentation includes features of politeness and opening and closing frames, a part of the interactional ritual of the conference. The opening and closing frames serve to greet the audience or express thanks, such as it is pleasure to be in this beautiful city and Thank you, Mr. Chairman. In the research article, themes are longer, more complex, and less suitable for oral communication. The results and conclusion sections are longer and devote more space to comparing the results with those of other researchers and accounting for discrepancies. Time constraints are important factors, especially in paper presentations, which are generally allotted only 10 to 15 minutes for the presentation and 5 minutes for discussion. Mastering conference skills thus presents difficulties for speakers and listeners, since much cognitively demanding information is presented in a short time and, apart from the discussion phase, listeners have no opportunity to check or ask for confirmation. This is particularly difficult for non-native speakers of English when faced with the task of understanding a range of different accents as presenters follow one after another in rapid succession, especially if the listeners have learned English in an idealized, perhaps written form.

Conclusion Certain features are typical of oral communication of medical research. There are frequent instances of language in action, as speakers refer to the slides. As already noted by Dubois, slides are ‘‘the backbone of the speech.’’ They are an important distinguishing

element in conferences, as they not only provide the evidence on which the researchers base their argument but also aid the listener in following the content, together with intonation, lexical cohesion, and explicit signaling of the discourse organization of the text using markers such as This talk will focus on . . . and now let me turn just briefly to . . . Each slide generally introduces a new unit of discourse. The co-presence of participants leads speakers to use features such as narrative elements, informal language, hesitancy markers, occasional anecdotes, jokes, and references to difficulties encountered in doing the research. These are partly a result of on-the-spot decisions, may be motivated by both conscious and unconscious choices, and realize the role of equals. Most of these would not be countenanced in written articles and are edited out of the final version if it eventually is published. Features of informality are also found. Among the informal features found by Dubois in her material are imprecise numerical expressions, such as instances of hedging (a little, a bit, around) and ranges (anywhere from 150 to 200), used not only because aural processing of conference talks is difficult but also because speakers wish to highlight certain data and downplay others that they consider less important (see also Medical Discourse: Hedges). Much of the preliminary research presented may not be published. This unpublished work is thus nipped in the bud and never reaches a wider audience. The conference may be the only opportunity to hear about this new research. Thus, novelty is one of the main characteristics of the conference. The choice of language in the titles and conference abstract is designed to try to make the contents appear newsworthy. Particularly in the many concurrent sessions there is competition to attract an audience. In addition to the competition, however, there is also a desire to create a sense of solidarity and in-group affiliation with the audience, where the speakers tend to present themselves as persons and not just researchers. Hence, we find instances of first-name familiarity and a frequent use of personalization, such as the use of the first-person singular and lexical phrases such as in my opinion, which lower interpersonal distance and create a sense of rapport in a situation that represents a challenge and involves risk. Webber (1997a) found many instances of conversational features, particularly in the discussion section. The discussion, however, is not a true conversation because serious business is being accomplished and it includes transactional as well as interactional work. The strategies used in different types of medical English vary in ways often characteristic of the genre that they represent. The linguistic behavior observed is partly

724 Medical English: Conferencing

rule governed in the sense that members of the discourse community will expect a conventional form of discourse from participants. The discourse is also affected in part by the participants’ motives for attending these conferences. Participants wish to keep abreast of important developments in their field, but the conference is also a social occasion where people can meet the speakers and contact other researchers to exchange news and views. The coffee breaks and conference dinners are thus also important parts of the conference. All these aspects make the study of conference language complex but fascinating. See also: Lingua Francas as Second Languages; Medical Discourse: Hedges.

Bibliography Adams Smith D (1984). ‘Medical discourse: Aspects of author’s comment.’ ESP Journal 3, 25–36. Dubois B (1980a). ‘Genre and structure of biomedical speeches.’ Forum Linguisticum 5, 140–169. Dubois B (1980b). ‘Nontechnical arguments in biomedical speeches.’ Perspectives in Biology and Medicine 24, 399–410. Dubois B (1982). ‘And the last slide please: Regulatory function at biomedical meetings.’ World Language English 1, 263–268. Dubois B (1985). ‘Popularization at the highest level: Poster sessions at biomedical meetings.’ International Journal of the Sociology of Language 56, 67–85. Dubois B (1987). ‘Something on the order of around forty to forty-four: Imprecise numerical expressions in biomedical slide talks.’ Language in Society 16, 527–541. Heino A, Tervonen E & Tommola J (2002). ‘Metadiscourse in academic conference presentations.’ In Ventola et al. (eds.). 127–146. Nwogu K (1997). ‘The medical research paper: Structure and functions.’ English for Specific Purposes Journal 16(2), 118–138. Rowley-Jolivet E (2002). ‘Science in the making: Scientific conference presentations and the construction of facts.’ In Ventola et al. (eds.). 95–125.

Rowley-Jolivet E (2004). ‘Visual textual patterns in scientific conference presentations.’ In Banks D (ed.) Text and texture. Paris: L’Harmattan. 383–410. Salager Meyer F (1993). ‘A text-type and move analysis study of verb tense and modality distribution in medical English abstracts.’ English for Specific Purposes 11, 93–114. Shalom C (1993). ‘Established and evolving spoken research process genres: Plenary lecture and poster session discussions at academic conferences.’ English for Specific Purposes 12, 37–50. Shalom C (1995). ‘The discourse management role of the Chair in academic conference presentation sessions.’ Interface: Journal of Applied Linguistics 10(1), 47–62. Shalom C (1996). ‘Poster presentations on the pre-sessional EAP course.’ In Hewings M & Dudley-Evans T (eds.) Evaluations and course design in EAP. London: Prentice Hall Macmillan. 96–104. Thompson S (1994a). ‘Frameworks and contexts: A genrebased approach to analyzing lecture introductions.’ English for Specific Purposes 13(2), 171–186. Thompson S (1994b). ‘Aspects of cohesion in monologue.’ Applied Linguistics 15(1), 58–75. Ventola E (1993). ‘‘Any questions?’’ – Discourse features of discussion time. Paper delivered at AILA, Amsterdam. Ventola E (1996). Discussing discussions. Paper presented at the International Conference of Pragmatics, Mexico City, Mexico. Ventola E, Shalom C & Thompson S (2002). The language of conferencing. Frankfurt: Peter Lang. Webber P (1986). ‘The poster session as a technique for integrating skills in an ESP course.’ Modern English Teacher 14(1), 46–48. Webber P (1997a). Casual conversation features in scientific conference presentations. Unpublished M.Sc. diss., Aston University, Birmingham. Webber P (1997b). ‘From argumentation to argument: Interaction in the conference hall.’ Asp GERAS, 15–18. Webber P (2002). ‘The paper is now open for discussion.’ In Ventola et al. (eds.). 227–253. Webber P (2004). ‘From spoken science to published research article: A comparative case study.’ In Banks D (ed.) Text and texture. Paris: L’Harmattan. 571–596.

Medical Journals, Letters to the Editor 725

Medical Journals, Letters to the Editor D Carnet and A Magnet, University of Burgundy, Dijon, France ! 2006 Elsevier Ltd. All rights reserved.

Introduction Among the seven genres identified in medical journals (Webber, 1994), along with research papers, review articles, editorials, book reviews, case studies, and the news section, letters to the editor are a tool offered to the community to react to other scientists’ research and mainly to express personal opinions and disagreement. The experimental article, in contrast, is known as a constrained and modelized mode of written communication and was set up as a genre following J. Swales’s (1990) archetypal studies. Thus, letters to the editor offer a much freer mode of expression than the classical scientific rhetoric, which is described as objective, purely referential, impersonal, and detached. They do not only offer researchers an open forum in which to give full vent to sometimes harsh criticism, but they also contribute to the validation process of new research by ‘invisible colleges’ (Crane, 1972). Although research papers are evaluated before being published, much of the process of evaluation comes after publication. Thus, evaluation goes on formally through their citation in other papers that undergo the same process of review and informally in letters to the editor published in medical journals (Bloch, 2003) (see Medical Discourse and Academic Genres; Collocations; Discourse Domain).

Historical Background of the Genre Origin of These Letters

Letters can be traced back to the very beginnings of scientific writing. The first scientific publications were letters enabling scholars interested in scientific observation and experiments to exchange information. At the end of the 17th century, researchers published exclusively through correspondence printed in journals. Indeed, letters have been described as ‘‘a longstanding tradition in scientific journals written in a predominantly personal style, with use of the first person’’ (Webber, 1994: 258); and in most medical journals, letter publishing began in the 19th century, generally at the same time as the birth of these journals. For instance, the first letters appeared in issue 2 of The Lancet, on October 12, 1823, in a section entitled ‘Miscellaneous.’ They first appeared in 1848 in the Transactions of the American Medical Association, which became The Journal of the

American Association (JAMA), in 1883; in the British Medical Journal (BMJ) in 1857; and in the New England Journal of Medicine (NEJM) in 1928. Letters to the editor can be considered a long-standing yet lively genre updated through time. A Habit Inherited from the Quality Newspapers

This type of correspondence shares common features with letters published in the quality papers of the European press and can be defined as a paper or article presented as a letter, dealing with a contentious issue in a controversial tone written by a member of the public. A diachronic overview shows that over time the letters have increased in length, number, and regularity. However, the number of letters published in each issue may still vary on average from 5 to 20 letters, depending on the journal. A Complement to Research Articles

Their raison d’eˆtre also evolved over the years, progressively becoming a genuine and mature genre. From mere clarifications aiming to provide further knowledge on a given research topic, they gradually became a tool for questioning previously validated research. They have grown as a complementary, and sometimes alternative or transient, strategy used to build a niche, establish a position, and defend it in the scientific community.

Characterization of the Genre These letters are present in most medical journals. However, they do not always bear the same name. Investigating the top medical journals following the impact factor as established by the Science Citation Index (SCI) shows that the most common title used to label this genre is ‘Correspondence.’ It is called this, for example, in Nature Medicine, The Lancet, The New England Journal of Medicine, The Journal of the National Cancer Institute, and Gastroenterology. However, other titles can also be observed, such as ‘Letters,’ in The Journal of the American Medical Association and ‘Reader’s comments’ in The American Journal of Cardiology. Letters to the editor do not belong to the epistolary genre. They are not really letters but a way to express one’s opinion or to set something straight. They often start with the characteristic head shared with the epistolary genre, ‘Sir’ or ‘Dear Sir.’ Interestingly enough, it can be observed that ‘Dear Sir’ typifies American journals, whereas ‘Sir’ is the standard form in British ones. Quite a lot of other journals publish this correspondence with the heading ‘To

726 Medical Journals, Letters to the Editor

the Editor,’ and, as in any scientific writing, communication actually builds up between a scientist and his or her community and not between two individuals. This communication mode should, in fact, be considered as deriving from ‘the open letter’ genre, thus borrowing some of the polemic journalistic features, and not be viewed as life writing, characteristic of the epistolary genre.

Writing Conventions Medical editorial boards very often provide specific guidelines for potential authors. These indicate the maximum length (between 300 and 1000 words), the deadline to react to a published article (between 1 and 6 months), and the nature of the issues that can be tackled. They can either be comment letters on what has been previously published in the journal or letters of general interest unrelated to earlier items in the journal. Letters can be modified by the editor without any prior agreement from the authors, and letter writers are generally not consulted before publication. The letter must not have been published (including on the Internet) nor be under consideration for publication elsewhere, and letters that contain unpublished data are refused by some journals. Correspondence letters are not usually peer reviewed, but the journal may invite replies from the authors of the original publication. In a few journals, it is even specified that only one table or figure is permitted and that no more than five references and five authors should be included. Authors are also required to suggest a title.

Objectives and Strategies In the scientific community at large, scientific claim is essentially expressed by the research paper through the use of validated protocols, by the publishing of experimental findings in journals ranked according to their impact factor, and eventually through the reviewing process. Letters to the editor represent a form of contradiction per se of the research paper. They appear to be on the fringes of the communityvalidated discourse, but they may be interpreted as a dynamic process leading to discourse production as well. Controversy generally concerns the experimental method selected, the duration of the experiment (most of the time too short for financial reasons), the number of experimental subjects (most of the time too few for the same reasons), and results too flimsy or too frail to be exploited. Indeed, they do not only occur as a reaction to the published discourse. They can also precede the publication of scientific papers because they very often relaunch new research

on a given theme. This type of discourse can therefore be seen as constructive contradiction. Authors’ Objectives

The objective for an author is not merely to write a letter and let other scientists know his or her opinion about a specific topic. If this were the case, such letters would have disappeared long ago and would perhaps have been replaced by e-mails, which are currently a much more direct and rapid way to communicate. Conversely, correspondence in scientific journals does not aim to resolve private issues. The letter writers require the entire community to witness and even take part in their public debate. The strategy of questioning can be understood as an explicit and implicit mode of criticism leveled at an established scientific fact. It is the quickest way for them to position themselves on a research theme considered to be ‘hot science’ on which they are also working because having a paper published takes several months at least. To submit an article, scientists must have gathered enough evidence through their results, whereas a letter will allow them to stake out their territory before all the conclusive evidence is available for publishing. It may also be the means for a researchers to highlight or rehighlight their own published research, which otherwise might be discredited or ignored due to the publication of an article that contradicts their own results. Writing letters to the editor may appear to be an irrepressible need for some scientists because it allows them to react swiftly, personally, and sometimes contentiously to issues about which they feel strongly concerned. A Community’s Strategy

It is noteworthy that editors still grant space to correspondence in their journals even though the publishers of medical journals generally complain about the increasing size, and hence the publishing cost, of scientific journals. The editor acts as a vector, and the letters are used as tools in the general process of scientific validation; they can be considered as the aftermath of the construction of the scientific fact. Indeed, in contrast to scientific conference forums or discussion forums on the net, their journals offer permanent records of a debate. They are public, can always be consulted (either in the paper version or the online version of the journals), and can even be quoted as references in bibliographies. Readers’ Motivations

Most researchers read letters not systematically but regularly in one or several journals. It is a matter of

Medical Journals, Letters to the Editor 727

keeping informed about the hot issues in their fields and to learn about the contentious debates within their specialists’ community. Researchers do not systematically read the correspondence section after the publication of an article they have written. This shows that they are not particularly interested in the reactions their papers may trigger or that the validation granted by the reviewers brings them enough satisfaction and confidence in their research. Skimming letters seems to be a quick and efficient means for researchers to keep up to date with the new paradigms, to hear about current controversies in their communities, and to learn about various conflicting reactions to previously published articles.

An Author-Reader Relationship

Such letters reveal underlying conflicts in the community that are totally ignored or hidden in the experimental research paper. Several linguists (Hyland, 2000; Magnet and Carnet, 2001; Bloch, 2003) suggest that this major difference necessarily induces specific features in the writing.

The research paper focuses on the object of the research and, hence, any form of subjectivity tends to be erased. It almost looks as if there were no subjects in charge of the experiment. On the contrary, letters to the editor are presented as a pseudodialog in which addressers speak in their own names. Addressees are supposedly the editors of the journals, but, in reality, letters are sent to the community of specialists at large. This personalized form of communication is therefore characterized by the recurrent use of the first-person pronouns we and, to a lesser extent, I; by the intensive use of the possessive our; and by the frequent opposition between we and you, we and they, or we and he. These are clear signs of the opposition between the scientific fact as reported in the primary article and that proposed by the authors of the letter. With this opposition, the strategy aims to weaken the credibility of the scientific fact as established by the experimental paper. This process allows the authors of these letters to prove that there is not only one established fact but at least two.

A Chronological Approach

A Researcher-Centered Genre

In the research paper, chronology is totally erased. In contrast, in letters to the editor, the authors are relatively time-conscious. Generally, the letter follows a chronological approach linked to the contentious character of the genre. It mimics more closely, in a way, the prosecution and plea process observed in courts; here, the board of editors and the scientific community at large act as jurors. Letters can thus play a role in the questioning process of the scientific community. They offer the scientist a way to react publicly, almost in real time. They can, in some cases, be a step toward publishing results in the form of an experimental paper.

Contrary to the primary scientific article, which aims to establish scientific facts, letters to the editor tend to be focused on the researchers themselves and their position in the community. They take into account the authors’ reputation in the community, for example, and disagreement on scientific issues may disguise clashes of interests. From research papers it may look as if the sole aim of researchers is scientific progress, whereas from letters to the editor conflicts and competition become explicit. The experimental paper can therefore be described as research-centered, whereas letters to the editor tend to be researcher-centered. This is why self-mentioning and self-quoting are common practices in this genre. Moreover, it is worth noting that most of the time letters are signed by only one scientist, unlike research articles, which most of the time are the fruit of collaboration in the medical field. This change in focus influences linguistic choices and induces cultural specificities.

Discursive Features

A Modelized Macrostructure

A basic feature of letters to the editor is their underlying macrostructure, whose most common pattern can be described in four moves. Adapting J. Swales’s famous descriptive analysis mode of the conceptual organization of the research paper (Swales, 1981), the following recurrent model can be suggested. The first move is a reminder of the contested published results. The other three reproduce the logical argumentative pattern: thesis – antithesis – synthesis. Thus, the second move aims at expressing the challenge. The third develops arguments backed by the author’s own research. Finally, in the fourth move, the author urges his or her colleagues to reconsider the initial findings of the study under attack before reaching their own conclusion.

Linguistic Features Specific Verbal Forms

An interesting point is the very low occurrence of passive structures in letters to the editor. Scientific discourse is generally characterized by a heavy use of the passive, especially in the ‘Materials and Methods’ and the ‘Results’ sections of papers. The reason for this is to focus on the object of the experiment, rather than on the subject, to give an objective value to the

728 Medical Journals, Letters to the Editor

published research. In letters to the editor, on the contrary, the emphasis is put on the choices made by the criticized authors and thereby on the authors of the letters themselves. Thus, selecting active forms to build an argument strongly reinforces the contentious mode. The most common grammatical tense used is the simple present (approximately 50% of the cases), and this again contrasts sharply with the research article, in which 80% of the verbs are in the simple past. In letters to the editor, the simple past is used in only 35–45% of the cases to report the experiments carried out by the criticized authors or by the authors themselves. The simple present is chosen to express the reality of the article in question. It is neither the historical present nor narrative present; it refers to the established scientific fact. This peculiar use of the present tense is close to a journalistic style. It has no connection with chronological time and follows Quirk’s (1997) interpretation: ‘‘The implication of the present tense seems to be that although the communication event took place in the past, its results – the information communicated – is still operative. Thus the notion that the past can remain alive in the present also explains the optional use of the present tense in sentences referring to writers and composers and their extant works.’’ Modality

Modals represent only 10% of the verbal structures, lower than the percentage in research articles’ ‘Discussion’ sections, in which they represent 14% of all verbal forms (Magnet, 1992) and in which they most often have an epistemic value. Epistemic Modality or Hedging Scientific discourse is generally used to weigh evidence and draw conclusions from data. Thus, uncertainty and doubt are necessarily present at least in the ‘Discussion’ section of experimental papers. This is expressed through hedges, which account for various degrees of probability (Salager-Meyer, 1994, 1995). Scientific discourse deals with the problem of what is true or false; the particular value of this modality is that it expresses the utterer’s lack of certainty regarding the validation of the predicative relation. In letters to the editor, in contrast, epistemic modality (hedging) has a low occurrence. The most frequently used modals are should, could, may, and would and to a lesser extent can, must, will, and might. Root Modality In contrast to epistemic modals, root modals may convey a deontic meaning to indicate a form of moral advice, expressing, in fact,

strong pressure from the utterer on the criticized authors. Linguistically, it thus concerns the relationship between the grammatical subject of the utterance and the predicate. Root modals are not concerned with the truthfulness of the subject-predicate relation but, rather, serve to express orders, wishes, suggestions, causality, or capacity. Indeed, the most widely accepted view is that hedging is the process whereby authors tone down their statement in order to reduce the risk of opposition, so it is understandable that, in letters to the editor, authors should express themselves instead through root modality. This type of moral advice through which the utterer exerts pressure on the criticized authors applies to objects and hence makes distancing possible. However, root modality can also bear on the animate subject, here, the criticized scientist. (1a) ‘‘First the degree of oxidant stress should be assessed to choose an effective antioxidant regimen.’’ (The Lancet, 2001: 631) (1b) ‘‘Respectfully, I must suggest we should all beware of inaccurate invectives.’’ (The Lancet, 2001: 647)

Modality of Assertion Positive assertion needs no particular marker, whereas any manipulation of the assertion calls for a specific marker. One of the most recurrent forms of manipulation in letters to the editor is contradiction, linguistically expressed through negation. The aim of the authors is to deny the original utterance or rebut its scientific relevance. Negation may be total or partial. It may apply to verbs or nouns, to the criticized research, or even to the researchers themselves. (2) ‘‘It makes no sense to me that it is possible to reliably detect the so-called independent associations of individual constituents of this group, when . . .’’ (AJCN, 2000: 849)

Qualifying Modality Qualifying modality concerns judgments about the content of the predicative relation, for example, whether it is good, bad, normal, abnormal, fortunate, or unfortunate. This is essentially qualitative modality, and in letters to the editor it mostly conveys a derogatory judgment. (3) ‘‘The method they use does not seem to be suited to the end results.’’ (The Lancet, 2001: 553)

Adjectives and Adverbs In contrast to the depersonalized style observed in the experimental paper, giving vent to direct criticism in letters to the editor is accepted by the community. These critiques borrow from the derogatory style, illustrated by the use of

Medical Journals, Letters to the Editor 729

specific figures of speech. One of the common ways to belittle one’s opponents is to use highly disparaging terms to qualify their work, such as poorly, mistakenly, biased, emotive, confusing, too simplistic, old and outmoded, artificial, vague, speculative, and sad. Nouns and Verbs Letters also reveal a massive use of certain nouns and verbs that are totally absent from the research paper because these derogatory terms usually belong to the critical style and reflect strong subjectivity. Nouns such as critique, rebuttal, borderline, reductionism, and blurring and verbs such as refute, rebut, fail to, contend, disagree, reject, challenge, and invalidate are common in letters. Negative Prefixes in Adjectives In standard English, the adjective is a linguistic tool mainly used to specify and qualify the notion expressed by the noun. In scientific discourse at large, adjectives mostly express a quantitative value, whereas in letters to the editor, an extensive use of qualitative adjectives can be observed. Interestingly enough, most of them carry a negative prefix whose aim is to weaken the arguments set out in the targeted paper. Examples of these prefixes are in- (inappropriate, inaccurate, inconsistent, incomplete, intemperate, incorrect, implausible, etc.), un- (unreliable, unexpected, unproven, unsupported, unclear, unaware, unfounded, unfortunate, etc.), out- (outmoded, etc.), under- (underpowered, understated, etc.), and mis- (misleading, misused, misdirected, etc.). Specific Markers Specific markers are necessary to build an argument. In letters to the editor, these markers fall into four groups that all express disagreement but with different levels of intensity. These markers correspond to four discursive strategies. . Concession. The weakest markers used to contradict somebody’s opinion express concession. They are common to all forms of discourse, including the research article. The most recurrent forms present in letters to the editor are, for example, although, however, but, yet, nevertheless, nonetheless, even if, and even though. These markers are used to diminish or belittle the impact of published observations and conclusions. . Antithesis. Authors may wish to express their opposition in a stronger way. In order to do so, they select markers such as but, while, whereas, conversely, by contrast, in contrast, otherwise, instead, unlike, and opposite. This linguistic device aims to oppose two ideas, two protocols, or two methodologies to demonstrate the superiority of the supported team.

. Rewording. Some markers are used to reformulate a previous statement and incite the criticized authors to change their minds and possibly their methods or conclusions. It is, in fact, an illustration of the pragmatics of politeness (Myers, 1989) generally observed in scientific discourse. Examples of these markers are rather, better, more accurately, and in other words. . Doubt. The most subtle way of explicitly questioning a method is to raise doubts concerning the validity of the study. The most common words and expressions used in letters to the editor to mark this are maybe, perhaps, probably, highly unlikely, wonder whether, and far from verified. Understatements

The use of implicit disagreement can be considered to be a less direct way to modulate contradiction. This weak contentious mode tends to express the utterer’s disagreement with the work under debate through forms that invariably involve a reference back to the utterer. Examples of these forms are we find it surprising that, therefore we strongly suggest, therefore we think, I have several comments, I showed clearly, In my opinion, we believe, we are aware, and we advocate doing this. By asserting their own results, letter writers actually contradict and criticize the results of the studies they are referring to. It can be understood as implicit criticism. This implicit mode of criticism is more frequently observed in American journals than in British ones, and it may be interpreted as a good example of the American tendency toward political correctness. Medical Terminology

Letters to the editor, along with the other six genres in medical writing, display specific linguistic characteristics, the most salient and obvious feature of which is the strong presence of a medical nomenclature used for labeling and description. The great majority of medical terms are nouns derived from Greek or Latin roots. This recurrent presence of a highly specialized vocabulary makes the letters to the editor as difficult for the layman to understand as the research papers, which confirms that this mode of communication truly belongs to the world of scientific discourse. (4) ‘‘The suggestion by Ozols that the dose of carboplatin could explain our results is implausible, because the same carboplatin regimens were used in the control arm and the test arm.’’ (The Lancet, 2002: 2088)

730 Medical Journals, Letters to the Editor

Cultural Features In spite of the common features shared with general scientific discourse, letters to the editor in AngloAmerican medical journals allow their authors some freedom of speech and style, which is never found in research papers. Whereas research articles are always built in the same way, with the same apparent objectivity and neutrality, the style of the letters may vary from one journal to another, from one author to another, and even from one country to another.

it shall be given you; seek and ye shall find’’ (Matthew VII:7; Luke XI:9) is to be interpreted as particularly disparaging toward the methodology of the scrutinized team. (6) ‘‘Seek and thou shalt find Patino et al. doubt the presence of endotoxin in their study cohort, although we know it to be present in the general population. Higher endotoxin levels in patients with diabetes mellitus and arterial disease may be expected and should be sought.’’ (AJCN, 2000: 1054)

Differences between American and British Letters

The tone and style used in American and British journals are often in stark contrast. In American journals, considerations of political correctness lead to the focusing of criticism on the object (the experiment or the selected methodology) and not on the subject (the research team). This attitude induces the letters’ authors to use understatement and hedging through modalized forms. In contrast, British journals abound with overtly polemical phrases, which may even be aggressive or sexist (e.g., inappropriate, inaccurate, emotive, poorly designed, retrospective, and biased). This type of overt criticism illustrates both the researcher-centered attitude previously mentioned and the lack of restraint observed in some British letters. (5a) ‘‘I think that Sandra Simkin’s report is inappropriate, inaccurate, and emotive.’’ (The Lancet, 2001: 641) (5b) ‘‘We are astounded that such frivolous allegations should have been made by academics from a reputable school of public health.’’ (The Lancet, 2002: 1982) Titles Borrowing Cultural References

Some letters bear a title that could never be found in a research article. These titles may conjure up a cultural background supposedly shared by the whole scientific community but that actually is familiar only to native speakers. References can be cultural and/or humorous. Bible Quotations In some cases, English-speaking writers disguise their biting critique as cultural references by including quotations from the Bible. Even though the Bible is a universal cultural heritage, it is well known that it is more prevalent in Anglo-Saxon culture. It would therefore be difficult for a nonnative speaker to understand the hint and even more so to borrow from the cultural background to express a criticism in such a way. It is through irony that a phrase borrowed from the New Testament ‘‘Ask and

Literary Quotations English-speaking scientists also share a common literary heritage that they enjoy borrowing from, thus mimicking newspaper headlines. Shakespeare is, of course, a favorite, and, because some quotations now belong to universal culture, it can be assumed that they will be understood by the whole community of researchers, native and non-native speakers alike. (7) ‘‘Acute hepatitis virus infection: to treat or not to treat.’’ (Gastroenterology, 2004: 1219)

Neologizing The use of impish neologisms to reveal the inadequacy of a terminology commonly used in the scientific community is truly meant to ridicule their authors. For example, the use of the word vegetarian, considered too vague and inaccurate, is criticized. This leads the authors in (8) to all sorts of ludicrous and wild variations on the theme.This is indeed a humorous criticism of scientific popularization of terminology used by a community that usually claims a highly specialized vocabulary. (8) ‘‘In an attempt to add more specificity in the scientific literature, various qualifying terms for vegetarian have been used, such as pescovegetarian and lactoovovegetarian. For one person I know who considers himself a vegetarian, an appropriate label might be lactoovopescopoulo-steak-only-when-I-eat-out vegetarian.’’ (AJCN, 2000: 1211)

Sayings Authors sometimes borrow sayings that can be, for instance, aphorisms, as in (9a). The authors of the letter criticize their colleagues’ lack of professionalism because they were unable to detect subtle changes that are likely to have dramatic health consequences. Others borrow from popular songs, as in (9b). (9a) ‘‘ ‘Small is beautiful’: a-linolenic acid and eicosapentaenoic acid in man [. . .] Minor changes in dietary lipids may be most significant in thrombosis prevention.’’ (AJCN, 1999: 1169)

Medical Journals, Letters to the Editor 731 (9b) ‘‘Mechanical circulatory support—a long and winding road.’’ (AJCN, 2000: 1222)

Reference to a common non-scientific cultural background characterizes the genre letters to the editor, as opposed to the research article, and may explain why letters built on such cultural references are seldom written by non-native writers. All these techniques are based on irony and humor through the use of phrases borrowed from other genres, generally out of reach for non-native speakers.

Status of the Genre: Diffusion and Scope of Letters One of the reasons researchers still spend time writing letters to the editor is that their diffusion is much easier and faster than that of research papers. For example, 33% of the letters received by The Lancet are actually published, whereas only 10% of submitted articles end up being published (Mullan, 2003). The refusal of a letter is often justified by the editors as being due to their readers’ lack of interest, but in fact less politically or scientifically correct reasons may also explain rejection (Behe, 2002). For the community, a mere letter on a hot research theme in a highly ranked journal may be the first step toward questioning a well-established paradigm. Indeed, a letter may be the favored means to start open exchange with a competing team. This sort of publishing may influence the standing of some researchers in the community. However, this influence will only be effective if the authors already have a good reputation among the specialists and if their laboratory results are brought as evidence to support their point in the debate. This relational mode is deemed satisfactory and rather efficient because it enables scientists to inform, in a quick and efficient manner, more people than a mere presentation in a conference. This may allow researchers to establish a niche before submitting more substantial results for publication. Finally, it also allows some researchers to defend an experimental protocol that has been much criticized by peers.

Trends and Prospects for Letters to the Editor Letters can be seen as a useful, and even necessary, but not self-sufficient communication tool within the scientific community. They reflect tensions in this community. Their most interesting role is to provide researchers with an outlet for frustrations, oppositions, controversies, and disagreements. Letters enable them to escape from the formal and impersonal

constraints of the research article style. Above all, they may constitute a breeding ground for new research by bringing up new ideas, new paths to explore, or new prospects for future collaborations. That is why these letters still play a key role among the scientific community at large, and this also explains why numerous online scientific journals have kept a section devoted to correspondence, thus a genuine and enduring genre. However, letters to the editor usually do not represent some sort of true validation because, unlike the research paper, there is no reviewing process. This genre does not offer an equal opportunity to the international community of researchers. It is a mode that non-native speakers find hard to master. Irony, humor, and cultural references that are difficult to tackle in a foreign language may explain why native English speakers represent the vast majority of the authors of letters to the editor in Englishlanguage journals. See also: Collocations; Discourse Domain; Genre and

Genre Analysis; Medical Discourse and Academic Genres.

Bibliography Adams Smith D (1984). ‘Medical discourse: aspects of author’s comment.’ ESP Journal 3, 25–36. Behe M J (2002). ‘Correspondence with science journals: response to critics concerning peer-review.’ Access Research Network. Available at: http:www.arn.org Bloch J (2003). ‘Creating materials for teaching evaluation in academic writing: using ‘‘letters to the editor’’ in L2 composition courses.’ English for Specific Purposes 22(4), 365–385. Carnet D & Magnet A (2002). ‘‘‘Letters to the Editor’’: Strate´ gies d’utilisation par une communaute´ de chercheurs francophones et tentative de caracte´ risation du genre.’ ASp (Anglais de Spe´ cialite´ ) 35–36, 89–102. Crane D (1972). Invisible colleges: diffusion of knowledge in scientific communities. Chicago, IL: University of Chicago Press. Crystal D (1987). The Cambridge encyclopedia of language. Cambridge, UK: Cambridge University Press. Crystal D (2003). The Cambridge encyclopedia of the English language. Cambridge, UK: Cambridge University Press. Halliday M A K & Martin J R (1993). Writing science, literary and discusive power. Pittsburgh, PA: University of Pittsburgh Press. Harnad S (1983). Peer commentary on peer review: a case study in scientific quality control. Ann Arbor, MI: Michigan University Press. Hyland K (1994). ‘Hedging in academic writing and EAP textbooks.’ English for Specific Purposes 13(3), 239–256.

732 Medical Journals, Letters to the Editor Hyland K (2000). Disciplinary discourses: social interactions in academic writing. Harlow, UK: Longman. Hyland K (2001). ‘Humble servants of the discipline? selfmention in research article.’ English for Specific Purposes 20(3), 207–226. Latour B (1993). We have never been modern. Cambridge, MA: Harvard University Press. Magnet A (1992). ‘La discussion de l’article scientifique: quelques aspects linguistiques et discursifs dans trente-six articles de nutrition.’ University of Bordeaux 2, France (unpublished dissertation). Magnet A & Carnet D (2001). ‘Quelques aspects de la contradiction et de la remise en cause dans le genre ‘‘Letters to the Editor.’’’ ASp (Anglais de Spe´ cialite´ ) 31–34, 51–62. Magnet A & Carnet D (in review). ‘Letters to the editor: still vigorous after all these years? a presentation of the genre and a survey of its use in a French speaking scientific community.’ English for Specific Purposes.

Mullan Z (2003). ‘Lancet correspondence: old letters, new rules.’ The Lancet 361, 12. Myers G A (1989). ‘The pragmatics of politeness in scientific articles.’ Applied Linguistics 10, 1–35. Quirk R, Greenbaum S, Leech G & Svartvik J (1985, 1997). A comprehensive grammar of the English language. London: Longman. Salager-Meyer F (1994). ‘Hedges and textual communicative function in medical English written discourse.’ English for Specific Purposes 13(2), 149–170. Salager-Meyer F (1995). ‘I think that perhaps you should : a study of hedges in written scientific discourse.’ Journal of Tesol-France 2(2), 127–143. Swales J (1990). Genre analysis: English in academic and research settings. Cambridge, UK: Cambridge University Press. Webber P (1994). ‘The function of questions in different medical journal genres.’ English for Specific Purposes 13(3), 257–268.

Medical Specialty Encounters E Barton, Wayne State University, Detroit, MI, USA ! 2006 Elsevier Ltd. All rights reserved.

Primary Care Medical Encounters The majority of research in linguistics on the discourse of medicine has focused on primary care encounters, that is, routine encounters in pediatrics, family medicine, gynecology, and internal medicine that take place in offices or clinics. This research has contributed a great deal to our understanding of how the phases of a routine medical encounter are structured, how its interactional sequences are organized, and how its discourse creates and reflects a variety of dynamic contextual dimensions. The Structure of Medical Encounters

Following Byrne and Long (1976), ten Have (1989) described the phases in the structure of a routine medical encounter as follows: . . . . . .

Opening Complaint (discovering the reason for the visit) Examination (medical history and physical exam) Diagnosis Treatment or Advice Closing.

Each of these phases is associated with particular interactional sequences: social talk in openings and closings, although this small talk is not inconsequential; question–answer for discovering the reason for

the visit and conducting the history and physical; medical assessments and explanations for diagnosis; and imperatives and instructions for treatment. This linguistic description of the structure of a medical encounter corresponds roughly to the structure of the medical interview as described in the research and teaching literature in the field of medicine. (There is an enormous amount of research and teaching literature on the topic of medical communication in medicine itself, which I draw on but do not review systematically here. I rely heavily on Aldridge (1999) and Roter and Hall (1992) for this purpose.) Aldrich (1999), for instance, lists the following: . . . . . . . .

Introduction Chief Complaint History Review of Systems and Symptoms Physical Examination Diagnosis Treatment Plan Closing.

Within the history, Aldridge identifies several areas, including history of the present illness, past medical history, family history, and socialpsychological history. Aldridge also notes that the review of systems and inventory of symptoms is typically organized by working head down as the physical examination takes place (HEENT (head, eyes, ears, neck, throat), chest, heart, abdomen, genitalia, skin and extremities, neurological), although the currently recommended practice is to organize this review by organ

732 Medical Journals, Letters to the Editor Hyland K (2000). Disciplinary discourses: social interactions in academic writing. Harlow, UK: Longman. Hyland K (2001). ‘Humble servants of the discipline? selfmention in research article.’ English for Specific Purposes 20(3), 207–226. Latour B (1993). We have never been modern. Cambridge, MA: Harvard University Press. Magnet A (1992). ‘La discussion de l’article scientifique: quelques aspects linguistiques et discursifs dans trente-six articles de nutrition.’ University of Bordeaux 2, France (unpublished dissertation). Magnet A & Carnet D (2001). ‘Quelques aspects de la contradiction et de la remise en cause dans le genre ‘‘Letters to the Editor.’’’ ASp (Anglais de Spe´cialite´) 31–34, 51–62. Magnet A & Carnet D (in review). ‘Letters to the editor: still vigorous after all these years? a presentation of the genre and a survey of its use in a French speaking scientific community.’ English for Specific Purposes.

Mullan Z (2003). ‘Lancet correspondence: old letters, new rules.’ The Lancet 361, 12. Myers G A (1989). ‘The pragmatics of politeness in scientific articles.’ Applied Linguistics 10, 1–35. Quirk R, Greenbaum S, Leech G & Svartvik J (1985, 1997). A comprehensive grammar of the English language. London: Longman. Salager-Meyer F (1994). ‘Hedges and textual communicative function in medical English written discourse.’ English for Specific Purposes 13(2), 149–170. Salager-Meyer F (1995). ‘I think that perhaps you should : a study of hedges in written scientific discourse.’ Journal of Tesol-France 2(2), 127–143. Swales J (1990). Genre analysis: English in academic and research settings. Cambridge, UK: Cambridge University Press. Webber P (1994). ‘The function of questions in different medical journal genres.’ English for Specific Purposes 13(3), 257–268.

Medical Specialty Encounters E Barton, Wayne State University, Detroit, MI, USA ! 2006 Elsevier Ltd. All rights reserved.

Primary Care Medical Encounters The majority of research in linguistics on the discourse of medicine has focused on primary care encounters, that is, routine encounters in pediatrics, family medicine, gynecology, and internal medicine that take place in offices or clinics. This research has contributed a great deal to our understanding of how the phases of a routine medical encounter are structured, how its interactional sequences are organized, and how its discourse creates and reflects a variety of dynamic contextual dimensions. The Structure of Medical Encounters

Following Byrne and Long (1976), ten Have (1989) described the phases in the structure of a routine medical encounter as follows: . . . . . .

Opening Complaint (discovering the reason for the visit) Examination (medical history and physical exam) Diagnosis Treatment or Advice Closing.

Each of these phases is associated with particular interactional sequences: social talk in openings and closings, although this small talk is not inconsequential; question–answer for discovering the reason for

the visit and conducting the history and physical; medical assessments and explanations for diagnosis; and imperatives and instructions for treatment. This linguistic description of the structure of a medical encounter corresponds roughly to the structure of the medical interview as described in the research and teaching literature in the field of medicine. (There is an enormous amount of research and teaching literature on the topic of medical communication in medicine itself, which I draw on but do not review systematically here. I rely heavily on Aldridge (1999) and Roter and Hall (1992) for this purpose.) Aldrich (1999), for instance, lists the following: . . . . . . . .

Introduction Chief Complaint History Review of Systems and Symptoms Physical Examination Diagnosis Treatment Plan Closing.

Within the history, Aldridge identifies several areas, including history of the present illness, past medical history, family history, and socialpsychological history. Aldridge also notes that the review of systems and inventory of symptoms is typically organized by working head down as the physical examination takes place (HEENT (head, eyes, ears, neck, throat), chest, heart, abdomen, genitalia, skin and extremities, neurological), although the currently recommended practice is to organize this review by organ

Medical Specialty Encounters 733

systems as a separate step in the history (skin, blood and lymph, respiratory, cardiac, vascular, gastrointestinal, urinary, reproductive, musculoskeletal, neurological, endocrine). As in most sources in the medical literature, Aldridge also recommends paying special attention to the psychosocial aspects of the interview, what he euphemistically calls ‘‘embarrassing topics,’’ including substance abuse, alcohol use, and financial affairs. The Organization of Interactional Sequences in the Medical Encounter

In linguistics, researchers have considered the organization of interactional sequences within different phases of the routine medical encounter in detail. In one of the most cited articles in medical communication, Beckman and Frankel (1984) found that physicians interrupt the patient’s account of the reason for the visit after only 18 seconds, on average, usually truncating the patient’s account of the first reason for coming to the doctor, even though the first reason mentioned is often not the patient’s most important reason (Rost and Frankel, 1993). This quick interruption within the complaint phase is often critiqued as one of the problematic means by which physicians establish interactional dominance in the medical encounter, sometimes to the point of actively marginalizing the patient’s topics, questions, and concerns (Roter and Hall, 1992). However, in a recent analysis of the complaint phase, Halkowski (in press) argued that the chief function of this phase in the encounter is establishing the ‘‘doctorability’’ of the reason for the visit: in other words, patients must legitimate their need for medical attention, what the sociologist Talcott Parsons called access to the sick role. Heritage (in press) further notes that physicians are oriented to establishing the reason for the visit with their first question and look for patients to indicate interactionally that their account is complete before they move on to the next phase of the encounter, which can happen very quickly, especially in a routine encounter. Heritage argues that the length of this phase is less important than the interactional coconstruction of the chief complaint to the satisfaction of both the physician and the patient. In the medical literature, a distinction is made between open-ended and closed-ended questions in what is called the medical interview. Physicians most often use an open-ended question to elicit the chief complaint, a question form that allows the patient to formulate an explanatory or narrative response, as in the following examples. [For the convenience of readers from a variety of backgrounds, I present transcribed data in very broad form, regularizing spelling and punctuation for intonation, and not

including all features transcribed by the original authors. Three transcription conventions that I should note here are the following: (1) contextual material in double parentheses, e.g., ((physician entering room)), (2) ellipses (. . .) to indicate deleted material from the original source, and (3) double dashes (–) to indicate a break in an utterance, e.g., (you know– ).] DOC: How have you been? DOC: What brings you in today? DOC: Tell me about your leg pain. What seems to be the problem?

Roter and Hall (1992) note that physicians use open-ended questions to orient themselves to the purpose of the visit and to gather information in order to begin hypothesis formation toward the differential diagnosis. Physicians then typically move to closedended questions for history taking, review of systems, and hypothesis testing of the differential diagnosis, as in the following examples: DOC: Any heart problems? DOC: Any shortness of breath? DOC: Are your leg symptoms worse after standing for a few minutes?

Closed-ended questions are much more frequent than open-ended questions in the routine medical encounter, by a factor of 2 to 3 or more (Roter and Hall, 1992). Although the pedagogical literature on what is called ‘‘patient-centered communication’’ recommends more frequent use of open-ended questions, especially in the complaint phase but also in the history and physical, many physicians continue to regard closed-ended questions as the most efficient means to conduct the medical interview (Roter and Hall, 1992; Frankel and Beckman, 1993; Aldridge, 1999). Linguists have looked at question–answer sequences in the routine medical encounter in great detail. As Ainsworth-Vaughn (2001) pointed out, questions are a powerful means of establishing and maintaining interactional dominance in an encounter: questions can establish and maintain the frame of the encounter as an asymmetrical interview in which the role of the patient is simply to answer questions posed by the physician, and the form of questions can further restrict patients’ contributions, as in a series of closed-ended questions that ask for yes–no or brief answers. In West’s (1984) study of 21 family care encounters, she found that 91% of the questions were asked by physicians and only 9% were asked by patients. Noting further that physicians’ questions are almost always answered, whereas patients’ questions are sometimes ignored in medical encounters

734 Medical Specialty Encounters

(98% of physicians’ questions were answered, whereas only 87% of patients’ questions were answered in her study), West (1984) concluded that the asymmetrical form of the medical interview is achieved in large part through the organization of question–answer sequences. Similarly, looking for question-answer sequences where patients initiate new topics, Frankel (1990) found that only 1% of such questions were posed by patients in a set of primary care encounters and argued that medical encounters show a distinct dispreference for patient-initiated questions. Other researchers have found similar patterns of physician dominance through questions and the ways that this interactional dominance establishes overarching schemas and frames as well as control of topics in the primary care encounter (Fisher and Todd, 1993). This interactional dominance in the discourse of medical encounters has been critiqued as promoting the interests and agenda of the physician within a biomedical model focused solely on disease rather than eliciting the understandings and concerns of the patient within a broader biopsychosocial model of disease as well as the experience of illness, in what the sociologist Elliot Mishler famously termed the conflict between the voice of medicine versus the voice of the lifeworld (Roter and Hall, 1992). Recently, however, researchers have looked more closely at interactional coconstruction in question– answer and sequences in the medical encounter, describing the ways that physicians and patients coconstruct the discourse of medicine within an encounter, a coconstruction of the context that is typical of institutional encounters in general (ten Have, 1991; Maynard, 1991; Heritage, 1997). With respect to question-answer sequences in medical encounters, patients, it seems, use a variety of interactional means not only to answer the question, but also, as Stivers and Heritage (2001) put it, ‘‘answer more than the question’’ and thereby raise their own topics and concerns. For example, in the following exchange, the physician is in the middle of a history sequence with closed-ended questions designed to elicit yes–no answers, but the patient’s answer to one question is an elaborated one: DOC: You have your gall bladder? ... PAT: Well I had a tubular pregnancy once, an’ I was too afraid to even ask ‘em anything about it. An’ so I don’t know what they did.

Stivers and Heritage argued that this expanded answer not only provides information but also signals that the patient is afraid of surgical procedures and afraid to talk about them. Boyd and Heritage (in

press) noted that patients resist the implications of physicians’ questions about lifestyle by answering closed-ended questions in elaborated form. In the following exchange, for example, the patient breaks away from a series of yes–no answers to provide her own characterization of her alcohol use: DOC: D’you smoke? PAT: Hm mm. DOC: Alcohol use? PAT: Hm: : moderate I’d say.

These studies show that although physicians design questions to serve their purposes in the encounter, patients design answers to serve their purposes as well. Another phase of the primary care encounter that has received considerable attention in the linguistic literature is the delivery of diagnostic news. Heath (1992) found that although physicians design their delivery of diagnostic news with pauses and other features in order to invite patient response, patients typically respond with silence or only a minimal acknowledgment token, as in the following sequence (pauses measured in seconds and placed in parentheses): DOC: You’ve got, erm (0.8) bronchitis. PAT: Er. (4.5) DOC: ((begins to write prescription)) I’ll give you antibiotics to take for a week.

This kind of sequence is a coconstruction not only of medical authority, with patients appearing to accept the physician’s diagnosis without challenge, but also of the structure of the medical encounter: with no elaboration of the topic of diagnosis, the encounter moves directly to the treatment phase. Heath noted that the coconstructed brevity of the diagnosis phase of the encounter contributes to patients receiving less than optimal amounts of information about their medical conditions, which is another of the key problems identified in research in medical communication (Roter and Hall, 1992). Heath went on to show, however, that patients have their own interactional means of disagreeing with the physician’s diagnosis, but disputing a diagnosis typically takes place in a different phase of the encounter, either during the treatment recommendations or the closing. Patients dispute a diagnosis by recycling the topic, repeating symptoms, emphasizing their severity, or questioning the diagnosis. In the following sequence, for example, Heath observed that the patient initially agreed with the diagnosis that his condition (angina) is the same, and the physician began to turn to treatment management, but the patient then produced an account of the recent severity of his symptoms:

Medical Specialty Encounters 735 DOC: It’s the same as you’ve had all along real[ly isn’t it [with this PAT: [Yes [Yes it is ((soft voice)) DOC: Yeah. PAT: Only if this last couple of months it’s been a bit – whether it be due to cold weather – I don’t know whether that’s anything to do with it. DOC: Can I just listen in around your heart and things right?

In this sequence, the patient achieves his apparent goal of disputing the initial diagnosis; and the physician returns to the physical examination and ultimately orders more diagnostic testing. Not all recyclings of the diagnosis are successful, however, as Heath showed in the following sequence, where the physician delivered a diagnosis and the patient disputed it by offering an account of the severity of his symptoms as the physician wrote a prescription: DOC: I’m sure Doctor Mckay’s right. I’m sure these headaches yer gettin are associated with a bit of arthritis in yer – in yer neck. PAT: It is. ((soft voice)) . . . ((physician begins writing prescription)) PAT: It’s the headaches was the thing that’s got me. More than anything else – More than the devil in hell because they were gettin’ more or less permanent yer know ... DOC: Right well I’ll tell what we’ll do Mister Tarett. I’ll give you . . . .

In this case, the physician did not move back to the examination phase or change his original medical assessment and diagnosis of arthritis in the neck; the patient simply received the prescription being written as part of the treatment phase of the encounter. A recent special issue of the journal TEXT (Beach, 2001) took up the topic of diagnosis in the routine encounter in more detail, arguing for the importance of the lay diagnosis a patient might bring to the medical encounter. The Context of the Medical Encounter

In the research literature on the discourse of the medical encounter and on the discourse of institutional encounters in general, there is an ongoing debate about how to take account of context in an analysis (Sarangi and Roberts, 1999). The linguistic anthropologist Bronislaw Malinowski originally called attention to the context of situation, defining it ethnographically as the larger sociocultural frameworks that surround language and speech events. Through the work of Dell Hymes, John Gumperz, and other linguistic anthropologists, sociolinguists, and discourse analysts, this approach works to discover and describe the complex connections between

the organization of discourse and its larger social and cultural contexts, considering discourse sequences that reflect and perform gender, ethnicity, class, and the sociopolitical power and prestige of professions and institutions (Schiffrin et al., 2001). Another approach that has been influential in research on the discourse of medicine defines context ethnomethodologically as the social actions participants coconstruct locally through the turn-by-turn organization of interactional sequences. Through the work of Harold Garfinkel, Erving Goffman, Emanuel Schegloff, and other conversation analysts, this approach works to discover the local management of the social and institutional order by describing how participants themselves are observably oriented to this local context, looking closely at turn-taking, turn design, lexical choice, sequence organization, structural organization, and asymmetry and accountability in professional and institutional encounters (Heritage, 1997). Both approaches are based on the close analysis of naturally occurring data, and both approaches see context as both created and reflected in discourse, particularly in institutional encounters. Both approaches have been critiqued and debated extensively. In discourse analysis, the concept of context can seem too uncontrolled: van Dijk (1997) once noted that ‘‘there is no a priori limit to the scope and level of what counts as relevant context,’’ an analytic situation Michael Silverstein questioned by asking, ‘‘When is enough enough?’’ In conversation analysis, the concept of context can seem too narrow: Gumperz (in Sarangi and Roberts, 1999) noted that the primary focus on what is overtly present and oriented to can miss important aspects of interpretation that are more indirect but still important to understanding and interpretation, an analytic situation that does not give a complete answer to Michael Agar’s question, ‘‘What is going on here?’’ Silverman (in Sarangi and Roberts, 1999) characterized the difference between the approaches as an emphasis on the how (ethnomethodological conversation analysis) in comparison to an emphasis on the why (ethnographic discourse analysis). An oversimplified, but perhaps useful, way to characterize these two approaches is to note that ethnographic discourse analysis focuses on the context of the medical encounter, with attention to the sociocultural context of situation, whereas ethnomethodological conversation analysis focuses on the context in the medical encounter, with attention to the observable orientation of the participants. Many researchers agree that a mixed approach to the analysis of the medical encounter is optimal, one that combines ethnographic and ethnomethodological approaches (Sarangi and Roberts, 1999); however,

736 Medical Specialty Encounters

there is not space to discuss fully these sometimes competing and sometimes complementary theoretical and methodological approaches of discourse analysis and conversation analysis, although this topic is taken up in a number of the references cited earlier and in other entries in this encyclopedia (for more on the discourse of the routine medical encounter, see Medical Discourse: Doctor–Patient Communication; Medical Communication: Professional–Lay; Medical Discourse: Sociohistorical Construction; for more on discourse analysis generally, including the conflicts over context, see Text and Text Analysis).

Specialty Care Medical Encounters The research literature reviewed earlier shows that the primary care encounter in medicine is complex at a variety of levels, from the microlevel of the organization of interactional sequences to the macrolevel of coconstructing medicine as a social institution. One aspect of medicine as a profession and institution that is largely ignored in this work, however, is the distinction between primary care and specialty care. In Western medicine, as practiced in the United States and other developed countries, physicians typically complete a residency within a specialty, some of which have a primary care focus (specialties include internal medicine, family medicine, obstetrics–gynecology, pediatrics, general surgery, orthopedic surgery, emergency medicine, psychiatry), and some physicians go on to complete a fellowship within a subspecialty that has a narrower focus (internal medicine, for example, has subspecialties in cardiology, nephrology, endocrinology, gastroenterology, urology, and hematology– oncology, and others; surgery has subspecialties in neurosurgery, thoracic surgery, urology, colorectal and more). Subspecialty care focuses on particular diseases and conditions, both acute and chronic, in hospital and clinic settings, and physicians who practice in subspecialties are called specialists. [There is a systematic terminological difference between the medical and lay language of the specialty/subspecialty system. In medicine, physicians complete a residency in a specialty (internal medicine, general surgery) and may complete a fellowship in a subspecialty (medical oncology, colorectal surgery). Within this terminology, the title of this entry should perhaps be the discourse of the subspecialty encounter. In lay language, however, subspecialists are simply called specialists, and this is a terminology that physicians use in medical encounters with patients and in informal conversations with medical staff. I thus use the latter terminology here.] Patients typically enter specialty care by illness or trauma, sometimes by means of a referral from a primary care physician upon diagnosis

or suspicion of diagnosis and sometimes by means of complications in an illness or hospitalization. In many instances, entry into the specialty care system of medicine is a surprise or a shock, often unwelcome and typically disorienting (see Medical Discourse: Illness Narratives). In the institutional and professional context of health care, the line between primary care and specialty care is not an absolute one. Some conditions, such as diabetes or congestive heart failure, are managed by primary care physicians as well as specialists. The line can be contested within the profession: specialists sometimes accuse primary care physicians of not referring patients to them appropriately, and primary care physicians similarly accuse specialists of stealing their patients. Access to specialty care can be contested within institutions as well: one principle of managed care in a health maintenance organization is the restricted access to consultations with specialists, not without controversy. Within this complicated context, the discourse of the specialty care encounter is just beginning to emerge as a focus for research in linguistics, with a research agenda that has many unanswered questions: Is there a generic structure of the specialty medical encounter? How is it similar to or different from the structure of the primary care encounter? Does the structure of a specialty medical encounter differ across specialties? How are the interactional sequences within the phases of a specialty medical encounter organized? How are these organizations of interactional sequences similar to or different from the organization of sequences in primary care encounters, institutional encounters in general, and ordinary conversation? How does the discourse of a specialty care encounter reflect the context of the medical specialty? How do physicians and patients come to coconstruct the context of a medical specialty? How does research on specialty encounters contribute to knowledge about the discourse of medicine? How does research on specialty encounters contribute to improved medical communication for physicians and patients? Although research has barely begun to consider these and other questions about the specialty medical encounter, there are interesting points of comparison to the phases and sequences of the primary care medical encounter appearing in the literature, including the reason for the visit, the medical history, the understanding of diagnosis, and the treatment discussion. The Reason for the Visit in Specialty Care Encounters

In specialty encounters, the reason for the visit, which is typically solicited from the patient with an open-ended question in the primary care encounter, appears to be organized interactionally across a series

Medical Specialty Encounters 737

of encounters with a focus on building an ongoing context of combined medical–lay expertise. Working from a database of 120 medical encounters between pediatric specialists and families who have a child with a disability, Barton (1996) found that in first encounters with pediatric specialists, physicians typically present the reason for the visit to families, thereby describing the particular medical problem that their specialty will manage, as in the following sequence with a pediatric neurologist: DOC: Now the question is – The reason why you’re seeing me is because of the EEG and what should we do about it. ... DOC: Now I will take a look at the EEG – you know – again – just to be sure that – you know – how much of activity – of seizure-like activity was there, because the question remains whether we should treat her or not. MOM: Um-hmm.

In this sequence, the neurologist explains the reason why you’re seeing me in terms of the specialty of pediatric neurology, identifying its chief diagnostic tool, the EEG; the chief medical problem managed by neurology, seizure-like activity in the brain; and the treatment options within the specialty. The mother’s responses in this sequence are primarily acknowledgments of the receipt of information. Through initial encounters where physicians pay considerable interactional attention to describing their own medical specialty as the reason for the visit, families appear to develop a sophisticated lay understanding of the institutional organization and professional domains of medical specialties in the care of their child, including the relevant diagnostic tests and the treatment options for management. In follow-up visits, both physicians and families seem to orient to this construction of medical–lay expertise by organizing the sequences of this phase of the encounter in terms of the management of the specific medical condition of the specialty. Physicians sometimes retain control of the discourse of this phase of the encounter by providing the reason for the visit themselves, as in this sequence, again from pediatric neurology: DOC: ((entering room)) So what’s been happening? His last levels were pretty good. In December it was 8.8, since they increased the dose, which is where we would like to maintain him at. MOM: OK. DOC: Uh – so that if he’s – you know – uh – we don’t have – you know – at least ah – Even if he’s thinking of having a seizure, that should prevent him from having one.

MOM: We haven’t seen any and school hasn’t reported any of them either so –

Here the physician presents the reason for the follow-up visit as treatment management, reporting the relevant test results from her review of the chart and claiming that maintaining the current level of medication should prevent the child from having seizures. The mother seems observably oriented to management of the seizure disorder as the reason for a pediatric neurology visit, providing both observed and reported information about seizure activity. Physicians do not always retain control of this phase of the encounter, however. In the following sequence, again from pediatric neurology, it is the mother who presents the reason for the current visit in terms of the specialty: MOM: ((after physician asks the child how he is doing)) I don’t know – uh – we’re not sure what’s going on. Um – his Depakote levels got weird, and as they got weird, he increasingly got hyper. When – when you added the other Depakote, for about four days he – the hyper was gone again. It’s starting to come back. I’m wondering if his levels are not too high.

Here the mother gives her impression of the levels of antiseizure medication first and then introduces her concern about its possible side effects. Experienced families thus routinely orient to the domain of the specialty when they initiate this phase of the encounter in follow-up visits. It appears, then, that specialists do interactional work to orient families to the domain of the specialty in initial encounters, and then physicians and families orient to this construction of medical–lay expertise in follow-up visits. With this orientation to the domain of the specialty, the encounter can be coconstructed as problem driven in terms of the specialty, with foregrounded attention to treatment management. This joint focus on a particular problem and its course of treatment in the specialty encounter is somewhat different from the open-ended focus on the patient’s chief complaint in the primary care encounter (see the recent special issue of Research on Language and Social Interaction (Candlin and Candlin, 2002) on expertise in a variety of health-care encounters). It is also of note here that specialty encounters seem to have an interactional orientation toward a series of encounters in an ongoing physician–patient/family relationship. Ainsworth-Vaughn (1998) pointed out that the context of an established relationship can significantly affect the discourse of medical encounters. In comparison to the studies of primary care mentioned earlier, where patients asked few questions

738 Medical Specialty Encounters

(West, 1984; Frankel, 1990), Ainsworth-Vaughn’s study of encounters within the specialty of medical oncology found that patients ask significantly more questions – almost 40% of the questions were asked by patients in a set of 40 encounters balanced for gender of patient and physician. Ainsworth-Vaughn argued that the context of established relationships allows both physicians and patients to claim power in the discourse of the encounter, with power defined not only as control over the emerging discourse but also as control over medical decision making. Silverman (1987) similarly noted that question–answer sequences seem to reflect the course of treatment within specialty medical care. In his study of cardiologists and parents of children with heart conditions, he found that parents ask different kinds and different numbers of questions at different points in the child’s course – before treatment, after diagnostic catheterization, and after surgery. In his study of adolescents in diabetes clinics, Silverman (1993) found that physicians and patients developed a more negotiated style of interaction as the relationships become more established and physicians extend more autonomy to teenage patients in the management of their diabetes. The Medical History in Specialty Care Encounters

In medical encounters, question–answer sequences are particularly prominent in the history phase. In the primary care encounter, the medical history typically elicits patients’ own reports of their past illnesses and present symptoms. The medical history in the specialty care encounter, however, also reflects the complex and ongoing context of specialty care. Barton (2000) noted that children with disabilities are routinely seen by multiple specialists, including a developmental pediatrician, a neurologist, an orthopedist, a physiatrist, a gastroenterologist, an ENT, an ophthalmologist, and more. Families move back and forth among specialists in a complex web of referrals and visits for attention to new or ongoing medical problems and conditions, a situation that is sometimes criticized as an unfortunate fragmentation of medical care. This expanded context of care is reflected in the medical history, where families are often asked to provide accounts of their encounters with other pediatric specialists, as in the following sequence: DOC: Who sees him, Dr. L or Dr. B? Mom: Dr. L. DOC: And what does Dr. L say? The last time you saw him. MOM: His X-rays came out just fine.

The specific specialty (orthopedics) is not even mentioned here, but when asked for an account of what does Dr. L say, the mother answers in terms of

her lay expertise, citing evidence from the primary diagnostic tool of orthopedics (X-rays) in order to report that the child is just fine. When families deliver what the physician apparently judges to be a competent account of a referral with an appropriate construct of lay expertise, like the one given, the history moves along smoothly, and parents’ accounts can become the basis for decision making. When families deliver a more problematic account of a referral in the history, however, one that does not reflect an appropriate construct of lay expertise from a medical perspective, interactional problems arise that can affect the relationship between the family and the physician as well as the medical decision making. In the following sequence with a developmental pediatrician, for example, the family is asked for an account of their child’s visit with the ENT (in the following sequence, the term stridor refers to a harsh vibrating sound during breathing due to obstruction of the air passages): DOC: Now, does he see the ear, nose and throat doctor still? MOM: Um-hmm. DOC: Who does he see? MOM: Uh, Dr. N. Paul N. DOC: Do they think that he may outgrow some of this stridor too, with time? MOM: Yeah. DOC: [OK. DAD: [I do. I don’t know what the doctors say, but I know he is. Unless there’s so many kids who have the signs that he has, but he is so much like a normal child. It’s just that his muscles, it’s going to take time to – to develop. Right now, to me, he’s at a stage that a eight month old baby would be at. DOC: [Uh-huh. DAD: [And see, I’m looking at it like that’s going to take time. By the time he five or six, he going to be able to talk. He going to be able to crawl, he going to [be – DOC: [At school with the right interventions, he may be able to do some of those things. We’ll have to see. ((change to motherese voice)) Who’s on your bib there sweetie?

In this sequence, the physician’s question asking for the ENT’s opinion is designed for either a brief yesanswer or else a longer no-answer with elaboration. The mother provides the optimal yeah answer, and the physician acknowledges it with an OK, but the father speaks at the same time to offer a lengthy diagnostic and prognostic description of the child. His account, however, is problematic with respect to a combined medical–lay expertise based on knowledge of and experience in the specialty medicine

Medical Specialty Encounters 739

system: his I do is not backed up either by an account of a medical visit with another specialist or an account demonstrating substantive lay expertise, because his description is in conflict with the medical prognosis for his severely impaired child. The developmentalist gives him no encouragement to continue this problematic account, providing only one minimal back-channel and then abruptly interrupting to end the father’s talk by addressing the nonverbal child in the pitch and patterns of motherese. The father’s account is asymmetrically received here, and his contribution is completely ignored by the physician. Later in the encounter, the father expressed considerable dissatisfaction with the visit to the Child Care Clinic, criticizing the staff for ‘‘asking the same questions,’’ and after the encounter, the staff expressed considerable disdain for the father’s perceived ignorance about the actual condition of his child. In this case, the interactional trouble contributed to the increasingly adversarial relationship between the medical staff and the family that ultimately moved to hierarchical decision making by the staff without incorporating the contributions and concerns of the family (this encounter is discussed in more detail in Barton, 1999, 2000). The Understanding of Diagnosis in Specialty Care Encounters

As the excerpts and discussion earlier indicate, lay expertise in the domain of a medical specialty encompasses a number of aspects, including diagnosis, prognosis, and treatment management of a medical problem or condition. Understanding of diagnosis, however, is a particularly important aspect of specialty medical care, and the delivery of diagnostic news is constructed very carefully by medical professionals in specialty encounters. Maynard (1992, 2003) has worked extensively on the delivery of diagnostic news in both primary and specialty encounters. Based on fieldwork in clinics for the evaluation of developmental disabilities in children, Maynard (1992) identified the perspective-display sequence as a crucial means by which specialists create alignment to the medical view of a diagnosis. A perspective–display sequence first asks the parents for their view of the child, which, if the view is in alignment with the medical diagnosis, allows the professionals to agree with the parents on the diagnosis, as in the following sequence: DOC: What do you see? As – as his difficulty. MOM: Mainly his uhm – the fact that he doesn’t understand everything and also the fact that his speech is very hard to understand what he’s saying lot[s of time

DOC: [Right ... DOC: Okay. I – you know we basically in some ways agree with you insofar as we think that Dan’s main problem you know does involve you know language

A perspective–display sequence thus coimplicates agreement between the parents and the specialists on the diagnosis of the child, which interactionally constructs alignment as the basis for the rest of the encounter. Maynard noted that medical specialists sometimes have to do considerable interactional work to bring about this alignment; in the following sequence, for example, the physician reformulates the mother’s description of her daughter: MOM: . . . and I have seen no progress, from September to June. For her learning ability, she is slow. DOC: That’s what we uh also found on – on psychological testing. That she was per- not performing like a normal uh six and a half year old uh should. MOM: Mm-hmm. DOC: And that she was performing more uh what we call as a borderline rate of retardation.

The physician here initially agrees with the mother that that’s what we also found on psychological testing, but he interactionally reformulates and upgrades slow from not performing like a normal six and a half year old to what we call as a borderline rate of retardation. Through perspective–display sequences and their elaboration and development, medical specialists work to coimplicate a problematic diagnosis that brings participants into alignment. Maynard noted that the perspective display sequence can create alignment nonproblematically or problematically, as shown in the contrasting sequences above, although he argued that it is ultimately the medical view that becomes the official diagnosis for the child. The Treatment Discussion in Specialty Care Encounters

There has been relatively little research on the treatment phase of the medical encounter, either in primary care or in specialty care. Barton (2004) looked at treatment discussions in medical oncology encounters, describing the structure within this phase as having four parts: first, foregrounding the relevant diagnostic information; second, presenting treatment information; third, asking and answering questions; and fourth, arranging treatment logistics. Using Erving Goffman’s distinction between the front-stage and back-stage of institutional encounters, Barton also noted a systematic difference in prognosis as a

740 Medical Specialty Encounters

topic. On the front-stage with patients and families, prognosis is an unstable topic, not always a predictable part of the treatment discussion, and a topic that is often interactionally elided and backgrounded in indirect language, as in the following sequence: DOC: When they did the liver biopsy in March, it showed that there were multiple lesions of the liver. And that means that we can’t cut it out. It has to be treated by some sort of chemotherapy to try to control it. Now, we have a number of different drugs that we use for that kind of treatment.

Here, the information on multiple lesions of the liver, which means that we can’t cut it out does not explicitly state the diagnosis of widely metastasized cancer, nor the prognosis of incurable disease that will run a fatal course. Further, the treatment recommendation (it has to be treated by some sort of chemotherapy) ends with the euphemistic expression to try to control it, an expression that does not explicitly note that treatment will ultimately fail. In the back-stage talk immediately after this encounter, the physician and researcher had the following exchange: RES: Sweet lady. DOC: Bad disease. RES: Because they didn’t catch it before it went to the liver? DOC: Yeah . . . . She’s an older, frail lady. But she has bad disease.

On the back-stage with medical colleagues and even researchers, prognosis is a predictable topic, especially in cases of expected negative outcomes, and a topic that is foregrounded with direct language in the talk of insiders (in the back-stage discourse of medicine, the expression bad disease carries the implications of incurable, fatal disease). Barton suggested that the front-stage presentation of prognosis is organized to preserve hope for the patient and the family, whereas the back-stage discussion of prognosis is organized to reflect the preponderance of negative experience in medical oncology and its and its resultant frustration. In another study of treatment discussions, Barton et al. (2005) looked at a set of end-of-life discussions in a surgical care unit, where the aim of the discussion is to change the goals of treatment from aggressive measures aimed at cure to palliative measures aimed at comfort. The structure of this discussion has four parts: first, openings, in which physicians as well as families coconstruct the traditional role structure of a medical encounter, with the physician in the interactional and decision-making lead; second, description of current status, in which physicians present medical information leading to a summary statement that

implies that the patient’s status is terminal; third, decision making, in which a holistic decision to end life support is made indirectly and inferentially; and fourth, logistics of dying in the ICU, where physicians and families discuss the details of death and dying. Barton et al. argue that the description of current status is actually the crucial part of the end-of-life discussion, because it is here that physicians and families, through an inferential summary statement, interactionally establish a consensus that the patient is dying, which makes it reasonable that continuing further treatment, including life support, would be futile and only prolong the patient’s suffering, as in the following decision-making sequence: FAM: The body is in shock because of the infection. DOC: So that’s where we are. I am absolutely sure that you’ve had discussions and discussions among yourselves and other family members . . . . What I’d like to do now would be to open up the discussion with regards to the thoughts that people have had or any unanimity or is there any unanimous opinion with regard to what you think she might want under these conditions. Have you had a chance to talk about this? FAM: Yes. If I – I don’t think she would want it. DOC: I know I can remember she told me once, you know, she wouldn’t want it, because if she felt like they cut on her once, they would continue to cut, and I don’t think she would want it. FAM: We all agree.

Here, the family’s acknowledgment that the body is in shock because of the infection allows the physician to sum matters up again (so that’s where we are) and then move to the frame of decision making as what she might want under these conditions. The family then seems to make a holistic and inferential decision to withdraw futile life support by saying that she wouldn’t want it. Barton et al. note that the description of current status and the decision-making sequences use language drawn from the indirect discourse of death and dying in ordinary language (as in the family’s use of the antecedentless pronoun it), but one of the functions of the fourth and final part of the discussion is to develop a more medical discourse of death by making specific the inferences of terminal status and the decision to withdraw life support in a combined medical–lay discourse that situates death more technically within the process of palliative care. In the earlier discussion, for example, the physician specified that we have a protocol, a set of rules we follow to keep people so sedated that they would not feel any discomfort. However, in discussions where there is no interactional consensus on the patient’s status, decision

Medical Specialty Encounters 741

making to end life support does not take place, as in this sequence: DOC: So when someone in the – experiences all of these, um, complications of the sepsis in the face of the strong medications that we are using, um, we feel like it is important to try to touch base with the family and clarify the goals of treatment and what you imagined the goals of treating her to be. FAM: OK. ((silence)) DOC: Does that make sense? What questions can I answer to help that make sense? ((silence))

Here, the physician sums up the patient’s terminal condition as complications of the sepsis and tries to move to the frame of decision making, but the family member’s OK seems to be more of an acceptance of the topic of treatment goals rather than an acknowledgment or agreement with the inference offered by the physician’s summary statement. With no uptake of her summary statement or her move to decision making, the physician resorts to asking whether there are any questions. With her continued silence, however, the family member provides no interactional indication of alignment, and the discussion falters. Future Research on Specialty Care Encounters

From the literature reviewed, it appears that the general structure of a specialty care encounter is roughly similar to the structure of a primary care encounter, with similar phases – opening, reason for the visit, history and physical, treatment discussion, and closing. The specialty encounter, however, is problem driven in terms of the domain of the specialty – the medical condition it manages with its diagnostic tools and treatment options – within a complex context of medical–lay expertise created through multiple encounters in an ongoing physician–patient relationship. In comparison to the primary care encounter, the interactional sequences within the phases appear to be organized with respect to this context of expertise in the specialty care encounter: the reason for the visit is interactionally constructed in terms of the specialty, initially by physicians and subsequently by families; the history draws on accounts of care within the specialty medical system, which displays the family’s expertise in this context; and the understanding of diagnosis appears crucially important, with considerable interactional attention to creating alignment between specialists and families. Perhaps reflecting the focus on managing medical problems and conditions, the treatment discussion appears to display considerable variation across specialties, with complex internal structures for different kinds

of discussions. Although research in this area is just beginning, it is based on an important question for linguistic research on the discourse of medicine: How do physicians, patients, and families coconstruct the professional and institutional discourse of specialty care? See also: Medical Communication: Professional–Lay; Medical Discourse: Doctor–Patient Communication; Medical Discourse: Illness Narratives; Medical Discourse: Sociohistorical Construction; Text and Text Analysis.

Bibliography Ainsworth-Vaughn N (1998). Claiming power in doctor– patient talk. Oxford: Oxford University Press. Ainsworth-Vaughn N (2001). ‘The discourse of medical encounters.’ In Schiffrin D, Tannen D & Hamilton H (eds.) The handbook of discourse analysis. Oxford: Blackwell. 453–469. Aldridge C K (1999). The medical interview: gateway to the doctor–patient relationship (2nd edn.). New York: Parthenon. Barton E (1996). ‘Negotiating expertise in discourses of disability.’ Text 16, 299–322. Barton E (1999). ‘The social work of diagnosis: evidence for judgments of competence and incompetence.’ In Kovarsky D, Duchan J & Maxwell M (eds.) Constructing (in)competence: disabling evaluations in clinical and social interaction. Hillsdale, NJ: Lawrence Erlbaum. 257–290. Barton E (2000). ‘The interactional practices of referrals and accounts in medical discourse: compliance and expertise.’ Discourse Studies 2, 259–281. Barton E (2004). ‘Discourse methods and critical practice in professional communication: the front-stage and back-stage discourse of prognosis in medicine.’ Journal of Business and Technical Communication 18, 67–111. Barton E, Aldridge M, Trimble T & Vidovic J (2005). ‘Structure and variation in end-of-life discussions in the surgical intensive care unit.’ Communication and Medicine 2, 3–20. Beach W (ed.) (2001). ‘Special issue: lay diagnosis.’ Text 21. Beckman H & Frankel R (1984). ‘The effect of physician behavior on the collection of data.’ Annals of Internal Medicine 101, 692–696. Boyd E & Heritage J (in press). ‘Taking the patient’s medical history: questioning during comprehensive history taking.’ In Heritage J & Maynard D (eds.) Communication in medical care: interaction between primary care physicians and patients. Cambridge: Cambridge University Press. Byrne P & Long B (1976). Doctors talking to patients: a study of the verbal behaviours of doctors consulting in their surgeries. London: HMSO. Candlin C & Candlin S (eds.) (2002). ‘Special issue: expert talk and risk in health care.’ Research on Language and Social Interaction 35.

742 Medical Specialty Encounters Fisher S & Todd A (eds.) (1993). The social organization of doctor–patient communication, 2nd edn. Norwood, NJ: Ablex. Frankel R (1990). ‘Talking in interviews: a dispreference for patient-initiated questions in physician-patient encounters.’ In Psathas G (ed.) Interaction competence: studies in ethnomethodology and conversation analysis. Lanham, MD: University Press of America. 231–262. Frankel R & Beckman H (1993). ‘Teaching communication skills to medical students and house officers: an integrated approach.’ In Clair J & Allman R (eds.) Sociomedical perspectives on patient care. Lexington: University Press of Kentucky. 211–222. Halkowski T (in press). ‘Realizing the illness: patients reports of symptom discovery in primary care visits.’ In Heritage J & Maynard D (eds.) Communication in medical care: interaction between primary care physicians and patients. Cambridge: Cambridge University Press. Heath C (1992). ‘The delivery and reception of diagnosis in the general-practice consultation.’ In Drew P & Heritage J (eds.) Talk at work: interaction in institutional settings. Cambridge: Cambridge University Press. 235–267. Heritage J (1997). ‘Conversation analysis and institutional talk: analyzing data.’ In Silverman D (ed.) Qualitative analysis: issues of theory and methods. Thousand Oaks, CA: Sage. 161–182. Heritage J (in press). ‘Accounting for the visit: patients’ reasons for seeking medical care.’ In Heritage J & Maynard D (eds.) Communication in medical care: interaction between primary care physicians and patients. Cambridge: Cambridge University Press. Maynard D (1991). ‘Interaction and asymmetry in clinical discourse.’ American Journal of Sociology 97, 448–495. Maynard D (1992). ‘On clinicians co-implicating recipients’ perspective in the delivery of diagnostic news.’ In Drew P & Heritage J (eds.) Talk at work: interaction in institutional settings. Cambridge: Cambridge University Press. 331–358.

Maynard D (2003). Bad news, good news: conversational order in everyday talk and clinical settings. Chicago: University of Chicago Press. Rost K & Frankel R (1993). ‘The introduction of the older patient’s problems in the medical visit.’ Journal of Health and Aging 5, 387–401. Roter D & Hall J (1992). Doctors talking to patients/ patients talking to doctors: improving communication in medical visits. Westport, CT: Auburn Press. Sarangi S & Roberts C (eds.) (1999). Talk, work and institutional order: discourse in medical, mediation and management settings. Berlin: Mouton de Gruyter. Schiffrin D, Tannen D & Hamilton H (eds.) (2001). The handbook of discourse analysis. Oxford: Blackwell. Silverman D (1987). Communication and medical practice. Thousand Oaks, CA: Sage. Silverman D (1993). ‘Policing the lying patient: surveillance and self-regulation in consultations with adolescent diabetics.’ In Fisher S & Todd A (eds.) The social organization of doctor–patient communication, 2nd edn. Norwood, NJ: Ablex. 213–242. Stivers T & Heritage J (2001). ‘Breaking the sequential mold: answering ‘‘more than the question’’ during comprehensive history taking.’ Text 21, 151–185. ten Have P (1989). ‘The consultation as a genre.’ In Torode B (ed.) Text and talk. Dordrecht, Holland: Foris. 115–135. ten Have Paul (1991). ‘Talk and institution: a reconsideration of the ‘‘asymmetry’’ of doctor–patient interaction.’ In Boden D & Zimmerman D (eds.) Talk and social structure. Cambridge: Polity Press. 138–163. van Dijk T (ed.) (1997). Discourse studies: a multidisciplinary introduction (2 vols). Thousand Oaks, CA: Sage. West C (1984). Routine complications: troubles with talk between doctors and patients. Bloomington: Indiana University Press.

Relevant Website http://www.ama-assn.org – The American Medical Association website; consult for an overview of medical specialties and subspecialties in American medicine.

Medical Writing, Revising and Editing M Pilegaard, Aarhus School of Business, Aarhus, Denmark ! 2006 Elsevier Ltd. All rights reserved.

Background Medical writing, editing, and revision is an emerging field in linguistic research and a thriving field in linguistic practice. It falls between several different disciplines, linguistic as well as nonlinguistic, and enjoys the attention of both medical and linguistic

communities of interest. The medical discourse communities began to take interest in the linguistic quality of papers in the late 1980s with the growing globalization of research. The international nature of the readership requires that research should be written clearly and concisely and the need for better quality in medical communication is now widely recognized and has given rise to a growing number of suggestions, checklists, and guidelines for quality improvements targeting the structure, style, and rhetoric of medical communication.

742 Medical Specialty Encounters Fisher S & Todd A (eds.) (1993). The social organization of doctor–patient communication, 2nd edn. Norwood, NJ: Ablex. Frankel R (1990). ‘Talking in interviews: a dispreference for patient-initiated questions in physician-patient encounters.’ In Psathas G (ed.) Interaction competence: studies in ethnomethodology and conversation analysis. Lanham, MD: University Press of America. 231–262. Frankel R & Beckman H (1993). ‘Teaching communication skills to medical students and house officers: an integrated approach.’ In Clair J & Allman R (eds.) Sociomedical perspectives on patient care. Lexington: University Press of Kentucky. 211–222. Halkowski T (in press). ‘Realizing the illness: patients reports of symptom discovery in primary care visits.’ In Heritage J & Maynard D (eds.) Communication in medical care: interaction between primary care physicians and patients. Cambridge: Cambridge University Press. Heath C (1992). ‘The delivery and reception of diagnosis in the general-practice consultation.’ In Drew P & Heritage J (eds.) Talk at work: interaction in institutional settings. Cambridge: Cambridge University Press. 235–267. Heritage J (1997). ‘Conversation analysis and institutional talk: analyzing data.’ In Silverman D (ed.) Qualitative analysis: issues of theory and methods. Thousand Oaks, CA: Sage. 161–182. Heritage J (in press). ‘Accounting for the visit: patients’ reasons for seeking medical care.’ In Heritage J & Maynard D (eds.) Communication in medical care: interaction between primary care physicians and patients. Cambridge: Cambridge University Press. Maynard D (1991). ‘Interaction and asymmetry in clinical discourse.’ American Journal of Sociology 97, 448–495. Maynard D (1992). ‘On clinicians co-implicating recipients’ perspective in the delivery of diagnostic news.’ In Drew P & Heritage J (eds.) Talk at work: interaction in institutional settings. Cambridge: Cambridge University Press. 331–358.

Maynard D (2003). Bad news, good news: conversational order in everyday talk and clinical settings. Chicago: University of Chicago Press. Rost K & Frankel R (1993). ‘The introduction of the older patient’s problems in the medical visit.’ Journal of Health and Aging 5, 387–401. Roter D & Hall J (1992). Doctors talking to patients/ patients talking to doctors: improving communication in medical visits. Westport, CT: Auburn Press. Sarangi S & Roberts C (eds.) (1999). Talk, work and institutional order: discourse in medical, mediation and management settings. Berlin: Mouton de Gruyter. Schiffrin D, Tannen D & Hamilton H (eds.) (2001). The handbook of discourse analysis. Oxford: Blackwell. Silverman D (1987). Communication and medical practice. Thousand Oaks, CA: Sage. Silverman D (1993). ‘Policing the lying patient: surveillance and self-regulation in consultations with adolescent diabetics.’ In Fisher S & Todd A (eds.) The social organization of doctor–patient communication, 2nd edn. Norwood, NJ: Ablex. 213–242. Stivers T & Heritage J (2001). ‘Breaking the sequential mold: answering ‘‘more than the question’’ during comprehensive history taking.’ Text 21, 151–185. ten Have P (1989). ‘The consultation as a genre.’ In Torode B (ed.) Text and talk. Dordrecht, Holland: Foris. 115–135. ten Have Paul (1991). ‘Talk and institution: a reconsideration of the ‘‘asymmetry’’ of doctor–patient interaction.’ In Boden D & Zimmerman D (eds.) Talk and social structure. Cambridge: Polity Press. 138–163. van Dijk T (ed.) (1997). Discourse studies: a multidisciplinary introduction (2 vols). Thousand Oaks, CA: Sage. West C (1984). Routine complications: troubles with talk between doctors and patients. Bloomington: Indiana University Press.

Relevant Website http://www.ama-assn.org – The American Medical Association website; consult for an overview of medical specialties and subspecialties in American medicine.

Medical Writing, Revising and Editing M Pilegaard, Aarhus School of Business, Aarhus, Denmark ! 2006 Elsevier Ltd. All rights reserved.

Background Medical writing, editing, and revision is an emerging field in linguistic research and a thriving field in linguistic practice. It falls between several different disciplines, linguistic as well as nonlinguistic, and enjoys the attention of both medical and linguistic

communities of interest. The medical discourse communities began to take interest in the linguistic quality of papers in the late 1980s with the growing globalization of research. The international nature of the readership requires that research should be written clearly and concisely and the need for better quality in medical communication is now widely recognized and has given rise to a growing number of suggestions, checklists, and guidelines for quality improvements targeting the structure, style, and rhetoric of medical communication.

Medical Writing, Revising and Editing 743

This globalization does not imply that all researchers have equal access to publication. There is considerable cultural bias with about 95% of all papers published in English coming from the Western, particularly Anglo-American, industrialized countries (Gibbs, 1995) and a preference for U.S.-based articles in U.S.-based journals (Link, 1998). Medical approaches to medical writing, revision, and editing center on the issues of uniformity of structure and clarity, precision and economy of language, and the need to ease non-native English speakers’ access to international publication. Linguistic approaches are varied and draw on disciplines such as genre theory, functional sentence analysis, and contrastive linguistics. Genre theory, which is concerned with global linguistic patterns that have developed in a linguistic community for fulfilling specific communicative tasks in specific situations, has provided particularly useful insights into medical text genres by creating a link between purpose at various levels of discourse and the situationally appropriate choice of linguistic means with which to realize these purposes. This article focuses mainly on the genre of the medical research paper.

Medical Writing Medical writing serves different purposes. The medical textbook, for example, covers a range of topics, whereas the traditional medical paper is more focused. Most journal articles are designed to disseminate the results of a discrete topic of inquiry. Apart from honesty in reporting the results of the study, the most important element in medical scientific writing is clarity: the reader should be told why the study was performed (introduction), what the research question is (introduction), what was done (material and method), what was found (results) and what the results mean (discussion). To enforce this presentation style – known as the IMRaD structure – medical journals require that papers are divided into text sections with uniform headings serving the above purposes. This uniform style is described in Uniform requirements for manuscripts submitted to biomedical journals and the Council of Biology Editors’ manuals of style. General Rules of Medical Writing

General rules of medical writing increasingly mirror the advice of a group of editors of medical journals who first met informally in Vancouver in 1978 to establish guidelines for the format of manuscripts submitted to their journals. This group is known as the Vancouver Group. The requirements were first published in 1978. The group is now known as the

International Committee of Medical Journal Editors. More than 500 journals agree to use their Uniform requirements, the so-called Vancouver style, which are instructions to authors on how to prepare manuscripts, not to editors on publication style. So authors must also follow the instructions to authors in the journal as to what topics are suitable for that journal and what types of papers may be submitted. The journal’s instructions are likely to contain requirements unique to that journal, such as the number of copies of a manuscript, acceptable language, size of articles and approved abbreviations, among others. The home page of the Mulford Library of the Medical College of Ohio provides links to instructions to authors for over 3500 journals in the health and life sciences. Specific Rules of Medical Writing

Specific rules of medical writing are formulated by the individual professional communities, which set up guidelines to be followed for the production of particular text genres. These guidelines are sometimes based on empirical studies of the qualities and shortcomings of such genres. For example, evidence that the quality of reporting of randomized, controlled trials (RCTs) was suboptimal initiated the development of consolidated standards of RCT (CONSORT) reporting by a group of scientists and editors. The objective of CONSORT is to facilitate critical appraisal and interpretation of RCTs by providing guidance to authors about how to report their trials. A revised CONSORT statement has recently been issued with checklist items for the contents of the title, abstract, introduction, methods, results, and discussion and with accompanying flow diagrams depicting information from four stages of a trial: enrolment, intervention allocation, follow-up, and analysis. The statement is available on the Internet, and it includes text examples, for instance on the generalizability (external validity) of trial findings from which valuable linguistic information may be derived. The case for structure in scientific papers is made not only for the macro level, but also for the micro level (i.e., what linguists would call the move or step structure). Thus, most abstracts are highly structured and certain prestigious journals advocate a specific structure even for the discussion; for instance, the British Medical Journal suggests that the discussion should be structured as follows: . statement of principal findings . strengths and weaknesses of the study . strengths and weaknesses in relation to other studies

744 Medical Writing, Revising and Editing

. meaning of the study: mechanisms and implications for clinicians and policy makers . unanswered questions and future research. Reviews are also expected to be written according to specific guidelines. An analysis of reviews included in the Cochrane Library, a regularly updated collection of evidence-based medicine databases, showed that a sample of Cochrane reviews had problems in terms of both structure and style. Hence, stylistic problems were encountered in 23% of 53 Cochrane reviews published during 1998. The stylistic problems included many spelling, grammatical, and typographical errors, and reviewers’ comments contained statements like ‘‘seems to be an unfinished draft,’’ ‘‘needs to be edited to be more readable and comprehensible’’ (Olsen, 1998). Cochrane reviews are, on average, more systematic and less biased than other systematic reviews, and the linguistic flaws have a negative effect on the validity of the scientific contents communicated.

Medical Editing Medical editing and medical revision are two sides of the same coin. Their domains may be visualized as overlapping circles. Editing is concerned mainly with the contents of the text. The most important thing in the editing process is to check the validity, i.e., the extent to which the study suffers from bias or errors. A common error is for example to confuse no evidence of effect with evidence of no effect. Editing also seeks to answer the major weaknesses of the paper; in particular, whether the manuscript requires additional work in terms of originality and importance, adequacy of design or approach, adequacy of materials studied, accuracy of interpretation of results, statistical analysis, relevance of discussion, soundness of conclusion, appropriateness of references, tables, and figures, or clarity of presentation. A good rule is to shorten the introduction and discussion and to lengthen the methods and results. Half the manuscripts submitted to the British Medical Journal, for example, are rejected for lack of importance to a general medical audience, serious scientific flaws, or lack of originality. Editing is often achieved through so-called peer reviews, which denotes a number of processes, most commonly the gathering of opinion on the quality of the paper from other experts, either in-house or external experts. The review process starts when a manuscript is received at the editorial office. There are generally two kinds of review systems: the ‘board review,’ where all reviewers are members of an editorial board, or ‘pool reviews,’ where the editorial board draws on a pool of expert reviewers. Most

journals have developed their own review strategies, but generally reviewers receive review forms, instructions and occasionally associated materials like instructions to authors and sample reviews. The reviewers are usually asked to provide text comments detailing their critique and suggestions (Shea et al., 2001).

Medical Revision Text revision is used here not in the traditional sense in English of modification of the text for contents, but in a narrower sense of reviewing, editing and proofreading with a focus on the form and style of the text. Revision incorporates the checking of the macrostructure and microstructure of the text, language and style issues, and judging suitability for the target reader or client. It embraces both the monolingual editing that borders on technical editing and, where translators and language revisers are included, the checking of the target text against the source for accuracy, style, and register. Revision ideally falls into three steps. The first step is to make sure that the manuscript is written according to the general guidelines set out in Uniform requirements and the specific instructions to authors on how to prepare manuscripts. It must follow the target journal’s format for style. Most journals, for example, spell out numbers less than ten, unless they appear with percentages or units of time or measure, and most but not all use upper-case P for ‘P values’ and C for ‘Chi-square,’ but not all. The second step is to revise for structure, i.e., to ascertain that the move and step structure of the text serves the purpose of the text maximally, both at the overall (macro) level and at deeper (micro) levels. This can be done first by analyzing the move structure of the text section by section, second by comparing the move structure with the prototypical move structure for the genre in question (e.g., the IMRaD structure), and, third, by revising the move structure where necessary to achieve model fit. The aim is to make sure that the ‘superficial’ levels of argumentation or the types of argument that lie in each section of the text are supported by the ‘deep levels’ of argument, viz the persuasive power of each sentence, which depends on elements such as use of active or passive, the position of adverbs and adjectives, the use of ifconditionals, hedges, and other forms of modality. The third step in the revision process is to analyze the text for correctness of grammar (active/passive, subject–verb agreement, position of modifiers, dangling participles, correct use of tense), word choice (jargon, bias-free language, good verb choice), and correctness of spelling and punctuation. This stage

Medicine and Health: Inter- and Intra-professional Communication 745

should also include checking all other rules of good English usage, including avoiding starting sentences with such phrases as ‘it has been shown that,’ ‘it was found that,’ ‘based on the fact that’ (change to ‘because’), and with ‘there’ (as ‘There has been an increase in the number of patients’ [change to ‘More patients are’] and ‘There is evidence to suggest that those who cease smoking’ [change to ‘Those who cease smoking may’]). Finally, where the target text is produced in a foreign language, it is necessary to check for source language interference at levels of lexis (word level), syntax (sentence structure, sender–receiver relationship), and text (cohesion/coherence). See also: Macrostructure; Medical Discourse: Structured Abstracts; Speech Acts; Translation of Scientific and Medical Texts.

Bibliography Alderson P, Green S & Higgins J P T (eds.). ‘Cochrane Reviewers’ Handbook 4.2.2. Cochrane Library, 2004(1). American Psychological Association (1995). Publication manual of the American Psychological Association (4th edn.). Washington, DC: American Psychological Association. Atlas M C (2002). Author’s handbook of styles for life science journals. New York: CRC Press. Council of Biology Editors (1994). Scientific style and format: the CBE manual for authors, editors, and publishers (6th edn.). Cambridge: Cambridge University Press. Gibbs W W (1995). ‘Trends: lost science in the Third World.’ Scientific American 273, 92–99. Huckin T N & Olsen L A (1991). Technical writing and professional communication for non-native speakers of English. New York: McGraw-Hill.

Hyland K (2000). Disciplinary discourses: social interactions in academic writing. Harlow, England: Longman. International Committee of Medical Journal Editors (2003). Uniform requirements for manuscripts submitted to biomedical journals: writing and editing for biomedical publication. Philadelphia: ICMJE. Iverson C (1998). American Medical Association manual of style: A guide for authors and editors (9th edn.). Baltimore: Williams & Wilkins. Link A M (1998). ‘US and non-US submissions.’ Journal of the American Medical Association 280, 246–247. Matthews J R, Bowen J M & Matthews R W (2003). Successful scientific writing: a step-by-step guide for the biological and medical sciences. Cambridge: Cambridge University Press. Moher D, Schulz K F & Altman D G (2001). ‘The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials.’ Lancet 357, 1191–1194. Olsen O et al. (2001). ‘Quality of Cochrane reviews: assessment of sample from 1998.’ British Medical Journal 323, 829–832. Shea J A, Caelleigh A S & Pangaro L (2001). ‘Review process.’ Academic Medicine 76, 911–914. Swales J (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press. Swales J & Feak C B (2003). English in today’s research world: a writing guide. Ann Arbor: University of Michigan Press.

Relevant Websites http://www.icmje.org – International Committee of Medical Journal Editors. http://www.mco.edu – Medical College of Ohio. http://www.consort-statement.org.

Medicine and Health: Inter- and Intra-professional Communication R Iedema, University of New South Wales, Sydney, Australia ! 2006 Elsevier Ltd. All rights reserved.

Introduction This article summarizes work that addresses the ways in which the practice of medicine is changing; how these changes are affecting medical discourse, and how they point to medical discourse becoming increasingly dependent on and coconstituted by what

may be seen as ‘nonmedical’ discourses. In reviewing this work on the changing status of medicine in contemporary societies, this article emphasizes medicine’s dependence on and accountability to nonmedical professionals, such as nurses, allied health clinicians, administrators, policy makers, managers, information technology specialists, and, last but by no means least, health care ‘consumers’ and their caregivers. The article makes the point that medical discourse is no longer appropriately studied in vacuo. Although medical talk may still occur in settings where we find

Medicine and Health: Inter- and Intra-professional Communication 745

should also include checking all other rules of good English usage, including avoiding starting sentences with such phrases as ‘it has been shown that,’ ‘it was found that,’ ‘based on the fact that’ (change to ‘because’), and with ‘there’ (as ‘There has been an increase in the number of patients’ [change to ‘More patients are’] and ‘There is evidence to suggest that those who cease smoking’ [change to ‘Those who cease smoking may’]). Finally, where the target text is produced in a foreign language, it is necessary to check for source language interference at levels of lexis (word level), syntax (sentence structure, sender–receiver relationship), and text (cohesion/coherence). See also: Macrostructure; Medical Discourse: Structured Abstracts; Speech Acts; Translation of Scientific and Medical Texts.

Bibliography Alderson P, Green S & Higgins J P T (eds.). ‘Cochrane Reviewers’ Handbook 4.2.2. Cochrane Library, 2004(1). American Psychological Association (1995). Publication manual of the American Psychological Association (4th edn.). Washington, DC: American Psychological Association. Atlas M C (2002). Author’s handbook of styles for life science journals. New York: CRC Press. Council of Biology Editors (1994). Scientific style and format: the CBE manual for authors, editors, and publishers (6th edn.). Cambridge: Cambridge University Press. Gibbs W W (1995). ‘Trends: lost science in the Third World.’ Scientific American 273, 92–99. Huckin T N & Olsen L A (1991). Technical writing and professional communication for non-native speakers of English. New York: McGraw-Hill.

Hyland K (2000). Disciplinary discourses: social interactions in academic writing. Harlow, England: Longman. International Committee of Medical Journal Editors (2003). Uniform requirements for manuscripts submitted to biomedical journals: writing and editing for biomedical publication. Philadelphia: ICMJE. Iverson C (1998). American Medical Association manual of style: A guide for authors and editors (9th edn.). Baltimore: Williams & Wilkins. Link A M (1998). ‘US and non-US submissions.’ Journal of the American Medical Association 280, 246–247. Matthews J R, Bowen J M & Matthews R W (2003). Successful scientific writing: a step-by-step guide for the biological and medical sciences. Cambridge: Cambridge University Press. Moher D, Schulz K F & Altman D G (2001). ‘The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials.’ Lancet 357, 1191–1194. Olsen O et al. (2001). ‘Quality of Cochrane reviews: assessment of sample from 1998.’ British Medical Journal 323, 829–832. Shea J A, Caelleigh A S & Pangaro L (2001). ‘Review process.’ Academic Medicine 76, 911–914. Swales J (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press. Swales J & Feak C B (2003). English in today’s research world: a writing guide. Ann Arbor: University of Michigan Press.

Relevant Websites http://www.icmje.org – International Committee of Medical Journal Editors. http://www.mco.edu – Medical College of Ohio. http://www.consort-statement.org.

Medicine and Health: Inter- and Intra-professional Communication R Iedema, University of New South Wales, Sydney, Australia ! 2006 Elsevier Ltd. All rights reserved.

Introduction This article summarizes work that addresses the ways in which the practice of medicine is changing; how these changes are affecting medical discourse, and how they point to medical discourse becoming increasingly dependent on and coconstituted by what

may be seen as ‘nonmedical’ discourses. In reviewing this work on the changing status of medicine in contemporary societies, this article emphasizes medicine’s dependence on and accountability to nonmedical professionals, such as nurses, allied health clinicians, administrators, policy makers, managers, information technology specialists, and, last but by no means least, health care ‘consumers’ and their caregivers. The article makes the point that medical discourse is no longer appropriately studied in vacuo. Although medical talk may still occur in settings where we find

746 Medicine and Health: Inter- and Intra-professional Communication

doctors interacting with patients or other doctors, and although medical writing still answers to the demands of disciplinary specialization, this article takes issue with analysts of discourse rendering invisible medical dependencies on other professional and nonprofessional voices and practices by constructing them as merely contextual, taken-for-granted, and therefore marginal to the analysis of medical language and discourse ‘proper.’ Before setting this argument out in greater detail, let me illustrate some of these dependencies with an example. Consider the segment of medical discourse below. It is part of what is called a ‘critical incident report’ (CIR), a form used to report events that are judged to be ‘out of order.’ This segment was written by a doctor who was in charge of moving a patient from a rural to a tertiary hospital setting. The doctor wrote the report because he found that something was amiss with the way the patient had been treated by the other doctors (the patient had been intubated incorrectly) and felt that this unduly hastened the patient’s death. A CIR, then, is an organizational device by means of which clinicians communicate about adverse or critical events in order to learn from such events and prevent them in the future. The headings are standard in the free text portion of the critical incident form. Story Patient with severe [internal] bleed. GCS 4 on admission. Intubated by doctor in Emergency Department. Left spontaneously breathing. Retrieval team arrived. Breath sounds heard within usual total assessment. Readied for transfer, paralysed for transport. Subsequent demise, presumed to be of neurological cause. Later review of CXR (not looked at by retrieval team) showed ETT in oesophagus. Outcome Cardiac arrest. Steps Taken or Treatment Required As presumed to be neurological cause, patient already elderly and moribund, aggressive resuscitation not attempted. Was This Incident Preventable? Yes – always check CXR when available, re-check airway if paralysing.

For those interested in medical discourse, this brief text is rich for a range of reasons. First, there is a degree of mimesis between the criticality of the event and its multiple acronyms (its abbreviated semantics) and its staccato syntax. In that sense, the CIR is reminiscent of the medical chart or the patient medical record produced in hospitals. Second, the text is a standardized ‘genre,’ insofar as critical incident reporting has crystallized as a particular textual construct across various types of organizations. Yet this

genre is complex, as it is a curious combination of narrative fragments: Story (what happened), Steps Taken (what did you do), as well as more technical components: Outcome (an assessment about what this incident produced), and Was This Incident Preventable? (a judgment about how avoidable the event was). Genre as ‘standardized meaning making ritual’ has been deployed as a perspective for making sense of medical discourse, but largely only in relation to traditional constructs such as dyadic medical consultations, medical notes, monodisciplinary ward rounds, and ‘case presentations.’ Yet another dimension of the text is its context. Context is commonly mentioned in analyses such as those cited above but remains little explored on the assumption that it merely enables the production of the text that is the ultimate focus of attention. From the CIR above, however, it is evident that it was the considerable contextual or organizational complexity that played an important role in bringing about the occurrence described and in producing the CIR in the first place. We can go further than this and conclude from the introduction of CIRs into medicine more generally that it is clinical–organizational complexity that is increasingly at the heart of medical work, to the point where all kinds of new genres and discourses are now seen to be necessary to handle these organizationally complex facets of medicine. As one among a number of such emergent discourses in and around medicine (such as clinical guidelines, clinical benchmarks, and so on), this critical incident report provides a means to reflect on how the work performed by people in one department affects what others do elsewhere, whether clinical documentation and communication facilitate such care trajectories, how other professionals fit into them, and whether adverse events arise as a consequence of care trajectories’ lack of explicit, proactive organization. The CIR as discourse phenomenon thus enables reflection on the changing nature of medicine more broadly and on how emergent medical genres and discourses now increasingly serve to foreground the integration of medicine with those alternative dimensions of patient care on which it increasingly depends: information and communication technologies; medical scientific evidence and treatment guidelines that are continually updated; patient and family preferences, concerns, and complaints; proactive interfacing of medicine with nursing and allied health practices, and so forth. Perhaps the dearth of linguistic attention to these contextual-organizational dimensions of medicine can be explained by pointing to the intimate sociological knowledge of situated medical practice that it requires. As Cicourel noted many years ago,

Medicine and Health: Inter- and Intra-professional Communication 747 Participants of discourse or readers of a text are always engaged in selective use of the information that they can attend to. Limited capacity processing constraints imposed by a complex setting and the participants’ knowledge base can sharply reduce the participants’ comprehension of what is taking place. The researcher faces similar problems. We normally examine single utterances or connected discourse that run for a few lines, giving careful attention to lexical items, pronominal usage and repetition, the repetition of clauses that give referential prominence to a person, object or event, WH-cleft constructions, IT-cleft constructions, relative clauses, rhetorical questions, and the like; yet we may avoid or be unaware of information that presupposes organizational constraints and complex social relationships that can be obtained only by ethnographic field research. (Cicourel, 1982: 53)

We could say that we are interested only in the linguistic details of specific people’s discourse practices and not in the multivariate complexities that professionals confront when faced with organizing their expert work with others. This article suggests, however, that downplaying ‘organizational constraints’ as pointed to by Cicourel poses not just an analytical problem, but also a political quandary. Much linguistic analysis has delineated medical discourse in such a way as to conform with and thereby reinforce medicine’s own preferred self-image as an autonomous profession accountable only to itself – an image that in current circumstances is not just becoming unrepresentative of what medicine does, but also untenable in view of contemporary ethics. It is on those grounds that this article seeks to shift the spotlight toward research that has investigated the ‘conditions of possibility’ of medicine; that is, the organizational and professional practices that in effect afford the execution of medicine. The article profiles a body of research that operates on the interstices of a number of disciplines (linguistics, discourse analysis, health services research, nursing research, medical sociology, and clinical ethnography) and that questions the view of medicine as ‘discourse of all discourses.’ The research reviewed here challenges the closure that medicine itself imposes on what medicine is and does, and it defines its object of research in a way that exceeds the boundaries currently observed by the majority of linguistic and discourse studies of medical language. If we ask how have the linguistic sciences constituted medicine as their object to date, we could start with perusing the contents of prominent publications associated with the disciplines of linguistics and discourse analysis: Journal of Sociolinguistics, Journal of Pragmatics, Discourse and Society, Discourse Studies, Text, Research on Language and Social Interaction, as well as overview publications such

as the Handbook of linguistics, Sociolinguistics: An international handbook of the science of society, the Handbook of pragmatics, the Handbook of discourse analysis, or more specialized collections of papers such as The construction of professional discourse (Gunnarson et al., 1997), Talk, work and the institutional order (Sarangi and Roberts, 1999) and Talk at work (Drew and Heritage, 1992). What confronts us in our survey is a dearth of work on health care discourse and practice generally. In addition, we notice an emphasis on doctor talk, whether with patients or about patients. Then, of course, there are specialized domains such as ‘clinical linguistics,’ which centers on language disorders and speech pathology; a domain of analysis concerned with medical records and clinical documentation; a field of study that targets medical scientific discourse, its metaphors, and its technicalities; and one that considers the representation of health and medicine in the public domain. The literature further reveals that there is research on intraprofessional or doctor–doctor communication, but considerably less that systematically addresses interprofessional communication (doctors talking to nurses or allied health clinicians). There is even less research that addresses more complex stakeholder configurations involving interactions between doctors and health care managers, health policy makers, special interest groups, or community representatives. Little to no research targets the practices and discourses of health organizational change. This article problematizes these cut-off points inherent in the distinctions that the research just cited brings to bear on the study of medicine and medical practice. It does so in two ways. First, it points to evidence that modern medicine is now so complex that the distinctions just reviewed are increasingly difficult to maintain epistemologically and ontologically, and therefore analytically. Second, this article reviews work on medical discourse that has operated in the margins of existing research, by addressing the multidisciplinary, organizational, and technological dimensions of health care provision through which and thanks to which medicine is increasingly being reconfigured. To realize these two aims, this article first presents a sketch of the complexity of 21st century Western health care by teasing out how the relationships between medicine, the other health professions, the State, the public, and policy have changed over the past 30 or so years. Second, and as presaged above, the article considers work that has begun to extend beyond established analytical parameters, in ways that do not delimit medical discourse and communication purely within the doctor–patient dyad, the

748 Medicine and Health: Inter- and Intra-professional Communication

technicalities of medical notation, or the peculiarities of medical science. This alternative work approaches medicine from a more ‘ecological’ perspective to encompass increasingly pervasive features of medical discourse and practice, such as team communication, multidisciplinary procedure mapping and treatment execution, risk management and practice review processes, organizational coordination and management, information technology and electronic drug prescribing, test ordering, computerized patients records, and so forth.

21st Century Medicine: Social, Cultural, Political, Economic, and Organizational Complexities For some, questions may remain about why we as linguists who are interested in the study of the languages of medicine need to concern ourselves with the ways in which medical practices have changed over the past three or four decades. Is there any basic change to the fact that individual doctors predominantly treat and communicate with individual patients and caregivers? Aren’t hospitals much the same wherever we go, give or take a few new technologies and drugs, with similar medical specialties, common dress and uniform codes, analogous ward divisions and nursing populations, and related forms of administration and management? Even if there is evidence of medical subspecialization, organizational restructuring, and some task shifting between nursing and medicine, can it not be said that the same medicine – whether practiced as doctor talk or as medical writing – is still the mainstay of medical care? We can put this more assertively and say that, given the growing technological sophistication of medicine, its growing control over the boundary between life and death, and the increasingly multicultural constitution of medicine’s client base, should we not set more store by understanding how doctors (are to) communicate with their patients and with one another and by clarifying how the discourse of medical science is unfolding? Of course, the importance of work that has focused on illuminating the social values that underpin doctor–patient communication, medical professional communication, and historical or other aspects of medical discourse is incontestable. These initiatives are important for locating medicine as specialized practice in modern society. But part of this picture should be analytical attention paid to the changing face of medicine and to the implications of these changes for the practising doctor. As part of illuminating the changes in medicine in contemporary society, we need not go back more than

approximately 30 years. Medicine’s pre-1970 history is far from irrelevant here, but suffice to say that, despite its heterogeneous origins, medicine gained a solid foothold in many industrialized and industrializing nation states in the course of the 20th century, combining science, specialized education and certification procedures, public institutionalization, and commercial investment. As a result of state support and institutionalization in the form of public hospitals, medicine was able to provide increasingly sophisticated treatments to growing numbers of people. These treatments forced up public investment and insurance, but medicine did little to manage its practices or its expenditures. Medicine’s ‘rise and rise’ was slowed somewhat with the advent of the 1970s, when, as a consequence of the oil crisis and other socioeconomic pressures, its profligacy with public monies was gradually called into question. The close of the 20th century has seen medicine coming under further pressure due to a number of events that shook public confidence in its ability to control and maintain its members’ standards of practice. Reports of serious medical failures in the United States, the United Kingdom, and Australia, in addition to findings suggesting that medical practice commonly results in approximately 10% of patients incurring injuries and deaths, raise critical questions about the trust and hope that societies have invested in doctors to act as people who practise medical science for the benefit of their patients in rational, expert, and judicious ways. The importance of these developments for linguistics and discourse analysts lies in the change of status that medicine has had to confront over the last two to three decades in most countries. Clearly, the autonomy and status built up by medical professionals during most of the 20th century have not just begun to be questioned but also eroded. In even the most routine facets of medical practice, doctors are affected by a host of constraining influences such as guidelines, clinical pathways, policies encouraging multidisciplinary approaches to teamwork, resource utilization checks, lengthy hospital accreditation surveys, ‘clinical pertinence’ reviews (which focus on whether medical records are properly filled out), practice guidelines, and so on. Furthermore, analyses of medical discourse can no longer ignore the interests, involvements, or influences of alternative stakeholders, even if they are not always physically copresent in the consultation room, on the ward, or in the laboratory. These stakeholders have made an indelible mark on the profile of medical discourse and practice, by requiring doctors to engage in ‘informed consent’ procedures (ensuring that patients are aware of the

Medicine and Health: Inter- and Intra-professional Communication 749

implications of treatments provided); ‘critical incident reporting and monitoring,’ ‘root cause analysis’ processes, complaints commissions’ inquiries, health care consumer organizations’ initiatives, nursing unions’ requirements for practice standardization and for nurses to increase their skills or cross-train, policy makers’ and medical colleges’ guidelines as to how to dispense medical services and treatments, managements’ directives with regard to resource expenditure and budget cuts, patients’ families’ requests for extended care or special drugs, litigants’ legal claims following substandard care, Internet group campaigns for the legalization of specific drugs, issue groups’ objections to genetic predetermination and experimentation, pharmaceutical companies’ pressure to prescribe brand-name drugs in place of their generic counterparts, and so on. In view of the impact of these phenomena on daily medical practice and the stresses that they cause for many doctors, it would be amiss for analysts interested in medical discourse to portray it as if these constraining influences remained peripheral to what doctors do and say, or without consequences for who they talk to and what they write about. Put more succinctly, contemporary medical discourse, whether spoken by the single doctor to a single patient or produced during a critical incident review procedure, is u¨ berhaupt (‘always already’) shot through with multiple competing interests. In this day and age, medicine consults, diagnoses, and researches in a ‘crowded space.’ In addition to being inevitable in the modern-day hospital, medical discourse that creates a space for different disciplinary and stakeholder views is central to producing good medical outcomes. In reality and in practice, then, medicine is fast becoming one voice among a host of others, offering one set of expertise and relying on others for different and complementary views, knowledge, and insights. The question that arises at this point is whether there is research that has addressed any of the linguistic-discursive characteristics of these emergent facets of modern medicine.

Linguistic and Discourse Studies of Medicine as Crowded Space As should now be more than evident, the view put forth in this article is that linguistic and discourse research needs to investigate medical language and communication so as to capture the ‘crowded clinical spaces’ in which it is enacted. This is not only crucial for doing justice to the complexity of contemporary medical discourse, it also prevents us from becoming unwitting players in the struggle over the changing definition of ‘profession’ in medicine and from

finding ourselves on the side of a medicine that tries to defend its traditional boundaries in the face of inevitable reform (Dent, 1998). Let us turn then to an overview of a heterogeneous body of work that confronts medical language and communication in a way that cuts across the neat confines of interactions occurring between doctors and patients or among doctors themselves. This body of work sets store by whether and how interprofessional and organizational coordination and communication are achieved, maintained, and, in some cases, enhanced. A good number of the investigations cited pay little attention to the linguistic details of medical representation, whereas in those that do, technical linguistic-interactive constructs, such as turn-taking, topicalization, hedging, and coherence, and so forth, have become subjugated to the need to answer questions about intraprofessional and interprofessional practices and processes (Cicourel, 1987: 220). It is important to begin, then, with acknowledging that a considerable amount of work on intra- and interprofessional clinical communication has been performed in disciplines that are not concerned with producing linguistic or discourse analytical findings. This includes work of a more instrumental or solution-oriented sort as well as (critical) sociological, anthropological, and ethnographic enquiries. Such studies are important, even if judged from a linguistic perspective they provide little insight into the syntactic, semantic, and generic features of the discourse practices under description. That said, there is a small body of work that uses discourse analytical methods for understanding and enhancing clinical–organizational communication. Especially interesting for the purposes of the present article is work that uses linguistic and discourse analyses to study multidisciplinary team meetings, with the aim of understanding the interprofessional dynamics between doctors and nurses, and how these affect care planning (Engstrom, 1986). Related findings range from the extent to which professional habitus enables or constrains cross-professional communication (Lingard et al., 2002), to how professional identities appear to determine the character of work meetings, resulting in highly variable impacts of executive-level decisions (Iedema et al., 1999), to how clinicians structure their communications as complex ‘polylogues’ (Grosjean, 2004). Also working across clinical–professional boundaries, Mellinger reported an interactive analysis of how paramedics and nurses coordinate their practices in environments filled with high levels of urgency and uncertainty (Mellinger, 1994). Focusing specifically on the logic and complexity of doctors’ practices and their relationship with clinical

750 Medicine and Health: Inter- and Intra-professional Communication

documentation, Pettinari (1988) wrote an extensive study about the ways in which surgeons work and how they make notations about their work, illuminating the degrees of tacit knowledge and presumed understanding that come into play there. For her part, Hobbs has also performed some important work on elucidating the medical assumptions underpinning the technical codes and symbols of clinical notation in residents’ progress notes (Hobbs, 2002). Others have addressed the work–documentation relationship, but from a sociotechnical angle, including Timmermans and Berg (1997). The complexities of various kinds of medical treatment are described in the work by Mol (2002) and Fox (1993), both of whom use versions of discoursebased ethnography. With regard to analyses of talk by doctors as professional identities, MontgomeryHunter (1991) studied a variety of dimensions of medical discourse, such as case presentations and medical records as ‘doctors’ stories,’ and Jordens and Little (2004) described how doctors frequently position themselves in interview talk by enunciating a genre that is agnate to the Directive. Surprisingly, given the enormous changes that it has undergone in recent decades, there is limited work that has addressed the relationship between doctors and managers (Griffiths and Hughes, 2000) or, more broadly, between clinicians and nonclinicians (Smith and Preston, 1996). It is these domains of investigation in particular that have least been broached by linguistic and discourse approaches, with some exceptions. Iedema et al. (2004) have addressed the delicate interactions occurring between clinicians and managers in a health care organization that faced hospital reform, whereas Pope et al. (2004) have studied the discursive tensions among policy statements, managers’ views, and clinicians’ uptake of the shift toward ‘elective treatment only’ centers. These analyses in effect contribute to redesigning the substance of the ‘conversations’ that currently characterize and constitute the clinical–managerial interface and to restructuring the evaluation of health care delivery based on those emerging conversations (Degeling et al., 2004). Prominent for our purposes too is research arising from within nursing. The work of Crawford et al. (1998) analyzes nursing discourse and offers insight into a number of facets of nursing care, including into how clinical notation relates to what is spoken about among nursing clinicians and into the socially constructed nature of clinical knowledge, its documentation, and its implications for how care is organized. Crawford and colleagues’ work belong to a vibrant tradition of nursing research that applies discourse analysis as methodology for illuminating

general nursing issues (Boutain, 1999; Lawson, 2002). A thorny issue for nursing research is how the nurse–doctor relationship impacts on what the nurse does. Manias, Street, and Cheek are among some nursing researchers doing critical discourse ethnographies of nurse–doctor interactions (Manias and Street, 2001) as well as of nurse–doctor communications enacted through chart notations and the like (Cheek and Gibson, 1996). Consonant with the concerns of these investigations, there is research emerging from within the discourse analytical tradition that also deals with doctors’ and nurses’ talk from the perspective of what it says about how they work together as members of multidisciplinary clinical teams (Iedema et al., 1999). Finally, innovative work on engendering new clinical practices involving different professionals as well as patients is emerging from the University of Helsinki under the guidance of Yrjo¨ Engestro¨ m (see Engestro¨ m, 2003). Engestro¨ m’s work is better categorized as belonging to what he terms ‘Activity Theory’ than to language or discourse research, and this is not least because he uses complex video data to describe and reflect on what clinicians do with and around patients. Examples of highly innovative practices designed in this way are reported in both Yrjo¨ and Ritva Engestro¨ m’s publications, centering on creating continuity of care across health caregivers and services by involving an acute-care specialist, a general practitioner, and the patient in his/her care planning meetings (Engestro¨ m, 2003). It is studies like these that are truly beginning to reinvent relationships and services in health care systems that for too long have relied on the image of the autonomous doctor epitomized by the doctor–patient consultation. Here, new conversations such as that presaged in Degeling et al.’s (2004) work cited above are beginning to be proactively configured and investigated, in a move toward recognizing that the organization of doctoring is no longer a peripheral issue to contemporary health care and medicine.

Conclusion In closing, let us return to the theme we started with. Above, we cast a brief glance at the contents of the major linguistics and discourse analysis publications, including journals, overview publications, and collected papers. It was stated that emphases in these publications were placed on how doctors communicate with patients (and vice versa) and on how medical discourse has written itself in the past or writes itself in the present. The article proceeded to argue that the orientations of medical discourse research are

Medicine and Health: Inter- and Intra-professional Communication 751

out of step with the face of 21st century medicine. Only a minority of researchers have taken steps not just to contextualize the study of medical and clinical discourses and practices with emerging consumer, managerial, policy, and organizational issues in health care, but to shift the focus of description from the medical consultation (which stopped being the apotheosis of health care with the shift toward medical and information technologies) toward the multidisciplinary team and focus on the ways in which it plans, communicates, and delivers care. This article suggested that, if medicine operates in complex social, cultural, organizational, and political environments, few of which can be relied on to ‘spring spontaneously from the discourse’, linguistic and discourse researchers may need to rethink the basis of their methodology as well as the focus of their approach to medicine as a disciplinary regime. With regard to methodology, we reiterate Cicourel’s (1982) call for a more ethnographically informed kind of linguistic–discourse analysis. As for linguistics’ focus on the discourse of medicine, I have argued that the task of a socially and organizationally aware and practice-oriented analysis is to understand and clarify the dynamics of, and the struggles that choreograph, the contexts of which its objects of investigation are part. Put differently, analysis should not fall victim to the notion that the empirics of situated language production can substitute for an analysis of social life. Particularly at a time when health care is changing on an almost daily basis and is increasingly under pressure from more and more parties claiming interest and influence, our engagement, as linguists and discourse analysts, with processes through which the new medicine is increasingly being constituted is no longer a luxury option. Finally, several publications that combine delicate linguistic analyses with penetrating commentaries on the changing nature of medicine and health care have been cited. On that score, the present article bears witness to the fact that meaningful social engagement and linguistic delicacy are far from a zero-sum game. On the contrary, the subtext driving this article is that linguistic and discursive analysis is central to not just understanding and clarifying health professional and organizational change for ourselves as analysts, but provides a means to enable professionals to come to terms with the increasingly challenging complexities of their clinical work. See also: Anatomical Nomenclature: History; Genre and Genre Analysis; Health and the Media; Medical Discourse: Doctor–Patient Communication; Medical Specialty Encounters; Organizational Discourse; Speech Genres in Cultural Practice.

Bibliography Boutain D M (1999). ‘Critical language and discourse study: Their transformative relevance for critical nursing enquiry.’ Advanced Nursing Science 21(3), 1–8. Cheek J & Gibson T (1996). ‘The discursive construction of the role of the nurse in medication administration: An exploration of the literature.’ Nursing Inquiry 3(2), 383–390. Cicourel A (1982). ‘Language and belief in a medical setting.’ In Byrnes H (ed.) Contemporary perceptions of language: Interdisciplinary dimensions. Washington, DC: Georgetown University Press. 48–78. Cicourel A (1987). ‘The interpenetration of communicative contexts: Examples from medical encounters.’ Social Psychology Quarterly 50(2), 217–226. Crawford P, Brown B & Nolan P (1998). Communicating care: The language of nursing. Cheltenham: Stanley Thornes. Degeling P, Maxwell S & Iedema R (2004). ‘Restructuring clinical governance to maximize its development potential.’ In Gray A & Harrison S (eds.) Governing medicine: Theory and practice. Maidenhead: Open University Press. 163–179. Dent M (1998). ‘Hospitals and new ways of organising medical work in Europe: Standardisation of medicine in the public sector and the future of medical autonomy.’ In Warhurst C & Thompson P (eds.) Workplaces of the future. Basingstoke: Macmillan. 204–224. Drew P & Heritage J (1992). Talk at work: Interaction in institutional settings. Cambridge: Cambridge University Press. Engestro¨ m R (2003). Hybridity and responsibility: Opening to new practices of health communication and health care? Paper given to the communication, medicine and ethics conference. Cardiff, UK. June 26–29, 2003. Engestro¨ m Y, Engestro¨ m R & Kerosuo H (2003). ‘The discursive construction of collaborative care.’ Applied Linguistics 24(3), 286–315. Engstrom B (1986). ‘Communication and decisionmaking in a study of a multidisciplinary team conference with the registered nurse as conference chairman.’ International Journal for Nursing Studies 23(4), 299–314. Fox N (1993). ‘Discourse, organisation and the surgical ward round.’ Sociology of Health and Illness 15(1), 16–42. Griffiths L & Hughes D (2000). ‘Talking contracts and taking care: Managers and professionals in the British Health Service internal market.’ Social Science and Medicine 51(2), 209–222. Grosjean M (2004). ‘From multi-participant talk to genuine polylogue: Shift-change briefing sessions at the hospital.’ Journal of Pragmatics 36, 25–52. Gunnarson B L, Linell P & Nordberg B (1997). The construction of professional discourse. London: Longman. Hobbs P (2002). ‘Islands in a string: The use of background knowledge in an obstetrical resident’s notes.’ Journal of Sociolinguistics 6(2), 267–274. Iedema R, Degeling P & White L (1999). ‘Professionalism and organisational change.’ In Wodak R & Ludwig C

752 Medicine and Health: Inter- and Intra-professional Communication (eds.) Challenges in a changing world: Issues in critical discourse analysis. Vienna: Passagen Verlag. 127–155. Iedema R, Degeling P, Braithwaite J & White L (2004). ‘‘‘It’s an interesting conversation I’m hearing’’: The doctor as manager.’ Organization Studies 25(1), 15–34. Jordens C F C & Little M (2004). ‘In this scenario, I do this, for these reasons: Narrative, genre and ethical reasoning in the clinic.’ Social Science and Medicine 58(9), 1635–1645. Lawson M T (2002). ‘Nurse practitioner and physician communication styles.’ Applied Nursing Research 15(2), 60–66. Lingard L, Reznick R, DeVito I & Espin S (2002). ‘Forming professional identities on the health care team: Discursive constructions of the ‘‘other’’ in the operating room.’ Medical Education 36, 728–734. Manias E & Street A (2001). ‘Nurse–doctor interactions during critical care ward rounds.’ Journal of Critical Care Nursing 10, 442–450. Mellinger W M (1994). ‘Negotiated orders: The negotiation of directives in paramedic–nurse interaction.’ Symbolic Interaction 17(2), 165–185. Mol A (2002). ‘Cutting surgeons, walking patients: Some complexities involved in comparing.’ In Law J & Mol A

(eds.) Complexities: Social studies of knowledge practices. Durham/London: Duke University Press. 218–257. Montgomery-Hunter K (1991). Doctors’ stories: The narrative structure of medical knowledge. Princeton, NJ: Princeton University Press. Pettinari C (1988). Task, talk and text in the operating room: A study in medical discourse. Norwood, NJ: Ablex Publishing Company. Pope C, Robert G, Bate P, LeMay A & Gabbay J (2004). Metamorphosis of meaning and discourse in organizational innovation and change processes: A multi-level case study of NHS treatment centres. Paper given to the 6th international conference on organizational discourse. Amsterdam, July 28–30, 2004. Sarangi S & Roberts C (1999). Talk, work and the institutional order: Discourse in medical, mediation and management settings. Berlin/New York: Mouton de Gruyter. Smith A J & Preston D (1996). ‘Communications between professional groups in an NHS trust hospital.’ Journal of Management in Medicine 10(2), 31–37. Timmermans S & Berg B (1997). ‘Standardization in action: Achieving local universality through medical protocols.’ Social Studies of Science 27(2), 273–305.

Medicine: Use of English M A´ Alcaraz Ariza, University of Alicante, Alicante, Spain F Navarro, Cabrerizos (Salamanca), Spain ! 2006 Elsevier Ltd. All rights reserved.

English: Its Role as a Lingua Franca in the Techno-scientific Field of Medicine A lingua franca is a common language serving as a regular means of communication relating to scientific, technological, and academic information between different linguistic groups in a multilingual speech community. Historically speaking, English first evolved as a lingua franca in the late 19th and early 20th centuries as a result of the British Empire, which exported its own language to all corners of the earth. In the second half of the 20th century, the American Empire replaced the British Empire throughout the world. English has reasserted itself as a medium for universal communication in an increasingly interconnected world. English is now the dominant or official language in over 60 countries and is represented on every continent (Crystal, 1997). Moreover, English is the dominant voice in international politics, banking, the press, news agencies, broadcasting, the recording industry, motion pictures, travel, technology, knowledge

management, and communications. It is the main language of the Internet, not to mention books, newspapers, airports and air traffic control, international business and academic conferences, sports, international competitions, pop music, and advertising. English has equally become the common language of international experts in a wide range of subjects, such as medicine, the natural sciences, and the social sciences. In addition to this, an estimated three-quarters of the world’s mail is written in English and most of the information stored in electronic retrieval systems is in English. Consequently, English is now needed in order to adjust to the world trend of keeping pace with scientific, technological, economic, and social advances.

English as a Medium of International Communication in the Medical Sciences: Extralinguistic Factors The rise of English as the lingua franca of science and technology in the second half of the 20th century seems to be determined by cultural and extralinguistic factors rather than by inherent linguistic characteristics. The origin of this phenomenon does not lie in any inherent stability of English for its adoption as an international language, but in the sociopolitical

752 Medicine and Health: Inter- and Intra-professional Communication (eds.) Challenges in a changing world: Issues in critical discourse analysis. Vienna: Passagen Verlag. 127–155. Iedema R, Degeling P, Braithwaite J & White L (2004). ‘‘‘It’s an interesting conversation I’m hearing’’: The doctor as manager.’ Organization Studies 25(1), 15–34. Jordens C F C & Little M (2004). ‘In this scenario, I do this, for these reasons: Narrative, genre and ethical reasoning in the clinic.’ Social Science and Medicine 58(9), 1635–1645. Lawson M T (2002). ‘Nurse practitioner and physician communication styles.’ Applied Nursing Research 15(2), 60–66. Lingard L, Reznick R, DeVito I & Espin S (2002). ‘Forming professional identities on the health care team: Discursive constructions of the ‘‘other’’ in the operating room.’ Medical Education 36, 728–734. Manias E & Street A (2001). ‘Nurse–doctor interactions during critical care ward rounds.’ Journal of Critical Care Nursing 10, 442–450. Mellinger W M (1994). ‘Negotiated orders: The negotiation of directives in paramedic–nurse interaction.’ Symbolic Interaction 17(2), 165–185. Mol A (2002). ‘Cutting surgeons, walking patients: Some complexities involved in comparing.’ In Law J & Mol A

(eds.) Complexities: Social studies of knowledge practices. Durham/London: Duke University Press. 218–257. Montgomery-Hunter K (1991). Doctors’ stories: The narrative structure of medical knowledge. Princeton, NJ: Princeton University Press. Pettinari C (1988). Task, talk and text in the operating room: A study in medical discourse. Norwood, NJ: Ablex Publishing Company. Pope C, Robert G, Bate P, LeMay A & Gabbay J (2004). Metamorphosis of meaning and discourse in organizational innovation and change processes: A multi-level case study of NHS treatment centres. Paper given to the 6th international conference on organizational discourse. Amsterdam, July 28–30, 2004. Sarangi S & Roberts C (1999). Talk, work and the institutional order: Discourse in medical, mediation and management settings. Berlin/New York: Mouton de Gruyter. Smith A J & Preston D (1996). ‘Communications between professional groups in an NHS trust hospital.’ Journal of Management in Medicine 10(2), 31–37. Timmermans S & Berg B (1997). ‘Standardization in action: Achieving local universality through medical protocols.’ Social Studies of Science 27(2), 273–305.

Medicine: Use of English M A´ Alcaraz Ariza, University of Alicante, Alicante, Spain F Navarro, Cabrerizos (Salamanca), Spain ! 2006 Elsevier Ltd. All rights reserved.

English: Its Role as a Lingua Franca in the Techno-scientific Field of Medicine A lingua franca is a common language serving as a regular means of communication relating to scientific, technological, and academic information between different linguistic groups in a multilingual speech community. Historically speaking, English first evolved as a lingua franca in the late 19th and early 20th centuries as a result of the British Empire, which exported its own language to all corners of the earth. In the second half of the 20th century, the American Empire replaced the British Empire throughout the world. English has reasserted itself as a medium for universal communication in an increasingly interconnected world. English is now the dominant or official language in over 60 countries and is represented on every continent (Crystal, 1997). Moreover, English is the dominant voice in international politics, banking, the press, news agencies, broadcasting, the recording industry, motion pictures, travel, technology, knowledge

management, and communications. It is the main language of the Internet, not to mention books, newspapers, airports and air traffic control, international business and academic conferences, sports, international competitions, pop music, and advertising. English has equally become the common language of international experts in a wide range of subjects, such as medicine, the natural sciences, and the social sciences. In addition to this, an estimated three-quarters of the world’s mail is written in English and most of the information stored in electronic retrieval systems is in English. Consequently, English is now needed in order to adjust to the world trend of keeping pace with scientific, technological, economic, and social advances.

English as a Medium of International Communication in the Medical Sciences: Extralinguistic Factors The rise of English as the lingua franca of science and technology in the second half of the 20th century seems to be determined by cultural and extralinguistic factors rather than by inherent linguistic characteristics. The origin of this phenomenon does not lie in any inherent stability of English for its adoption as an international language, but in the sociopolitical

Medicine: Use of English 753

changes that, mainly due to the Anglo-American influence, have taken place after World War II. All the countries in Western Europe have been affected to a greater or lesser degree by this dominant role of the United States of America, which is related to several well-known factors, such as their military, political, economic, scientific, and technological leadership, as well as the creation of the Atlantic Alliance and the diffusion of the culture, lifestyles, and behaviors of the English-speaking world. English has equally become a prime vehicle for the transmission of information, which explains its nearly absolute dominance in most scientific fields, because not only the world’s most widely cited medical journals but also most of the best contributions in science and medicine are published in English in international European or American journals. To this overall presence of English in traditional written communication systems, we must add the World Wide Web and the computer networking Internet, whose predominantly English voice has been rapidly exported from and imported into many languages. Therefore, non-English-speaking scientists, researchers, and practicing doctors have no other option but to learn English if they want to be informed of the latest developments in their fields. As English has turned into the primary medium of international specialized publication, many nonEnglish-speaking scientists, being aware of the relevance of medical literature in English to their work and wanting to obtain responses to it, find it more effective to publish in English than in their native language. In this respect, it is interesting to note that many nations measure the productivity of their top international scientists and scholars by the number of times their works are quoted in English-language publications with an impact factor by the Science Citation Index. The increasing tendency toward publishing the most important contributions in English and in top general-science journals, usually published in English, runs parallel to the standardization of research lines and the uniformity of writing style dictated by the editorial boards of scientific journals whose members, in a large proportion, have English as their mother tongue. Apart from being the primary medium of scientific publication, English has likewise emerged as the main language of international gatherings of specialists and of international scientific exchanges. In fact, the high level of technical and scientific knowledge, the necessity of collaboration among several specialists in order to establish a common base for work, and the complexity of the organization of production and of services in today’s society are all factors that foster the use of the same technical terms contemporaneously.

This trend to increasingly use one lingua franca, and in relatively few journals for each science, favors a smoother communication between scientists and, consequently, a rapid progress in science. The continually increasing contact between nonEnglish-speaking scientists and the English-speaking scientific world, mainly through reading and, to a lesser extent, through writing and attending conferences, even reaches meetings of the national type and daily informal conversations between colleagues and national journals, where many English terms tend to creep in. The publishing scientists are equally given stylistic and rhetorical recommendations to write academic papers in their own native language, which follow and imitate the norms characteristic of the foreign English culture. Despite the obvious advantages of the existence of English as a lingua franca, its achievement nevertheless runs parallel to a series of interrelated drawbacks. One of the disadvantages caused by the supremacy of English in the world of science would be that the body of medical knowledge published in other languages is not taken into account. This fact indicates ignorance of the role played by these languages in the different phases of creation, invention, and innovation, and this may lead to uniformity of thought. Also at issue is the scarce participation of non-English-speaking scientists in the editorial policies of the major internationally recognized journals.

Borrowing: Linguistic and Extralinguistic Factors Borrowing is one of the normal neological processes that every language has at its disposal to enrich its lexis. This neological process consists of two consecutive operations. First, a speaker is put in contact with a foreign word and adopts it in his or her language. This does not mean that the adopted word is used as it is in the original language. Usually, the speaker adapts the foreign word to the phonetics of the language he or she speaks. Then comes the diffusion and establishment in the recipient language of the generally phonetically adapted loanword. Thus, the former transfer is essentially an individual act, which can be completed by the latter, diffusion and establishment, which involves collective acceptance, and results in integration. Since all loans are neologisms, the underlying cause of borrowing, from either internal or external sources, is fundamentally the same as that of neologisms in general, i.e., the need for language, as a means of communication and expression, to take account of changes in the nonlinguistic world. But

754 Medicine: Use of English

loanwords differ from other types of neologisms in that their components or models are taken from external, and not from internal, sources. As far as the external source is concerned, drawing loanwords from any foreign language, which is traditionally known as ‘‘lexical borrowing’’ (Haugen, 1950; Hope, 1971), is a direct consequence of a response to the internal lexical needs of a recipient language. The extension of the frontiers of knowledge in science and technology in particular has been accompanied by the creation of a vast terminology necessary to describe the discoveries made and to express the concepts that evolved in the course of this development. As English has become the international language used in science and technology, it is not surprising that it has become the primary source for the creation of new concepts and their corresponding denominations. These new concepts tend to carry their names into the adopting languages, becoming on many occasions integral parts of their word stocks. Every language (unless it is a dead language, such as Latin) cannot avoid interference with other countries and other cultures, especially when this language has become a lingua franca in several fields of knowledge. This is the case for English, which had been a receiving language for centuries and is now supplying other languages with the necessary English loans or Anglicisms. Sometimes the speed of development of new knowledge, which requires new words for new concepts, does not allow enough time to find a suitable translation, or it may be that laziness prevents this translation to be found. The tendency toward economy of expression and the law of least effort can also be responsible for borrowing because the borrowed element is frequently a short term that the speaker and, above all, the writer are inclined to adopt, or it is often quicker and easier to borrow the foreign terminology along with the science, behavior, or product than to mine one’s own language for a suitable expression. Economy of expression is not the only purpose that prompts the use of English loanwords. Often the employment of a new and shorter term aims at precision and clarity if the recipient languages do not dispose of an unequivocal equivalent. The transfer of Anglicisms can also be accounted for by the scientists’ extensive reading of Englishwritten medical literature. Another common feature, such as the greater facility with which English produces noun compounds, equally favors the adoption of straight English loanwords. Although the borrowed terms can be rendered by means of glosses and paraphrases, the lack of specificity and economy of some of the proposed solutions may lead to the conservation of their original form. The avoidance of

homonymic clash may likewise induce keeping the borrowed term unchanged, for instance, the English term shock, which has come into Spanish in order to avoid being identical in form and sound with the already existing Spanish word choque. These linguistic factors are not solely responsible for the adoption of Anglicisms. Social and psychological pressures also enter strongly into the promotion or assurance of their acceptability. The new foreign loanwords, associated with the strongest nation in the world, i.e., the United States of America, quite naturally take on certain general attributes of prestige, as they may evoke an image of quality, efficiency, reliability, and modern living. Also important is the in-talk of certain professional groups, who may rely on a borrowed terminology to mark their status while excluding others. Usually quoted examples are the medical and legal professions. Sometimes the adoption of English loanwords is prompted by the desire for a euphemistic term to express in a less direct way concepts regarded as painful or embarrassing. Pretentiousness and snobbery can equally cause words containing grouping of letters impossible in non-English languages to have an attraction for their speakers, who may also be tempted to borrow terminologies wholesale from English either for international comprehensibility or for greater acceptance among their peers.

Medical Anglicisms: Their Presence in Non-English European Languages Medical terminologies, at least in Western Europe, rest on a fundamentally Latin nomenclature and on neologisms built up with roots, prefixes, and suffixes drawn from Greek and Latin, especially in the fields of anatomy and physiology (Dirckx, 1983). From the Middle Ages to the recent past, Latin, German (Standard German), and French all served traditionally as international languages in the field of medicine. Traces from other European languages, such as Italian and Spanish, may be found as well. Non-European languages have contributed, although to a lesser degree, to the formation of international medical terminology. But on the whole, the current growth of this melting pot of words from different origins is mostly due to the clear supremacy of the United States of America, which, especially after World War II, has been very strong (Maher, 1986). As English has turned into the most powerful medium of medical and scientific communication in Europe, it is not surprising to find hundreds of English words whose use has become common and standard in other European languages. Even a large proportion of these English words are now commonly found in medical

Medicine: Use of English 755

dictionaries written in languages other than English, which is proof of their acceptance within nonEnglish-speaking medical communities. The influence exerted by medical English on other European medical languages has affected all levels of their linguistic systems, ranging from lexis and semantics to syntax and rhetorico-pragmatics, with the borrowing of vocabulary items being nevertheless by far the most common. Lexico-semantic Level

A technical term is defined on the basis of a direct relationship with the thing signified. This is why, in the contact between two cultures and languages, technical terms generally pass from one to the other along with the things they denominate. The vast majority of the borrowed sequences reveal themselves as foreign because of their spelling pattern, which may contravene the rules of syllable structure or phonemic distribution of the recipient languages. Examples of univerbal lexical Anglicisms with geminated vowels and consonants, consonant clusters, unusual graphs, or graphs in unusual positions found in many non-English European medical terminologies would be, for instance, words such as those found in Table 1. Scores of multiverbal lexical Anglicisms with consonant groupings are listed in Table 2. The form of a word sometimes encourages the preference for an unadapted Anglicism. Compounds that contain a noun and a particle are generally considered difficult to translate or impossible to translate adequately. This is why they can be found in various domains related to medicine (see Table 3). The economy of expression given by the -ing form also makes it an extremely popular structure with specialist writers, and it is not strange to find it in a

wide series of simple and compound terms, such as those found in Table 4. Although the vast majority of graphically unassimilated Anglicisms do reveal themselves as foreign, because of either their orthographic pattern or the lack of relation between pronunciation and spelling, there are several borrowed terms the spelling of which fits perfectly into the morphological patterns of the adopting language. Among these we could include words that are seldom pointed out by inverted commas or italics (see Table 5). In some cases, the borrowed words acquire native status by the degree of adaptation they undergo. This adaptation allows the foreign word to be adjusted to the prosodic, phonetic, or spelling norms of the borrowing language. Examples of words once considered to be foreign but fully adapted into a language such as Spanish would be the adjective standard and the noun stress, which have become esta´ ndar and estre´ s, respectively, in order to sustain the Spanish principle that a word should be written as it is pronounced. The role as a donor language played by English is also especially noted in the field of abbreviations, which are simply adopted by the recipient languages. Particularly common in medical writing are the initialisms, which consist of the first letters of the words that compose a phrase and which are especially popular for the names of diseases and of diagnostic and therapeutic procedures. These initialisms are not only understood within the profession, but they are also almost invariably substituted, even in print, for the full expressions. For example, corticotropin is probably better known by the initialism ACTH (adrenocorticotropic hormone) than by its full native name. The same happens to other compressed forms (see Table 6). Table 3 Lexico-semantic level

Table 1 Lexico-semantic level Box Flush Pattern Scatter

Buffer Flutter Plug Scratch

Cast Insight Pool Shunt

Clamp Linkage Prick Stapler

Distress Loop Punch Stent

Flash Patch Rash Thrill

Acting-out Check-up Down-regulation Output Up-regulation

Blackout Crossover Flare-up Run-in Upward creep

Burnout Crossing-over Follow-up Run-out Washout

Bypass Cut-off Input Turnover

Table 4 Lexico-semantic level Table 2 Lexico-semantic level Black tongue Checklist Flint glass

Breakthrough

Bulldog

Buffy-coat

Cross-match End-point

Feedback Glossy skin

Half-life Odds ratio Stem cell

Lag phase Setpoint Threshold

Mismatch Second look

Crown glass Heat shock protein Open door Step and shot

Banding

Binding

Capping

Clapping Counseling

Clinging Doping

Clubbing Dumping

Hardening Quenching Splicing Walking

Kindling Scaling Splitting Wasting

Mapping Screening Strapping

Casefinding Clumping Flapping tremor Panning Smoldering Stripping

Clamping Coping Freestanding Priming Smoothing Taping

756 Medicine: Use of English Table 5 Lexico-semantic level Antisense Core Primer

Borderline Covariance Rate

Clearance Deviance Template

Table 7 Lexico-semantic level Cluster Helper Variance

Compliance Hospice

Table 6 Lexico-semantic level ATP (adenosine triphosphate) DNA (deoxyribonucleic acid) LDL (low-density lipoprotein) PCR (polymerase chain reaction) TSH (thyroid-stimulating hormone)

BUN (blood urea nitrogen) HDL (high-density lipoprotein) MRI (magnetic resonance imaging) RNA (ribonucleic acid)

Another type of abbreviated form is the ‘‘acronym,’’ i.e., an initialism that can be, and customarily is, pronounced like a word. Very common English medical acronyms include those found in Table 7. The bulk of loanwords borrowed from English usually belong to the nominal category, whereas the adjectival category plays a very small part. Exceptions in the language of medicine, and mainly used orally, would be adjectives such as low, high, or light, which, on the other hand, are not technical terms but words used in both common and specialized languages. A rare case of importation of adjectives in the written form would be the expressions slow-low and rapid-high. Worth mentioning is the fact that when an English word is taken into a foreign language, an adaptation of its meaning usually takes place. This adaptation follows a general tendency to narrowing of meaning, a fact especially obvious with items that are polysemic in the source language, and it is therefore usual for them to be borrowed from one specific context and with one meaning. The same term, together with its specific meaning in each case, may also be borrowed from other contexts and may be found in different medical branches and even in other fields outside medicine. For instance, scaling belongs to dermatology, odontology, and statistics, whereas loop has entered not only medicine but also aeronautics, computer science, and economics.

ACE (angiotensin-converting enzyme) ELISA (enzyme-linked immunosorbent assay) NEFA (nonesterified fatty acids) PEEP (positive end-expiratory pressure) PUFA (polyunsaturated fatty acids) SPET (single-photon emission tomography)

Table 8 Semantic level Buena pra´ctica clı´nica ‘good clinical practice’ Ce´lula asesina ‘killer cell’ Ce´lula hue´sped ‘host cell’ Dosis umbral ‘threshold dose’ Efecto techo ‘ceiling effect’ Estudio doble ciego ‘double-blind study’ Lı´nea basal ‘baseline’

acquires a new meaning under the influence of the corresponding English cognate, i.e., a word derived from the same, generally Latin root; and (2) loan translation or calque, which is the result of the translation of an English collocation into the recipient language. It is significant that many examples of semantic loanwords, once numbered among the socalled faux-amis or false friends, i.e., words that are spelled the same, or almost the same, in two different languages but have different meanings, today have become current usage in medical literature. Good cases of semantic loanwords would be those of the Spanish verb asumir, which has taken on the additional English meaning of ‘to assume,’ or of the Spanish nouns evidencia, which is used as the equivalent of ‘sign,’ ‘proof,’ or ‘testimony’ under the influence of evidence, and ocurrencia, which, along with the meanings of ‘idea’ and ‘funny remark,’ is also used in the sense of ‘happening,’ according to occurrence. Examples of Spanish calques on English expressions are shown in Table 8. Further examples of prepositional phrases in Spanish would be the eponymous expressions, which have always been widely diffused in the language of medicine. It is sufficient to mention just a few of these expressions composed of a proper noun and a common noun (see Table 9).

Semantic Level

Another quite frequent means of expressing new concepts in a recipient language is through semantic borrowing, which consists of applying a foreign meaning to a native word and which sometimes cannot be easily identified because a native element is involved. Two main word-formational devices can be distinguished under this type of borrowing: (1) semantic loanword, for which a native word

Syntactic Level

Although the impact on grammatical structure is less than that suffered by lexical and semantic levels, certain syntactic Anglicisms have definitively taken root. In Spanish, one of these Anglicisms would be an increased use of the passive voice instead of the more usual reflexive. The use of the gerund with a sense of posterity and consequence or effect would

Medicine: Use of English 757 Table 9 Semantic level

Table 11 Syntactic level

Ciclo de Ross ‘Ross cycle’ Enfermedad de Parkinson ‘Parkinson’s disease’ Ley de Allen ‘Allen’s law’ Prueba de Addis ‘Addis test’ Reaccio´ n de Porter ‘Porter’s reaction’ Signo de Hoover ‘Hoover’s sign’ Syndrome de Cushing ‘Cushing’s syndrome’

Myelin-like HLTV-III-like

Inhibin-like PDGF-like

Heparin-like PUU-like

Insulin-like 5-HTL-like

ICS-like

Table 12 Syntactic level Anfetamina-like ‘amphetamine-like’ Cola´ geno-like ‘collagen-like’ Eczema´ tica-like ‘eczema-like’

Table 10 Syntactic level Campan˜ a anti-aborto ‘anti-abortion campaign’ instead of campan˜ a antiabortista

Carcinoma ce´lula pequen˜a ‘small-cell carcinoma’ instead of carcinoma microcı´tico or carcinoma de ce´ lulas pequen˜ as Depresio´ n posparto ‘postpartum depression’ instead of depresio´ n puerperal or depresio´ n del posparto Diabetes insulina-dependiente ‘insulin-dependent diabetes’ instead of diabetes dependiente de la insulina Estudio caso-control ‘case-control study’ instead of estudio de casos y controles

languages where a feeling of urgency drives scientists to coin expressions with words from different origins. Other mixed words due to English influence would be the Spanish verbs deletear, mapear, printear, reportar, and testar, which conserve the first component of the English verbs ‘to delete,’ ‘to map,’ ‘to print,’ ‘to report,’ and ‘to test,’ respectively, and add them the verbal Spanish morpheme -(e)ar.

Linfoma no-Hodgkin ‘non-Hodgkin’s lymphoma’ instead of linfoma no hodgkiniano Variabilidad interana´ lisis ‘interassay variability’ instead of variabilidad interanalı´tica

be another example of syntactic Anglicism. English equally appears to have encouraged the attributive use of nouns where Spanish requires adjectives or prepositions (see Table 10). The greater flexibility of the adverbial position in English would have also led to the placement in new positions of adverbs and adverbials in other European languages. Moreover, there seems to be an increasing tendency in the written scientific Spanish language for the word order subject–verb–object, which might be owed to the influence of English, in which this word order is virtually fixed. It is equally interesting to note the frequent appearance of the particle like, which can be combined in all sorts of compounds (see Table 11). All these compounds show the influence of English at the lexical and syntactic levels. On the one hand, the adopting lexicon is enriched with one English word, and, on the other hand, the preposition is placed after the noun and not before it, as is required by syntaxes such as the Spanish one. Like may also be added to words of the recipient language, thereby illustrating the linguistic phenomenon called ‘loan blend,’ i.e., formations between loanwords and calques (see Table 12). Likewise, the mixture of calques and lexical Anglicisms should be noted in the Spanish expressions ce´ lula helper ‘helper cell’ and ce´ lula natural killer ‘natural killer cell,’ to mention just two of the many mixed syntagms that are characteristic of specialized

Phonemic Level

Language contact situations do not bring about phonemic importation as the original English phonemes are replaced in all cases by native phonemes, which are approximately equivalent and follow the native rules governing syllable-structure and stress. For instance, stripper /"strı¯pe/, whose initial ‘s’ plus consonant does not exist in Spanish, is pronounced /es"triper/, even when the original spelling does not undergo any change. Another word with the unmodified spelling is screening /"skri:nı¯N/, which is pronounced /es"krinin/. Although there is no phonemic importation, the phonemic system does undergo change in that phonemic redistribution may take place as a result of the Anglicism, as occurs with plosive consonants, which may occupy new distributional paradigms, especially in the final position. Typographic Level

An example of typographic Anglicism would be the attachment to a noun by means of a hyphen of prefixes, such as the neoclassical anti- and the temporal or sequential pre- and post- (see Table 13). The use of English inverted commas, instead of the Latin ones, or the absence of graphic accents and wrong spellings would exemplify further cases of typographic Anglicisms in Spanish (see Table 14). Rhetorico-pragmatic Level

The extensive reading of English-written sources and of style manuals for writing academic papers, which are mostly translations of English papers, have led scientists worldwide to imitate some of the preferred English attenuating rhetorical patterns (Salager-Meyer

758 Medicine: Use of English Table 13 Typographic level

Table 15 Rhetorico-pragmatic level

Anti-ale´ rgico ‘anti-allergic’ Anti-nu´ cleo ‘anti-core’ Pre-nu´ cleo ‘pre-core’ Pre-test ‘pre-test’ Post-test ‘post-test’

Probabilidad ‘probability’ Probable ‘probable’ Probablemente ‘probably’ Posibilidad ‘possibility’ Posible ‘possible’ Posiblemente ‘possibly’ Quiza´ s/quiza´ ‘perhaps/maybe’

Table 14 Typographic level Catatonia ‘catatonia’ instead of catatonı´a Osteitis ‘osteitis’ instead of osteı´tis Colorectal ‘colorectal’ instead of colorrectal Halucinacio´ n ‘hallucination’ instead of alucinacio´ n Linfokina ‘lymphokine’ instead of linfocina Hematopoiesis ‘hematopoiesis’ instead of hematopoyesis Amfotericina ‘amphotericine’ instead of anfotericina

et al., 2003). The usual English tendency toward hedginess may be seen in Spanish in the epistemic use of modal auxiliary verbs expressing possibility, such as puede(n) ‘can/may,’ podrı´an(n) ‘could/ might,’ in a greater use of the semi-auxiliary verb parece(n) ‘to appear/to seem,’ or in the presence of adjectives, adverbs, and nouns related to the modals (see Table 15). The influence of English-speaking thought patterns may be equally responsible for another rhetorical strategy used abundantly today, i.e., the responsibility shifting that allows the researchers to ‘defocus’ the agents (researchers and readers) involved in the communication act and to focus on the information being transmitted. The exclusive use of a correct term in the adopting languages to the detriment of possible synonyms might equally be owed to English influence. This would be the case of the Spanish verb sugerir, which, due to the influx of its English cognate suggest, has a greater frequency of occurrence than some of its synonyms, which could express in a better form the conclusion of a scientific work (see Table 16). Pseudo-Anglicisms

Although borrowing in a strict sense consists of the transfer into a recipient language of a word together with one of its meanings, it may be that some words acquire semantic extensions beyond their original meanings when taken in by other languages. Examples of pseudo-Anglicisms found in Spanish would be the English verb relax, which has turned into a noun with the senses of ‘break’ and ‘relaxation,’ and the English noun scanner, referring to medical computedtomography scanning, which in its already hispanicized form esca´ner has expanded to encompass not only the meaning of escanografı´a ‘scanography’ but

Table 16 Rhetorico-pragmatic level Aconsejar

Apuntar a

Colegir

Dar una opinio´ n Desprenderse

Dejar entrever

Demostrar

Evocar

Evidenciar

Indicar Permitir suponer Proponer

Inferir Poner de manifiesto Recomendar

Insinuar Poner de relieve

Dar a entender Denotar Hacer pensar Mostrar Probar

also other nonmedical applications such as the scanners used in supermarkets to read barcodes. Other pseudo-Anglicisms that must count as fully native in Spanish would be blister or gating, which have never been used in British English or American English, to refer to blister pack and cardiac gating, respectively.

A Word in Closing Political and military might, economic power, and techno-scientific superiority seem to be the main factors that have brought about the role of English as a lingua franca. In addition, its overall presence in the content posted on the Internet and its nearly absolute dominance in scientific publications have turned English as a prime vehicle for the transmission of information. The vastly greater amount of research in English-speaking countries has equally led many European and Asian languages to be somewhat behind the times with respect to the plethora of English terminology being created every day. This is why much of this terminology has found a degree of acceptance in these recipient languages. Although the development of English as a world language may be seen with concern because it gives the originating cultures, namely, the British Isles and the United States of America, a certain advantage, especially in the world of science, medical loanwords may nevertheless be contemplated as a reflection of the standing progress of medicine, which needs an appropriate increase in its vocabulary to provide a fluid scientific communication, not to mention the fact that many of the borrowed Anglicisms are

Meeussen, Achille Emile (1912–1978) 759

eventually translated into the adopting languages or assimilated to their genuine linguistic rules. See also: English: World Englishes; Lexical Semantics: Overview; Lexicology; Lingua Francas as Second Languages; Medical Discourse: Hedges.

Bibliography Alcaraz Ariza M A´ (2000). Anglicismos en el lenguaje de las ciencias de la salud. Alicante: Universidad de Alicante. Alcaraz Ariza M A´ (2000). ‘Exploring medical Spanish.’ International Journal of Translation 12(1/2), 71–80. Ammon U (1998). ‘Englisch als Wissenschaftssprache der deutschsprachigen La¨ nder.’ In Ist Deutsch noch internationale Wissenschaftssprache? Berlin: De Gruyter. 205–286. Crystal D (1997). English as a global language. Cambridge: Cambridge University Press. Dirckx J A (1983). The language of medicine: its evolution, structure and dynamics. New York: Praeger. Haugen E (1950). ‘The analysis of linguistic borrowing.’ Language 26, 210–231. Hope T E (1971). Lexical borrowing in the Romance languages: A critical study of Italianisms in French

and Gallicisms in Italian from 1100 to 1900. Oxford: Blackwell. Maher J (1986). ‘The development of English as an international language of medicine.’ Applied Linguistics 7(2), 206–218. Murube J (1998). Influjo de la lengua inglesa en el espan˜ ol usado por los oftalmo´ logos. Madrid: Tecnimedia. Navarro F (2000). Diccionario crı´tico de dudas ingle´ s–espan˜ ol de medicina. Madrid: McGraw-Hill/Interamericana de Espan˜ a. Navarro F A (2001). ‘El ingle´ s, idioma internacional de la medicina: Causas y consecuencias de un feno´ meno actual.’ Panace@ – Boletı´n de Medicina y Traduccio´ n 2(3), 35–51. Available at http://www.medtrad.org/pana.html. Phillipson R (1992). Linguistic imperialism. Oxford: Oxford University Press. Salager-Meyer F, Alcaraz Ariza M A´ & Zambrano N (2003). ‘The scimitar, the dagger and the glove: Intercultural differences in the rhetoric of criticism in Spanish, French and English medical discourse (1930–1999).’ English for Specific Purposes 22(3), 223–247. Tournier J (1998). Les mots anglais du franc¸ ais. Paris: Belin. Weinrich H, Markl H, Wickler W, Heckhausen H, Lippert H, Schwabl W & Karger T (1986). ‘Die Spitzenforschung spricht Englisch – oder etwa nicht?’ In Kalverka¨mper H & Weinrich H (eds.) Deutsch als Wissenschaftssprache. Tu¨ bingen: Gunter Narr. 15–94.

Meeussen, Achille Emile (1912–1978) P Swiggers, Katholieke Universiteit Leuven, Leuven, Belgium ! 2006 Elsevier Ltd. All rights reserved.

Born on April 6, 1912 in Sint-Pieters-Jette, Achille Emile Meeussen studied classical philology in Leuven. He obtained his Ph.D. in 1938 with a thesis in the field of Indo-European comparative grammar (more specifically, on the criterion for ablaut phenomena). In the 1940s, he started publishing extensively on Flemish (Vlaams) dialects, Dutch, and also on African languages, a field which he then had recently discovered for himself. In 1950, Meeussen, after spending some time at the School of Oriental and African Studies in London, became staff member of the Muse´ e royal du Congo belge (later: Muse´ e royal de l’Afrique centrale) in Tervuren, and in 1952 he was also appointed professor in the African Institute at the University of Leuven, where he was able to form a generation of well-trained Africanists and descriptive linguists. When the African Institute was closed, Meeussen continued his teaching at the University of Leiden (1964–77); he

also taught sporadically in Lyon. He remained attached to the linguistic section of the Tervuren museum until his death on February 5, 1978. Meeussen did fieldwork on a wide variety of Bantu languages, and initiated a comprehensive project on the comparative grammar and the reconstruction of Bantu (at present the project is continued by staff members of the Tervuren museum). He also studied African literature, music, folklore, and customs. In addition, Meeussen wrote insightful papers dealing with problems of phonology, morpho(pho)nology, and syntax, concerning Dutch (his native language), Japanese, and a number of Algonquian languages (Unami or Delaware, Cheyenne, Blackfoot, Mistassini or Inland Eastern Cree; cf. Swiggers, 1983). His most important achievements, however, are in the field of Bantu linguistics: a very impressive synchronic realization is his grammar of the Rundi language (1959); and his most important comparatist publications are his Bantu grammatical (1967) and posthumously republished lexical reconstructions (1980). Meeussen’s approach was a structuralist one, and his work represents a very coherent, systematic, and highly authentic brand of structuralism, characterized

Meeussen, Achille Emile (1912–1978) 759

eventually translated into the adopting languages or assimilated to their genuine linguistic rules. See also: English: World Englishes; Lexical Semantics: Overview; Lexicology; Lingua Francas as Second Languages; Medical Discourse: Hedges.

Bibliography Alcaraz Ariza M A´ (2000). Anglicismos en el lenguaje de las ciencias de la salud. Alicante: Universidad de Alicante. Alcaraz Ariza M A´ (2000). ‘Exploring medical Spanish.’ International Journal of Translation 12(1/2), 71–80. Ammon U (1998). ‘Englisch als Wissenschaftssprache der deutschsprachigen La¨nder.’ In Ist Deutsch noch internationale Wissenschaftssprache? Berlin: De Gruyter. 205–286. Crystal D (1997). English as a global language. Cambridge: Cambridge University Press. Dirckx J A (1983). The language of medicine: its evolution, structure and dynamics. New York: Praeger. Haugen E (1950). ‘The analysis of linguistic borrowing.’ Language 26, 210–231. Hope T E (1971). Lexical borrowing in the Romance languages: A critical study of Italianisms in French

and Gallicisms in Italian from 1100 to 1900. Oxford: Blackwell. Maher J (1986). ‘The development of English as an international language of medicine.’ Applied Linguistics 7(2), 206–218. Murube J (1998). Influjo de la lengua inglesa en el espan˜ol usado por los oftalmo´logos. Madrid: Tecnimedia. Navarro F (2000). Diccionario crı´tico de dudas ingle´s–espan˜ol de medicina. Madrid: McGraw-Hill/Interamericana de Espan˜a. Navarro F A (2001). ‘El ingle´s, idioma internacional de la medicina: Causas y consecuencias de un feno´meno actual.’ Panace@ – Boletı´n de Medicina y Traduccio´n 2(3), 35–51. Available at http://www.medtrad.org/pana.html. Phillipson R (1992). Linguistic imperialism. Oxford: Oxford University Press. Salager-Meyer F, Alcaraz Ariza M A´ & Zambrano N (2003). ‘The scimitar, the dagger and the glove: Intercultural differences in the rhetoric of criticism in Spanish, French and English medical discourse (1930–1999).’ English for Specific Purposes 22(3), 223–247. Tournier J (1998). Les mots anglais du franc¸ais. Paris: Belin. Weinrich H, Markl H, Wickler W, Heckhausen H, Lippert H, Schwabl W & Karger T (1986). ‘Die Spitzenforschung spricht Englisch – oder etwa nicht?’ In Kalverka¨mper H & Weinrich H (eds.) Deutsch als Wissenschaftssprache. Tu¨bingen: Gunter Narr. 15–94.

Meeussen, Achille Emile (1912–1978) P Swiggers, Katholieke Universiteit Leuven, Leuven, Belgium ! 2006 Elsevier Ltd. All rights reserved.

Born on April 6, 1912 in Sint-Pieters-Jette, Achille Emile Meeussen studied classical philology in Leuven. He obtained his Ph.D. in 1938 with a thesis in the field of Indo-European comparative grammar (more specifically, on the criterion for ablaut phenomena). In the 1940s, he started publishing extensively on Flemish (Vlaams) dialects, Dutch, and also on African languages, a field which he then had recently discovered for himself. In 1950, Meeussen, after spending some time at the School of Oriental and African Studies in London, became staff member of the Muse´e royal du Congo belge (later: Muse´e royal de l’Afrique centrale) in Tervuren, and in 1952 he was also appointed professor in the African Institute at the University of Leuven, where he was able to form a generation of well-trained Africanists and descriptive linguists. When the African Institute was closed, Meeussen continued his teaching at the University of Leiden (1964–77); he

also taught sporadically in Lyon. He remained attached to the linguistic section of the Tervuren museum until his death on February 5, 1978. Meeussen did fieldwork on a wide variety of Bantu languages, and initiated a comprehensive project on the comparative grammar and the reconstruction of Bantu (at present the project is continued by staff members of the Tervuren museum). He also studied African literature, music, folklore, and customs. In addition, Meeussen wrote insightful papers dealing with problems of phonology, morpho(pho)nology, and syntax, concerning Dutch (his native language), Japanese, and a number of Algonquian languages (Unami or Delaware, Cheyenne, Blackfoot, Mistassini or Inland Eastern Cree; cf. Swiggers, 1983). His most important achievements, however, are in the field of Bantu linguistics: a very impressive synchronic realization is his grammar of the Rundi language (1959); and his most important comparatist publications are his Bantu grammatical (1967) and posthumously republished lexical reconstructions (1980). Meeussen’s approach was a structuralist one, and his work represents a very coherent, systematic, and highly authentic brand of structuralism, characterized

760 Meeussen, Achille Emile (1912–1978)

by methodological consciousness, and an acute sense of rigor and descriptive economy. He developed a formally based approach to Bantu tonological structures and morphosyntactic structures, anticipating aspects of autosegmental phonology (Meeussen’s name is attached to a rule by which a sequence of two high tones is changed into a sequence of high þ low, or by which two accents are reduced to a single accent). In later years, Meeussen used the formalist framework of generative linguistics (especially generative phonology), without subscribing to its psycholinguistic and metatheoretical claims. See also: Bantu Languages.

Bibliography Coupez A (1977). ‘A. E. Meeussen.’ Africa-Tervuren 23(III–IV), 57–63.

Coupez A (1980). ‘A. Meeussen (1912–1978).’ Africana Linguistica 8, 3–22. Meeussen A E (1952). Esquisse de la langue ombo. Tervuren: Muse´ e royal du Congo belge. Meeussen A E (1954). Linguı¨stische schets van het Bangubangu. Tervuren: Muse´ e royal du Congo belge. Meeussen A E (1959). Essai de grammaire rundi. Tervuren: Muse´ e royal du Congo belge. Meeussen A E (1965). Ethnolinguı¨stiek en taaltheorie. Leiden: Universitaire Pers. Meeussen A E (1967). ‘Bantu Grammatical Reconstructions.’ Africana Linguistica 3, 79–121. Meeussen A E (1971). Ele´ ments de grammaire lega. Tervuren: Muse´ e royal de l’Afrique centrale. Meeussen A E (1980). Bantu Lexical Reconstructions. Tervuren: Muse´ e royal de l’Afrique centrale. Swiggers P (1983). ‘A. E. Meeussen (1912–1978).’ International Journal of American Linguistics 49, 428–429.

Meigret, Louis (?1500–1558) D A Kibbee, University of Illinois, Urbana, IL, USA ! 2006 Elsevier Ltd. All rights reserved.

Louis Meigret is best known today for his Trette´ de la gramme`re franc¸ oeze (1550) and his work on orthographic reform (from 1531 to his death). He was born into an important family of jurists in Lyon and pursued the typical Humanist career of translator. He translated primarily from Latin, but also from Greek and Italian, focusing on works about the art of war and the natural sciences. Meigret’s first foray into linguistic analysis came in his formulation of a new orthography for the French language. Many complained of the lack of correlation between the spoken language and the written; Meigret supplied some concrete suggestions in 1531, although these were not published until 1542. For Meigret, the battle for an orthography in which each letter stood for one sound, and each sound was represented by only one letter was part of the greater Humanist program, in which reason rather than superstition would reign. Ironically for this dedicated scholar of Greek and Latin, reason meant abandoning real or imagined connections between the spelling of the ancient languages and the spelling of French. The battle cry for orthographic reform appeared in Champfleury (1529) by the printer Geofroy Tory. In the same year Tory mentioned his intention to publish ‘General Rules for the Spelling of the French

Language,’ but this work never appeared. An anonymous treatise on French orthography was published later in 1529, with the goal of bringing French writing back to its former ‘integrity.’ While the author of that treatise and Jacques Dubois thought that re-establishing the integrity of French meant reverting to Latinate forms and morphological distinctions, Meigret took a radically different tack. Meigret’s orthographic reform promoted spelling based on pronunciation, rather than on derivation and etymology. He was immediately attacked by Guillaume des Autels, with whom he had an extended public debate, and then by Jacques Peletier du Mans and The´ odore de Be`ze. The publication in 1548 of his translation of Lucian of Samosata’s Philopseudes (Le menteur) in his reformed orthography, as well as his own Trette´ de la Gramme`re Franc¸ oeze in 1550, demonstrated the scientific interest and the practical failure of his reform. Readers found his system virtually impossible to decipher. In spite of the difficulties presented by its writing system, the Trette´ de la Gramme`re Franc¸ oeze is arguably the most original and most interesting work on the French language in the 16th century. It is divided into 11 books, some quite short. The first deals with the sounds of French, and the articles. The next seven are devoted to the parts of speech (noun, pronoun, verb and participle, adverb, preposition, conjunction, interjection), the last three to prosody and punctuation. Throughout, Meigret grapples with the question

760 Meeussen, Achille Emile (1912–1978)

by methodological consciousness, and an acute sense of rigor and descriptive economy. He developed a formally based approach to Bantu tonological structures and morphosyntactic structures, anticipating aspects of autosegmental phonology (Meeussen’s name is attached to a rule by which a sequence of two high tones is changed into a sequence of high þ low, or by which two accents are reduced to a single accent). In later years, Meeussen used the formalist framework of generative linguistics (especially generative phonology), without subscribing to its psycholinguistic and metatheoretical claims. See also: Bantu Languages.

Bibliography Coupez A (1977). ‘A. E. Meeussen.’ Africa-Tervuren 23(III–IV), 57–63.

Coupez A (1980). ‘A. Meeussen (1912–1978).’ Africana Linguistica 8, 3–22. Meeussen A E (1952). Esquisse de la langue ombo. Tervuren: Muse´e royal du Congo belge. Meeussen A E (1954). Linguı¨stische schets van het Bangubangu. Tervuren: Muse´e royal du Congo belge. Meeussen A E (1959). Essai de grammaire rundi. Tervuren: Muse´e royal du Congo belge. Meeussen A E (1965). Ethnolinguı¨stiek en taaltheorie. Leiden: Universitaire Pers. Meeussen A E (1967). ‘Bantu Grammatical Reconstructions.’ Africana Linguistica 3, 79–121. Meeussen A E (1971). Ele´ments de grammaire lega. Tervuren: Muse´e royal de l’Afrique centrale. Meeussen A E (1980). Bantu Lexical Reconstructions. Tervuren: Muse´e royal de l’Afrique centrale. Swiggers P (1983). ‘A. E. Meeussen (1912–1978).’ International Journal of American Linguistics 49, 428–429.

Meigret, Louis (?1500–1558) D A Kibbee, University of Illinois, Urbana, IL, USA ! 2006 Elsevier Ltd. All rights reserved.

Louis Meigret is best known today for his Trette´ de la gramme`re franc¸oeze (1550) and his work on orthographic reform (from 1531 to his death). He was born into an important family of jurists in Lyon and pursued the typical Humanist career of translator. He translated primarily from Latin, but also from Greek and Italian, focusing on works about the art of war and the natural sciences. Meigret’s first foray into linguistic analysis came in his formulation of a new orthography for the French language. Many complained of the lack of correlation between the spoken language and the written; Meigret supplied some concrete suggestions in 1531, although these were not published until 1542. For Meigret, the battle for an orthography in which each letter stood for one sound, and each sound was represented by only one letter was part of the greater Humanist program, in which reason rather than superstition would reign. Ironically for this dedicated scholar of Greek and Latin, reason meant abandoning real or imagined connections between the spelling of the ancient languages and the spelling of French. The battle cry for orthographic reform appeared in Champfleury (1529) by the printer Geofroy Tory. In the same year Tory mentioned his intention to publish ‘General Rules for the Spelling of the French

Language,’ but this work never appeared. An anonymous treatise on French orthography was published later in 1529, with the goal of bringing French writing back to its former ‘integrity.’ While the author of that treatise and Jacques Dubois thought that re-establishing the integrity of French meant reverting to Latinate forms and morphological distinctions, Meigret took a radically different tack. Meigret’s orthographic reform promoted spelling based on pronunciation, rather than on derivation and etymology. He was immediately attacked by Guillaume des Autels, with whom he had an extended public debate, and then by Jacques Peletier du Mans and The´odore de Be`ze. The publication in 1548 of his translation of Lucian of Samosata’s Philopseudes (Le menteur) in his reformed orthography, as well as his own Trette´ de la Gramme`re Franc¸oeze in 1550, demonstrated the scientific interest and the practical failure of his reform. Readers found his system virtually impossible to decipher. In spite of the difficulties presented by its writing system, the Trette´ de la Gramme`re Franc¸oeze is arguably the most original and most interesting work on the French language in the 16th century. It is divided into 11 books, some quite short. The first deals with the sounds of French, and the articles. The next seven are devoted to the parts of speech (noun, pronoun, verb and participle, adverb, preposition, conjunction, interjection), the last three to prosody and punctuation. Throughout, Meigret grapples with the question

Meillet, Antoine (Paul Jules) (1866–1936) 761

of whether reason or usage is the best guide to grammatical rules, and provides glimpses of regional and social variation in mid-16th century French. See also: Dubois, Jacques (Sylvius) (1478–1555); French; Grammar; Greek, Modern; Italian; Latin.

Bibliography Cameron K (1979). Traite touchant le commun vsage de l’Escriture Franc¸ oise (1545). Exeter: University of Exeter Press. Freyssinet G (1998). ‘E´ criture du franc¸ ais et projets humanistes: Meigret, Peletier et quelques autres.’ Nouvelle Revue du Seizie`me Sie`cle 17, 37–54. Hausmann F J (1980). Louis Meigret humaniste et linguiste. Tu¨ bingen: Gunter Narr.

Hausmann F J (1980). Louis Meigret. Le Traite´ de la Grammaire Franc¸ aise (1550). Le Menteur de Lucien. Aux Lecteurs (1548). Tu¨ bingen: Gunter Narr. Kibbee DA (2003). ‘Louis Meigret, le parler lyonnais et les politiques de la langue franc¸ aise a` la Renaissance.’ In Dufaux G (ed.) Lyon et la de´ fense et illustration de la langue franc¸ aise. Lyon: ENS E´ ditions. 63–75. Meigret L (1972). Traite touchant le commun vsage de l’Escriture Francoise (1542). Le trette´ de la grammere franc¸ oeze (1550) Defenses de Louis Meigret touchant son Orthographie Franc¸ oeze (1550). La Reponse de Louis Meigret a l’apolojie de Iaqes Pelletier (1550. Reponse de Louis Meigret a la dezespere´ e repliqe de Glaomalis de Vezelet (1551). Gene`ve: Slatkine. Swiggers P (1997). ‘Le Trette´ de la grammere franc¸ oeze (1550) de Louis Meigret: la description et la terminologie du nom.’ In Lieber M & Hirdt W (eds.) Kunst und Kommunikation: Betrachtungen zum Medium Sprache in der Romania. Tu¨ bingen: Stauffenburg. 311–325.

Meillet, Antoine (Paul Jules) (1866–1936) T C Christy, University of North Alabama, Florence, AL, USA ! 2006 Elsevier Ltd. All rights reserved.

Antoine Meillet, whose teachers included Ferdinand de Saussure, Louis Havet, Victor Henry, James Darmesteter, and Michel Bre´ al (whom he succeeded at the Colle`ge de France), is widely known for his contributions to the fields of comparative Indo-European linguistics, classical studies, the Slavic languages, Armenian, and general linguistics. Among his many publications, those included in the Bibliography below have gone through numerous editions and translations and stand out in terms of both their contemporaneous impact and their enduring relevance. Of perhaps greatest importance in terms of contemporary relevance is the collection of his major linguistics articles published under the title Linguistique historique et linguistique ge´ ne´ rale (1958; originally published as vol. 1, 1921, vol. 2, 1936). Two articles in this collection merit special attention. In his ‘Comment les mots changent de sens’ (1906, 1958: 230–271) Meillet considers language to be first and foremost a social fact, and accordingly seeks social causes for semantic change. In ‘L’e´ volution des formes grammaticales’ (1912, 1958: 130–148) Meillet coins the term ‘‘grammaticalization’’ (1958: 133) to describe the process whereby words, through combination and habituation, lose their principal meaning to become accessories, grammatical markers. Grammaticalization is now a

central topic in linguistic research (cf., e.g., Hopper and Traugott, 1993; Ramat and Hopper, 1998). See also: Armenian; Bre´al, Michel Jules Alfred (1832–

1915); Cultural and Social Dimension of Spoken Discourse; Evolution of Semantics; Evolution of Syntax; Grammaticalization; Morphologization; Saussure, Ferdinand (-Mongin) de (1857–1913).

Bibliography Auroux S et al. (eds.) (1987). Archives et documents de la Socie´ te´ d’Histoire et d’E´ piste´ mologie des Sciences du Langage, 8. Paris: S.H.E.S.L. Auroux S (ed.) (1988). Antoine Meillet et la linguistique de son temps (vol. 2, 2 of Histoire) E´ piste´ mologie, Langage. Paris: S.H.E.S.L. Bouquet S (1987). ‘Les archives d’Antoine Meillet au Colle`ge de France.’ In Auroux S et al. (eds.) Archives et documents de la Socie´ te´ d’Histoire et d’E´ piste´ mologie des Sciences du Langage. Paris: S.H.E.S.L. 113–140. Hopper P J & Traugott E C (1993). Grammaticalization. Cambridge: Cambridge University Press. Meillet A (1903). Introduction a` l’e´ tude comparative des langues indoeurope´ ennes. Paris: Hachette. Meillet A (1908). Les dialectes indo-europe´ ens. Paris: Champion. Meillet A (1925). La me´ thode comparative en linguistique historique. Oslo: Aschehoug & Co. Meillet A (1958). Linguistique historique et linguistique ge´ ne´ rale (first published as vol. 1, 1921, and vol. 2, 1936). Paris: Champion. Ramat A G & Hopper P J (eds.) (1998). The limits of grammaticalization. Amsterdam: Benjamins.

Meillet, Antoine (Paul Jules) (1866–1936) 761

of whether reason or usage is the best guide to grammatical rules, and provides glimpses of regional and social variation in mid-16th century French. See also: Dubois, Jacques (Sylvius) (1478–1555); French; Grammar; Greek, Modern; Italian; Latin.

Bibliography Cameron K (1979). Traite touchant le commun vsage de l’Escriture Franc¸oise (1545). Exeter: University of Exeter Press. Freyssinet G (1998). ‘E´criture du franc¸ais et projets humanistes: Meigret, Peletier et quelques autres.’ Nouvelle Revue du Seizie`me Sie`cle 17, 37–54. Hausmann F J (1980). Louis Meigret humaniste et linguiste. Tu¨bingen: Gunter Narr.

Hausmann F J (1980). Louis Meigret. Le Traite´ de la Grammaire Franc¸aise (1550). Le Menteur de Lucien. Aux Lecteurs (1548). Tu¨bingen: Gunter Narr. Kibbee DA (2003). ‘Louis Meigret, le parler lyonnais et les politiques de la langue franc¸aise a` la Renaissance.’ In Dufaux G (ed.) Lyon et la de´fense et illustration de la langue franc¸aise. Lyon: ENS E´ditions. 63–75. Meigret L (1972). Traite touchant le commun vsage de l’Escriture Francoise (1542). Le trette´ de la grammere franc¸oeze (1550) Defenses de Louis Meigret touchant son Orthographie Franc¸oeze (1550). La Reponse de Louis Meigret a l’apolojie de Iaqes Pelletier (1550. Reponse de Louis Meigret a la dezespere´e repliqe de Glaomalis de Vezelet (1551). Gene`ve: Slatkine. Swiggers P (1997). ‘Le Trette´ de la grammere franc¸oeze (1550) de Louis Meigret: la description et la terminologie du nom.’ In Lieber M & Hirdt W (eds.) Kunst und Kommunikation: Betrachtungen zum Medium Sprache in der Romania. Tu¨bingen: Stauffenburg. 311–325.

Meillet, Antoine (Paul Jules) (1866–1936) T C Christy, University of North Alabama, Florence, AL, USA ! 2006 Elsevier Ltd. All rights reserved.

Antoine Meillet, whose teachers included Ferdinand de Saussure, Louis Havet, Victor Henry, James Darmesteter, and Michel Bre´al (whom he succeeded at the Colle`ge de France), is widely known for his contributions to the fields of comparative Indo-European linguistics, classical studies, the Slavic languages, Armenian, and general linguistics. Among his many publications, those included in the Bibliography below have gone through numerous editions and translations and stand out in terms of both their contemporaneous impact and their enduring relevance. Of perhaps greatest importance in terms of contemporary relevance is the collection of his major linguistics articles published under the title Linguistique historique et linguistique ge´ne´rale (1958; originally published as vol. 1, 1921, vol. 2, 1936). Two articles in this collection merit special attention. In his ‘Comment les mots changent de sens’ (1906, 1958: 230–271) Meillet considers language to be first and foremost a social fact, and accordingly seeks social causes for semantic change. In ‘L’e´volution des formes grammaticales’ (1912, 1958: 130–148) Meillet coins the term ‘‘grammaticalization’’ (1958: 133) to describe the process whereby words, through combination and habituation, lose their principal meaning to become accessories, grammatical markers. Grammaticalization is now a

central topic in linguistic research (cf., e.g., Hopper and Traugott, 1993; Ramat and Hopper, 1998). See also: Armenian; Bre´al, Michel Jules Alfred (1832–

1915); Cultural and Social Dimension of Spoken Discourse; Evolution of Semantics; Evolution of Syntax; Grammaticalization; Morphologization; Saussure, Ferdinand (-Mongin) de (1857–1913).

Bibliography Auroux S et al. (eds.) (1987). Archives et documents de la Socie´te´ d’Histoire et d’E´piste´mologie des Sciences du Langage, 8. Paris: S.H.E.S.L. Auroux S (ed.) (1988). Antoine Meillet et la linguistique de son temps (vol. 2, 2 of Histoire) E´piste´mologie, Langage. Paris: S.H.E.S.L. Bouquet S (1987). ‘Les archives d’Antoine Meillet au Colle`ge de France.’ In Auroux S et al. (eds.) Archives et documents de la Socie´te´ d’Histoire et d’E´piste´mologie des Sciences du Langage. Paris: S.H.E.S.L. 113–140. Hopper P J & Traugott E C (1993). Grammaticalization. Cambridge: Cambridge University Press. Meillet A (1903). Introduction a` l’e´tude comparative des langues indoeurope´ennes. Paris: Hachette. Meillet A (1908). Les dialectes indo-europe´ens. Paris: Champion. Meillet A (1925). La me´thode comparative en linguistique historique. Oslo: Aschehoug & Co. Meillet A (1958). Linguistique historique et linguistique ge´ne´rale (first published as vol. 1, 1921, and vol. 2, 1936). Paris: Champion. Ramat A G & Hopper P J (eds.) (1998). The limits of grammaticalization. Amsterdam: Benjamins.

762 Meillet, Antoine (Paul Jules) (1866–1936) Sebeok, Thomas A (ed.) (1966). Portraits of linguists: a biographical (Source book for the history of western linguistics 1746–1963, 2 vols). Bloomington: Indiana University Press. Sommerfelt A (1966). ‘Antoine Meillet, the scholar and the man.’ In Sebeok T A (ed.), vol. 2, 241–249.

Swiggers P (1996). ‘Meillet, Antoine.’ In Stammerjohann H (ed.) Lexicon grammaticorum: who’s who in the history of world linguistics. Tu¨ bingen: Niemeyer. 622–624. Vendryes J (1966). ‘Antoine Meillet.’ In Sebeok T A (ed.), vol. 2, 201–240.

Meinhof, Carl Friedrich Michael (1857–1944) S Pugach, Ohio State University, Lima, OH, USA ! 2006 Elsevier Ltd. All rights reserved.

Along with Diedrich Westermann, Carl Meinhof is usually considered one of the main founders of the discipline of African linguistics – Afrikanistik – in Germany (see Westermann, Diedrich Hermann (1875– 1956)). He was born on July 23, 1857, in Barzwitz, a village in the Pomeranian county of Schlawe that is now part of Poland, and died of old age on February 11, 1944, in Hamburg. He was descended from a long line of Swabian pastors, who included his father, Friedrich. His mother, Clara Giesebrecht, came from a noted intellectual family: Meinhof’s grandfather, Karl Ludwig, was a poet and director of the Grauen Kloster high school in Berlin; his great-uncle Ludwig was a prominent 19th-century poet and philosopher; and his uncle Wilhelm was an historian of the Holy Roman Empire. Meinhof was to follow both his father’s and mother’s families, becoming a minister and a scholar. He had 11 siblings, two of whom were among his first teachers as he began his education at home in Barzwitz. In 1868 Meinhof moved in with a relative in Halle, where he first attended Gymnasium, and in 1875 progressed to the University of Halle. Meinhof’s concentration was theology, and he worked under Luther expert Julius Ko¨ stlin. After transferring to the University of Erlangen, Meinhof started to study Germanistik and to conduct linguistic research. He worked briefly but intensely with philologist Rudolf von Raumer (see Raumer, Rudolf von (1815–1876)), an author who wrote on the theories of August Schleicher (see Schleicher, August (1821–1868)) and was concerned with such issues as standard orthography, phonetic change, and language origins. Raumer was interested in demonstrating a genetic connection between IndoEuropean and Semitic languages (Raumer, 1863, 1876), and his theories had an undeniable impact on Meinhof, who in his later career focused on proving relationships among the Semitic, Hamitic, Bantu (Bantoid), and Nigritic language families in Africa.

Meinhof returned to religious studies in 1877, transferring yet again to the University of Greifswald, where he passed a series of theological exams in 1878 and 1879. After sitting for his examinations, Meinhof worked briefly as a high school teacher in Wolgast and Stettin, and in 1886 took a post as pastor in the small Pomeranian town of Zizow, not far from his family home in Barzwitz. All the while he continued his studies, passing more examinations, learning Assyrian, Aramaic, and Arabic, and expanding an already solid background in Hebrew. In Zizow, Meinhof also discovered his passion for Africa and chose to abandon the study of Asian languages for African ones in order to serve Germany’s colonial interests (Pugach, 2001). He studied Duala and related Cameroonian languages with a Cameroonian named Njo Dibone, and began to teach these and other African languages to missionaries. He wrote several articles on Duala and related dialects that appeared in the Zeitschrift fu¨ r Afrikanische Sprachen, and in 1899 published the Grundriss einer Lautlehre der Bantusprachen (An introduction to Bantu phonology), which soon became a standard in the field of African linguistics. The Grundriss offered phonological descriptions of six Bantu languages, including Swahili and Duala. The book also described an Urbantu, which Meinhof distilled from knowledge of the vocabulary, morphology, and grammar of the contemporary Bantu languages (Meinhof, 1899, 1910, 1932). The concept of Urbantu was similar to that of ‘Urindogermanisch,’ a term that August Schleicher had proposed to describe the parent of the Indo-European or Indo-Germanic language family (Schleicher, 1866). Meinhof also went beyond Urbantu in the Grundriss, providing a description of Bantu noun classes that was considered the most complete in existence until the publication of Malcolm Guthrie’s work in the 1970s (see Guthrie, Malcolm (1903–1872)). After the publication of the Grundriss, Meinhof was hired to teach African languages at the Seminar fu¨ r Orientalische Sprachen in Berlin. He continued his research on Bantu there and at the Hamburg

762 Meillet, Antoine (Paul Jules) (1866–1936) Sebeok, Thomas A (ed.) (1966). Portraits of linguists: a biographical (Source book for the history of western linguistics 1746–1963, 2 vols). Bloomington: Indiana University Press. Sommerfelt A (1966). ‘Antoine Meillet, the scholar and the man.’ In Sebeok T A (ed.), vol. 2, 241–249.

Swiggers P (1996). ‘Meillet, Antoine.’ In Stammerjohann H (ed.) Lexicon grammaticorum: who’s who in the history of world linguistics. Tu¨bingen: Niemeyer. 622–624. Vendryes J (1966). ‘Antoine Meillet.’ In Sebeok T A (ed.), vol. 2, 201–240.

Meinhof, Carl Friedrich Michael (1857–1944) S Pugach, Ohio State University, Lima, OH, USA ! 2006 Elsevier Ltd. All rights reserved.

Along with Diedrich Westermann, Carl Meinhof is usually considered one of the main founders of the discipline of African linguistics – Afrikanistik – in Germany (see Westermann, Diedrich Hermann (1875– 1956)). He was born on July 23, 1857, in Barzwitz, a village in the Pomeranian county of Schlawe that is now part of Poland, and died of old age on February 11, 1944, in Hamburg. He was descended from a long line of Swabian pastors, who included his father, Friedrich. His mother, Clara Giesebrecht, came from a noted intellectual family: Meinhof’s grandfather, Karl Ludwig, was a poet and director of the Grauen Kloster high school in Berlin; his great-uncle Ludwig was a prominent 19th-century poet and philosopher; and his uncle Wilhelm was an historian of the Holy Roman Empire. Meinhof was to follow both his father’s and mother’s families, becoming a minister and a scholar. He had 11 siblings, two of whom were among his first teachers as he began his education at home in Barzwitz. In 1868 Meinhof moved in with a relative in Halle, where he first attended Gymnasium, and in 1875 progressed to the University of Halle. Meinhof’s concentration was theology, and he worked under Luther expert Julius Ko¨stlin. After transferring to the University of Erlangen, Meinhof started to study Germanistik and to conduct linguistic research. He worked briefly but intensely with philologist Rudolf von Raumer (see Raumer, Rudolf von (1815–1876)), an author who wrote on the theories of August Schleicher (see Schleicher, August (1821–1868)) and was concerned with such issues as standard orthography, phonetic change, and language origins. Raumer was interested in demonstrating a genetic connection between IndoEuropean and Semitic languages (Raumer, 1863, 1876), and his theories had an undeniable impact on Meinhof, who in his later career focused on proving relationships among the Semitic, Hamitic, Bantu (Bantoid), and Nigritic language families in Africa.

Meinhof returned to religious studies in 1877, transferring yet again to the University of Greifswald, where he passed a series of theological exams in 1878 and 1879. After sitting for his examinations, Meinhof worked briefly as a high school teacher in Wolgast and Stettin, and in 1886 took a post as pastor in the small Pomeranian town of Zizow, not far from his family home in Barzwitz. All the while he continued his studies, passing more examinations, learning Assyrian, Aramaic, and Arabic, and expanding an already solid background in Hebrew. In Zizow, Meinhof also discovered his passion for Africa and chose to abandon the study of Asian languages for African ones in order to serve Germany’s colonial interests (Pugach, 2001). He studied Duala and related Cameroonian languages with a Cameroonian named Njo Dibone, and began to teach these and other African languages to missionaries. He wrote several articles on Duala and related dialects that appeared in the Zeitschrift fu¨r Afrikanische Sprachen, and in 1899 published the Grundriss einer Lautlehre der Bantusprachen (An introduction to Bantu phonology), which soon became a standard in the field of African linguistics. The Grundriss offered phonological descriptions of six Bantu languages, including Swahili and Duala. The book also described an Urbantu, which Meinhof distilled from knowledge of the vocabulary, morphology, and grammar of the contemporary Bantu languages (Meinhof, 1899, 1910, 1932). The concept of Urbantu was similar to that of ‘Urindogermanisch,’ a term that August Schleicher had proposed to describe the parent of the Indo-European or Indo-Germanic language family (Schleicher, 1866). Meinhof also went beyond Urbantu in the Grundriss, providing a description of Bantu noun classes that was considered the most complete in existence until the publication of Malcolm Guthrie’s work in the 1970s (see Guthrie, Malcolm (1903–1872)). After the publication of the Grundriss, Meinhof was hired to teach African languages at the Seminar fu¨r Orientalische Sprachen in Berlin. He continued his research on Bantu there and at the Hamburg

Meinhof, Carl Friedrich Michael (1857–1944) 763

Kolonialinstitut, where he transferred in 1909. Meinhof also started to research African languages that were not Bantu, and published an article on west African Fulbe (Adamawa Fulfulde) that posited the language as a bridge between Africa’s Hamitic and Nigritic languages (Meinhof, 1911). This was controversial, because Meinhof’s understanding of the differences between the Hamitic and Nigritic language families was part linguistic and part racial, a fact that was not lost on linguists of the time (Schuchhardt, 1912; Sapir, 1913). Those groups that spoke Hamitic languages – a categorical designation that linguists have since dismissed – were commonly classified as racially ‘superior’ to the Nigritic speakers, who were said to have darker skin than the Hamites (Saunders, 1969). Nonetheless, Meinhof insisted that his work was purely linguistic and continued to publish on the Hamitic languages for much of his career, most notably in the Sprachen der Hamiten (Languages of the Hamites) (Meinhof, 1912). However, because of his racialized discussion of African languages, Meinhof’s relevance waned after his death, when Joseph Greenberg demonstrated that many of his theories were false and relied more on pseudoscientific biological evidence than linguistic proof (Greenberg, 1966). See also: Africa as a Linguistic Area; Bantu Languages; Guthrie, Malcolm (1903–1872); Proto-Bantu; Raumer, Rudolf von (1815–1876); Schleicher, August (1821–1868); Westermann, Diedrich Hermann (1875–1956).

Bibliography Bernhardt H (ed.) (1990). Beitra¨ ge zur Geschichte der Humboldt-Universita¨ t zu Berlin. Berlin: Der Rektor, Humboldt-Universita¨ t zu Berlin. Bourquin W, Doke C, Eiselen W W M, Lestrade G P & van Eeden B I C (1946). ‘Meinhof’s contributions to our knowledge of African languages.’ African Studies 5, 73–77. Brauner S (1995). ‘Carl Meinhof und Wilhelm Wundt.’ In Fleisch A & Otten D (eds.) Sprachkulturelle und historische Forschungen in Afrika: Beitra¨ge zum 11. Afrikanistentag, Ko¨ ln. 19–21 September 1994. Ko¨ ln: Ru¨ diger Ko¨ ppe Verlag. 59–69. Dammann E (1999). 70 Jahre erlebte Afrikanistik: ein Beitrag zur Wissenschaftsgeschichte. Berlin: D. Reimer. Doke C (1943). ‘The growth of comparative Bantu philology.’ African Studies 2(1), 41–64. Gerhardt L (1995). ‘The place of Carl Meinhof in African ¨ bersee 78, 163–175. linguistics.’ Afrika und U Greenberg J (1966). The languages of Africa. Bloomington: Indiana University Press. Hering R (2000). ‘Meinhof, Carl Friedrich Michael.’ In Biographisch: Bibliographisches Kirchenlexikon, Band XVII. Herzberg, Germany: Verlag Traugott Bautz.

Lukas J (1965). ‘Afrikanische Sprachen und Kulturen – der Hamburger Beitrag zu ihrer Erforschung.’ Mitteilungen der Geographischen Gesellschaft in Hamburg 56, 149–179. Meyer-Bahlburg H & Wolff E (eds.) (1986). Afrikanische Sprachen in Forschung und Lehre: 75 Jahre Afrikanistik in Hamburg (1909–1984). Berlin, Hamburg: Dietrich Reimer. Meinhof C F M (1910). Grundriss einer Lautlehre der Bantusprachen nebst Anleitung zur Aufnahme von Bantusprachen (2nd edn.). Berlin: Dietrich Reimer (Ernst Vohsen). Meinhof C F M (1911). ‘Das Ful in seiner Bedeutung fu¨ r die Sprachen der Hamiten, Semiten und Bantu.’ Zeitschrift der Deutschen Morgenla¨ ndischen Gesellschaft 65, 177–200. Meinhof C F M (1912). Die Sprachen der Hamiten. Berlin: L. Friedrichsen & Co. Miehe G (1983). ‘Meinhof, Carl Friedrich Michael.’ In Jungraithmayr H & Mo¨ hlig W J G (eds.) Lexikon der Afrikanistik: Afrikanische Sprachen und ihre Erforschung. Berlin: Dietrich Reimer Verlag. 161–162. Miehe G (1996). ‘Vom Verhaltnis Zwischen Afrikanistik und allgemeiner Sprachwissenschaft.’ Paideuma: Mitteilungen zur Kulturkunde 42, 267–284. Mo¨ hle H (1999). Branntwein, Bibeln, und Bananen: der deutsche Kolonialismus – eine Spurensuche (in Hamburg). Hamburg: Liberta¨ re Assoz. Pugach S (2001). Afrikanistik and colonial knowledge: Carl Meinhof, the missionary impulse, and African language and culture studies in Germany, 1887–1919. Ph.D. diss., University of Chicago. Raumer R von (1863). ‘Die Urverwandschaft der semitischen und indoeuropa¨ischen Sprachen.’ In Raumer R V (ed.) Gesammelte sprachwissenschaftliche Schriften. Frankfurt a. M. und Erlangen: Heyder & Zimmer. Raumer R von (1876). Sendschreiben an Herrn Professor Whitney, u¨ ber die Urverwandschaft der semitischen und indogermanischen Sprachen. Frankfurt a. M: Heyder & Zimmer. Reineke B & Dodt W (1986). ‘Sprache und Kultur im Werk von Carl Meinhof.’ Ethnographische-Archa¨ologische Zeitschrift 27, 455–471. Reineke B (1990). ‘Afrikanische Sprachen am Seminar fu¨ r Orientalische Sprachen.’ In Bernhardt H (ed.) Beitra¨ ge zur Geschichte der Humboldt-Universita¨ t zu Berlin. Berlin: Der Rektor, Humboldt-Universita¨ t zu Berlin. 64–73. Sanders E R (1969). ‘The Hamitic hypothesis: its origins and functions in time perspective.’ Journal of African History X(4), 521–532. Sapir E (1913). ‘Review of Carl Meinhof, Die Sprachen der Hamiten.’ Current Anthropological Literature II. Lancaster, PA: American Anthropological Association and the American Folklore Society. 21–27. Schleicher A (1866). Compendium der vergleichenden Grammatik der indogermanischen Sprachen. Kurzer Abriss einer Lautund Formenlehre der indogermanischen Ursprache, des Altindischen, Alteranischen, Altgriechischen, Altitalischen, Altkeltischen,

764 Meinhof, Carl Friedrich Michael (1857–1944) Altslawischen, Litauischen und Altdeutschen. Weimar: H. Bo¨ hlau. Schuchardt H (1912). ‘Meinhof, Carl: Die Sprachen der Hamiten, nebst einer Beigabe: Hamitische Typen von Felix von Luschan. Mit 33 Abbildungen auf 11 Tafeln

und 1 Karte. Hamburg: L. Friedrichsen, 1912. Großoktav, 256 S.’ Wiener Zeitschrift fu¨ r die Kunde des Morgenlandes 26, 407–408.

Meinong, Alexius (1853–1920) E Shay, University of Colorado, Boulder, CO, USA ! 2006 Elsevier Ltd. All rights reserved.

Alexius Meinong (Alexius von Meinong, Ritter von Handschuchsheim) was born July 17, 1853, in Lemberg, Austria (now L’viv, Ukraine), to a family of minor German nobility. He attended the Vienna Academic Gymnasium and enrolled in 1870 at the University of Vienna. After receiving a degree in history in 1874, he turned to the study of philosophy under positivist philosopher Franz Brentano. Meinong’s first academic appointment, as Privatdozent (lecturer), was at the University of Vienna in 1878. In 1882 he accepted an associate professorship in philosophy at the University of Graz. He was promoted in 1889 to full professor, the position he held until his death on November 27, 1920. In 1894 he established Austria’s first laboratory for experimental psychology at Graz. Like Brentano, Meinong viewed philosophy not as a potential answer to metaphysical questions but as an empirical science. He proposed that if relationships among cognition, perception, language and reality could be clarified, then topics such as morality and ethics and emotions could be discussed in empirical terms, ultimately resulting in a synthesis of philosophical thought. To this end, Meinong developed a theory of assumptions, a theory of evidence, a theory of value, and a theory of objects (Gegenstandstheorie). The first theory has to do with the nature of assumptions and the role they play in social phenomena, including communication. The theory of evidence concerns types of evidence used in reasoning and communicating: direct vs. indirect evidence, a priori vs. a posteriori evidence, and evidence for certainty vs. evidence for presumption. The theory of value involves emotional reactions associated with

the existence of certain objects. The theory of objects concerns the ability to think about and describe objects that do not exist, e.g., round squares, and its ramifications for the notion of meaning: ‘‘What a speaker wants to ‘say,’ or, more exactly, what he wants to speak about, is not that which his words express, but that which they mean, and that is not the content, but the object of the idea expressed by the word’’ (Meinong 1899; translated in Lindenfeld 1980, 135–136). Meinong’s work, which lies at the intersection of philosophy, psychology and linguistics, was cited in contemporary and later works by Frege, Saussure and Russell. His discussion of relationships among thought, symbol and meaning have application in the fields of cognitive linguistics and formal semantics. See also: Cognitive Linguistics; Formal Semantics; Frege, Gottlob (1848–1925); Russell, Bertrand (1872–1970); Saussure, Ferdinand (-Mongin) de (1857–1913).

Bibliography Chisholm R M (1967). ‘Alexius Meinong.’ In Edwards P (ed.) The encyclopedia of philosophy, vol. 5. New York: Macmillan. 261–263. Findlay J N (1963). Meinong’s theory of objects and values (2nd edn.). Oxford: Clarendon Press. Lindenfeld D F (1980). The transformation of positivism: Alexius Meinong and European thought, 1880–1920. Berkeley: University of California Press. Meinong A (1899). ‘U¨ ber Gegensta¨ nde ho¨ herer Ordnung und deren Verha¨ ltnis zur inneren Wahrnehmung.’ In Haller R (ed.) Alexius Meinong, Gesamtausgabe vol. II: Abhandlungen zur Erkenntnistheorie und Gegenstandstheorie. Graz: Akademische Druck- u. Verlagsanstalt. 377–480. Russell B (1904). ‘Meinong’s theory of complexes and assumptions’ (three articles). Mind 13, 204–219, 336–354, 509–524.

764 Meinhof, Carl Friedrich Michael (1857–1944) Altslawischen, Litauischen und Altdeutschen. Weimar: H. Bo¨hlau. Schuchardt H (1912). ‘Meinhof, Carl: Die Sprachen der Hamiten, nebst einer Beigabe: Hamitische Typen von Felix von Luschan. Mit 33 Abbildungen auf 11 Tafeln

und 1 Karte. Hamburg: L. Friedrichsen, 1912. Großoktav, 256 S.’ Wiener Zeitschrift fu¨r die Kunde des Morgenlandes 26, 407–408.

Meinong, Alexius (1853–1920) E Shay, University of Colorado, Boulder, CO, USA ! 2006 Elsevier Ltd. All rights reserved.

Alexius Meinong (Alexius von Meinong, Ritter von Handschuchsheim) was born July 17, 1853, in Lemberg, Austria (now L’viv, Ukraine), to a family of minor German nobility. He attended the Vienna Academic Gymnasium and enrolled in 1870 at the University of Vienna. After receiving a degree in history in 1874, he turned to the study of philosophy under positivist philosopher Franz Brentano. Meinong’s first academic appointment, as Privatdozent (lecturer), was at the University of Vienna in 1878. In 1882 he accepted an associate professorship in philosophy at the University of Graz. He was promoted in 1889 to full professor, the position he held until his death on November 27, 1920. In 1894 he established Austria’s first laboratory for experimental psychology at Graz. Like Brentano, Meinong viewed philosophy not as a potential answer to metaphysical questions but as an empirical science. He proposed that if relationships among cognition, perception, language and reality could be clarified, then topics such as morality and ethics and emotions could be discussed in empirical terms, ultimately resulting in a synthesis of philosophical thought. To this end, Meinong developed a theory of assumptions, a theory of evidence, a theory of value, and a theory of objects (Gegenstandstheorie). The first theory has to do with the nature of assumptions and the role they play in social phenomena, including communication. The theory of evidence concerns types of evidence used in reasoning and communicating: direct vs. indirect evidence, a priori vs. a posteriori evidence, and evidence for certainty vs. evidence for presumption. The theory of value involves emotional reactions associated with

the existence of certain objects. The theory of objects concerns the ability to think about and describe objects that do not exist, e.g., round squares, and its ramifications for the notion of meaning: ‘‘What a speaker wants to ‘say,’ or, more exactly, what he wants to speak about, is not that which his words express, but that which they mean, and that is not the content, but the object of the idea expressed by the word’’ (Meinong 1899; translated in Lindenfeld 1980, 135–136). Meinong’s work, which lies at the intersection of philosophy, psychology and linguistics, was cited in contemporary and later works by Frege, Saussure and Russell. His discussion of relationships among thought, symbol and meaning have application in the fields of cognitive linguistics and formal semantics. See also: Cognitive Linguistics; Formal Semantics; Frege, Gottlob (1848–1925); Russell, Bertrand (1872–1970); Saussure, Ferdinand (-Mongin) de (1857–1913).

Bibliography Chisholm R M (1967). ‘Alexius Meinong.’ In Edwards P (ed.) The encyclopedia of philosophy, vol. 5. New York: Macmillan. 261–263. Findlay J N (1963). Meinong’s theory of objects and values (2nd edn.). Oxford: Clarendon Press. Lindenfeld D F (1980). The transformation of positivism: Alexius Meinong and European thought, 1880–1920. Berkeley: University of California Press. Meinong A (1899). ‘U¨ber Gegensta¨nde ho¨herer Ordnung und deren Verha¨ltnis zur inneren Wahrnehmung.’ In Haller R (ed.) Alexius Meinong, Gesamtausgabe vol. II: Abhandlungen zur Erkenntnistheorie und Gegenstandstheorie. Graz: Akademische Druck- u. Verlagsanstalt. 377–480. Russell B (1904). ‘Meinong’s theory of complexes and assumptions’ (three articles). Mind 13, 204–219, 336–354, 509–524.

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF