Lerdahl - 2 Ways in Which Music Relates to World
Lerdahl 2 way in which music relates...
two ways in which music relates to the world
infant interplay, and expressive gestural communication. These causes are not mutually exclusive. One suggestive idea in this volume, by Steven Brown, is that music ﬁrst emerged together with language in a “musilanguage” that eventually split into the two modalities that we recognize today. The notion that music and language have the same source goes back at least to Jean-Jacques Rousseau, who wrote:
Essay: Two Ways in Which Music Relates to the World fred lerdahl
With the ﬁrst voices came the ﬁrst articulations or sounds formed according to the respective passions that dictated them . . . Thus verse, singing, and speech have a common origin. The ﬁrst discourses were the ﬁrst songs. The periodic recurrences and measures of rhythm, the melodious modulations of accents, gave birth to poetry and music along with language.2
Of all the arts, music possesses the most technical vocabulary. This state of affairs gives music theorists the ability to speak and write about music with enviable precision, but it also isolates us. Technical training in music theory is a specialized endeavor. Nonmusicians, and even musicians who are not theoretically inclined, do not easily understand us. From our isolation and their incomprehension comes the tendency to regard music as existing in a bubble, unrelated to anything else in the world. This view is surely mistaken. Here, I shall discuss two respects in which music relates to the world beyond itself: its common origin and shared structures with language, and its projection of intuitions of tension, attraction, and agency through the internalization of motion. Both aspects are fundamental to musical emotion. Music exists in complex form only in the human species, and it appears in all human societies. How did it arise? Early ethnomusicologists were concerned with this question, but in recent decades the issue has largely been neglected. A sign of recent reengagement is a rather speculative book, The Origins of Music, in which biologists, paleontologists, evolutionary psychologists, and anthropologists propose that music-making conferred an evolutionary advantage upon our distant ancestors.1 The hypothesized causes for the musical capacity include Darwinian sexual selection, synchronized group behavior, social bonding during grooming, mother-
Wallin, et al., 2001.
Brown’s evolutionary argument is very general, but it can be supported by two lines of contemporary evidence. The ﬁrst comes from the brain sciences. The neuropsychologist Isabelle Peretz has reached some telling conclusions based on patterns of behavioral deﬁcits in patients with brain lesions.3 First, musical processing divides into two broad components, rhythm and pitch. Second, musical and linguistic processing share certain deﬁcits but not others. On one hand, rhythmic processing takes place in the same areas of the brain for both language and music. On the other, lexical retrieval and syntax in language and pitch processing in music are activated in different areas of the brain. Contour recognition appears to take place in a different brain area than interval recognition and to precede it in processing, so that tone-deaf people are usually able to speak with normal contour but contour-deaf people are necessarily tone-deaf. These conclusions are supported in part by new imaging techniques that track local brain activation. The second line of evidence comes from theoretical accounts of linguistic and musical cognitive capacities. The
Rousseau  1966, 50. Peretz 1993 and Patel & Peretz 1997.
music theory spectrum 25 (2003)
linguistic capacity has three broad components: semantics, syntax, and phonology. Music does not, except peripherally, have semantics in a linguistic sense, which includes lexical items as well as concepts such as reference and entailment. Nor does music have a speciﬁcally linguistic syntax, which includes parts of speech, labeled phrase structures, negation, anaphora, and so forth. Rather, the linguistic component that most resembles music is phonology, which, like music, concerns the organization of sound in time. The sounds of sentences break up into units of phrases and words; these units decompose into patterns of stressed and unstressed sounds and of long and short sounds, and they form rising and falling contours. All of these phonological features have musical counterparts. In a recent article I develop these parallels through a treatment of the sounds of a short poem, “Nothing Gold Can Stay” by Robert Frost, entirely as if they were musical sounds, ignoring their meaning and syntax.4 The sounds of the poem are put through the grouping, metrical, and reductional components of Lerdahl & Jackendoff 1983 (hereafter GTTM), and through a newly devised method for the derivation of contour. The analytic procedure relies on aspects of generative phonological theory, speciﬁcally the prosodic hierarchy, stress theory, and contour theory.5 Brieﬂy, the prosodic hierarchy describes the grouping of speech sounds into the levels of the syllable, word or clitic group, phonological phrase, intonational phrase, and utterance. The stress theory uses a notation similar to GTTM’s metrical grid and represents hierarchical patterns of syllabic stress. Stresses are assigned cyclically over the prosodic groupings. After these structures are established, the model assigns metrical structure by ﬁnding the optimal match between a permissible metrical grid and the stress pattern, essentially as 4 5
Lerdahl 2001a. For prosodic hierarchy, see Hayes 1989; for stress theory, Liberman & Prince 1977; and for countour theory, Pierrehumbert 1980 and Ladd 1996.
in the musical case. Syllables are placed not only to match stress and grid but also to maximize, through relative distances between attack points, the perceptual projection of the constituents of the prosodic hierarchy. In this way long and short durations are assigned to syllables. It may be objected that language and even metered poetry are not spoken with periodicity between metrical accents. However, limericks and many short verses are recited with great metrical regularity,6 and music is never played by human performers with complete isochrony.7 The difference is one of degree. Periodic meter is an idealized mental construct for both music and poetry. The derivation of contour follows largely from the stress grid, since the perception of relative stress is primarily a result of relative pitch height, not of intensity, as one might suppose.8 Following intonational theory and data,9 which establish focal pitches usually near the onset of syllables even though pitch height continuously modulates, the model posits four levels of tone height, with glides assumed between levels. In other languages the treatment of pitch height might vary. Within the four-level framework, pitch height is assigned via the stress grid from global to local levels, guided by a few paradigmatic shapes.10 The addition of contour to the metrical and durational assignments yields the normative realization of the poem in musical notation shown in Example 1. Contained within this seemingly transparent notation are the structures of the prosodic hierarchy, phonological stress, the metrical grid, duration, and pitch height.11 6 7 8 9 10 11
Oehrle 1989. Gabrielsson 1999. Handel 1989. Reviewed in Ladd 1996. This method bears comparison to the pitch-contour tradition in music theory, in particular the contour reduction algorithm in Morris 1993. The phonologist William Idsardi recently apprised me of Frost’s reading of this poem, recorded in Paschen and Mosby 2001. Frost’s rendition is extremely close to that represented in Example 1.
two ways in which music relates to the world
⁄Ł Ł Ł Ł
Ł Ł ý
Na-ture’s ﬁrst green is
¹ ¹ Ł Ł Ł Ł 2
Her hard- est hue
Ł Łý to
¹ ¹ Ł Ł Ł Ł 2
Then leaf sub-sides to
So E- den sank to
Ł Ł Ł ¹ ¹ Ł Ł Ł Ł Ł Ł ý
Her ear- ly leaf’s a
⁄ ¼ ý ¹ ¹ Ł Ł Ł Ł Ł Ł ý ¹ ¹ Ł Ł Ł Ł Ł Ł ý ¹ ¹ Ł Ł
But on- ly so an hour.
Ł Ł Ł Ł ý ¼ ý Ł Ł Ł Ł Ł ý ¼ ý
So dawn goes down to
Noth-ing gold can stay.
example 1. Metrical, durational, and contour realization of “Nothing Gold Can Stay,” by Robert Frost.
From a musical perspective there is another step to take. Traditional poetic analysis treats verbal recurrences as simple rhyming patterns: aabb, abab, and so forth. Music theory, in contrast, has a highly developed approach to recurrence in the form of prolongational structure. As in my theory of timbral prolongations,12 prosodic prolongational structure is derived from global to local time-span reductional levels of syllabic prominence. Example 2 illustrates this for the ﬁrst couplet. Noteheads signify relative structural importance. In an adaptation of GTTM’s threefold classiﬁcation, dashed slurs represent the strong prolongation of rhyme, dotted slurs the weak prolongation of alliteration or assonance, and solid slurs the progression of nonrepetition. The graph shows not isolated instances of alliteration and rhyme, as in standard poetic analysis, but the richer relationship of partial repetitions nested within rhymes—“green” is to “gold” as “hue” is to “hold.” Note that it is timbral similarity rather than pitch that is connected prolongationally, for “gold” is in the highest pitch category while “hold” is in the lowest. This approach can be extended to the poem as a whole. Incidentally, text setting is a rich source of evidence for the interface between music and poetry.13 The present ap12 13
Lerdahl 1987. Ruwet 1972, Jackendoff 1989, Halle & Lerdahl 1994, and Hayes & Kaun 1996.
proach can contribute in turn to the study of text setting. It should also be remarked that this treatment extends to the analysis of nonpoetic speech, with the proviso that in ordinary speech there is little regularity in phonological stress and syllabic repetition, so that its metrical and prolongational structures are attenuated. To summarize the preceding, music and the the sounds of language share more organization than has commonly been recognized. The subcomponents of this organization correspond to the pattern of neuropsychological evidence mentioned earlier. Example 3 gives the hypothesized overall picture: those brain modules that process rhythm, contour, and timbral relationships are the same in music and language, while those that process purely pitch-intervallic structures and purely linguistic syntax and semantics occupy different parts of the brain. The convergence between cognitive theory and neuroscientiﬁc evidence calls for further investigation. The most plausible explanation for this convergence is that music and language share the same evolutionary roots, in the form of pre-musical and pre-linguistic communicative and expressive auditory gestures involving shapes of grouping, stress, duration, contour, and timbre. We still communicate with infants and higher mammals in this manner. These elementary shapes appear to lie at the basis of expressive utterance in language and of musical expression. With evolution came specialization. Music and language diverged in
⁄ Ł Na-
music theory spectrum 25 (2003)
example 2. Prolongational structure of the ﬁrst couplet of “Nothing Gold Can Stay,” by Robert Frost.
exclusively musical structures pitches & intervals scales harmony & counterpoint tonality pitch prolongations tonal tension & attraction
common structures durational patterns grouping (prosodic hierarchy) stress (contextual salience) metrical grids contour timbral prolongations
exclusively linguistic structures syntactic categories & relations word meaning (lexicon) semantic structures (reference, truth conditions . . .) phonological distinctive features (etc.)
example 3. Hypothesized brain organization of musical and linguistic structures. their most characteristic features: pitch organization in music, and word and sentence meaning in language. Poetry straddles this evolutionary divergence by projecting, through the addition to ordinary speech of metrical and timbral patterning, its common heritage with music. Let us now consider a second way in which music relates to the world beyond itself. Drawing on philosophical and linguistic work by Lakoff & Johnson (1980), Jackendoff (1982), and others, the cognitive scientist Steven Pinker writes:
I have reached a similar conclusion about music from a different line of reasoning. In the early 1980s, it was established empirically that music listeners of varied training and background make essentially identical judgments about perceived distances of pitches, chords, and regions from a given tonic.15 This striking discovery motivated me both to seek a theoretical explanation for regular patterns in the data and to develop a model that would quantify the qualitatively stated stability conditions in GTTM’s conception of prolongational analysis. This research agenda culminated in Tonal Pitch Space,16 which presents an algebraic model of the perceived distances of pitches, chords, and regions from one another. These distances are mapped onto multidimensional geometries that have precedent in the music-theoretic literature. In addition to providing an explanatory framework for the data, the model is used to trace event locations and paths at multiple prolongational levels, thereby conveying in
Location in space is one of the two fundamental metaphors in language . . . The other is force, agency, and causation . . . Many cognitive scientists have concluded from their research that a handful of concepts about places, paths, motions, agency, and causation underlie the literal or ﬁgurative meanings of tens of thousands of words and constructions, not only in English but in every other language that has been studied . . . These concepts and relations appear to be the vocabulary and syntax of mentalese, the language of thought.14 14
Pinker 1997, 354–5.
Krumhansl 1990. Lerdahl 2001b.
two ways in which music relates to the world
cause of the visceral sense of the ebb and ﬂow of musical tension.20 Recall Pinker’s statement: “Location in space is one of the two fundamental metaphors in language. The other is force, agency, and causation.” The theory of tonal attraction brings force into the picture of musical space and motion. Like a spaceship moving among the moons of Jupiter, a melody or chord progression moves in a certain direction but is affected in its velocity and direction by the relative gravitational or attractive force of other pitches and chords. A neighboring ornament may have little effect on its motion, but a tonic has considerable mass and may bring the tonal spaceship to rest. But what of agency and causation? Pinker refers to a classic experiment by Heider & Simmel (1944), in which they made a cartoon ﬁlm using three dots that were perceived by subjects as moving not as inanimate objects but as animate agents. Pinker writes:
a speciﬁc manner the otherwise vague intuition that listening to a piece of music is like taking a journey. When allied to words, pitch-space paths take on a narrative dimension as well. The pitch-space theory also enables the prediction of patterns of tension and relaxation as events unfold. Four conditions are needed to make valid predictions. First, there must be a component that derives and represents hierarchical event structure, since tension is judged hierarchically more than sequentially.17 This goal is accomplished by an improved version of GTTM’s prolongational analysis. Second, there must be a calculation of the perceived distance between any two chords, something the model does with great accuracy. Third, there must be a treatment of surface or sensory dissonance. Although this topic has been studied extensively by psychoacousticians, its behavior in musical contexts is complex, and here the theory settles for an approximate implementation. Fourth, there must be a model of melodic and harmonic attractions. The theory succeeds in this goal, subject to computational ﬁne-tuning from experimental evidence that is only beginning to become available.18 Carol Krumhansl and I have undertaken an ongoing empirical study of the predictions of the tension model over a wide range of diatonic and chromatic music. The correlations between predictions and data are generally very high, and they permit detailed and illuminating interpretations about listeners’ responses.19 According to this theoretical and empirical perspective, then, not only the linguistic but also the musical capacity employs space and motion in a constitutive way. This employment is not just cognitive in a disembodied sense but is a 17 18 19
This conclusion is sustained by empirical data on hierarchical and sequential predictions, as reported in Lerdahl, et al., 2000. See also Larson 2002 and Margulis 2003. A preliminary version of this research appears in Lerdahl and Krumhansl 2003. For a historical review of music theories of tonal motion, tension, and attraction, see Rothfarb 2002.
Agents are recognized by their ability to violate intuitive physics by starting, stopping, swerving, or speeding up without an external nudge, especially when they persistently approach or avoid some other object. The agents are thought to have an internal and renewable source of energy, force, impetus, or oomph, which they use to propel themselves, usually in the service of a goal.21
Similarly, a melody or chord progression does not simply follow the inertial path of least resistance. It would be dull and would quickly come to a stop unless enlivened by motion away from places that pull it toward rest. Such motion works against inertia and seems to be caused by an animate agent. Furthermore, such motion causes an emotional response. Echoing Pinker, the neurologist Antonio Damasio writes: You can ﬁnd the basic conﬁgurations of emotions in simple organisms, even in unicellular organism . . . You can do the same thing with a simple chip moving about on a computer screen. Some jagged fast movements will appear “angry,” harmonious but explosive jumps will 20 21
See Brower 2000. Pinker 1997, 322.
music theory spectrum 25 (2003)
look “joyous,” recoiling motions will look “fearful.” A video that depicts several geometric shapes moving about at different rates and holding varied relationships reliably elicits attributions of emotional state from normal adults and even children. The reason why you can anthropomorphize the chip or an animal so effectively is simple: emotion, as the word indicates, is about movement, about externalized behavior, about certain orchestrations of reactions to a given cause, within a given environment.22
Here is a central source of musical emotion. We internalize the motion of pitches and chords in reaction to contextual tonal forces in musical space. We attribute agency and causation to musical motions that violate intuitive physics and inevitability to motions that yield to musical inertia and force. The character of the musical motions, which is shaped also by their temporal realization, mirrors equivalent motions in the “real” physical world. We map speciﬁc musical motions onto speciﬁc emotional qualities, again in reﬂection of real-world equivalences. This argument about musical space, motion, force, agency, and emotion rejoins the earlier discussion about the origin of “musilanguage” in expressive auditory gestures. But language lacks pitch structure except in the most rudimentary sense. Perhaps music is the quintessentially emotional art because its elaborate pitch structures so richly and precisely reﬂect motion, force, and agency, and therefore emotions, in the outer world. references Brower, C. 2000. “A Cognitive Theory of Musical Meaning.” Journal of Music Theory 44: 323–79. Brown, S. 2001. “The ‘Musilanguage’ Model of Music.” In The Origins of Music. Edited by N. L. Wallin, B. Merker, and S. Brown. Cambridge: MIT Press. Damasio, A. 1999. The Feeling of What Happens: Body and Emotion in the Making of Consciousness. New York: Harcourt Brace. 22
Damasio 1999, 70.
Gabrielsson, A. 1999. “The Performance of Music.” In The Psychology of Music. Edited by D. Deutsch. Second edition. New York: Academic. Halle, J., and F. Lerdahl. 1994. “A Generative Textsetting Model.” Current Musicology 55: 3–26. Handel, S. 1989. Listening. Cambridge: MIT Press. Hayes, B. 1989. “The Prosodic Hierarchy in Poetry.” In Phonetics and Phonology: Rhythm and Meter. Edited by P. Kiparsky and G. Youmans. New York: Academic. Hayes, B., and A. Kaun. 1996. “The Role of Phonological Phrasing in Sung and Chanted Verse.” The Linguistic Review 13: 243–303. Heider, F., and M. Simmel. 1944. “An Experimental Study of Apparent Behavior.” American Journal of Psychology 57: 243–59. Jackendoff, R. 1982. Semantics and Cognition. Cambridge: MIT Press. ———. 1989. “Rhythmic Structures in Music and Language.” In Phonetics and Phonology: Rhythm and Meter. Edited by P. Kiparsky and G. Youmans. New York: Academic. Krumhansl, C. L. 1990. Cognitive Foundations of Musical Pitch. New York: Oxford University Press. Ladd, D. R. 1996. Intonational Phonology. Cambridge: Cambridge University Press. Lakoff, G., and M. Johnson 1980. Metaphors We Live By. Chicago: University of Chicago Press. Larson, S. 2002. “Musical Forces, Melodic Expectation, and Jazz Melody.” Music Perception 19: 351–85. Lerdahl, F. 1987. “Timbral Hierarchies.” Contemporary Music Review 1: 135–60. ———. 2001a. “The Sounds of Poetry Viewed as Music.” In The Biological Foundations of Music. Edited by R. J. Zatorre and I. Peretz. Annals of the New York Academy of Sciences 930: 337–54. ———. 2001b. Tonal Pitch Space. New York: Oxford University Press. Lerdahl, F., and R. Jackendoff. 1983. A Generative Theory of Tonal Music. Cambridge: MIT Press.
two ways in which music relates to the world Lerdahl, F., and C. L. Krumhansl. 2003. “La teoría de la tensión tonal y sus consecuencias para la investigación musical.” In Los últimos diez años en la investigación musical. Edited by J. Martín Galán & C. Villar-Taboada. Valladolid: Servicio de Publicaciones de la Universidad de Valladolid. Lerdahl, F., C. L. Krumhansl, J. Fineberg, and E. Hannon. 2000. “Modeling Tonal Tension and Attraction.” Paper given at Toronto 2000: Musical Intersections. Liberman, M., and A. Prince. 1977. “On Stress and Linguistic Rhythm.” Linguistic Inquiry 8: 249–336. Margulis, E. H. 2003. “Melodic Expectation: A Discussion and Model.” Ph.D. dissertation, Columbia University. Morris, R. D. 1993. “New Directions in the Theory and Analysis of Musical Contour.” Music Theory Spectrum 15: 205–28. Oehrle, R. T. 1989. “Temporal Structures in Verse Design.” In Phonetics and Phonology: Rhythm and Meter. Edited by P. Kiparsky and G. Youmans. New York: Academic. Paschen, E., and R. P. Mosby, eds. 2001. Poetry Speaks. Naperville, IL: Sourcebooks. Patel, A. D., and I. Peretz. 1997. “Is Music Autonomous from Language? A Neuropsychological Appraisal.” In Perception and Cognition of Music. Edited by I. Deliège and J. A. Sloboda. Hove, UK: Psychology Press. Peretz, I. 1993. “Auditory Agnosia: A Functional Analysis.” In Thinking in Sound: The Cognitive Psychology of Human Audition. Edited by S. McAdams and E. Bigand. Oxford: Oxford University Press. Pierrehumbert, J. 1980. “The Phonology and Phonetics of English Intonation.” Ph.D. dissertation, Massachusetts Institute of Technology. Pinker, S. 1997. How the Mind Works. New York: Norton. Rothfarb, L. 2002. “Energetics.” In The Cambridge History of Music Theory. Edited by T. Christensen. Cambridge: Cambridge University Press. Rousseau, J.-J.  1966. Essai sur l’origine des langues. In On the Origin of Language. Translated by J. H. Moran
and A. Gode. Chicago: University of Chicago Press, 1966. Ruwet, N. 1972. Langage, musique, poésie. Paris: Seuil. Wallin, N. L., B. Merker, and S. Brown, eds. 2001. The Origins of Music. Cambridge: MIT Press.