Relational Information in Memory for Music:

The Interaction of Melody, Rhythm, Text and Instrument

Kate StevensÝ, J. Devin McAuleyÝÝ, & Michael S. HumphreysÝÝ

Ý Department of Psychology, University of Western Sydney, Macarthur (kj.stevens@uws.edu.au)

ÝÝ School of Psychology, The University of Queensland

Abstract

A controversial issue in the music cognition literature is whether the melody and lyrics of a song form an integrated representation in memory. A related issue concerns the nature of the representation of melody and rhythm (i.e., is the rhythm of a familiar melody stored separately from the pattern of pitches that make up that melody?). Research examining recognition memory for relational information has led to seemingly disparate conclusions and, upon close examination, one realizes that the opposing claims of separate versus integrative representation of musical components are often based on similar data. This confusion in the literature partially stems from the lack of an adequate theory to interpret the data. In this paper, we demonstrate that music recognition data can be usefully re-interpreted within a theoretical framework proposed by Humphreys (1976) to assess context effects in verbal episodic memory. Using this framework, we discover that many of the music recognition studies have omitted critical experimental conditions which are necessary to distinguish between separate and integrative representation. Moreover, it becomes apparent that there is a naive use of the terms separate and integrative that pervades the literature and hinders clear interpretation of the data. As an alternative, we propose an associative model of melody recognition which distinguishes between composite, conjunctive, and integrative representation of musical components.

Introduction

During the last 25 years there has been increasing interest in the study of music from a psychological perspective (Aiello, 1994; Deliège & Sloboda, 1996; Krumhansl, 1991; Tighe & Dowling, 1993). There are a number of reasons for this including the hierarchical and definable structure of Western tonal music, our expansive knowledge of the conventions and rules of musical composition, the lack of ambiguity in music relative to linguistic stimuli, the importance of expectancy in temporal patterns in general and music in particular, and the relative ease of construction and manipulation of musical patterns. In short, music is an ideal stimulus to use in studies of memory for complex, novel, hierarchical, structured, temporal patterns. Yet, there have been relatively few studies of musical memory. This is particularly evident when one surveys the vast field of human memory research and discovers mainly studies of word recognition and recall. Most notably, those studies that have examined musical memory have rarely drawn on contemporary models of human memory to help explain the processes which mediate recognition or recall of musical events (e.g. Cuddy, Cohen & Miller, 1979; Dowling, 1978; Wallace, 1994). Recent work examining relational properties of musical memory provides an important bridge between the music and non-music areas of human memory research.

Relational information in music recognition can take a variety of forms including the relationship in which text and a pitch sequence or melody stand to one another. For example, Crowder, Serafine & Repp (1990) have demonstrated the importance of melody when attempting to recognize (or recall) the lyrics of a song and, inversely, the importance of lyrics, even nonsense lyrics, when recognizing a melody. In a similar way, the relation between a melody and the name or label associated with that melody can be manipulated and assessed; for example, the association between the opening melodic theme of a particular musical composition and knowledge of the label, Trumpet Voluntary. The instrument on which a melody or theme is played can provide yet another instance of relational and contextual information. Finally, the rhythm through which a melody is expressed or, indeed, the melody through which a rhythm is expressed, are also examples of relational information (e.g. Handel, 1989; Jones, 1993; Longuet-Higgins, 1976). From this perspective, musical components include pitch, rhythm, text as lyric, text as label, and instrument timbre. The control and flexibility afforded by music composed according to Western tonal conventions permits the combination and re-combination of the various components and, subsequently, the representation of musical components in auditory memory can be examined.

How to Examine Relational Information in Human Memory

When we recognize a familiar melody there is always some change in the context in which that melody is remembered compared with the context in which it was first heard. For example, Twinkle, Twinkle, Little Star can be sung, played on a piano, listened to on the radio, and so on. Similarly, it can be experienced outdoors, in a car, or in a recital hall. An obvious question, with respect to the aims of the present paper, concerns our ability to remember contextual (relational) information that is episodic. We can ask the question: how does relational information influence the ability to recognize and make episodic judgments about familiar melodies? Is it easier to recognize the melody of Twinkle, Twinkle, Little Star when presented with the correct lyrics than when presented with different lyrics? Is it easier to recognize that we heard it yesterday when it is played on the same instrument as it was played on yesterday?

Laboratory studies which have addressed the precise role of relational information in recognition have used mainly verbal materials (e.g., Humphreys, 1976; Light & Carter-Sobell, 1970; Tulving & Thomson, 1971). We can use these studies to assess the role of relational information in music. Specifically, Humphreys (1976) used an episodic recognition paradigm to gauge the extent to which memory representations distinguish the paired presentation of two words (relational information) from the separate occurrence of each word (item information). The hypothesis that subjects use relational information about pairs of words in making yes/no recognition-memory decisions about single words is called the relational information hypothesis.

To test the relational hypothesis, subjects are presented with a list of to-be-remembered target words paired with some contextual material (usually other words). Memory for the to-be-remembered word can then tested with some of the material which comprised the study trial context, with new material not previously presented to the participant, or without any material (Humphreys, 1976, p. 221). In a seminal study, Light and Carter-Sobell (1970) paired adjectives with target nouns such as "traffic jam" and then embedded the adjective-noun combination in a sentence. At test, recognition of the target nouns was tested in either the same context as study ("traffic jam") or in a new context ("strawberry jam"). Light and Carter-Sobell (1970) found that recognition of the target nouns was best when they were paired with the same adjective as at study. Humphreys (1976) observed that the context effect reported by Light and Carter-Sobell did not seem to depend on the meaning of the new adjective-noun combination (as would be predicted by a differential-encoding hypothesis), but instead was due to pairing the target noun with a new adjective (independent of the resulting meaning). Thus, the context effect in word recognition seems to be due to subjects' use of relational information.

It is possible to classify and organize music recognition studies according to the same paradigm that Humphreys has used to test the relational hypothesis. Moreover, the melody recognition data can be re-interpreted to address the controversial issue of separate versus integrated representation of melody and lyrics (and melody and rhythm). The nature of instructions imparted to participants in the study phase of music recognition experiments is one dimension along which the studies can be organized. There are at least three different types of study instructions. The first possibility is that participants listen to each sequence and judge whether the components of the pattern belong or fit together; for example, is Label a an appropriate title for Melody A? This emphasises the existence of components but requires an assessment of the way they relate to one another (e.g. Stevens & Humphreys, Experiment 2). The second possibility is that participants attend to the whole pattern (Lyric a + Melody A) and rate some quality of the sequence such as familiarity (e.g. Crowder et al., 1990; Serafine et al., 1984, Expt 1; Stevens & Humphreys, Expt 1). A third possibility is that participants ignore one component or item, such as rhythm, and attend to the other component (e.g. Jones & Ralston, 1990; Serafine et al., 1984, Expt 2).

At test, there are at least two different kinds of retrieval tasks which have been used to provide information about the nature of the memory representation for the components. Figure 1 summarises the way in which musical components have been presented in one form at study and re-combined at test. For example, Label a can be presented originally with Melody A and, in a subsequent test phase, re-combined with Melody B. Text can refer to a label or lyric; alternatively the first component could be rhythm or instrument.

Figure 1: Components such as rhythm and melody paired originally in a study phase can be recombined at test to examine the effect of relational information and context on episodic memory for auditory sequences.

The first retrieval task taps relational information and involves pair recognition: participants judge whether patterns are intact (Label a as presented with Melody A) or rearranged (Label a re-combined with Melody B). Where the response is rearranged, participants make a further judgment about the pair and identify components in terms of which are old and/or new (not presented at study). A second retrieval task involves simply making independent old/new item judgments which are then used to generate pair judgements.

In order to assess the issue of separate versus integrated representation of musical components, studies of melody recognition have essentially focussed on three questions. First, can listeners make accurate intact pair judgments? Second, do listeners show an intact pair advantage in melody recognition? Finally, what is the effect of the "oldness" of one component on the recognition of the other? It has been argued by Serafine and colleagues (Crowder et al., 1990; Serafine et al., 1984; 1986) that the observed intact pair advantage in melody recognition and listeners' ability to make intact pair judgments provides evidence for the integrated representation of melody and lyric. Similar arguments have been made for the integrated representation of melody and rhythm (Jones & Ralston, 1991). However, these conclusions are subject to re-interpretation in light of precise definitions of what is meant by, and required for, integrated representation of musical components.

An Associative Model of Melody Recognition

The issue of separate versus integrative representation can be re-examined using an associative model of melody recognition based on the class of global matching models of verbal memory (Gillund & Shiffrin, 1984; Humphreys, Bain, & Pike, 1989; Raaijmakers & Shiffrin, 1981). In the proposed model, it is assumed that long-term memory for music contains information about specific musical components (melodies, rhythms, instrumental timbres, text as lyric, and text as label) as well as information about associations between components (pair information). For simplicity, we assume that componential and associative information are distinct in memory and can be represented as local codes. That is, there are distinct memory units for musical components and learned associations between components. Thus, during the study phase of a music recognition experiment, memory units are formed for components (e.g., a, b, c, A, B, C) and possibly for associations between pairs of components (aA, bB, and cC). At test, subjects are presented with a combination of memory cues, such as a melody paired with a label (xX), and they have to make a recognition judgment based on those cues. In the model, it is assumed that old/new decisions are based on the integration of activity over all of the memory units in response to the set of cues used to retrieve information from memory. The integrated activity over the memory units represents a global familiarity score. If the familiarity score is above a criterion value (or threshold), the model responds "old", otherwise it responds "new". Familiarity, and consequently whether the model responds old or new, is determined by the strength of the connections between the cues and the memory units.

Figure 2: Schematic diagram of a simple connectionist network depicting three types of memory representations, composite, conjunctive, and integrative.

Composite Representation

The five test conditions in the melody recognition studies can be regarded as different combinations of cues used to retrieve information from memory. One possibility is that relational information is only represented compositely. That is, memory units are formed for the separate components (a, b, c, d, A, B, C, D) but no item-to-item associations are made (aA, bB, cC, and dD). As a result, the familiarity score for the intact (aA) and rearranged (aB) pairs is the same. Since the components are stored independently, the presence of an old label increases the familiarity score independent of whether or not the melody is old or new. Thus, the effect of one component on the recognition of the other is additive and results in a bias shift. If relational information is represented compositely, then intact pair judgments should be poor, there should be no intact pair advantage in melody recognition, and the "oldness" of one component should only bias old/new judgments of the other component.

Recent experiments provide some empirical support for composite representation (see Stevens & Humphreys, Experiment 2, Table 1). Tables 1 and 2 provide a summary of data from music recognition studies that have examined relational information. Table 1 shows studies reporting the mean proportion of intact pair judgments in associative recognition for the five test conditions: aA (intact), aB (rearranged), aX (old Component/new Melody), xA (new Component/old Melody), and xX (new Component/new Melody). Table 2 consists of data reporting the mean proportion of "old melody" responses for the same five test conditions. Bold values in both tables refer to hit rates.

Table 1: Proportion of "intact" responses

Components:

Serafine et al (1984)

Expt 1

"Listen, later tested"

Stevens & Humphreys (1994)

Expt 1

"Rate familiarity of melody"

Stevens & Humphreys (1994)

Expt 2

"Rate how well label and melody match"

Intact (old pair)

.85

.67

.60

Rearranged (old pair recombined)

.39

.43

.54

Old component / New melody

.25

.38

.39

New component / Old melody

.06

.12

.02

New component / New melody

.07

.04

.02

Table 2: Proportion of "old melody" responses

Serafine

(1984)

Expt 2

"Listen to melody"

Jones (1991)

Expt 1

"Identify melody, ignore rhythm"

Jones (1991)

Expt 2

"Identify melody, ignore rhythm"

S & H ('94)

Expt 2

"Rate how well label and melody match"

Samson & Zatorre (1993)

"Listen to melody"

Controls

Samson & Zatorre

(1993)

 

LT Lesion

Intact

.84

.86

X

.91

.86

.80

Rearranged

.64

.58

X

.91

.73

.66

Old/New mel

.57

.11-.32

X

.76

.64

.65

New/Old mel

.23

X

.63-.81

.79

.65

.53

New/New mel

.20

X

.18-.41

.47

.35

.55

In Stevens & Humphreys Experiment 2, short, tonal, novel melodies were paired with labels such as Babbling Brook and participants rated how well the labels fit the melody; this instruction forced a relational judgment. Intact, rearranged and new melody/label pairings were then presented in a recognition test involving both pair and item judgments. The results are reported under the heading Stevens & Humphreys (Exp 2) in Table 1. Consistent with composite representation, intact pair recognition was poor. The proportion of intact judgments was only 0.6 for intact pairs compared with 0.54 for rearranged pairs. Similarly, intact pairs did not seem to leverage the recognition of old melodies. For both the intact and rearranged pairs, the hit rate for old (studied) melodies was 0.91. Overall, the effect of an old label on melody judgments was to increase subjects' tendency to respond that the melody was old (a bias shift). In general, subjects had a strong bias to respond old, but this was also mediated by the presence of an old label. Consider that the false alarm rate to new melodies was 0.76 in the context of an old label, compared with 0.47 in the context of a new label.

Conjunctive Representation

A second possibility is that a pair of components forms an association or binding in memory. Item information still exists, but memory retrieval is assisted by relational information such as the pairing of Lyric a with Melody A. In the associative model, the presence of item-to-item associations means that, in response to an intact pattern (aA), the familiarity score reflects the activation of units a, A, as well as the binding aA. In contrast, the familiarity score for a rearranged pair (aB) includes only the activity of the units a and B, since no association has been formed between a and B.

If relational information is represented conjunctively, then subjects should be able to distinguish intact and rearranged pairs and there should be an intact advantage in melody recognition. A good example of an intact advantage is provided by Serafine, Crowder & Repp (1984), Experiments 1 and 2. The data from Experiment 1 (see Table 1) show a clear ability to distinguish old songs (intact lyric/melody pairs) from new songs (rearranged), as evidenced by the 0.85 hit rate for intact songs compared with the 0.39 false alarm rate for rearranged songs. Similarly, the data from Experiment 2 (see Table 2) demonstrate a distinct advantage for recognizing a melody when it is paired with the same lyric as during study. Compare the hit rate of 0.86 for intact pairs with the hit rate of 0.64 for rearranged pairs. As discussed earlier, no such intact advantage was observed in the Stevens & Humphreys study.

Integrative Representation

A third possibility involves the joint (integrated) representation of components, with item information absent or attenuated. The representation formed in memory for pairs of components, thus, is a merged representation, with little or no componential information retained or retrievable. There is the possibility, however, that one component contributes more to the joint representation of the pair than the other. In the most extreme case this corresponds to the formation of memory units for item-to-item associations but no units for the separate items. For example, only a single memory unit (aA) is formed for the pair, Label a and Melody A. We can ask the question: what would constitute evidence for an integrative representation? In terms of the associative model, we would expect first, that listeners would be unable to recognize a melody unless it was paired with the same component as during study. For example, listeners would be unable to recognize Melody A unless it was presented with Lyric a. Therefore, listeners should be able to make intact judgments, demonstrate a strong intact pair advantage in melody recognition, and be able to recognize an old component only in the presence of its companion.

Serafine et al. (1986) and Crowder et al. (1990) conclude that the results from their studies provide strong evidence that melody and lyric have an integrated representation. These data are most consistent with conjunctive representation of relational information realized as simple item-to-item associations in the proposed model. The poor recognition of an old melody when paired with new lyric (a hit rate of .23 compared with a false alarm rate of .2) does suggest that item information about the melodies is absent. However, rather than being due to integrative representation, it is possible that this finding reflects the low familiarity of the melodies as each was presented only once during study. This interpretation is supported by a study by Samson & Zatorre (1993) who (using the exact same stimuli as that used by Serafine) did not find comparable poor recognition of an old melody when paired with new lyric (hit rate of .65 and false alarm rate of .35).

Discussion

The investigation of whether musical components are integrated is problematic. The study by Samson and Zatorre provides some relevant insights. They compared melody recognition of normal controls to patients with unilateral left- and right-temporal lobe lesions The data for normal controls and the left-temporal lobe (LT) subjects are shown in Table 2. Both the LT and control subjects exhibited a clear intact advantage in melody recognition. The most striking difference concerned the recognition of the old melodies in the context of new lyrics. The LT subjects' recognition-performance was near chance in the presence of new lyrics (a hit rate of 0.53 compared with a false alarm rate of 0.55), yet they were able to recognize old melodies in the context of the same lyric as study. In contrast, the controls were able to discriminate old and new melodies independent of whether the lyric was "old" or "new". Samson & Zatorre interpret these data as evidence for separate encoding of melody and lyric since the LT group can selectively lose the ability to discriminate melodies. However, Serafine and colleagues would interpret the intact advantage in melody recognition observed both for the normal controls and the LT groups as evidence for integrative representation of melody and lyric. Using the associative model these conflicting claims can be resolved. The data are most consistent with a conjunctive representation and the result from the LT group suggests diminution or loss of memory units coding melody components without a loss of memory units for coding item-to-item associations.

Studies of melody recognition manipulating rhythmic components have produced similar findings to those investigating the integration of melody and text. Specifically, Jones & Ralston (1991) Experiment 1 required participants during the study phase to identify one of three possible target melodies whilst trying to "ignore" the rhythm. In the test phase, participants judged whether melodies were old or new. Only three test conditions of the possible five were used, namely intact, recombined (target melody recombined with another rhythm) and old rhythm-new melody. New melodies were either the same or different in melodic contour to the targets. The results of this experiment are shown in Table 2. Consistent with conjunctive representation of melody and rhythm, Jones & Ralston found an intact advantage in the recognition of the target melodies. The new melodies that were more similar to targets resulted in an increased false alarm rate. In a second experiment, Jones & Ralston examined melody recognition for the two remaining test conditions: new rhythm-old melody, new rhythm-new melody. New rhythms at test were either similar or dissimilar to the rhythms used at study. Similar rhythms increased both hit and false alarm rates in melody judgments.

The associative model illustrates the possible confounding effects of component similarity. Consider the situation where an old component-new melody (aX) contains a new melody which is very similar to the originally-paired melody A. In the associative model, the similar melody would activate both the memory unit for A and the memory unit for the association between the Melody A and Rhythm a. Another possibility is that the new Melody X is similar to another studied melody, B. In this case, only memory units for the label and the similar melody B would become activated since no association was formed between Rhythm a and Melody B at study. This highlights a difference between composite and conjunctive representations of relational information. Poor discrimination of new melodies may not be due to item information being absent. Rather, new components may be similar to original components and participants may be responding to these pairs as if they are old, resulting in an increased false alarm rate.

Conclusions

The results from a number of melody recognition studies have been organised and evaluated in terms of an associative model which distinguishes composite, conjunctive, and integrative representation of musical components in order to assess the use of relational information. An outstanding challenge for research on melody recognition is to design studies which test directly the integrative coding of auditory information. Although evidence has been cited for the integration in memory of musical components we have shown that current data are consistent with composite or conjunctive representation.

Author Notes

The research was supported by an ARC Postdoctoral Research Fellowship granted to the first author and University of Queensland Postdoctoral Fellowship awarded to the second author.

References

Aiello, R., & Sloboda, J. A. (Eds.), (1994). Musical perceptions. New York: Oxford University Press.

Crowder, R. G., Serafine, M. L., & Repp, B. (1990). Physical interaction and association by contiguity in memory for the words and melodies of songs. Memory & Cognition, 18, 469-476.

Cuddy, L. L., Cohen, A. J., & Miller, J. (1979). Melody recognition: The experimental application of musical rules. Canadian Journal of Psychology, 33, 148-157.

Deliège, I., & Sloboda, J. (1996). Musical beginnings: Origins and development of musical competence. Oxford: Oxford University Press.

Dowling, W. J. (1978). Scale and contour: Two components of a theory of memory for melodies. Psychological Review, 85, 341-354.

Gillund, G., & Shiffrin, R. M. (1984). A retrieval model for both recognition and recall. Psychological Review, 91, 1-67.

Handel, S. (1989). Listening: An introduction to the perception of auditory events. Cambridge, Mass.: MIT Press.

Humphreys, M. S. (1976). Relational information and the context effect in recognition memory. Memory & Cognition, 4, 221-232.

Humphreys, M. S., Bain, J. D., & Pike, R. (1989). Different ways to cue a coherent memory system: A theory for episodic, semantic, and procedural tasks. Psychological Review, 96, 208-233.

Jones, M. R. (1993). Dynamics of musical patterns: How do melody and rhythm fit together? In T. J. Tighe & W. J. Dowling (Eds.), Psychology and music: The understanding of melody and rhythm (pp. 67-92). Hillsdale, NJ: Erlbaum.

Jones, M. R., & Ralston, J. T. (1991). Some influences of accent structure on melody recognition. Memory & Cognition, 19, 8-20.

Krumhansl, C. L. (1991). Music psychology: Tonal structures in perception and memory. Annual Review of Psychology, 42, 277-303.

Light, L. L., & Carter-Sobell, L. (1970). Effects of changed semantic context on recognition memory. Journal of Verbal Learning and Verbal Behavior, 9, 1-11.

Longuet-Higgins, H. C. (1976). The perception of melodies. Nature, 263, 646-653.

Raaijmakers, J. G. W., & Shiffrin, R. M. (1981). Search of Associative Memory. Psychological Review 88, 93-134.

Samson, S., & Zatorre, R. J. (1993). Recognition memory for text and melody of songs after unilateral temporal lobe lesion: Evidence for dual encoding. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 793-804.

Serafine, M. L., Crowder, R. G., & Repp, B. H. (1984). Integration of melody and text in memory for songs. Cognition, 16, 285-303.

Serafine, M. L., Davidson, J., Crowder, R. G., & Repp, B. H. (1986). On the nature of melody-text integration in memory for songs. Journal of Memory & Language, 25, 123-135.

Steedman, M. J. (1977). The perception of musical rhythm and metre. Perception, 6, 555-569.

Stevens, C. J., & Humphreys, M. S. (1994). Item and relational information with verbal and nonverbal stimuli. Paper presented at the 35th Annual Meeting of the Psychonomic Society, St Louis, Missouri.

Tighe, T. J., & Dowling, W. J. (Eds.), (1993). Psychology and music: The understanding of melody and rhythm. Hillsdale, NJ: Erlbaum.

Tulving, E., & Thomson, D. M. (1971). Retrieval processes in recognition memory: Effects of associative context. Journal of Experimental Psychology, 87, 116-124.

Wallace, W. T. (1994). Memory for music: Effect of melody on recall of text. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1471-1485.