Joseph Tyler attended the conference ETAP 2 (Experimental and Theoretical Approaches to Prosody) from Sept. 23-25. The event was organized by Duane Watson, Michael Wagner, and Ted Gibson, and took place in Montreal at McGill University. He presented two posters, one solo project based on the research he did for his dissertation prospectus, analyzing coherence judgments in written and spoken modalities and assessing prosodic effects on coherence judgments. The second poster, co-authored with Jason Kahn and Jennifer Arnold, PhD of University of North Carolina (Psychology), looked at production and comprehension of ambiguous discourses. He was happy to have some time on the last day to climb Mount Royal and get a view over Montreal and the surrounding area, as well as to walk along the river and see the old city.
Discourse coherence in spoken vs. written discourse, and prosodic effects on coherence judgments Joseph Tyler, University of Michigan
This paper explores two questions about the relationship between discourse and prosody. First, do people judge the coherence of a discourse differently when it is presented in written vs. spoken form? And second, does a manipulation of the prosody of a spoken discourse along the lines of discourse production patterns affect coherence judgments? In work on the coherence structure of discourse, the data that serve as the basis of analysis have almost exclusively been written in form. As a result, it is unclear whether theories that are constructed on the basis of coherence judgments (Asher & Lascarides, 2003; Kehler, 2002) are generalizable to spoken discourse as well. To test this question, one study was run soliciting coherence judgments of written discourses, comparing the effects of discourses where a discourse marker matched the discourse content, mismatched the content, or was not present at all (1a-f). Then, the same study was re-run with discourses presented in spoken form. Results showed participant judgments of written and spoken discourse patterned similarly, although spoken discourses received better ratings overall. To test whether patterns of discourse prosody affect discourse perception, two kinds of discourses were crossed with two prosodic manipulations. One consistent finding of discourse prosody production studies has been that speakers pause longer between sentences at stronger boundaries in discourse (den Ouden, Noordman, & Terken, 2009; Hirschberg & Grosz, 1992). To test whether this production pattern affects perception, one set of three-sentence long discourses was created with a larger break before the third sentence (2a) and another with a smaller break (2b) before the third sentence. Then, pause durations between the sentences were manipulated so that longer pauses either occurred at stronger breaks in the discourse or they did not. The hypothesis was that listeners would prefer discourses where stronger boundaries matched longer pause durations. Results showed that listeners rated discourses with a smaller boundary more coherent than those with a larger boundary, but there was no effect of the pause manipulation. This study shows that when participants are not made explicitly aware of the pause manipulation and are free to choose any coherence judgment, the pause manipulation does not affect their judgments. In addition to the above studies, two pilot studies were run where participants were forced to choose between two discourses contrasted only by pause durations between sentences. Participants were told this was the only difference and were asked to choose which one they found more coherent. In the first study, the pause manipulation trended towards significance. In the second study, the pause durations were made more contrastive and resulted in a significant effect. It seems these less natural presentations are more likely to get an effect, suggesting participants may not draw on discourse prosody unless they are forced to.
1a. Narration-Match: Jim slept all night. Then, he got up and took a shower.
1b. Narration-NoCoherenceMarker: Jim slept all night. He got up and took a shower.
1c. Narration-Mismatch: Jim slept all night. For example, he got up and took a shower.
1d. Elaboration-Match: There are many delicious cheeses. For example, gouda and cheddar are both tasty.
1e. Elaboration- NoCoherenceMarker: There are many delicious cheeses. Gouda and cheddar are both tasty.
1f. Elaboration- Mismatch: There are many delicious cheeses. Then, gouda and cheddar are both tasty
2a. Discourse Pop: Stacy spent the morning at the gym. She swam fifty laps in the pool. Then she picked up some groceries at Whole Foods.
2b. Discourse Subordination: Stacy spent the morning at the gym. She swam fifty laps in the pool. She swam them faster than she ever had before.
Asher, N., & Lascarides, A. (2003). Logics of Conversation. xxii+526pp, Cambridge, UK: Cambridge U Press. den Ouden, H., Noordman, L., & Terken, J. (2009). Prosodic realizations of global and local structure and rhetorical relations in read aloud news reports. Speech Communication, 51(2), 116-129. Hirschberg, J., & Grosz, B. (1992). Intonational Features of Local and Global Discourse Structure. Paper presented at the Proceedings of the Speech and Natural Language Workshop. Kehler, A. (2002). Coherence, Reference, and the Theory of Grammar. Stanford, California: Center for the Study of Language and Information.
Speakers use prosody to communicate discourse structure, and listeners use that prosody in comprehending discourse structure
Joseph Tyler, University of Michigan Jason Kahn, University of North Carolina Jennifer Arnold, University of North Carolina
In response to the question What did you do last weekend?, one could say, I went into town. I stopped by the post office. And I bought some socks at the mall. This discourse is ambiguous, in that the post office and mall could be in town or somewhere else. Two experiments tested whether: 1) speakers use prosody to indicate which interpretation they intend 2) listeners use the available prosodic information to inform judgments about which interpretation was intended. At the sentential level, research has demonstrated that under certain conditions, speakers use prosody to disambiguate syntactic structure (e.g. Snedeker & Trueswell, 2003), and listeners use prosodic information to create an appropriate syntactic parse (Schafer et al., 2000). Discourse has structure as well, and theoretical work (e.g. Asher & Lascarides, 2003) tends to model that structure by segmenting it into elementary units, taxonomizing relations that hold between those units, and constructing a hierarchy. When the post office and mall are in town in the discourse above, sentences 2 and 3 are subordinated to sentence 1; when outside of town, the sentences are coordinated. If speakers represent discourse relations this way during production, they may use prosody to mark the difference. In the production experiment, 12 speakers saw an ambiguous three-sentence discourse, and two pictures corresponding to each of its meanings (subordinate vs. coordinate). First, they paraphrased each meaning verbally. Then, the speaker read the discourse text verbatim, attempting to convey one of the meanings. Participants saw all 14 discourses twice, once in each relation condition (order randomized). This resulted in 28 utterances per participant in a within-subjects design, holding all lexical and syntactic information constant. A multilevel model revealed prosodic differences between coordinated and subordinated structures. We extracted the duration of the pauses between the sentences, as well as the duration of And at the beginning of the third sentence. Subordinating relations led to shorter pauses between the first and second sentences. Further, when the second sentence was subordinated to the first but the third was instead coordinated with the first, the pause before the third sentence was longer than any other pause (interaction: p<.0001). The duration of And was shorter when both the second and third sentences were subordinated. In the comprehension experiment, listeners saw each discourse and picture pair and paraphrased the two meanings. Afterward, they heard recordings from the production experiment, chose the meaning they thought the speaker intended, and rated their confidence in their choice. A multi-level model showed that when listeners chose the correct meaning, the duration of both the first and second pause predicted their confidence. This set of results indicates that speakers vary their prosody in order to disambiguate discourse relations, and that listeners make use of this information during comprehension. More specifically, speakers tended to use short first pauses to signal subordination, very short second pauses and And’s to signal continued subordination, and long pauses to indicate coordination after subordination. This information made listeners more confident in their correctness. References Asher, N. & Lascarides, A. (2003). Logics of Conversation, Cambridge University Press. Dell, G.S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review, 93, 283-321. Schafer, A., Speer, S., Warren, P. & White, S. D. (2000). Intonational Disambiguation in Sentence Production and Comprehension. Journal of Psycholinguistic Research, 29(2), 169-181. Snedeker, J. & Trueswell, J.C. (2003). Using Prosody to Avoid Ambiguity: Effects of Speaker Awareness and Referential Context, Journal of Memory and Language, 48, 103-130. Figure 1- Duration in milliseconds of the sentence and pause regions, broken down by condition (Subordinated versus Coordinated) and discourse type (Second and third sentence both subordinated/coordinated vs. only second sentence subordinated/coordinated) Figure 2 – Duration in milliseconds of the And region at the beginning of each Sentence 3, again broken down by condition and discourse type (see Figure 1).