Exploring intonational interfaces with other linguistic domains
There has been remarkable progress on the phonology of English intonation, especially over the past four decades. At the same time, there is much left to uncover in the nature of intonational models and representations (even for relatively well-studied languages like English) — both with respect to intonation proper and in its relations to other domains of language. This major branch of my research program aims to make progress in understanding intonation from a variety of perspectives. I engage in this work as a collaboration with others in a research collective known as the Intonation and its Interfaces (int²) group.
I start with a premise that morphosyntax builds abstract structures while being blind to how they manifest in phonological form (i.e., “late insertion”/“realizational” models). Following this, it is expected that some abstract morphemes could (in principle) map onto intonational forms with the same mechanisms that map abstract morphemes onto segmental forms. (i.e., We expect that a morpheme X⁰ could just as easily map onto /L*/ as onto /-o/.) In order to discover if this is true, many more questions arise, related to intonation and its many interfaces.
Some of those questions are about:
phonology (What are the atoms of form, in the abstract sense, for English?),
the phonetics-phonology interface (How do atoms of form get expressed as acoustic cues?),
annotation (How should we annotate intonation?),
semantics/pragmatics (What are the atoms of meaning that intonation is associated with in English?),
sociolinguistics (How do we tease out the effects of social variables on intonation?),
syntax (How are those atoms of form and atoms of meaning related, in the syntax?), and
morphology (Are there constraints on what intonation can/cannot manifest?).
A portion of my work in this domain is an NSF-supported project on intonational morphology, for which I am a lead PI, in collaboration with Nanette Veilleux, Stefanie Shattuck-Hufnagel, Sunwoo Jeong, Alejna Brugos, and a group of undergraduate research assistants.
I have also done work in this domain about the intonation of American English polar questions (with Z.L. Zhou), and on the intonational features of American English newscaster speech (with Emily Gasser, Donna Jo Napoli, and Z.L. Zhou).
some work in this project
Ahn, Byron, Nanette Veilleux, Stefanie Shattuck-Hufnagel & Alejna Brugos. Accepted. Embarking on PoLaR Explorations: A Framework for Intonational Annotation and Analysis.
This monograph lays out an extensive set of guidelines that define the PoLaR framework for prosodic labelling, whose name is based on Points, Levels, and Ranges. The aim of this system is to provide a means of annotating the core components of prosody (especially with respect to intonation) on separate tiers. This includes information such as pitch turning points, pitch ranges, prominence and phrases. A hallmark of PoLaR is that these features can be annotated separately from one another and systematically, in a way that is not offered by other popular systems of intonational annotation; at the same time, PoLaR also provides advanced labellers with optional labels that can annotate analyses of how these features are related to one another. PoLaR’s shallow learning curve, its flexibility, and its extensibility make PoLaR useful and accessible to a wide variety of researchers, from a wide variety of backgrounds, working on a wide variety of languages and varieties. The structure of this monograph is as follows. After providing background context for PoLaR in Chapter 1, Chapters 2 and 3 comprise the annotation guidelines. After the official guidelines, we discuss details about how PoLaR has been explicitly designed to be customized or expanded upon (Chapter 4), as well as some usage cases for PoLaR and some of its advantages (Chapter 5).
Ahn, Byron, Nanette Veilleux, Beth Sturman, Alejna Brugos, Sunwoo Jeong & Stefanie Shattuck-Hufnagel. 2022, March. How Meaningful These Intonational Contours Are! Poster presented at 35th Annual Conference on Human Sentence Processing (formerly the ‘CUNY Conference on Human Sentence Processing’). University of California, Santa Cruz.
As noted in works like Rett and Sturman 2021, exclamative expressions in English can take on a variety of specialized syntactic forms, which have been argued to share a syntactic core (responsible for word-order effects) and a semantic core (a mirativity operator). Like Rett and Sturman, this work aims to identify an intonational core for exclamatives, and successfully replicates their results using an entirely different methodology: a non-phonological annotation system (PoLaR) alongside statistical analysis and machine learning techniques facilitated by those labels.
Constructing a grammatical analysis this way has advantages (including practical ones like PoLaR’s ease of use). Moreover, developing these models can serve as a stepping stone towards computationally identifying more complex models of other sem-prag and/or paralinguistic meanings.
Ahn, Byron, Nanette Veilleux & Stefanie Shattuck-Hufnagel. 2019. Annotating Prosody with PoLaR: Conventions for a Decompositional Annotation System. In Sasha Calhoun, Paola Escudero, Marija Tabain, & Paul Warren (eds.), Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019, 1302–1306. Canberra, Australia: Australasian Speech Science and Technology Association Inc.
An utterance’s intonational characteristics vary according to linguistic meaning; even controlling for meaning, they may vary across speakers, and even across utterances for a given speaker. Phonological annotation systems typically bundle together disparate characteristics, such as f0 scaling, alignment, and prominence (e.g., labels like H*). This can make it difficult to document variation, and also to determine which aspects of the signal (i) result from the phonology-phonetics interface, (ii) vary according to abstract linguistic features, (iii) depend on dialect, social context, or emotional state, or (iv) may be simply noise. This has motivated a new annotation system: Points, Levels, and Ranges (PoLaR). PoLaR takes inspiration from (and can be used alongside) existing theoretical (AM theory) and transcriptional (IPO, IViE, RPT) systems, but is neither a phonological/cognitive model, nor an acoustic/physical model. It isolates individual prosodic characteristics, using labels that transparently correspond to aspects of the acoustics and/or native speaker perception.
Zhou, Z.L. & Byron Ahn. 2019. Is this in the phonology? Examining the intonational phonetics-phonology interface with American English polar questions. In Sasha Calhoun, Paola Escudero, Marija Tabain, & Paul Warren (eds.), Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019, 2450–2454. Canberra, Australia: Australasian Speech Science and Technology Association Inc.
In this paper, we discuss a priori unexpected pitch movements (spurious pitch movements; SPMs) preceding the L* in rising MAE polar questions. In experimental data, we find two non-canonical contours: a rise-fall SPM with a peak, and a steady-high-and-fall SPM with a plateau. In both, the alignment and scaling of SPMs is more variable than might be predictable with MAE_ToBI as currently defined.
We present a linear mixed effects model which shows that SPMs reflect fine-grained detail related to fluency and emotional state, but also the semantico-pragmatics of the discourse context: they appear to be correlated with fewer expectations about possible answers. Our results necessitate a model in which this type of phonetic variation can be understood as linguistically structured and motivated.
Gasser, Emily, Byron Ahn, Donna Jo Napoli & Z.L. Zhou. 2019. Production, perception, and communicative goals of American newscaster speech. Language in Society48(2). 233–259.
Listeners often have the intuition that the speech of broadcast news reporters somehow ‘sounds different’; previous literature supports this observation and has described some distinctive aspects of newscaster register. This article presents two studies further describing the characteristic properties and functions of American English newscaster speech, focusing specifically on prosody. In the first, we investigate the production of newscaster speech. We describe the measurable differences in pitch, speed, intensity, and melodic features between newscaster and conversational speech, and connect those traits to perceptions of authority, credibility, charisma, and related characteristics. In the second, we investigate the perception of newscaster speech. Our experiments demonstrate that listeners can distinguish newscaster from conversational speech given only prosodic information, and that they use a subset of the newscasters’ distinguishing features to do so.
Gasser, Emily, Byron Ahn, Donna Jo Napoli & Z.L. Zhou. 2018, October. Prosodic features of newscaster intonation: production, perception, and communicative use. Talk presented at Experimental and Theoretical Advances in Prosody 4, UMass Amherst.
Listeners often have the intuition that the speech of broadcast news reporters somehow ‘sounds different’; previous literature supports this observation and has described some distinctive aspects of newscaster register. We present two studies further describing the characteristic properties and functions of American English newscaster speech, focusing specifically on prosody. In the first, we investigate the production of newscaster speech. We describe the measurable differences in pitch, speed, intensity, and melodic features between newscaster and conversational speech, and connect those traits to perceptions of authority, credibility, charisma, and related characteristics. In the second, we investigate the perception of newscaster speech. Our experiments demonstrate that listeners can distinguish newscaster from conversational speech given only prosodic information, and that they use a subset of the newscasters’ distinguishing features to do so.
Our results raise important questions about speech features, their conversational functions, and their perceived relationship to traits such as credibility and authority. In a follow-up survey, we asked radio newscasters about their priorities and intentions in crafting their on-air voice; their responses conflicted with our findings, indicating that the prosodic features characteristic of newscaster speech are likely instantiated on a subconscious level as a feature of the speech genre. Further, listeners appear to be paying attention to sub-phonemic features to identify it. When characterizing a prosodic genre, we must look past phonological representations to the phonetic level.
Ahn, Byron, Stefanie Shattuck-Hufnagel & Nanette Veilleux. 2016. Evidence and Intonational Contours: An Experimental Approach to Meaning in Intonation. Proceedings of the Sixteenth Australasian International Conference on Speech Science and Technology 189–192.
One proposed function of prosody is conveying speakers’ stances on the evidence for their statements and assessments of listeners’ beliefs. Testing this is challenging - specifying evidential status is difficult, and speakers vary intonationally for a given context. A novel task of reading lines from comic strips elicits relatively consistent intonation, suggesting both the method’s usefulness, and the efficacy of speaker beliefs in governing prosodic contours. Preliminary results suggest H* accents are used when speakers believe they have evidence the hearer lacks, and L* accents for the flipped situation.