How people learn language continues to puzzle many researchers in linguistics, but a team of College of Arts and Science linguists recently received a $500,000 grant from the National Science Foundation in 2025 to get closer to an answer. Volya Kapatsinski in the Department of Linguistics and Kaori Idemaru in the Department of East Asian Languages and Literatures secured the grant and are currently conducting the research.
The scientific question of the research experiment, “The Role of Learned Selective Attention in Accent Adaptation,” is: how do people learn to understand speech better by adjusting what they pay attention to in spoken language?
—Volya Kapatsinski, Co-Principal Investigator, Professor and Head of the Department of Linguistics
“The ultimate goal is to understand how language learning works. Linguistics has long been concerned with the mechanisms that allow humans to learn language,” said Volya Kapatsinski, professor and head of the Linguistics department.
In everyday interactions, to increase their understanding of someone talking in an unfamiliar accent, a listener will adjust what they listen for when trying to make out an unfamiliar sounding word. The small details listeners pay attention to are technically referred to as perceptual dimensions and include things like duration or pitch. This adjustment listeners make is called dimensional reweighting and there is not a definitive answer for how people learn to do this.
To further explain this, Kapatsinski offers the words beer and peer as an example, explaining they differ on a dozen distinct perceptual dimensions. “An example of a perceptual dimension is the duration of the little puff of air that escapes the mouth when the lips open to produce a ‘p’ or ‘b.’ It is longer in ‘p.’ This dimension is called voice onset time (VOT). The other main difference in perceptual dimensions of the two words is the pitch at the beginning of the following vowel, which is higher in ‘p,’ and called fundamental frequency (F0).”
Dimensions themselves are universal, but their weights vary based on the language. Kapatsinski explained that native English listeners may pay more attention to VOT than to F0, but some native Korean listeners pay more attention to F0 than to VOT, even when they are listening to English. The research likens this shift in attention to how machine learning uses upweighting and downweighting, putting more importance or less importance on a data point.
“This explains some puzzling cases in which second language learners seem to have persistent difficulties in second language speech perception,” Kapatsinski said.
To conduct the research, the team is using an existing computer model they developed, combined with behavioral experimentation, to test key predictions of accent adaptation. The model is built on the ideas of learned selective attention theory and reinforcement learning.
“Learned selective attention is a popular idea in category learning that has been investigated extensively in vision,” said Kapatsinski. “Reinforcement learning is the dominant approach in AI and robotics. We are bringing these ideas together and investigating them in the domain of speech perception where they have not been investigated.”
Along with principal investigators Kapatsinski and Idemaru, graduate students Carina Ahrens in linguistics, Nadia Clement in linguistics and Yeojin Jung in East Asian languages and literatures are researchers on the project. They plan to test approximately 600 English-speaking subjects in the US and Korean-speaking subjects in Korea.
“Research is important for graduate students to train us in research methods, concept formulation and collaboration,” said Clement.
—By Jenny Brooks, College of Arts and Sciences