Pseudo-Dynamic Preservation and Elucidation of Neural Processing of Endangered Languages Based on Natural Discourse Corpora with Physiological Indices
项目来源
项目主持人
项目受资助机构
立项年度
立项时间
项目编号
项目级别
研究期限
受资助金额
学科
学科代码
基金类别
关键词
参与机构
1.Pupillometric evidence for perceptual simulation in language comprehension: Sensory and emotional meanings of Japanese adjectives
- 关键词:
- pupillometry; perceptual simulation; language comprehension; embodiedcognition; Japanese; literal and metaphorical meaning;PUPIL; EYE; EMBODIMENT; WORDS
- Niikuni, Keiyu;Sato, Manami
- 《PERCEPTION》
- 2026年
- 卷
- 期
- 期刊
Previous research has demonstrated that words associated with brightness (e.g., "sun") elicit smaller pupil diameters than those related to darkness (e.g., "night"). The present study aimed to determine whether these language-induced pupillary responses are driven by the luminance of the mentally simulated content-referred to here as sensory interpretation-or by the conceptual brightness linked to the words' emotional valence, termed emotional interpretation. To address this question, we utilized the Japanese adjectives akarui and kurai, which can denote both luminance, as in the noun phrase akarui/kurai gamen ("bright/dark screen"), and emotional valence, as in akarui/kurai seikaku ("cheerful/gloomy personality"). Participants were presented with noun phrases composed of these adjectives and various nouns (akarui/kurai + noun). A significant main effect of the adjective indicated that phrases containing akarui yielded smaller pupil diameters than those containing kurai. Furthermore, although the interaction effect did not reach significance, the adjective effect was observed only when the adjectives conveyed luminance, not when they conveyed emotional valence. These findings suggest that sensory, rather than emotional, interpretation better explains language-induced changes in pupil size. The use of pupillometry as a measure of perceptual simulation offers more direct and compelling evidence in support of the central claim of embodied language theories: that during language comprehension, readers and listeners spontaneously generate sensorimotor simulations of the described content. Future studies are warranted to examine whether these findings extend to sentence- and discourse-level processing, as well as to simulations of information conveyed implicitly or indirectly through language.
...2.The Lingering Effect as Memory Persistence Has Distinct Predictors From the Garden-Path Effect
- 关键词:
- sentence processing; garden-path sentence; garden-path effect; lingeringeffect; Japanese;WORD-LENGTH; COMPREHENSION; RECOVERY; YOUNGER; DECAY
- Emura, Rei;Kawachi, Yousuke;Sugawara, Saku;Koizumi, Masatoshi
- 《JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION》
- 2025年
- 卷
- 期
- 期刊
We investigated the mechanism of the lingering effect in relation to the garden-path effect based on self-paced reading and comprehension experiments in Japanese, which shows higher reanalysis success rates than English does. The lingering effect is a phenomenon whereby an initial misinterpretation persists in the final comprehension even after disambiguation. Through self-paced reading (Experiment 1) and comprehension tasks (Experiments 2 and 3), this study explored how the length and head position of ambiguous regions influence the garden-path and lingering effects. Our results indicate that the length and head position influenced the garden-path and lingering effects in different ways. In particular, a longer initial misparse strengthened the garden-path effect in a linear manner but weakened the lingering effect in a nonlinear manner. Additionally, surprisal affected the garden-path effect but not the lingering effect. These results support the notion that the garden-path and lingering effects are correlated but operate through different underlying processes. Specifically, the garden-path effect pertains to parsing, whereas the lingering effect relates to short-term memory.
...3.Speakers of Verb-Initial Languages and Verb-Medial Languages Interpret the World Differently: A Comparative Study of Truku Seediq and English
- 关键词:
- psycholinguistics; verb-object-subject word order; pantomime;verb-initial language; cognitive saliency;CONSTITUENT ORDER; COMMUNICATION-SYSTEMS; ADAPTIVE MEMORY; ANIMACY;CONSTRAINTS; ACCOUNT; VOICE
- Sato, Manami;Luo, Yingyi;Schafer, Amy J.;Tang, Apay Ai-yu;Ono, Hajime;Sakai, Hiromu;Koizumi, Masatoshi
- 《JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION》
- 2025年
- 卷
- 期
- 期刊
Recent gesture studies investigating how speakers linearize events in which one entity acts on another have claimed that the preferred order is [subject/agent]-[object/patient]-[verb/action] (SOV/APV) irrespective of language background (Schouwstra et al., 2022; Goldin-Meadow et al., 2008). However, these studies have only tested speakers of languages in which the subject/agent preferentially precedes the object/patient. We provide a stronger test of this cognitive-universal hypothesis using elicited pantomime (plus a spoken-language comparison task) with speakers of Truku Seediq, which favors the typologically rare VOS/VPA word order, and English-speaking controls. While the English speakers' pantomimes largely employed the expected SOV/APV and SVO/AVP orders, the Truku Seediq speakers produced almost no APV sequences. The results strengthen the evidence for processing effects that promote SVO/AVP order under certain conditions, and further support the claim that the habitual use of a language may cumulatively influence speakers' cognitive activities as they are interpreting the world. The divergent preferences for the two typologically different languages suggest that language experience can change conceptual accessibility, especially in terms of action saliency, in speakers' cognition.
...4.The crucial role of the left inferior frontal gyrus (BA44) in synergizing syntactic structure and information structure during sentence comprehension
- 关键词:
- Syntactic structure; Information structure; fMRI; Left inferior frontalgyrus;WORD-ORDER; NEURAL BASIS; MOVEMENT; CONTEXTS
- Jeong, Hyeonjeong;Kim, Jungho;Yano, Masataka;Cui, Haining;Kiayama, Sachiko;Koizumi, Masatoshi
- 《BRAIN AND LANGUAGE》
- 2025年
- 262卷
- 期
- 期刊
This study examines the neural mechanisms behind integrating syntactic and information structures during sentence comprehension using functional Magnetic Resonance Imaging. Focusing on Japanese sentences with canonical (SOV) and non-canonical (OSV) word orders, the study revealed distinct neural networks responsible for processing these linguistic structures. The left opercular part of the inferior frontal gyrus, left premotor area, and left posterior superior/middle temporal gyrus were primarily involved in syntactic processing. In contrast, the right inferior frontal sulcus, bilateral intraparietal sulci, and the left triangular part of the inferior frontal gyrus were linked to information structure processing. Importantly, the left opercular part of the inferior frontal gyrus (BA44) played a crucial role in integrating these structures during the later stages of comprehension, particularly when processing the second noun phrase. These findings enhance our understanding of the complex interplay between syntactic and information structures in language comprehension.
...5.Evaluation ofDifferent Training Strategies andRecognizers inLow Resource Speech Recognition Using Wav2vec2.0
- 关键词:
- Decoding;Learning algorithms;Learning systems;Self-supervised learning;Signal encoding;Speech coding;Speech communication;Supervised learning;Automatic speech recognition;Character error rates;Learning frameworks;Learning strategy;Low resource languages;Low-resource speech recognition;Minority languages;Training strategy;Transformer;Wav2vec
- Koshikawa, Takaki;Ito, Akinori;Nose, Takashi
- 《17th International Conference on Machine Learning and Computing, ICMLC 2025》
- 2025年
- February 14, 2025 - February 17, 2025
- Guangzhou, China
- 会议
Automatic Speech Recognition (ASR) is crucial for preserving minority languages, promoting inclusivity, and supporting education. Wav2vec2.0 Model, pre-trained through self-supervised learning, is effective for low-resource language speech recognition. Thus, this study investigates different learning strategies, recognizers, and frameworks to improve ASR performance for low-resource languages. First, we compared five learning strategies for low-resource language speech recognition using the wav2vec2.0 model. The Freeze-Transformer strategy, which fixes the CNN and low-layer Transformer blocks, achieved the lowest Character Error Rate (CER). Next, we evaluated five types of recognizers, including fully connected layers, MLP, RNN, LSTM, and GRU. The bi-GRU recognizer performed the best, achieving the lowest CER. Finally, we tested an Encoder-Decoder model with wav2vec2.0 as the encoder and a Transformer-decoder as the decoder. The results showed that the recognition performance did not improve with this model, even with a large amount of training data. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
...6.Producing non-basic word orders in (in)felicitous contexts: evidence from pupillometry and functional near-infrared spectroscopy (fNIRS)
- 关键词:
- Filler-gap dependency; discourse; Japanese; pupillometry; functionalnear-infrared spectroscopy (fNIRS);SENTENCE PRODUCTION; SYNTACTIC STRUCTURE; PROCESSING LOAD; LANGUAGEPRODUCTION; AUDIENCE DESIGN; WORKING-MEMORY; INFORMATION; COMPLEXITY;ERPS; COMPREHENSION
- Yano, Masataka;Niikuni, Keiyu;Shimura, Ruri;Funasaki, Natsumi;Koizumi, Masatoshi
- 《LANGUAGE COGNITION AND NEUROSCIENCE》
- 2024年
- 卷
- 期
- 期刊
The present study examined why speakers of languages with flexible word orders are more likely to use syntactically complex non-basic word orders when they provide discourse-given information earlier in sentences. This may be because they are more efficient for speakers to produce (the Speaker Economy Hypothesis). Alternatively, speakers may produce them to help listeners understand sentences more efficiently (the Listener Economy Hypothesis), given that previous studies showed that the processing of non-basic word orders was facilitated when the felicitous context was provided (i.e. a displaced object refers to discourse-given information). We addressed this issue by conducting a picture-description experiment, in which participants uttered sentences with syntactically basic Subject-Object-Verb (SOV) or non-basic Object-Subject-Verb (OSV) in felicitous or infelicitous contexts while cognitive load was tracked using pupillometry and functional near-infrared spectroscopy. The results showed that the felicitous context facilitated the filler-gap dependency formation of OSVs in production, supporting the Speaker Economy Hypothesis.
...7.Improving Speaker Consistency in Speech-to-Speech Translation Using Speaker Retention Unit-to-Mel Techniques
- 关键词:
- Semantics;Speech enhancement;Translation (languages);End to end;French-english;Semantic content;Semantics Information;Speaker specific informations;Speech-to-speech translation;Synthesized speech;Voice quality;Waveforms
- Zhou, Rui;Ito, Akinori;Nose, Takashi
- 《2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024》
- 2024年
- December 3, 2024 - December 6, 2024
- Macau, China
- 会议
We propose a Speaker-Consistent Speech-to-Speech Translation (SC-S2ST) system that effectively retains speaker-specific information. While the paradigm of Speech-to-Unit Translation (S2UT) followed by Unit-to-Waveform Vocoder has become a mainstream for End-to-End S2ST systems, due to the substantial semantic content carried by discrete units, this approach primarily captures semantic information and often results in synthesized speech that lacks speaker-specific characteristics such as accent and individual voice qualities. Existing S2UT systems with style transfer face the issue of high inference latency. To address this limitation, we introduced a Speaker-Retention Unit-to-Mel (SR-UTM) framework designed to capture and preserve speaker-specific information. We conducted experiments on the CVSS-C and CVSS-T corpora for Spanish-English and French-English translation tasks. Our approach achieved BLEU scores of 16.10 and 21.68, which are comparable to those of the baseline S2UT system. Furthermore, our SC-S2UT system excelled in preserving speaker similarity. The speaker similarity experiments showed that our method effectively retains speaker-specific information without significantly increasing inference time. These results confirm that our primary approach successfully achieve speaker-consistent speech-to-speech translation. © 2024 IEEE.
...
