Pseudo-Dynamic Preservati... - 小泉政利 - 日本学术振兴会基金(JS...

Pseudo-Dynamic Preservation and Elucidation of Neural Processing of Endangered Languages Based on Natural Discourse Corpora with Physiological Indices

项目来源

日本学术振兴会基金(JSPS)

项目主持人

小泉政利

项目受资助机构

東北大学

立项年度

2024

立项时间

未公开

项目编号

24H00085

项目级别

国家级

研究期限

未知 / 未知

受资助金额

47580000.00日元

学科

文学、言語学およびその関連分野

学科代码

未公开

基金类别

基盤研究(A)

关键词

オーストロネシア語族 ; マヤ語族 ; 日流語族 ; 危機言語 ; 脳機能計測 ;

伊藤彰則；那須川訓也；大塚祐子；小野創；大滝宏一；里麻奈美；木山幸子；安永大地；山田真寛；大関洋平；新国佳祐；矢野雅貴；宮川創；遊佐麻友子

参与机构

東北学院大学；上智大学；津田塾大学；中京大学；沖縄国際大学；金沢大学；大学共同利用機関法人人間文化研究機構国立国語研究所；東京大学；新潟青陵大学；東京都立大学；筑波大学；弘前学院大学

项目标书摘要：Outline of Research at the Start:現在する言語の大多数が消滅の危機に瀕しており、言語と文化の保存・復興は喫緊の課題である。この問題を解決するために、本研究では、「生理指標付き自然談話コーパス」と「AI対話システム」を活用した「危機言語の擬似動態保存」という斬新な方法を提案し実施する。また、「生理指標付き自然談話コーパス」と「行動実験・視線計測実験・脳機能計測実験」を駆使して、「少数民族の言語の脳内処理過程の解明」に取り組む。

排序方式：时间相关性
显示方式：列表摘要

1.Pupillometric evidence for perceptual simulation in language comprehension: Sensory and emotional meanings of Japanese adjectives

关键词：
pupillometry; perceptual simulation; language comprehension; embodiedcognition; Japanese; literal and metaphorical meaning;PUPIL; EYE; EMBODIMENT; WORDS

Niikuni, Keiyu;Sato, Manami
《PERCEPTION》
2026年
卷
期
期刊

Previous research has demonstrated that words associated with brightness (e.g., "sun") elicit smaller pupil diameters than those related to darkness (e.g., "night"). The present study aimed to determine whether these language-induced pupillary responses are driven by the luminance of the mentally simulated content-referred to here as sensory interpretation-or by the conceptual brightness linked to the words' emotional valence, termed emotional interpretation. To address this question, we utilized the Japanese adjectives akarui and kurai, which can denote both luminance, as in the noun phrase akarui/kurai gamen ("bright/dark screen"), and emotional valence, as in akarui/kurai seikaku ("cheerful/gloomy personality"). Participants were presented with noun phrases composed of these adjectives and various nouns (akarui/kurai + noun). A significant main effect of the adjective indicated that phrases containing akarui yielded smaller pupil diameters than those containing kurai. Furthermore, although the interaction effect did not reach significance, the adjective effect was observed only when the adjectives conveyed luminance, not when they conveyed emotional valence. These findings suggest that sensory, rather than emotional, interpretation better explains language-induced changes in pupil size. The use of pupillometry as a measure of perceptual simulation offers more direct and compelling evidence in support of the central claim of embodied language theories: that during language comprehension, readers and listeners spontaneously generate sensorimotor simulations of the described content. Future studies are warranted to examine whether these findings extend to sentence- and discourse-level processing, as well as to simulations of information conveyed implicitly or indirectly through language.

...

2.The Lingering Effect as Memory Persistence Has Distinct Predictors From the Garden-Path Effect

关键词：
sentence processing; garden-path sentence; garden-path effect; lingeringeffect; Japanese;WORD-LENGTH; COMPREHENSION; RECOVERY; YOUNGER; DECAY

Emura, Rei;Kawachi, Yousuke;Sugawara, Saku;Koizumi, Masatoshi
《JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION》
2025年
卷
期
期刊

We investigated the mechanism of the lingering effect in relation to the garden-path effect based on self-paced reading and comprehension experiments in Japanese, which shows higher reanalysis success rates than English does. The lingering effect is a phenomenon whereby an initial misinterpretation persists in the final comprehension even after disambiguation. Through self-paced reading (Experiment 1) and comprehension tasks (Experiments 2 and 3), this study explored how the length and head position of ambiguous regions influence the garden-path and lingering effects. Our results indicate that the length and head position influenced the garden-path and lingering effects in different ways. In particular, a longer initial misparse strengthened the garden-path effect in a linear manner but weakened the lingering effect in a nonlinear manner. Additionally, surprisal affected the garden-path effect but not the lingering effect. These results support the notion that the garden-path and lingering effects are correlated but operate through different underlying processes. Specifically, the garden-path effect pertains to parsing, whereas the lingering effect relates to short-term memory.

...

3.Speakers of Verb-Initial Languages and Verb-Medial Languages Interpret the World Differently: A Comparative Study of Truku Seediq and English

关键词：
psycholinguistics; verb-object-subject word order; pantomime;verb-initial language; cognitive saliency;CONSTITUENT ORDER; COMMUNICATION-SYSTEMS; ADAPTIVE MEMORY; ANIMACY;CONSTRAINTS; ACCOUNT; VOICE

Sato, Manami;Luo, Yingyi;Schafer, Amy J.;Tang, Apay Ai-yu;Ono, Hajime;Sakai, Hiromu;Koizumi, Masatoshi
《JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION》
2025年
卷
期
期刊

Recent gesture studies investigating how speakers linearize events in which one entity acts on another have claimed that the preferred order is [subject/agent]-[object/patient]-[verb/action] (SOV/APV) irrespective of language background (Schouwstra et al., 2022; Goldin-Meadow et al., 2008). However, these studies have only tested speakers of languages in which the subject/agent preferentially precedes the object/patient. We provide a stronger test of this cognitive-universal hypothesis using elicited pantomime (plus a spoken-language comparison task) with speakers of Truku Seediq, which favors the typologically rare VOS/VPA word order, and English-speaking controls. While the English speakers' pantomimes largely employed the expected SOV/APV and SVO/AVP orders, the Truku Seediq speakers produced almost no APV sequences. The results strengthen the evidence for processing effects that promote SVO/AVP order under certain conditions, and further support the claim that the habitual use of a language may cumulatively influence speakers' cognitive activities as they are interpreting the world. The divergent preferences for the two typologically different languages suggest that language experience can change conceptual accessibility, especially in terms of action saliency, in speakers' cognition.

...

4.The crucial role of the left inferior frontal gyrus (BA44) in synergizing syntactic structure and information structure during sentence comprehension

关键词：
Syntactic structure; Information structure; fMRI; Left inferior frontalgyrus;WORD-ORDER; NEURAL BASIS; MOVEMENT; CONTEXTS

Jeong, Hyeonjeong;Kim, Jungho;Yano, Masataka;Cui, Haining;Kiayama, Sachiko;Koizumi, Masatoshi
《BRAIN AND LANGUAGE》
2025年
262卷
期
期刊

This study examines the neural mechanisms behind integrating syntactic and information structures during sentence comprehension using functional Magnetic Resonance Imaging. Focusing on Japanese sentences with canonical (SOV) and non-canonical (OSV) word orders, the study revealed distinct neural networks responsible for processing these linguistic structures. The left opercular part of the inferior frontal gyrus, left premotor area, and left posterior superior/middle temporal gyrus were primarily involved in syntactic processing. In contrast, the right inferior frontal sulcus, bilateral intraparietal sulci, and the left triangular part of the inferior frontal gyrus were linked to information structure processing. Importantly, the left opercular part of the inferior frontal gyrus (BA44) played a crucial role in integrating these structures during the later stages of comprehension, particularly when processing the second noun phrase. These findings enhance our understanding of the complex interplay between syntactic and information structures in language comprehension.

...

5.Evaluation ofDifferent Training Strategies andRecognizers inLow Resource Speech Recognition Using Wav2vec2.0

关键词：
Decoding;Learning algorithms;Learning systems;Self-supervised learning;Signal encoding;Speech coding;Speech communication;Supervised learning;Automatic speech recognition;Character error rates;Learning frameworks;Learning strategy;Low resource languages;Low-resource speech recognition;Minority languages;Training strategy;Transformer;Wav2vec

Koshikawa, Takaki;Ito, Akinori;Nose, Takashi
《17th International Conference on Machine Learning and Computing, ICMLC 2025》
2025年
February 14, 2025 - February 17, 2025
Guangzhou, China
会议

Automatic Speech Recognition (ASR) is crucial for preserving minority languages, promoting inclusivity, and supporting education. Wav2vec2.0 Model, pre-trained through self-supervised learning, is effective for low-resource language speech recognition. Thus, this study investigates different learning strategies, recognizers, and frameworks to improve ASR performance for low-resource languages. First, we compared five learning strategies for low-resource language speech recognition using the wav2vec2.0 model. The Freeze-Transformer strategy, which fixes the CNN and low-layer Transformer blocks, achieved the lowest Character Error Rate (CER). Next, we evaluated five types of recognizers, including fully connected layers, MLP, RNN, LSTM, and GRU. The bi-GRU recognizer performed the best, achieving the lowest CER. Finally, we tested an Encoder-Decoder model with wav2vec2.0 as the encoder and a Transformer-decoder as the decoder. The results showed that the recognition performance did not improve with this model, even with a large amount of training data. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

...

6.Producing non-basic word orders in (in)felicitous contexts: evidence from pupillometry and functional near-infrared spectroscopy (fNIRS)

关键词：
Filler-gap dependency; discourse; Japanese; pupillometry; functionalnear-infrared spectroscopy (fNIRS);SENTENCE PRODUCTION; SYNTACTIC STRUCTURE; PROCESSING LOAD; LANGUAGEPRODUCTION; AUDIENCE DESIGN; WORKING-MEMORY; INFORMATION; COMPLEXITY;ERPS; COMPREHENSION

Yano, Masataka;Niikuni, Keiyu;Shimura, Ruri;Funasaki, Natsumi;Koizumi, Masatoshi
《LANGUAGE COGNITION AND NEUROSCIENCE》
2024年
卷
期
期刊

The present study examined why speakers of languages with flexible word orders are more likely to use syntactically complex non-basic word orders when they provide discourse-given information earlier in sentences. This may be because they are more efficient for speakers to produce (the Speaker Economy Hypothesis). Alternatively, speakers may produce them to help listeners understand sentences more efficiently (the Listener Economy Hypothesis), given that previous studies showed that the processing of non-basic word orders was facilitated when the felicitous context was provided (i.e. a displaced object refers to discourse-given information). We addressed this issue by conducting a picture-description experiment, in which participants uttered sentences with syntactically basic Subject-Object-Verb (SOV) or non-basic Object-Subject-Verb (OSV) in felicitous or infelicitous contexts while cognitive load was tracked using pupillometry and functional near-infrared spectroscopy. The results showed that the felicitous context facilitated the filler-gap dependency formation of OSVs in production, supporting the Speaker Economy Hypothesis.

...

7.Improving Speaker Consistency in Speech-to-Speech Translation Using Speaker Retention Unit-to-Mel Techniques

关键词：
Semantics;Speech enhancement;Translation (languages);End to end;French-english;Semantic content;Semantics Information;Speaker specific informations;Speech-to-speech translation;Synthesized speech;Voice quality;Waveforms

Zhou, Rui;Ito, Akinori;Nose, Takashi
《2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024》
2024年
December 3, 2024 - December 6, 2024
Macau, China
会议

We propose a Speaker-Consistent Speech-to-Speech Translation (SC-S2ST) system that effectively retains speaker-specific information. While the paradigm of Speech-to-Unit Translation (S2UT) followed by Unit-to-Waveform Vocoder has become a mainstream for End-to-End S2ST systems, due to the substantial semantic content carried by discrete units, this approach primarily captures semantic information and often results in synthesized speech that lacks speaker-specific characteristics such as accent and individual voice qualities. Existing S2UT systems with style transfer face the issue of high inference latency. To address this limitation, we introduced a Speaker-Retention Unit-to-Mel (SR-UTM) framework designed to capture and preserve speaker-specific information. We conducted experiments on the CVSS-C and CVSS-T corpora for Spanish-English and French-English translation tasks. Our approach achieved BLEU scores of 16.10 and 21.68, which are comparable to those of the baseline S2UT system. Furthermore, our SC-S2UT system excelled in preserving speaker similarity. The speaker similarity experiments showed that our method effectively retains speaker-specific information without significantly increasing inference time. These results confirm that our primary approach successfully achieve speaker-consistent speech-to-speech translation. © 2024 IEEE.

...

排序方式：时间相关性
显示方式：列表摘要