対話者の非言語行動のマルチモーダル相乗作用解明のための機能スペクトラム解析
项目来源
项目主持人
项目受资助机构
立项年度
立项时间
项目编号
项目级别
研究期限
受资助金额
学科
学科代码
基金类别
关键词
参与者
参与机构
1.Disentangling Perceptual Ambiguity in Multifunctional Nonverbal Behaviors in Conversations via Tensor Spectrum Decomposition
- 关键词:
- Behavioral research;Decomposition;Factorization;Human computer interaction;Human engineering;Matrix algebra;Tensors;Base matrix;Conversation;;Functionals;Head movement;;Head movements;Label aggregation;Label aggregation;;Non-verbal behaviours;Nonverbal behavior;;Tensor factorization
- Tamura, Issa;Tajima, Momoka;Kumano, Shiro;Otsuka, Kazuhiro
- 《27th International Conference on Multimodal Interaction, ICMI 2025》
- 2025年
- October 13, 2025 - October 17, 2025
- Canberra, ACT, Australia
- 会议
A framework named perceptual functional spectrum analysis (pFSA) for analyzing how people perceive the multifunctional nonverbal behaviors that emerge in conversations is proposed. The goal is to elucidate the intrinsic nonverbal properties, called functional multiplicity and interpretational ambiguity, in a separable way. The former property is that a single behavior could imply multiple meanings, and the latter is that different observers could interpret the same behaviors differently. In the pFSA framework, the labels of multiple raters across multiple functions over time are represented as a third-order tensor. This study then formulated a semiorthogonal nonnegative tensor factorization (SO-NTF) that approximates the input tensor as a linear combination of the functional basis matrix, perceptual basis matrices, and perceptual coefficient matrices. The functional basis matrix consists of functional spectra that represent fundamental functionalities in conversations. The perceptual basis matrices represent the perceptual tendencies, which consist of the sensitivities of the raters to the fundamental functionalities. The perceptual coefficient matrices represent the temporal activations of the perceptual tendencies. The pFSA framework constructs the perceptual basis matrices to characterize both label reliability and diversity. This study targeted 32 head movement functions labeled by ten raters. The experimental results confirmed that pFSA could successfully analyze the levels of ambiguity for multiple functionalities, such as low ambiguity for addressing and backchannel functions and high ambiguity for thinking functions. © 2025 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
...2.Analyzing Multimodal Multifunctional Interactions in Multiparty Conversations via Functional Spectrum Factorization
- 关键词:
- Beam plasma interactions;Behavioral research;Human computer interaction;Human engineering;Interactive computer systems;Matrix algebra;Spectrum analysis;Functionals;Group conversation;Interaction;Multi-modal;Multifunctionals;Multimodal nonverbal behavior;Non-verbal behaviours;Nonnegative matrix factorization;Spectra analysis;Spectra's
- Tajima, Momoka;Tamura, Issa;Otsuka, Kazuhiro
- 《27th International Conference on Multimodal Interaction, ICMI 2025》
- 2025年
- October 13, 2025 - October 17, 2025
- Canberra, ACT, Australia
- 会议
An analytic framework named an interactional functional spectrum analysis (iFSA) is proposed to reveal how people interact with each other via multimodal nonverbal behaviors in multiparty conversations, focusing on their interactional functional aspects. Based on the representation called functional spectrum, which is the distribution of perceptual intensities over multiple functions of nonverbal behaviors, this study extends such approach to analyze multiparty multimodal multifunctional interactions. More specifically, the iFSA introduces three key extensions: i) nonverbal modalities consisting of facial expressions, head movements, and gaze behaviors; ii) group-level interactions consisting of a speaker, addressee, and other listeners; and iii) temporal spectrum pooling to account for reaction time. From the multiparty multimodal functional spectra, the iFSA conducts spectrum decomposition via semiorthogonal nonnegative matrix factorization (SO-NMF), which approximates the input spectra as the product of a basis matrix called interactional functional basis and a coefficient matrix called interactional functional spectrum. The former represents fundamental patterns of multimodal interactions, and the latter indicates the temporal activation of each basis vector, i.e., each interaction pattern. The experiments targeting four-party conversation data revealed several essential interactions, such as the speaker’s full-modal addressing response with attentive listening by the addressee and other listeners. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.
...3.Exploring Interlocutor Gaze Interactions in Conversations based on Functional Spectrum Analysis
- 关键词:
- Convolutional neural networks;Matrix algebra;Non-negative matrix factorization;Spectrum analyzers;Speech analysis;Convolutional neural network;Functional basis;Functionals;Gaze interaction;Multi-modal;Multimodal nonverbal behavior;Non-verbal behaviours;Nonnegative matrix factorization;Spectra analysis;Spectra's
- Tashiro, Ayane;Imamura, Mai;Kumano, Shiro;Otsuka, Kazuhiro
- 《26th International Conference on Multimodal Interaction, ICMI 2024》
- 2024年
- November 4, 2024 - November 8, 2024
- San Jose, Costa rica
- 会议
A novel framework named a gaze interactional functional spectrum analysis (GI-FSA) is proposed to explore the functional aspects of gaze interactions among interlocutors in conversations. It aims to reveal the primary and distinctive interactional functionalities that emerge via the gaze behaviors of the speaker, and the listener whom the speaker looks at. To capture the intrinsic nature of gaze functions, such as multiple functionalities and ambiguity, this study introduces a novel representation called a gaze functional spectrum representing the distribution of perceptual intensity of multiple gaze functions and presents a gaze functional spectrum corpus that targets 43 gaze functions covering various speech-related, listening-related and other functions. Then, semiorthogonal nonnegative matrix factorization (SO-NMF) is employed to decompose the concatenated speaker-listener functional spectra into a interactional functional spectrum in a lower-dimensional functional space spanned with functional bases, each of which represents a distinct aspect of interactional functionalities. Targeting four female conversations, the GI-FSA revealed interpretable functional bases such as addressing-listening and joint positive emotion. In addition, this paper proposes convolutional neural networks (CNNs) that can recognize the binary level of the interactional functional spectrum from observable multimodal nonverbal behaviors, including head pose, utterance status, eyeball direction and facial expressions. These experimental fndings validate the potential of the GI-FSA as a promising framework for analyzing gaze interactions among interlocutors, and understanding communication dynamics. © 2024 Copyright held by the owner/author(s).
...4.Exploring Multimodal Nonverbal Functional Features for Predicting the Subjective Impressions of Interlocutors
- 关键词:
- Facial expression; feature selection; group meeting; head movement;multimodal recognition; nonverbal communication; social signal;subjective impression; group meeting; head movement; multimodalrecognition; nonverbal communication; social signal; subjectiveimpression;FEATURE-SELECTION; CONVERSATION; PERSONALITY; JAPANESE; GAZE
- Ito, Koya;Ishii, Yoko;Ishii, Ryo;Eitoku, Shin-Ichiro;Otsuka, Kazuhiro
- 《IEEE ACCESS》
- 2024年
- 12卷
- 期
- 期刊
This paper proposes models for predicting the subjective impressions of interlocutors in discussions according to multimodal nonverbal behaviors. To that end, we focus mainly on the functional aspects of head movement and facial expressions as insightful cues. For example, head movement functions include the speaker's rhythm and the listener's back channel and thinking processes, as well as their positive emotions. Facial expression functions include emotional expressions and communicative functions such as the speaker addressing the listener and the listener's affirmation. In addition, our model employs synergetic functions, which are jointly performed with head movements and facial expressions, assuming that the simultaneous appearance of head and face functions could strengthen the results or lead to multiplexing. On the basis of these nonverbal functions, we define a set of functional features, including the rate of occurrence and composition balance among different functions that emerge during conversation. Then, a feature selection scheme is used to identify the best combinations of intermodal and intramodal features. In the experiments, an SA-Off corpus of 17 groups of discussions involving 4 female participants was used, including interlocutors' self-reported scores for 16 impression items felt during the discussion, such as helpfulness and interest. The experiments confirmed that our models' predictions were significantly correlated with the self-reported scores for more than 70% of the impression items. These results indicate the effectiveness of multimodal nonverbal functional features for predicting subjective impressions.
...5.Synergistic Functional Spectrum Analysis: A Framework for Exploring the Multifunctional Interplay Among Multimodal Nonverbal Behaviours in Conversations
- 关键词:
- Convolution;Matrix algebra;Non-negative matrix factorization;Regression analysis;Spectrum analyzers;Vector spaces;Conversation;Convolutional neural network;Functionals;Multi-modal;Multifunctionals;Multimodal nonverbal behavior;Non-verbal behaviours;Nonnegative matrix factorization;Spectra analysis;Spectra's
- Imamura, Mai;Tashiro, Ayane;Kumano, Shiro;Otsuka, Kazuhiro
- 《IEEE Transactions on Affective Computing》
- 2024年
- 卷
- 期
- 期刊
A novel framework named the synergistic functional spectrum analysis (sFSA) is proposed to explore the multifunctional interplay among multimodal nonverbal behaviours in human conversations. This study aims to reveal how multimodal nonverbal behaviours cooperatively perform communicative functions in conversations. To capture the intrinsic nature of nonverbal expressions, functional multiplicity, and interpretational ambiguity, e.g., a single head nod could imply listening, agreeing, or both, a novel concept named the functional spectrum, which is defined as the distribution of perceptual intensities of multiple functions by multiple observers, is introduced in the sFSA. Based on this concept, this paper presents functional spectrum corpora, which target 44 facial expression and 32 head movement functions. Then, spectrum decomposition is conducted to reduce the multimodal functional spectrum to a synergetic functional spectrum in a lower dimension functional space that is spanned by functional basis vectors representing primary and distinctive functionalities across multiple modalities. To that end, we propose a semiorthogonal nonnegative matrix factorization (SO-NMF) method, which assumes the additivity of multiple functions and aims to balance the distinctiveness and expressiveness of the factorization. The results confirm that some primary functional bases can be identified, which can be interpreted as the listener’s backchannel, thinking, and affirmative response functions, and the speaker’s thinking and addressing functions, and their positive emotion functions. In addition, regression models based on convolutional neural networks (CNNs) are presented to estimate the synergistic functional spectrum from the head poses and facial action units measured from conversation data. The results of these analyses and experiments confirm the potential of the sFSA and may lead to future extensions. © 2010-2012 IEEE.
...
