基于几何代数的生理机能智能检测与评估系统的研究

项目来源

国家自然科学基金(NSFC)

项目主持人

曹文明

项目受资助机构

深圳大学

立项年度

2017

立项时间

未公开

项目编号

61771322

研究期限

未知 / 未知

项目级别

国家级

受资助金额

62.00万元

学科

信息科学-电子学与信息系统-医学信息检测与处理

学科代码

F-F01-F0125

基金类别

面上项目

关键词

生理机能 ; 检测与评估 ; 异常检测 ; 几何代数 ; 仿生模式识别 ; Anomaly monitoring ; Geometric Algebra ; Physical Function ; Monitoring and Assessment ; Biometric pattern recognition

参与者

何志海；谢维信；杜浩翠；吕芳芳；陈学军；李宇鸿

参与机构

美国密苏里大学哥伦比亚分校

项目标书摘要：生理机能智能检测与评估是复杂系统科学问题，需要寻求有效的数学方法来研究解决，而几何代数通过建立经典几何的统一代数表示，实现了不变量代数的高效计算，从而实现了用统一的几何语言进行经典几何计算，它的理论为生理机能仿生模式识别模式的智能检测与评估问题提供了新的简洁有效的数学工具。本课题将几何代数与仿生模式识别理论相结合，研究生理机能智能检测与评估问题。首先，结合人体生理机能评估指标信息，建立生理机能系统几何代数时空信息表示；其次，利用几何代数对生理机能系统进行时空分析与建立相关时空模型，分析和挖掘生理机能信息时空域几何不变量相关性；最后，通过构造出人体生理机能连续性行为不变量的几何代数最佳模板几何覆盖体，实现其生理机能连续性智能检测与评估的可信度分类，为生理机能信息智能检测与评估提供新模型与新方法，通过实际场景检验该系统对生理机能智能检测与评估的有效性与先进性。

Application Abstract: Physical function monitoring and assessment is a complicated problem in system science,requiring us to find effective mathematical solutions.Geometric algebra is able to perform fast and efficient computation of invariant algebra by constructing a unified algebra representation of classic geometric problems,achieving classic geometric computation using a unified geometric language.It provides an elegant and effective mathematical tool for biometric pattern recognition,physical functional assessment,and intelligent monitoring using sensor networks.This research project integrates geometric algebra with biometric pattern recognition for efficient functional assessment.Specifically,we first establish a geometric algebra representation of the physical functional system in the spatiotemporal domain based on domain knowledge of professional instruments for functional assessment.Second,we perform geometric algebra analysis of physical functions and develop spatiotemporal correlation models,extracting and uncovering correlation between geometric invariants in physical functional monitoring data.Finally,by constructing the optimal covering of continuous behaviors of physical functions with geometric patterns,we develop a confidence classification scheme for continuous functional assessment。This allows us to establish new models and methods for physical function intelligence monitoring and assessment,as well as to evaluate the performance and effectiveness of our proposed methods in real-world scenarios.

项目受资助省

广东省

项目结题报告(全文)

本项目首先建立了设计生理机能系统数据流的几何代数表达与基本运算软件工具包；其次针对生理机能出现的问题，构造降维几何代数(RGA)理论，利用RG A理论，进行输入图像，神经元，卷积内核，学习算法以及RGA框架内的所有相关计算，该方法充分保留联合信道信息，从而降低RGA-CNN网络复杂度；然后，针对人体生理机能表现分析身体状况，通过完整几何代数框架骨架数据来进行分析，进行的集成人体动作分类识别方法，得到相关生理机能的分析；最后，初步建立生理机能评估系统，并进一步完善。取得相关的研究成果，按照项目执行情况，完成了该项目。项目完成了生理机能的几何代数的模型建立与分析理论，并初步形成相关系统。已经发表论文36 篇,SCI 31,会议论文5篇，专著5本,5人3次参加国际会议，申请发明专利17项，其中授权专利4项,PCT 7项，正在培养博士研究生2 名，已经硕士毕业研究生10名。

排序方式：时间相关性
显示方式：列表摘要

1.Dual Knowledge-Aware Guidance forSource-Free Domain Adaptive Fundus Image Segmentation

关键词：
Balancing;Calibration;Domain Knowledge;Knowledge management;Knowledge transfer;Semantics;Boundary information;Domain adaptation;Domain-invariant knowledge;Domain-specific knowledge;Fundus image;Images segmentations;Pseudo-label calibration;Source models;Source-free domain adaptation;Target domain

Chen, Yu;Wang, Hailing;Wu, Chunwei;Cao, Guitao
《28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025》
2026年
September 23, 2025 - September 27, 2025
Daejeon, Korea, Republic of
会议

Source-free domain adaptation (SFDA), where only a pre-trained source model is available to adapt to the target domain, has gained widespread application in the medical field. Most existing methods overlook low-quality pseudo-labels, i.e., pseudo-labels with boundary semantic confusion, when learning target domain-specific knowledge, leading to the loss of crucial boundary information. Furthermore, focusing solely on the specific knowledge can drive the model shifts in an uncontrollable direction, resulting in model degradation. To address these issues, we propose Dual Knowledge-aware Guidance (DKG), a novel SFDA method that integrates domain-specific knowledge with domain-invariant knowledge to improve transfer performance. Specifically, the pseudo-label calibration scheme is proposed to reduce semantic bias in high-uncertainty pixels, preserving the boundary information of target domain-specific knowledge. To ensure stable training, we propose a domain-invariant knowledge-based loss strategy, leveraging a confidence-guided mechanism and a consistency constraint. Additionally, we also introduce a dynamic balancing loss to address class imbalance. Extensive experiments on cross-domain fundus image segmentation show that DKG achieves state-of-the-art performance. Code is available at https://github.com/Hanshuqian/DKG © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

...

2.SS-Mixer: MLP-Based 3D Human Motion Prediction with Spatial-Spectral Attention

关键词：
Complex networks;Convolution;Dynamics;Forecasting;Low pass filters;Mixer circuits;Mixers (machinery);Mixing;Motion capture;Motion estimation ;Network layers;Active motion;Convolutional networks;Graph convolutional network;Human motions;Mixing mechanisms;Motion generation;Motion prediction;Multilayers perceptrons;Spatial-spectral mixing mechanism;Spectral mixing

Zhang, Jianhua;Zhong, Jianqi;Cao, Wenming
《6th Asia-Pacific Conference on Image Processing, Electronics and Computers, IPEC 2025》
2025年
May 16, 2025 - May 18, 2025
Dalian, China
会议

Traditional graph convolutional network (GCN)-based methods for 3D human motion prediction have demonstrated great potential. However, these methods face two critical limitations: first, they require a large number of trainable parameters due to the complex network structure; second, they fail to differentiate between active motion regions and static regions, leading to suboptimal feature extraction. To address these issues, we propose Spatial-Spectral MLPs (SS-Mixer), a novel architecture designed to efficiently capture spatial and spectral features for human motion prediction. The SS-Mixer introduces an attention-based segmentation mechanism to distinguish active motion regions from static regions, allowing the network to prioritize critical features. Furthermore, we decompose the input skeleton into multiple scales, modeling the dynamics of each part independently to enhance feature diversity. By incorporating a hybrid spatial-spectral mixing mechanism, SS-Mixer captures the diversity in motion sequences across both spatial and spectral domains, improving prediction performance. The integration of spectral decomposition into the mixing process addresses the low-pass filtering issue in GCNs, ensuring robust representation learning for dynamic motions. Extensive experiments on three challenging datasets - Human3.6M, and 3DPW - demonstrate the superiority of SS-Mixer. Our model demonstrates outstanding performance in terms of 3D mean per joint position error (MPJPE) across these datasets, achieving significant improvements compared to state-of-the-art methods. These results validate the exceptional ability of SS-Mixer in enhancing prediction accuracy while maintaining computational efficiency. These results highlight the effectiveness of SS-Mixer in balancing computational efficiency and predictive accuracy while addressing the limitations of existing GCN-based approaches. © 2025 IOS Press.

...

3.Counterfactual Thinking Driven Emotion Regulation for Image Sentiment Recognition

关键词：
Emotion Recognition;Feature Selection;Psychology computing;Quality control;Affective Computing;Counterfactuals;Effective tool;Emotion predictions;Emotion regulations;Generation tools;Psychological theory;Recognition methods;Region-based;Regulation networks

Zhang, Xinyue;Wang, Zhaoxia;Wang, Hailing;Cao, Guitao
《34th Internationa Joint Conference on Artificial Intelligence, IJCAI 2025》
2025年
August 16, 2025 - August 22, 2025
Montreal, QC, Canada
会议

Image sentiment recognition (ISR) facilitates the practical application of affective computing on rapidly growing social platforms. Nowadays, region-based ISR methods that use affective regions to guide emotion prediction have gained significant attention. However, existing methods lack a causality-based mechanism to guide affective region generation and effective tools to quantitatively evaluate their quality. Inspired by the psychological theory of Emotion Regulation, we propose a counterfactual thinking driven emotion regulation network (CTERNet), which simulates the Emotion Regulation Theory by modeling the entire process of ISR based on human causality-driven mechanisms. Specifically, we first use multi-scale perception for feature extraction to simulate the stage of situation selection. Next, we combine situation modification, attentional deployment, and cognitive change into a counterfactual thinking based cognitive reappraisal module, which learns both affective regions (factual) and other potential affective regions (counterfactual). In the response modulation stage, we compare the factual and counterfactual outcomes to encourage the network to discover the most emotionally representative regions, thereby quantifying the quality of affective regions for ISR tasks. Experimental results demonstrate that our method outperforms or matches the state-of-the-art approaches, proving its effectiveness in addressing the key challenges of region-based ISR. © 2025 International Joint Conferences on Artificial Intelligence. All rights reserved.

...

4.CBA:-2 Cross-Domain Consistency and Bidirectional Alignment for Cross-Modal Domain-Incremental Learning

关键词：
Computer vision;Domain Knowledge;Learning systems;Modal analysis;Cross-domain;Cross-modal;Cross-modal attention;Domain consistency;Domain-incremental learning;Global knowledge;Incremental learning;Language model;Modal domain;Vision-language model

Huang, Weiyi;Xi, Xidong;Wang, Hailing;Cao, Guitao
《2025 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2025》
2025年
October 5, 2025 - October 8, 2025
Hybrid, Vienna, Austria
会议

In Cross-Modal Domain-Incremental learning, the primary challenge lies in learning from varying data distributions and maintaining its performance on prior domains. However, existing methods often overlook the importance of shared knowledge across domains and the interaction between modalities is still insufficient. To address these issues, we propose Cross-Domain Consistency and Bidirectional Alignment (C2BA), a novel framework that enhances the model's generalization ability and improves the cross-modal integration in VLMs through two key components. We design a Cross-domain Global Consistency Constraint (CGCC) to stabilize domain-invariant representations during incremental training, preventing excessive shifts of shared distributions toward new domains. In addition, we design a Bidirectional Cross-Modal Attention (BCMA) module, which enables effective interaction between visual and textual features through a bidirectional attention mechanism, thereby reducing cross-modal discrepancies. Experiments on three benchmark datasets demonstrate that our method outperforms state-of-the-art exemplar-free and even exemplar-based approaches, achieving superior generalization and cross-modal interaction. © 2025 IEEE.

...

5.Refining Long-Term Predictions: Two-Stage Spatial-Temporal Feature Learning for 3D Human Motion Prediction

关键词：
;3D skeleton;Auto-regressive;Feature learning;GCN;Human motion prediction;Human motions;Hybrid regressive mechanism;Long-term prediction;Motion prediction;Spatial-temporal features

Cao, Wenming;Yang, Yixin;Zhong, Jianqi;Zhang, Yicha
《2025 IEEE International Conference on Big Data and Smart Computing, BigComp 2025》
2025年
February 9, 2025 - February 12, 2025
Kota Kinabalu, Malaysia
会议

3D skeleton-based human motion prediction is critical for human-machine interactions but remains challenging. Recent RNN-based approaches achieve good performance but suffer from error accumulation due to their sequential prediction. To overcome this, we propose a Hybrid Regressive Network with Better Guesses Decision, combining autoregressive and non-autoregressive strategies to improve accuracy. The Better Guesses Decision unit enhances long-term forecasting through Better Guess Learning and Better Prediction Decision. Our Multimapping Parsing Unit maps motion sequences into geometric algebra and Euclidean spaces, providing comprehensive modeling of motion dependencies. Experiments on Human3.6M datasets show that our method achieves state-of-the-art performance. © 2025 IEEE.

...

6.A Novel Framework for Inverse Problems: Fixed-Point Iteration Using Consistency Models

关键词：
;

Wang, Xinke;Cao, Guitao;Wang, Hailing
《2025 International Joint Conference on Neural Networks, IJCNN 2025》
2025年
June 30, 2025 - July 5, 2025
Rome, Italy
会议

Inverse problems play a crucial role in science and engineering, especially in the field of computer vision, where tasks such as deblurring, super-resolution, and colorization can be formally modeled as inverse problems. Consistency models excel in generation speed while maintaining high quality, making them a promising family of generative models. However, existing sampling methods struggle to achieve high-quality results when applying consistency models to image inverse problems. To address this limitation, we propose the Consistency Inverse Reconstruction Sampling (CIRS) framework, which incorporates two modes: CIRS-Hybrid and CIRS-Pure. In CIRS-Hybrid, the posterior formula of inverse problems is utilized by estimating the prior term using a diffusion denoiser and the likelihood term with a consistency model, enabling reconstruction under dual-model guidance. To overcome the complexities of dual-model tuning and inefficiencies caused by employing a diffusion denoiser, we introduce CIRS-Pure, which relies solely on a consistency model. By eliminating the iterative noise addition and denoising steps, the iterative procedure is transformed into a fixed-point iteration, achieving efficient and high-quality restoration. Extensive experiments demonstrate that CIRS-Pure outperforms state-of-the-art methods in zero-shot image restoration tasks such as image deblurring and colorization while achieving competitive performance in super-resolution. © 2025 IEEE.

...

7.A Generic Autoregressive Predictive Feedback Framework forSkeleton-Based Action Recognition

关键词：
Feedback;Action recognition;Auto-regressive;Autoregressive predictive;Global models;Long-term temporal dependence;Motion sequences;Ordering constraints;State-space models;Stationary components;Temporal dependence

Yin, Xinpeng;Hu, Jing;Cao, Wenming
《17th Asian Conference on Computer Vision, ACCV 2024》
2025年
December 8, 2024 - December 12, 2024
Hanoi, Viet nam
会议

Prior works in skeleton-based action recognition have struggled with overcoming sequence order constraints while achieving comprehensive global modeling of temporal dependencies. However, most focus on enhancing the model’s fitting ability across different temporal scales, overlooking the temporal non-stationary characteristics inherent in motion sequences. In this paper, we explore the adaptation of state-space modeling (SSM), typically suited for stationary sequences, to motion sequences. Addressing the challenge posed by the trendiness of motion sequences and the stability requirement of SSM, we integrate SSM into a generalized Autoregressive Predictive Feedback (APF) framework. Our approach involves segmenting motion sequences into trend and stationary components. We introduce the Non-Independent Multi-channel Processing (NiMc-P) module to capture implicit relationships among 3D coordinates and propose the Independent Multi-joint SSM (IMj-S) module to model temporal dependencies within stationary components. Throughout this process, state space matrices drive the feedback mechanism. Experiments conducted on the NTU-RGB+D 60 and NTU-RGB+D 120 datasets demonstrate the efficiency and versatility of APF. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

...

8.Self-supported Prototype Rectification for Few-shot Medical Image Segmentation

关键词：
Electric rectifiers;Medical imaging;Structured Query Language;Few-shot learning;Intra class;Labeled images;Many to many;Medical image segmentation;Prototype rectification;Query images;Self-support;Semantic segmentation;Semantics Information

Li, Zhaoxu;Wang, Hailing;Cao, Guitao
《2024 International Joint Conference on Neural Networks, IJCNN 2024》
2024年
June 30, 2024 - July 5, 2024
Yokohama, Japan
会议

Few-shot semantic segmentation aims to quickly adapt to pixel-wise predictions for novel classes with only a few labeled images. Recent works rely on prototypical learning, where prototypes obtained from support images are applied to the segmentation of query images. However, there are inherent intra-class appearance differences between support images and query images, and the prototypes extracted from a small number of support images contain limited deep semantic information, which makes it difficult to accurately guide the segmentation of query images. To alleviate this problem, we propose a Self-Supported Prototype Rectification Network. Specifically, we introduce a Pseudo Mask Generation (PMG) module to generate a pseudo query mask by means of many-to-many prototype matching. We design a Prototype Rectification (PR) module with a learnable parameter ? to balance self-supported rectified prototype between support prototype obtained from support image and query prototype extracted from query features with pseudo query mask. Furthermore, we introduce a prototype-based multi-class segmentation approach mitigate the issue of confusion area prediction among different organs for query images in multi-organ segmentation scenario. Our method outperforms other SOTAs on two widely used datasets: CHAOST2 and MS-CMR. © 2024 IEEE.

...

9.DLE: Document Illumination Correction with Dynamic Light Estimation

关键词：
Image enhancement;Photodegradation;Photomasks;Adversarial networks;Background light;Document images;Down-stream;Illumination correction;Image degradation;Light estimations;Multi-modal;Natural environments;Subnetworks

Quan, Jiahao;Wang, Hailing;Wu, Chunwei;Cao, Guitao
《2024 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2024》
2024年
October 6, 2024 - October 10, 2024
Kuching, Malaysia
会议

Document images captured through mobile devices in natural environments are often affected by various types of illumination degradation. The degradation diminishes the clarity and readability of document images, thereby complicating their application to OCR downstream tasks. Existing methods typically address only one or a limited number of degradation types and do not consider the diversity of image degradation types. Additionally, these methods typically involve a pre-trained fixed sub-network to estimate background light or shadows, which lacks flexibility and adaptability. To overcome these challenges, this study proposes a novel framework named DLE, which comprises a two-loop generative adversarial network and a multi-modal discriminator. Specifically, to improve the quality of image representation, a mask extractor is embedded before the image input generator. This forces the model to focus on the distinct features in the image, enhancing the representation of illumination anomalous and degraded regions. The mask extractor generates a luminance mask to evaluate the difference in illumination between the input and target images. Subsequently, the consistency loss computation incorporates a dynamic optimization of the mask extractor, strengthening its ability to estimate the illumination degradation part. Moreover, a pre-trained visual-language model is introduced into the multi-modal discriminator, leveraging its robust cross-modal alignment capability to improve the semantic consistency of the generated images with the preset input text. Extensive experiments demonstrate that our approach achieves the SOTA performance in terms of edit distance (ED) and character error rate (CER). © 2024 IEEE.

...

10.HWSformer: History Window Serialization Based Transformer for Semantic Enrichment Driven Stock Market Prediction

关键词：
Commerce;Costs;Electronic trading;Financial markets;Marketplaces;Natural language processing systems;Prediction models;Semantics;Time series;Performance;Price index;Semantic enrichment;Stock index forecasting;Stock market prediction;Stock price;Stock price index forecasting;Time-series data;Transformer modeling;Transformer-based

Hu, Yisheng;Cao, Guitao;Cheng, Dawei
《2024 International Joint Conference on Neural Networks, IJCNN 2024》
2024年
June 30, 2024 - July 5, 2024
Yokohama, Japan
会议

After the Transformer model demonstrated excellent performance in natural language processing (NLP) tasks and computer vision tasks, people have started to explore the use of Transformer models in the field of time series prediction. Because of the significant role of the stock market in the global economy, stock market prediction is of paramount importance for investors. Stock indices forecasting is one of the fields of stock market forecasting and researchers have also set their sights on Transformer. However, with limited semantic information available in time series data and the unique characteristics of the self-attention mechanism, the Transformer model has not gained widespread adoption in stock indices forecasting. In this paper, we propose a history window serialization based Transformer model (HWSformer) specifically designed for predicting stock price indices. Our innovation is to introduce the historical window serialization layer to solve the problem of limited semantic richness in time series data, which affects the validity of self-attention. Additionally, in order to capture the original distribution accurately and retain the valuable non-stationary information, we incorporate the Reversible Instance Normalization (RevIN) method. We conducted experiments on 12 stock price index datasets collected from multiple countries and demonstrated that HWSformer outperforms traditional Transformer models by approximately 20% and varying degrees of improvement compared to other recent variants of Transformers. © 2024 IEEE.

...

排序方式：时间相关性
显示方式：列表摘要