基于几何代数的生理机能智能检测与评估系统的研究

项目来源

国家自然科学基金(NSFC)

项目主持人

曹文明

项目受资助机构

深圳大学

立项年度

2017

立项时间

未公开

项目编号

61771322

项目级别

国家级

研究期限

未知 / 未知

受资助金额

62.00万元

学科

信息科学-电子学与信息系统-医学信息检测与处理

学科代码

F-F01-F0125

基金类别

面上项目

关键词

生理机能 ; 检测与评估 ; 异常检测 ; 几何代数 ; 仿生模式识别 ; Anomaly monitoring ; Geometric Algebra ; Physical Function ; Monitoring and Assessment ; Biometric pattern recognition

参与者

何志海；谢维信；杜浩翠；吕芳芳；陈学军；李宇鸿

参与机构

美国密苏里大学哥伦比亚分校

项目标书摘要：生理机能智能检测与评估是复杂系统科学问题，需要寻求有效的数学方法来研究解决，而几何代数通过建立经典几何的统一代数表示，实现了不变量代数的高效计算，从而实现了用统一的几何语言进行经典几何计算，它的理论为生理机能仿生模式识别模式的智能检测与评估问题提供了新的简洁有效的数学工具。本课题将几何代数与仿生模式识别理论相结合，研究生理机能智能检测与评估问题。首先，结合人体生理机能评估指标信息，建立生理机能系统几何代数时空信息表示；其次，利用几何代数对生理机能系统进行时空分析与建立相关时空模型，分析和挖掘生理机能信息时空域几何不变量相关性；最后，通过构造出人体生理机能连续性行为不变量的几何代数最佳模板几何覆盖体，实现其生理机能连续性智能检测与评估的可信度分类，为生理机能信息智能检测与评估提供新模型与新方法，通过实际场景检验该系统对生理机能智能检测与评估的有效性与先进性。

Application Abstract: Physical function monitoring and assessment is a complicated problem in system science,requiring us to find effective mathematical solutions.Geometric algebra is able to perform fast and efficient computation of invariant algebra by constructing a unified algebra representation of classic geometric problems,achieving classic geometric computation using a unified geometric language.It provides an elegant and effective mathematical tool for biometric pattern recognition,physical functional assessment,and intelligent monitoring using sensor networks.This research project integrates geometric algebra with biometric pattern recognition for efficient functional assessment.Specifically,we first establish a geometric algebra representation of the physical functional system in the spatiotemporal domain based on domain knowledge of professional instruments for functional assessment.Second,we perform geometric algebra analysis of physical functions and develop spatiotemporal correlation models,extracting and uncovering correlation between geometric invariants in physical functional monitoring data.Finally,by constructing the optimal covering of continuous behaviors of physical functions with geometric patterns,we develop a confidence classification scheme for continuous functional assessment。This allows us to establish new models and methods for physical function intelligence monitoring and assessment,as well as to evaluate the performance and effectiveness of our proposed methods in real-world scenarios.

项目受资助省

广东省

项目结题报告(全文)

本项目首先建立了设计生理机能系统数据流的几何代数表达与基本运算软件工具包；其次针对生理机能出现的问题，构造降维几何代数(RGA)理论，利用RG A理论，进行输入图像，神经元，卷积内核，学习算法以及RGA框架内的所有相关计算，该方法充分保留联合信道信息，从而降低RGA-CNN网络复杂度；然后，针对人体生理机能表现分析身体状况，通过完整几何代数框架骨架数据来进行分析，进行的集成人体动作分类识别方法，得到相关生理机能的分析；最后，初步建立生理机能评估系统，并进一步完善。取得相关的研究成果，按照项目执行情况，完成了该项目。项目完成了生理机能的几何代数的模型建立与分析理论，并初步形成相关系统。已经发表论文36 篇,SCI 31,会议论文5篇，专著5本,5人3次参加国际会议，申请发明专利17项，其中授权专利4项,PCT 7项，正在培养博士研究生2 名，已经硕士毕业研究生10名。

排序方式：时间相关性
显示方式：列表摘要

1.Dynamic prototype with discriminative representation for rapid adaptation in new organ segmentation

关键词：
Image segmentation;Learning systems;Attention mechanisms;Discriminative representation;Few-shot segmentation;Medical domains;Organ segmentation;Prototype learning;Prototype-based learning;Rapid adaptation;Self-attention mechanism;Shot segmentation

Wang, Hailing;Chen, Yu;Zhang, Xinyue;Cao, Guitao;Cao, Wenming
《Pattern Recognition》
2026年
173卷
期
期刊

Recent work in label-efficient prototype-based learning have demonstrated significant potential for rapid adaptation in new organ segmentation. However, a prevalent challenge in prototypical extraction within the medical domain is semantic bias. To address this issue, we propose a Dynamic Prototype with Discriminative Representation Network (DPDRNet), to enhance the effectiveness of semantic class prototype for new organ. Specifically, we introduce a self-attention mechanism to generate dynamic prototype, enhancing the efficient utilization of local information. This is accomplished by capturing interdependencies among pixel-level prototypes from limited labeled samples. Subsequently, we design a prototype contrastive learning method to maintain the discriminative representation of dynamic prototype in the high-level feature space. This method enhances the correlation between dynamic prototype and foreground features while simultaneously increasing the distinction from background features. By incorporating a self-attention mechanism with contrastive learning, the proposed dynamic prototype exhibits enhanced generalization capabilities, facilitating more precise segmentation of new organ structures. Experimental results demonstrate that our method achieves effective performance on Cardiac and Abdominal MRI segmentation tasks. © 2025 Elsevier Ltd

...

2.Dual-decoder collaborative learning with multi-hybrid view augmentation for self-supervised 3D action recognition

关键词：
Skeleton-based action recognition; Self-supervised representationlearning; Contrastive learning; Masked autoencoder; Masked skeletonmodeling

Cao, Wenming;Wu, Yingfei;Yin, Xinpeng
《PATTERN RECOGNITION》
2026年
172卷
期
期刊

Self-supervised methods, including contrastive learning and masked skeleton modeling, have demonstrated considerable potential in the field of skeleton-based action recognition. While contrastive learning captures finegrained details at the instance level, masked skeleton modeling emphasizes joint-level features. Recent studies have begun to combine these two approaches. However, existing combination methods primarily focus on integrating the tasks within the skeleton space. Moreover, existing contrastive learning methods often fail to exploit the comprehensive interaction information in skeletal structures, resulting in suboptimal performance when recognizing actions involving multiple individuals. To overcome these limitations, we introduce the Dual-Decoder Collaborative Learning (DDC) with Multi-Hybrid View Augmentation (MHGNA) method, which connects these two tasks across multiple spaces. Specifically, the masked skeleton modeling task provides diverse views for the contrastive learning task in the skeleton space, while the contrastive method aligns the features generated by both tasks within the feature space. We further present an innovative view augmentation method that enhances the model's capacity to understand human interaction relationships by shuffling and replacing data across temporal, spatial, and personal dimensions. Extensive experiments on four downstream tasks across three largescale datasets demonstrate that DDC exhibits stronger representational capabilities compared to state-of-the-art methods. Our code is available at https://github.com/Yingfei-Wu/DDC.

...

3.Dual Knowledge-Aware Guidance forSource-Free Domain Adaptive Fundus Image Segmentation

关键词：
Balancing;Calibration;Domain Knowledge;Knowledge management;Knowledge transfer;Semantics;Boundary information;Domain adaptation;Domain-invariant knowledge;Domain-specific knowledge;Fundus image;Images segmentations;Pseudo-label calibration;Source models;Source-free domain adaptation;Target domain

Chen, Yu;Wang, Hailing;Wu, Chunwei;Cao, Guitao
《28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025》
2026年
September 23, 2025 - September 27, 2025
Daejeon, Korea, Republic of
会议

Source-free domain adaptation (SFDA), where only a pre-trained source model is available to adapt to the target domain, has gained widespread application in the medical field. Most existing methods overlook low-quality pseudo-labels, i.e., pseudo-labels with boundary semantic confusion, when learning target domain-specific knowledge, leading to the loss of crucial boundary information. Furthermore, focusing solely on the specific knowledge can drive the model shifts in an uncontrollable direction, resulting in model degradation. To address these issues, we propose Dual Knowledge-aware Guidance (DKG), a novel SFDA method that integrates domain-specific knowledge with domain-invariant knowledge to improve transfer performance. Specifically, the pseudo-label calibration scheme is proposed to reduce semantic bias in high-uncertainty pixels, preserving the boundary information of target domain-specific knowledge. To ensure stable training, we propose a domain-invariant knowledge-based loss strategy, leveraging a confidence-guided mechanism and a consistency constraint. Additionally, we also introduce a dynamic balancing loss to address class imbalance. Extensive experiments on cross-domain fundus image segmentation show that DKG achieves state-of-the-art performance. Code is available at https://github.com/Hanshuqian/DKG © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

...

4.Spectral–spatial representation progressive learning via segmented attention for 3D skeleton-based motion prediction

关键词：
Algebra;Arts computing;Bone;Extraction;Motion estimation;Spectrum analysis;Three dimensional computer graphics;3D skeleton;Feature information;Motion generation;Motion prediction;Progressive learning;Recombination factors;Soft attention;Spatial representations;Spectra's;Spectral–spatial representation

Cao, Wenming;Zhang, Jianhua;Zhong, Jianqi
《Applied Soft Computing》
2025年
184卷
期
期刊

Recently, GCNs-based methods have demonstrated impressive performance in human behavior prediction tasks. We believe that human motion modeling can explained as motion correlation extraction from the combination of the active and static motion parts analysis. However, existing methodologies fail to address the issue that feature information associated with static regions may overshadow feature information from dynamic regions, ultimately affecting the extraction of network features. Moreover, the unique low-pass feature pre-retention processing mechanism of GCN on the spectrum will lead to the attitude of some sequences remaining unchanged during the prediction process and further hurt the prediction. In this paper, we propose a Spectral–Spatial Representation Progressive Learning network to solve the problem above. Firstly, we propose a segmented attention block to compare the input observation sequence with the static contrast standard to obtain the motion region and the rest region. Then, we design the Spectrum Deconstruction Recombination Factor block(SDRF) to extract the global bandpass spectrum of human bone joints. The joint features of different regions are encoded by graph convolution and high-frequency feature filter coding based on geometric algebra. Specifically, a spectral–spatial interaction block is presented in each SDRF, focusing on the diversity of motion sequence frequency domain and spatial domain map, and realizes the fine extraction of historical pose sequence features from the two levels of space and spectral domain. Experimental results demonstrate that our approach outperforms state-of-art algorithms by 2.4%, 5.3% and 4.7% in terms of 3D mean per joint position error on Human 3.6M, CMU Mocap and 3DPW datasets, respectively. © 2025

...

5.SS-Mixer: MLP-Based 3D Human Motion Prediction with Spatial-Spectral Attention

关键词：
Complex networks;Convolution;Dynamics;Forecasting;Low pass filters;Mixer circuits;Mixers (machinery);Mixing;Motion capture;Motion estimation ;Network layers;Active motion;Convolutional networks;Graph convolutional network;Human motions;Mixing mechanisms;Motion generation;Motion prediction;Multilayers perceptrons;Spatial-spectral mixing mechanism;Spectral mixing

Zhang, Jianhua;Zhong, Jianqi;Cao, Wenming
《6th Asia-Pacific Conference on Image Processing, Electronics and Computers, IPEC 2025》
2025年
May 16, 2025 - May 18, 2025
Dalian, China
会议

Traditional graph convolutional network (GCN)-based methods for 3D human motion prediction have demonstrated great potential. However, these methods face two critical limitations: first, they require a large number of trainable parameters due to the complex network structure; second, they fail to differentiate between active motion regions and static regions, leading to suboptimal feature extraction. To address these issues, we propose Spatial-Spectral MLPs (SS-Mixer), a novel architecture designed to efficiently capture spatial and spectral features for human motion prediction. The SS-Mixer introduces an attention-based segmentation mechanism to distinguish active motion regions from static regions, allowing the network to prioritize critical features. Furthermore, we decompose the input skeleton into multiple scales, modeling the dynamics of each part independently to enhance feature diversity. By incorporating a hybrid spatial-spectral mixing mechanism, SS-Mixer captures the diversity in motion sequences across both spatial and spectral domains, improving prediction performance. The integration of spectral decomposition into the mixing process addresses the low-pass filtering issue in GCNs, ensuring robust representation learning for dynamic motions. Extensive experiments on three challenging datasets - Human3.6M, and 3DPW - demonstrate the superiority of SS-Mixer. Our model demonstrates outstanding performance in terms of 3D mean per joint position error (MPJPE) across these datasets, achieving significant improvements compared to state-of-the-art methods. These results validate the exceptional ability of SS-Mixer in enhancing prediction accuracy while maintaining computational efficiency. These results highlight the effectiveness of SS-Mixer in balancing computational efficiency and predictive accuracy while addressing the limitations of existing GCN-based approaches. © 2025 IOS Press.

...

6.Progressively deeper attention networks for 3D human motion prediction

关键词：
Human motion prediction; Transformer; GCNs; Motion dependencies learning

Huang, Jiangtao;He, Dong;Cao, Wenming;Zhong, Jianqi
《MULTIMEDIA SYSTEMS》
2025年
31卷
5期
期刊

Human motion prediction is a significant challenge with broad applications in fields such as robotics, human-computer interaction, and healthcare. Despite the progress achieved by recent deep learning approaches, existing methods often struggle to effectively capture the complex spatial relationships and long-term temporal dependencies inherent in human motion. To address the issue, we propose the Progressive Deeper Attention Network (PDANet), which incorporates multiple GCN-Attention modules of varying depths. This architecture enables the model to extract more comprehensive information from sequential data. Additionally, we enhance the model's performance through two key improvements: (1) the introduction of joint-relative velocity and temporally perturbed features to distinguish complex motion semantics between dynamic and static joints; and (2) the design of a Multi-Dimensional Joint Fusion (MDJF) module, which employs the Gumbel Softmax method to dynamically learn the optimal fusion strategy for multi-semantic sequences. Extensive experiments demonstrate the effectiveness of our model. The proposed approach outperforms state-of-the-art methods by 2.8%, 4.7%, and 18.8% in terms of MPJPE for human motion prediction on the Human3.6M, AMASS, and 3DPW datasets, respectively.

...

7.FLCL: Feature-Level Contrastive Learning for Few-Shot Image Classification

关键词：
Contrastive learning; Few shot learning; Training; Feature extraction;Measurement; Metalearning; Data augmentation; Vectors; Data models;Adaptation models; few-shot learning; data augmentation; imageclassification

Cao, Wenming;Zeng, Jiewen;Liu, Qifan
《IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING》
2025年
13卷
3期
期刊

Few-shot classification is the task of recognizing unseen classes using a limited number of samples. In this paper, we propose a new contrastive learning method called Feature-Level Contrastive Learning (FLCL). FLCL conducts contrastive learning at the feature level and leverages the subtle relationships between positive and negative samples to achieve more effective classification. Additionally, we address the challenges of requiring a large number of negative samples and the difficulty of selecting high-quality negative samples in traditional contrastive learning methods. For feature learning, we design a Feature Enhancement Coding (FEC) module to analyze the interactions and correlations between nonlinear features, enhancing the quality of feature representations. In the metric stage, we propose a centered hypersphere projection metric to map feature vectors onto the hypersphere, improving the comparison between the support and query sets. Experimental results on four few-shot classification benchmark datasets demonstrate that our method, while simple in design, outperforms previous methods and achieves state-of-the-art performance. A detailed ablation study further confirms the effectiveness of each component of our model.

...

8.Linearformer: Tri-Net Multi-Layer DVF Medical Image Registration

关键词：
Angiography;Deep neural networks;Electroencephalography;Functional neuroimaging;Image registration;Linearization;Mammography;Multilayer neural networks;Transillumination;Accurate registration;Brain MRI;Convolutional neural network;Deep learning;Deformable medical image registration;Linearformer;Medical image registration;Multi-layers;Similarity measure;Transformer modeling

Anwar, Muhammad;Yan, Zhiyue;Cao, Wenming
《Expert Systems》
2025年
42卷
7期
期刊

In medical imaging, accurate registration is crucial for reliable analysis. While transformer models demonstrate potential, their application to large datasets like OASIS is constrained by substantial memory requirements, quadratic complexity and the challenge of managing complex deformations. To overcome these challenges, Linearformer is introduced, an efficient transformer-based model with Linear-ProbSparse self-attention for optimised time and memory, along with TNM DVF, a Pyramid-based framework for unsupervised non-rigid registration. Evaluated on OASIS and LPBA40 brain MRI datasets, the model outperforms state-of-the-art methods in Dice score and Jacobian metrics, surpassing TransMatch by 0.6% and 1.9% on the two datasets while maintaining a comparable voxel folding percentage. © 2025 John Wiley & Sons Ltd.

...

9.Asymmetric Context-Guided Adaptive Alignment Network for Skeleton-Based Action Recognition

关键词：
Skeleton; Image reconstruction; Transformers; Three-dimensionaldisplays; Data models; Adaptation models; Feature extraction;Computational modeling; Solid modeling; Representation learning;Self-supervised learning; skeleton-based action recognition; maskedmodeling; alignment

Cao, Wenming;Qian, Liangxi;Zhang, Yicha;Li, Xuelong;Yin, Xinpeng
《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》
2025年
35卷
6期
期刊

In skeleton-based action recognition, self-supervised pre-training paradigms have been extensively investigated. Particularly, masked autoencoders-like methods based on masked target reconstruction have pushed the performance of pre-training to a new height, which are committed to choose a better target for reconstruction. In this work, we propose an asymmetric context-guided adaptive alignment network (ACA(2)Net) for self-supervised skeleton-based action recognition by utilizing a transformer-based teacher encoder guiding the student encoder to learn richer action contextual information. To tackle the misalignment from the asymmetry, we devise an adaptive alignment module to better align the student representations to the teacher's. Additionally, considering that the differential operation for temporal motion might cause the prior loss related to the changes of direction, we propose a motion compass-aware masking strategy with fusion prior supplemented by motion and direction intensity. Extensive experiments on NTU-60, NTU-120, and PKU-MMD datasets demonstrate that our proposed ACA(2)Net outperforms previous MAE-like methods.

...

10.Progressive Feature Reconstruction Network for Zero-Shot Learning

关键词：
Visualization; Semantics; Image reconstruction; Feature extraction; Zeroshot learning; Whales; Training; Vectors; Data models; Benchmarktesting; Zero-shot learning; feature reconstruction; attributeinformation

Hu, Linchun;Cao, Wenming;Zhang, Zhenqi;Liang, Yuchuang
《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》
2025年
35卷
6期
期刊

Zero-shot learning (ZSL) aims to transfer the knowledge learned in the seen classes to the unseen classes through semantic knowledge. However, to ensure the model's versatility on different datasets, existing methods divide the image into blocks of the same size, resulting in the loss of information between attributes. More importantly, existing methods ignore that not every image contains all attributes corresponding to that class. In this paper, we propose a progressive feature reconstruction network, called PFRN. PFRN consists of an attribute relation sub-net and an attention-based feature reconstruction sub-net. Specifically, the attribute relation sub-net first adopts the attribute-related region module to obtain the attribute-related regions in the visual features, which are input to the attribute relation discovery module to find the relationships between attributes. The attention-based feature reconstruction sub-net obtains the fine-grained features based on attributes by the attribute attention module and uses the feature reconstruction module to randomly lose some attributes to reconstruct the new visual features of the missing attributes. The new visual features are fed back into the network for training. Finally, the attribute information learned by the attribute relation sub-net is fused to the visual embedding learned by the attention-based features reconstruction sub-net, and the ideal visual semantic interaction is performed with the semantic vector classified by ZSL. Extensive experiments on three ZSL benchmark datasets demonstrate the significant generalization performance of our proposed method over the state-of-the-art methods.

...

排序方式：时间相关性
显示方式：列表摘要