基于几何代数的生理机能智能检测与评估系统的研究

项目来源

国家自然科学基金(NSFC)

项目主持人

曹文明

项目受资助机构

深圳大学

立项年度

2017

立项时间

未公开

项目编号

61771322

项目级别

国家级

研究期限

未知 / 未知

受资助金额

62.00万元

学科

信息科学-电子学与信息系统-医学信息检测与处理

学科代码

F-F01-F0125

基金类别

面上项目

关键词

生理机能 ; 检测与评估 ; 异常检测 ; 几何代数 ; 仿生模式识别 ; Anomaly monitoring ; Geometric Algebra ; Physical Function ; Monitoring and Assessment ; Biometric pattern recognition

参与者

何志海；谢维信；杜浩翠；吕芳芳；陈学军；李宇鸿

参与机构

美国密苏里大学哥伦比亚分校

项目标书摘要：生理机能智能检测与评估是复杂系统科学问题，需要寻求有效的数学方法来研究解决，而几何代数通过建立经典几何的统一代数表示，实现了不变量代数的高效计算，从而实现了用统一的几何语言进行经典几何计算，它的理论为生理机能仿生模式识别模式的智能检测与评估问题提供了新的简洁有效的数学工具。本课题将几何代数与仿生模式识别理论相结合，研究生理机能智能检测与评估问题。首先，结合人体生理机能评估指标信息，建立生理机能系统几何代数时空信息表示；其次，利用几何代数对生理机能系统进行时空分析与建立相关时空模型，分析和挖掘生理机能信息时空域几何不变量相关性；最后，通过构造出人体生理机能连续性行为不变量的几何代数最佳模板几何覆盖体，实现其生理机能连续性智能检测与评估的可信度分类，为生理机能信息智能检测与评估提供新模型与新方法，通过实际场景检验该系统对生理机能智能检测与评估的有效性与先进性。

Application Abstract: Physical function monitoring and assessment is a complicated problem in system science,requiring us to find effective mathematical solutions.Geometric algebra is able to perform fast and efficient computation of invariant algebra by constructing a unified algebra representation of classic geometric problems,achieving classic geometric computation using a unified geometric language.It provides an elegant and effective mathematical tool for biometric pattern recognition,physical functional assessment,and intelligent monitoring using sensor networks.This research project integrates geometric algebra with biometric pattern recognition for efficient functional assessment.Specifically,we first establish a geometric algebra representation of the physical functional system in the spatiotemporal domain based on domain knowledge of professional instruments for functional assessment.Second,we perform geometric algebra analysis of physical functions and develop spatiotemporal correlation models,extracting and uncovering correlation between geometric invariants in physical functional monitoring data.Finally,by constructing the optimal covering of continuous behaviors of physical functions with geometric patterns,we develop a confidence classification scheme for continuous functional assessment。This allows us to establish new models and methods for physical function intelligence monitoring and assessment,as well as to evaluate the performance and effectiveness of our proposed methods in real-world scenarios.

项目受资助省

广东省

项目结题报告(全文)

本项目首先建立了设计生理机能系统数据流的几何代数表达与基本运算软件工具包；其次针对生理机能出现的问题，构造降维几何代数(RGA)理论，利用RG A理论，进行输入图像，神经元，卷积内核，学习算法以及RGA框架内的所有相关计算，该方法充分保留联合信道信息，从而降低RGA-CNN网络复杂度；然后，针对人体生理机能表现分析身体状况，通过完整几何代数框架骨架数据来进行分析，进行的集成人体动作分类识别方法，得到相关生理机能的分析；最后，初步建立生理机能评估系统，并进一步完善。取得相关的研究成果，按照项目执行情况，完成了该项目。项目完成了生理机能的几何代数的模型建立与分析理论，并初步形成相关系统。已经发表论文36 篇,SCI 31,会议论文5篇，专著5本,5人3次参加国际会议，申请发明专利17项，其中授权专利4项,PCT 7项，正在培养博士研究生2 名，已经硕士毕业研究生10名。

排序方式：时间相关性
显示方式：列表摘要

1.Dynamic prototype with discriminative representation for rapid adaptation in new organ segmentation

关键词：
Image segmentation;Learning systems;Attention mechanisms;Discriminative representation;Few-shot segmentation;Medical domains;Organ segmentation;Prototype learning;Prototype-based learning;Rapid adaptation;Self-attention mechanism;Shot segmentation

Wang, Hailing;Chen, Yu;Zhang, Xinyue;Cao, Guitao;Cao, Wenming
《Pattern Recognition》
2026年
173卷
期
期刊

Recent work in label-efficient prototype-based learning have demonstrated significant potential for rapid adaptation in new organ segmentation. However, a prevalent challenge in prototypical extraction within the medical domain is semantic bias. To address this issue, we propose a Dynamic Prototype with Discriminative Representation Network (DPDRNet), to enhance the effectiveness of semantic class prototype for new organ. Specifically, we introduce a self-attention mechanism to generate dynamic prototype, enhancing the efficient utilization of local information. This is accomplished by capturing interdependencies among pixel-level prototypes from limited labeled samples. Subsequently, we design a prototype contrastive learning method to maintain the discriminative representation of dynamic prototype in the high-level feature space. This method enhances the correlation between dynamic prototype and foreground features while simultaneously increasing the distinction from background features. By incorporating a self-attention mechanism with contrastive learning, the proposed dynamic prototype exhibits enhanced generalization capabilities, facilitating more precise segmentation of new organ structures. Experimental results demonstrate that our method achieves effective performance on Cardiac and Abdominal MRI segmentation tasks. © 2025 Elsevier Ltd

...

2.Dual-decoder collaborative learning with multi-hybrid view augmentation for self-supervised 3D action recognition

关键词：
Skeleton-based action recognition; Self-supervised representationlearning; Contrastive learning; Masked autoencoder; Masked skeletonmodeling

Cao, Wenming;Wu, Yingfei;Yin, Xinpeng
《PATTERN RECOGNITION》
2026年
172卷
期
期刊

Self-supervised methods, including contrastive learning and masked skeleton modeling, have demonstrated considerable potential in the field of skeleton-based action recognition. While contrastive learning captures finegrained details at the instance level, masked skeleton modeling emphasizes joint-level features. Recent studies have begun to combine these two approaches. However, existing combination methods primarily focus on integrating the tasks within the skeleton space. Moreover, existing contrastive learning methods often fail to exploit the comprehensive interaction information in skeletal structures, resulting in suboptimal performance when recognizing actions involving multiple individuals. To overcome these limitations, we introduce the Dual-Decoder Collaborative Learning (DDC) with Multi-Hybrid View Augmentation (MHGNA) method, which connects these two tasks across multiple spaces. Specifically, the masked skeleton modeling task provides diverse views for the contrastive learning task in the skeleton space, while the contrastive method aligns the features generated by both tasks within the feature space. We further present an innovative view augmentation method that enhances the model's capacity to understand human interaction relationships by shuffling and replacing data across temporal, spatial, and personal dimensions. Extensive experiments on four downstream tasks across three largescale datasets demonstrate that DDC exhibits stronger representational capabilities compared to state-of-the-art methods. Our code is available at https://github.com/Yingfei-Wu/DDC.

...

3.Spectral–spatial representation progressive learning via segmented attention for 3D skeleton-based motion prediction

关键词：
Algebra;Arts computing;Bone;Extraction;Motion estimation;Spectrum analysis;Three dimensional computer graphics;3D skeleton;Feature information;Motion generation;Motion prediction;Progressive learning;Recombination factors;Soft attention;Spatial representations;Spectra's;Spectral–spatial representation

Cao, Wenming;Zhang, Jianhua;Zhong, Jianqi
《Applied Soft Computing》
2025年
184卷
期
期刊

Recently, GCNs-based methods have demonstrated impressive performance in human behavior prediction tasks. We believe that human motion modeling can explained as motion correlation extraction from the combination of the active and static motion parts analysis. However, existing methodologies fail to address the issue that feature information associated with static regions may overshadow feature information from dynamic regions, ultimately affecting the extraction of network features. Moreover, the unique low-pass feature pre-retention processing mechanism of GCN on the spectrum will lead to the attitude of some sequences remaining unchanged during the prediction process and further hurt the prediction. In this paper, we propose a Spectral–Spatial Representation Progressive Learning network to solve the problem above. Firstly, we propose a segmented attention block to compare the input observation sequence with the static contrast standard to obtain the motion region and the rest region. Then, we design the Spectrum Deconstruction Recombination Factor block(SDRF) to extract the global bandpass spectrum of human bone joints. The joint features of different regions are encoded by graph convolution and high-frequency feature filter coding based on geometric algebra. Specifically, a spectral–spatial interaction block is presented in each SDRF, focusing on the diversity of motion sequence frequency domain and spatial domain map, and realizes the fine extraction of historical pose sequence features from the two levels of space and spectral domain. Experimental results demonstrate that our approach outperforms state-of-art algorithms by 2.4%, 5.3% and 4.7% in terms of 3D mean per joint position error on Human 3.6M, CMU Mocap and 3DPW datasets, respectively. © 2025

...

4.Progressively deeper attention networks for 3D human motion prediction

关键词：
Human motion prediction; Transformer; GCNs; Motion dependencies learning

Huang, Jiangtao;He, Dong;Cao, Wenming;Zhong, Jianqi
《MULTIMEDIA SYSTEMS》
2025年
31卷
5期
期刊

Human motion prediction is a significant challenge with broad applications in fields such as robotics, human-computer interaction, and healthcare. Despite the progress achieved by recent deep learning approaches, existing methods often struggle to effectively capture the complex spatial relationships and long-term temporal dependencies inherent in human motion. To address the issue, we propose the Progressive Deeper Attention Network (PDANet), which incorporates multiple GCN-Attention modules of varying depths. This architecture enables the model to extract more comprehensive information from sequential data. Additionally, we enhance the model's performance through two key improvements: (1) the introduction of joint-relative velocity and temporally perturbed features to distinguish complex motion semantics between dynamic and static joints; and (2) the design of a Multi-Dimensional Joint Fusion (MDJF) module, which employs the Gumbel Softmax method to dynamically learn the optimal fusion strategy for multi-semantic sequences. Extensive experiments demonstrate the effectiveness of our model. The proposed approach outperforms state-of-the-art methods by 2.8%, 4.7%, and 18.8% in terms of MPJPE for human motion prediction on the Human3.6M, AMASS, and 3DPW datasets, respectively.

...

5.FLCL: Feature-Level Contrastive Learning for Few-Shot Image Classification

关键词：
Contrastive learning; Few shot learning; Training; Feature extraction;Measurement; Metalearning; Data augmentation; Vectors; Data models;Adaptation models; few-shot learning; data augmentation; imageclassification

Cao, Wenming;Zeng, Jiewen;Liu, Qifan
《IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING》
2025年
13卷
3期
期刊

Few-shot classification is the task of recognizing unseen classes using a limited number of samples. In this paper, we propose a new contrastive learning method called Feature-Level Contrastive Learning (FLCL). FLCL conducts contrastive learning at the feature level and leverages the subtle relationships between positive and negative samples to achieve more effective classification. Additionally, we address the challenges of requiring a large number of negative samples and the difficulty of selecting high-quality negative samples in traditional contrastive learning methods. For feature learning, we design a Feature Enhancement Coding (FEC) module to analyze the interactions and correlations between nonlinear features, enhancing the quality of feature representations. In the metric stage, we propose a centered hypersphere projection metric to map feature vectors onto the hypersphere, improving the comparison between the support and query sets. Experimental results on four few-shot classification benchmark datasets demonstrate that our method, while simple in design, outperforms previous methods and achieves state-of-the-art performance. A detailed ablation study further confirms the effectiveness of each component of our model.

...

6.Linearformer: Tri-Net Multi-Layer DVF Medical Image Registration

关键词：
Angiography;Deep neural networks;Electroencephalography;Functional neuroimaging;Image registration;Linearization;Mammography;Multilayer neural networks;Transillumination;Accurate registration;Brain MRI;Convolutional neural network;Deep learning;Deformable medical image registration;Linearformer;Medical image registration;Multi-layers;Similarity measure;Transformer modeling

Anwar, Muhammad;Yan, Zhiyue;Cao, Wenming
《Expert Systems》
2025年
42卷
7期
期刊

In medical imaging, accurate registration is crucial for reliable analysis. While transformer models demonstrate potential, their application to large datasets like OASIS is constrained by substantial memory requirements, quadratic complexity and the challenge of managing complex deformations. To overcome these challenges, Linearformer is introduced, an efficient transformer-based model with Linear-ProbSparse self-attention for optimised time and memory, along with TNM DVF, a Pyramid-based framework for unsupervised non-rigid registration. Evaluated on OASIS and LPBA40 brain MRI datasets, the model outperforms state-of-the-art methods in Dice score and Jacobian metrics, surpassing TransMatch by 0.6% and 1.9% on the two datasets while maintaining a comparable voxel folding percentage. © 2025 John Wiley & Sons Ltd.

...

7.Asymmetric Context-Guided Adaptive Alignment Network for Skeleton-Based Action Recognition

关键词：
Skeleton; Image reconstruction; Transformers; Three-dimensionaldisplays; Data models; Adaptation models; Feature extraction;Computational modeling; Solid modeling; Representation learning;Self-supervised learning; skeleton-based action recognition; maskedmodeling; alignment

Cao, Wenming;Qian, Liangxi;Zhang, Yicha;Li, Xuelong;Yin, Xinpeng
《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》
2025年
35卷
6期
期刊

In skeleton-based action recognition, self-supervised pre-training paradigms have been extensively investigated. Particularly, masked autoencoders-like methods based on masked target reconstruction have pushed the performance of pre-training to a new height, which are committed to choose a better target for reconstruction. In this work, we propose an asymmetric context-guided adaptive alignment network (ACA(2)Net) for self-supervised skeleton-based action recognition by utilizing a transformer-based teacher encoder guiding the student encoder to learn richer action contextual information. To tackle the misalignment from the asymmetry, we devise an adaptive alignment module to better align the student representations to the teacher's. Additionally, considering that the differential operation for temporal motion might cause the prior loss related to the changes of direction, we propose a motion compass-aware masking strategy with fusion prior supplemented by motion and direction intensity. Extensive experiments on NTU-60, NTU-120, and PKU-MMD datasets demonstrate that our proposed ACA(2)Net outperforms previous MAE-like methods.

...

8.Progressive Feature Reconstruction Network for Zero-Shot Learning

关键词：
Visualization; Semantics; Image reconstruction; Feature extraction; Zeroshot learning; Whales; Training; Vectors; Data models; Benchmarktesting; Zero-shot learning; feature reconstruction; attributeinformation

Hu, Linchun;Cao, Wenming;Zhang, Zhenqi;Liang, Yuchuang
《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》
2025年
35卷
6期
期刊

Zero-shot learning (ZSL) aims to transfer the knowledge learned in the seen classes to the unseen classes through semantic knowledge. However, to ensure the model's versatility on different datasets, existing methods divide the image into blocks of the same size, resulting in the loss of information between attributes. More importantly, existing methods ignore that not every image contains all attributes corresponding to that class. In this paper, we propose a progressive feature reconstruction network, called PFRN. PFRN consists of an attribute relation sub-net and an attention-based feature reconstruction sub-net. Specifically, the attribute relation sub-net first adopts the attribute-related region module to obtain the attribute-related regions in the visual features, which are input to the attribute relation discovery module to find the relationships between attributes. The attention-based feature reconstruction sub-net obtains the fine-grained features based on attributes by the attribute attention module and uses the feature reconstruction module to randomly lose some attributes to reconstruct the new visual features of the missing attributes. The new visual features are fed back into the network for training. Finally, the attribute information learned by the attribute relation sub-net is fused to the visual embedding learned by the attention-based features reconstruction sub-net, and the ideal visual semantic interaction is performed with the semantic vector classified by ZSL. Extensive experiments on three ZSL benchmark datasets demonstrate the significant generalization performance of our proposed method over the state-of-the-art methods.

...

9.Optimizing human motion prediction through decoupled motion spatio-temporal trends

关键词：
3D human motion forecasting; Deep learning; Time series

Pan, Huan;Ji, Ruiya;Cao, Wenming;Huang, Zhao;Zhong, Jianqi
《MULTIMEDIA SYSTEMS》
2025年
31卷
2期
期刊

Recent advancements in deep learning and artificial intelligence have underscored the importance of human motion prediction in fields such as intelligent robotics, autonomous driving, and human-computer interaction. Current human motion prediction methods primarily focus on network structure and feature extraction innovations, often overlooking the underlying logic of spatio-temporal changes in motion data. This oversight can result in potential conflicts within the coupled modeling of spatial and temporal dependencies, potentially obscuring the spatio-temporal logic of human motion. In this paper, we address this issue by decoupling the spatio-temporal features, employing time series modeling for preliminary prediction, and introducing velocity data as a learning branch to capture joint dependencies. This velocity-based information more clearly represents quantitative indices related to human movement, enhancing the model's pattern recognition capability. We map the trajectory change rules to the joint change trends for future moments, thereby refining the prediction results. Additionally, we enhance local semantic information through a patching method and ensure the independence of multi-scale representations of spatial and temporal dimensions using a two-branch framework. We propose a multi-layer perceptron (MLP)-based network structure, DCMixer, designed to learn multi-scale dynamic information and perform internal feature extraction. Our approach achieves spatio-temporal fusion with greater kinematic logic, significantly improving model performance. We evaluated our method on three public datasets, demonstrating superior prediction performance compared to state-of-the-art methods. The code is publicly available at https://github.com/Dabanshou/STTSN.

...

10.STHRA: selective transformer hierarchical reciprocal attention-based deformable medical image registration

关键词：
Similarity measures; Deformable medical image registration;Convolutional neural networks; Transformer; Deep learning

Anwar, Muhammad;Yan, Zhiyue;Cao, Wenming;Hussain, Naeem
《MULTIMEDIA SYSTEMS》
2025年
31卷
2期
期刊

Deformable medical image registration is an essential process requiring characteristics to be extracted and aligned from two images to provide exact correspondence, a necessity for accurate registration. The Transformer can improve predictive capacities, as demonstrated by recent experiments. Nevertheless, there are significant obstacles when directly applying it to big databases like OASIS. These include significant memory needs, quadratic temporal complexity, and intrinsic limitations in the encoder-decoder architecture. Even with the development of advanced registration models, achieving precise and effective deformable registration is still difficult, particularly in situations with significant volumetric deformations. We use a Selective Transformer (ST) and Hierarchical Reciprocal Attention (HRA) to address these challenges. To minimize computing complexity and optimize resource allocation for more effective processing, ST assists in calculating the diversity of voxels and chooses those with a broad range of diversity. Using an encoder-decoder architecture, HRA uses high-level features to link layers, allowing information to flow from a high to a lower level and vice versa. We use Reciprocal Attention (RA) instead of skip connections to facilitate the flow of information between the feature extractor and the feature reconstructor. This method maximizes the model's capacity to accurately capture and anticipate deformations by thoroughly integrating complex spatial data and abstract representations. We utilized two well-known pre-aligned brain MRI datasets, OASIS and LPBA40, to benchmark our model against other established registration techniques. Our evaluations frequently demonstrate that our network surpasses state-of-the-art methods across various metrics, including Dice score and Jacobian.

...

排序方式：时间相关性
显示方式：列表摘要