基于属性学习的零样本图像分类研究

项目来源

国家自然科学基金(NSFC)

项目主持人

程玉虎

项目受资助机构

中国矿业大学

项目编号

61772532

立项年度

2017

立项时间

未公开

项目级别

国家级

研究期限

未知 / 未知

受资助金额

62.00万元

学科

信息科学-人工智能-模式识别与数据挖掘

学科代码

F-F06-F0605

基金类别

面上项目

关键词

属性学习 ; 深度神经网络 ; 领域适应 ; 零样本图像分类 ; 多任务学习 ; Zero-shot image classification ; Attribute learning ; Deep neural network ; Multi-task learning ; Domain adaptation

参与者

潘杰;张倩;刘健;孔毅;吕恩辉;李冬青;顾扬;宝阿春

参与机构

中国矿业大学;徐州医科大学;中国科学院国家空间科学中心

项目标书摘要:训练数据和测试数据的分布不同使得零样本图像分类成为一个非常困难的学习任务。本项目拟通过对所有对象类共同的属性描述,将以前学到的各类属性知识迁移到新的对象类上,从而有效解决零样本学习场景下的图像分类问题。具体包括:利用深度神经网络构建一个深度属性学习模型,同时实现图像的深层次特征表达及属性分类器训练,以回避人工选取特征的主观性和解决浅层学习方法导致的属性分类器泛化能力弱的问题;借鉴多任务学习思想,利用共享的图像底层特征来协同学习属性分类器(属性排序函数)和图像分类器,以同时提升属性预测(相对属性的排序)精度和图像分类精度;利用领域适应学习技术,从不同层次(单源、多源)、不同角度(分类器适应、特征表示适应)来解决属性学习的领域偏移问题,以使得在可见类图像上训练得到的属性分类器能够准确预测未见类图像的属性。研究成果不仅可以丰富和发展现有的机器学习理论,而且可以推广应用到模式识别的诸多相关领域。

Application Abstract: Because the data distributions between training and testing samples are different,zero-shot image classification becomes a very difficult learning task.In this project,we aim to effectively solve the image classification problem under the zero-shot learning scenario by describing all object classes with common attributes and transferring the attribute knowledge from known classes to new classes.The main contents in our research include the following aspects.A deep attribute learning model based on deep neural networks is constructed to simultaneously realize the deep-level feature representation of images and the training of attribute classifier.The designed deep attribute learning model can not only avoid the subjectivity resulted from manually select features,but also solve the poor generalization problem of attribute classifier due to shallow learning methods.By borrowing the idea of multi-task learning,the attribute classifier(or the attribute ranking functions)and the image classifier are collaboratively learned by using the shared low-level image features.Thus,the attribute prediction(or the relative attribute ranking)accuracy and the image classification accuracy can be simultaneously improved.By using the domain adaptation learning technique,the domain shift problem of attribute learning is solved from different levels(single-source and multi-source)and different perspectives(classifier adaptation and feature representation adaptation).Thus,the obtained attribute classifier that is trained with seen images can accurately predict the attributes of unseen images.The research fruits not only can enrich and develop the existing machine learning theory,but also can be extended to many pattern recognition-related fields.

项目受资助省

江苏省

项目结题报告(全文)

由于标记样本的缺乏,已标记类别不可能涵盖所有的对象类,这种零样本学习问题场景广泛存在于计算机视觉、图像分类、人脸和语音识别等领。本项目利用深/宽度学习、多任务学习和迁移学习等技术,通过对所有对象类共同的属性描述,将以前学到的各类属性知识迁移到新的对象类上,从而有效解决零样本学习场景下的图像分类问题,主要从下述2个方面开展研究工作:1深度网络能够从无标签的原始图像中自动提取出具有良好描述能力的图像特征。相较于深度网络,宽度学习系统具有结构简单、易于与其他模型结合等优点。为此,项目组针对轻量型深度网络、新型宽度网络构造等相关问题展开了研究,提出了:基于自注意力机制的生成对抗网络、自适应多尺度图卷积网络、多路径集成卷积网络、权重共享多级多尺度集成卷积网络、基于反卷积特征提取的深度卷积网络、基于监督超图和样本扩充的卷积网络、领域适配CycleGan网络、领域适应宽度网络、基于块对角约束的多阶段卷积宽度网络;2利用构造的深度和宽度网络,在零样本图像分类方面,提出了:基于深度加权属性预测的零样本学习、基于自适应多核校准的多源域属性适应学习、基于图正则化特征选择的零样本学习、基于多任务扩展属性组的零样本学习、基于多任务混合属性关系与属性固有特征的零样本学习、基于特征原型的零样本学习、基于耦合自编码与高斯混合模型的零样本学习、基于关系有向图正则化的属性三因子分解模型、基于属性核矩阵的生成特征领域自适应模型、基于混合属性的零样本学习、基于加权重构混合属性组的零样本图像分类模型、基于增强属性—特征的宽度属性预测模型。通过研究,项目组取得的成果为:在科学出版社出版专著1部;在国际国内学术期刊上发表/录用论文43篇;授权发明专利7件;培养博士研究生、硕士研究生8名;获江苏省优秀硕士学位论文奖1项。

  • 排序方式:
  • 8
  • /
  • 1.Broad learning systems: An overview of recent advances, applications, challenges and future directions

    • 关键词:
    • Deep learning;Broad learning system;Feature learning;Feature nodes;Functional link neural network;Multi-layer feature learning;Multi-layers;Networks learning;Random vector functional link neural network;Random vectors;Training framework
    • Chu, Yonghe;Guo, Yanlong;Ding, Weiping;Cao, Heling;Ping, Peng
    • 《Neurocomputing》
    • 2025年
    • 641卷
    • 期刊

    Broad learning system (BLS) as a novel training framework derived from the random vector functional link neural network (RVFLNN). Unlike RVFLNN, which directly applies raw data to network learning, BLS first transforms input data into feature nodes through feature mapping. These feature nodes are then nonlinearly transformed into enhanced nodes. Both feature nodes and enhanced nodes are concatenated and connected to the output layer, and the corresponding output weights are derived via pseudo-inverse. Due to its shallow architecture and the need to train only the output weights, BLS achieves highly efficient learning capabilities.Moreover, when adding new nodes, BLS does not require retraining from scratch; it only adjusts the weights associated with the new nodes. Different from deep learning, BLS aims to expand the broad rather than the depth of the neural network to solve complex problems. It not only overcomes the drawback of the long-training process in deep learning but also enables the rapid incremental construction of network models. Therefore, BLS has received extensive attention from both the academic and industrial communities. In view of this, this paper conducts a systematic review of BLS. First, we outline the research background of the BLS. Then, we elaborate on the relevant concepts and definitions of BLS. Furthermore, recent advances in BLS are introduced. In addition, we present the extensive applications of BLS in various fields. Finally, several possible development directions for BLS are proposed. © 2025 Elsevier B.V.

    ...
  • 2.Global-local graph convolutional broad network for hyperspectral image classification

    • 关键词:
    • Laplace transforms;Broad learning system;Global manifold structure;Global-local;HyperSpectral;Hyperspectral image;Hyperspectral image classification;Local manifold structure;Manifold learning;Manifold structures;Nonlinear features
    • Chu, Yonghe;Cao, Jun;Huang, Jiashuang;Ju, Hengrong;Liu, Guangen;Cao, Heling;Ding, Weiping
    • 《Applied Soft Computing》
    • 2025年
    • 170卷
    • 期刊

    The conventional broad learning system (BLS) struggles to represent the complex nonlinear features of hyperspectral images (HSI) due to its reliance on linear sparse feature extraction methods. Additionally, traditional BLS models focus primarily on class separability, ignoring the manifold structure that characterizes relationships between samples. To address these issues, previous research has incorporated graph convolutional networks (GCNs) and manifold learning into the BLS framework, but these methods often emphasize only local manifold structures, overlooking global structural information. In this paper, we propose a Global-Local Graph Convolutional Broad Network (GLGBN) for HSI classification. GLGBN addresses both global and local manifold structures, optimizing the classification boundary by minimizing local scatter and maximizing global scatter. It uses linear discriminant analysis (LDA) to preserve global manifold structure and locality preserving projections (LPP) to model local relationships via a Laplacian graph. This dual approach ensures that similar samples remain close while dissimilar samples are separated, enhancing classification accuracy. The proposed GLGBN model demonstrated outstanding overall accuracy across multiple public datasets: 95.31% on Indian Pines, 97.67% on Pavia University and 98.37% on Salinas, surpassing several classical and state-of-the-art approaches. © 2025

    ...
  • 3.RFBLS: A robust rough fuzzy broad learning system with local neighborhood structure

    • 关键词:
    • Fuzzy rules;Group theory;k-nearest neighbors;Broad learning system;Class Centers;Classification boundary;Fuzzy theory;K-near neighbor;Local neighborhood structures;Membership degrees;Nearest-neighbour;Neighborhood rough sets;Optimal classification
    • Chu, Yonghe;Guo, Yanlong;Li, Peng;Ding, Weiping;Pedrycz, Witold;Cao, Heling
    • 《Neurocomputing》
    • 2025年
    • 647卷
    • 期刊

    As an extension of random vector functional link neural network (RVFLNN), the broad learning system (BLS) has gradually been applied in various fields due to its advantages of simple network structure, rapid training, and fast model updating. Despite its remarkable achievements, there are still some difficulties and challenges that restrict its further application in practical tasks. In BLS, all samples equally contribute to constructing the optimal classification boundary. However, when training samples contain noise or outliers, it will result in the obtained classification boundary not being the true optimal classification boundary. In other words, BLS is sensitive to noise points or outliers. To address it, we propose a robust rough fuzzy broad learning system with local neighborhood structure (RFBLS). In RFBLS, fuzzy sets are utilized by associating a membership degree with each sample. This incorporation ensures diverse contributions of different samples during the minimization of the objective function. Samples with noise or outliers are assigned smaller weights, effectively eliminating the influence of noise and outlier samples in the learning process. We employ the rough set theory to explore the influence-decision degree of each conditional attribute on the decision attribute, aiming to eliminate the impact of redundant or interfering features. Subsequently, we comprehensively consider the distance between samples and class centers, as well as the local neighborhood structure relationship between samples. Initially, we calculate the initial membership degree of each sample based on the distance to class centers. Then, utilizing fuzzy K-nearest neighbors, we calculate and rank the membership degrees of K nearest neighbors for each sample. These rankings are then weighted with the initial membership degree, achieving the fusion of relationships and membership degree calculations between samples. Finally, extensive comparative experiments are conducted on various UCI datasets and image datasets, and the results indicate that our method exhibits superior noise resistance and classification performance. © 2025

    ...
  • 5.FDBFN: Fuzzy discriminative broad fusion network for hyperspectral image classification

    • 关键词:
    • Fuzzy sets;Network theory (graphs);Class-distance;Classification methods;Convolutional networks;Feature discrimination;Global discriminative information;Hyperspectral image classification;Inter class;Inter-class distance;Intra class;Intra-class distance
    • Chu, Yonghe;Cao, Jun;Ding, Weiping;Huang, Jiashuang;Ju, Hengrong;Cao, Heling
    • 《Expert Systems with Applications》
    • 2025年
    • 266卷
    • 期刊

    Hyperspectral image (HSI) classification methods based on graph convolutional networks (GCNs) have gained attention due to their ability to process irregular regions using graph encoding techniques. Most existing GCN-based HSI classification methods use multilayer perceptrons (MLPs) for classification, relying primarily on aligning predicted values with actual sample values. However, these methods often overlook inter-class separability and intra-class compactness, limiting their ability to achieve effective inter-class separation and intra-class aggregation, which compromises feature discrimination. Additionally, HSIs exhibit complex spectral characteristics where different substances can share similar spectra, and identical substances may present varying spectra, creating sample uncertainty. Addressing these challenges, we propose a fuzzy discriminative broad fusion network (FDBFN) for HSI classification. FDBFN leverages fuzzy set theory and manifold learning to calculate a membership matrix, capturing the global structure and discriminative information of samples. This matrix enables samples to be classified across categories, capturing their distributional uncertainty. Using this information, we construct inter-class and intra-class scatter matrices and design a loss function that minimizes intra-class distances while maximizing inter-class distances to enhance feature discrimination. FDBFN further employs broad learning in the classification layer, integrating features through feature nodes and enhanced nodes layers for full utilization of network-extracted features. Experimental results show that FDBFN achieves classification accuracies of 97.45%, 98.34%, and 99.50% on the Indian Pines, Pavia University, and Salinas datasets, respectively. Compared to several state-of-the-art methods, FDBFN enhances classification accuracy, robustness, and efficiency in HSI, demonstrating its superiority and adaptability across diverse datasets. © 2024 Elsevier Ltd

    ...
  • 6.Hyperspectral image classification using feature fusion fuzzy graph broad network

    • 关键词:
    • Fuzzy clustering;Fuzzy set theory;Hyperspectral imaging;Image fusion;Network theory (graphs);Class graphs;Convolutional networks;Features fusions;Graph convolutional network;HyperSpectral;Hyperspectral image;Hyperspectral image classification;Inter class;Inter-class graph;Intra-class graphs
    • Chu, Yonghe;Cao, Jun;Ding, Weiping;Huang, Jiashuang;Ju, Hengrong;Cao, Heling;Liu, Guangen
    • 《Information Sciences》
    • 2025年
    • 689卷
    • 期刊

    In recent years, graph convolutional networks (GCNs) have shown strong performance in hyperspectral image (HSI) classification. However, traditional GCN methods often use superpixel-based nodes to reduce computational complexity, which fails to capture pixel-level spectral-spatial features. Additionally, these methods typically focus on matching predicted labels with ground truth, neglecting the relationships between inter-class and intra-class distances, leading to less discriminative features. To address these issues, we propose a feature fusion fuzzy graph broad network (F3GBN) for HSI classification. Our method extracts pixel-level attribute contour features using attribute filters and fuses them with superpixel features through canonical correlation analysis. We employ a broad learning system (BLS) as the classifier, which fully utilizes spectral-spatial information via nonlinear transformations. Furthermore, we construct intra-class and inter-class graphs based on fuzzy set and manifold learning theories to ensure better clustering of samples within the same class and separation between different classes. A novel loss function is introduced in BLS to minimize intra-class distances and maximize inter-class distances, enhancing feature discriminability. The proposed F3GBN model achieved impressive overall accuracy on public datasets: 96.73% on Indian Pines, 98.29% on Pavia University, 98.69% on Salinas, and 99.43% on Kennedy Space Center, outperforming several classical and state-of-the-art methods, thereby demonstrating its effectiveness and feasibility. © 2024

    ...
  • 7.基于水平可视图多元联合模体熵的多维EEG情感脑电信号识别

    • 关键词:
    • EEG;多路水平可视图;多元联合模体熵;情感识别;多维分析
    • 杨小冬;马志怡;任彦霖;陈梅辉;何爱军;王俊
    • 《中国科学:信息科学》
    • 2023年
    • 期刊

    目前,许多基于深度学习和神经网络的算法被应用于脑电(electroencephalogram, EEG)信号情感识别.然而,现有研究大多采用提取单维脑电信号特征的方法.随着多传感技术的更新,更具全面性和系统性的多维信号特征提取需求出现.本文尝试将复杂网络研究应用到多维情感脑电识别中,提出一种基于水平可视图多元联合模体熵的情感识别算法,该方法可以有效避免人工选取特征对实验结果的影响,保持原始序列的非线性动力学特征.首先利用水平可视图算法将多维情感脑电信号分别转换为多路可视图网络,提取模体熵特征识别情感脑电研究中的关键频带和关键通道.在此基础上,将水平可视图网络两两联合,提取多元水平联合模体熵向量,作为输入参数对情感脑电信号进行识别.由于情感脑电序列长度会对识别效果产生影响,我们将脑电信号切割成大小不一的窗口,对比不同窗口大小对分类准确率的影响.实验结果表明,当切割窗口大小为10 s时,多元水平联合模体熵对情感脑电信号的识别效果最佳,对积极脑电/消极脑电、积极脑电/中性脑电、消极脑电/中性脑电的分类准确率分别达到95.07%, 97.73%, 90.26%,优于其他二维连接参数.同时,三分类的准确率为93.67%,本文算法无论在识别复杂度和准确率上,与已有算法相比均有较大提高.

    ...
  • 8.基于卷积神经网络的图像分类研究

    • 关键词:
    • 图像分类 卷积神经网络 梯度弥散 网络冗余 网络性能退化 基金资助:基于属性学习的零样本图像分类研究,国家自然科学基金项目,编号:61772532; 专辑:信息科技 专题:计算机软件及计算机应用 自动化技术 分类号:TP391.41TP183 导师:王雪松 手机阅读
    • 期刊

    图像分类作为计算机视觉识别领域的基本研究主题之一,其研究目标是对于一给定的图像和分类标签集合,通过分类方法进行识别后,可以预测出其输入图像的类别标签。整个分类过程中,特征提取和选择概括起来统称为特征表达,而良好的特征表达对提升图像分类精度起到了至关重要的作用。因此,针对图像分类任务,深度学习以其强大的特征提取能力已获得了广泛的关注和应用。而卷积神经网络作为深度学习重要的模型之一,它的权值共享网络结构大大降低了网络模型的复杂度,减少了参数的数量,从而避免了传统识别算法中复杂的特征提取和数据重建过程。但是,仍有一些问题需要解决,例如:(1)梯度弥散。现有的深度神经网络通常采用梯度下降法进行参数训练。但是,随着网络层数的不断增加,训练过程中反向传播梯度逐渐消失,准确率很快达到饱和并迅速下降,从而导致网络性能的退化。(2)网络冗余。通过简单堆叠更多的网络层次可构造出超深层网络。然而,随着结构规模的增大,网络中产生了大量的参数,增加了网络的冗余性,进而导致网络性能的退化。针对上述问题,以提高图像分类精度为目标,设计了更为高效的深度卷积网络模型。主要研究内容如下:第一,在卷积神经网络的学习过程中,卷积核的初始值通常是随机赋值的,这将导致学习过程陷入局部最优。另外,随着网络深度的增加,基于梯度下降法的网络参数学习法则通常会导致梯度弥散现象。针对上述问题,提出一种基于反卷积特征提取的深度卷积网络模型。首先,采用无监督的两层堆叠反卷积网络从原始图像中学习得到特征映射矩阵;然后,将该特征映射矩阵作为深度卷积网络的卷积核对原始图像进行逐层卷积与池化操作;最后,采用附加动量系数的小批次随机梯度下降法对深度卷积网络微调以避免梯度弥散。第二,通过增加深度卷积网络的层数可有效提升分类精度,但随着网络深度的增加,训练过程中梯度逐渐消失,准确率很快达到饱和并迅速下降,从而导致网络性能的退化。针对上述问题,提出一种基于金字塔结构的深度卷积网络模型。首先,各网络层通过逐渐增加特征图维数,以分散集中在受下采样影响的结构单元压力,使其在所有单元中均匀分布;然后,通过探讨结构单元内部堆叠元素之间的顺序,设计出一种金字塔结构单元;最后,通过使用小批次随机梯度下降法进行参数训练,进一步避免梯度弥散。第三,深度融合网络由于具有学习多尺度表达和优化信息流动的能力,可以改善深层网络的训练过程。但是,深度融合网络中的深度并没有最大限度地提升网络整体性能。随着网络深度的增加,准确率很快达到饱和并迅速下降,从而导致网络性能的退化。针对上述问题,提出一种基于多路融合的深度卷积网络模型。首先,以拼接融合的方式组合两种网络结构,生成一个深度融合网络集成结构;然后,与嵌入的学习机制组成一个集成单元,从而改善模型的特征表达能力;最后,在无损性能的前提下,通过引入分组卷积来提升模型的计算效率。第四,通过增加深度卷积网络的深度和宽度可有效提升分类精度。然而,随着网络结构规模的增大,网络中产生了大量的参数,增加了网络的冗余性。针对上述问题,提出一种基于交错融合分组的深度卷积网络模型。首先,使用相同的网络拓扑结构以及拆分-变换-拼接策略构建出一个模板模块,并通过堆叠模板模块形成深度网络;然后,在深度网络中引入小的卷积核和分组卷积来构建出更高效的卷积核;最后,将结构化稀疏卷积核和深度网络组合形成一种高度模块化和轻量级的网络结构。在图像分类标准数据集上进行的对比实验结果表明,所提出的深度卷积网络具有较好的泛化能力,可有效提高图像的分类精度。

    ...
  • 9.基于自适应多尺度图卷积网络的多标签图像识别

    • 关键词:
    • 自适应关系图多尺度图卷积网络多标签图像识别块Krylov子空间基金资助:国家自然科学基金项目(61772532,61976215);DOI:10.13195/j.kzyjc.2021.0179专辑:信息科技专题:计算机软件及计算机应用 自动化技术分类号:TP391.41TP183手机阅读
    • 王雪松;荣小龙;程玉虎;陈正升
    • 期刊

    利用一阶谱图卷积探索类别标签间关系是目前多标签图像识别常用的手段,但是,较多的图卷积层数易出现过度平滑现象,使得该方法存在局限性.为此,提出一种基于自适应多尺度图卷积网络的多标签图像识别方法,主要思路为:采用块Krylov子空间形式的谱图卷积来挖掘类别标签间的相关性,在每个图卷积层中拼接多尺度信息并扩展到深层结构,并在自适应标签关系图模块所构建的关系图上学习分类器,从而更加有效地进行多标签图像识别.通过两个公开数据集PASCAL VOC 2007和MS-COCO 2014上的实验结果验证了所提出方法的有效性.

    ...
  • 10.基于多路可视图的健康与心梗患者心电图信号复杂网络识别

    • 关键词:
    • 心肌梗塞;多元时间序列;多路可视图;复杂网络
    • 马志怡;杨小冬;何爱军;马璐;王俊
    • 《物理学报》
    • 2022年
    • 05期
    • 期刊

    可视图(visibility graph, VG)算法已被证明是将时间序列转换为复杂网络的简单且高效的方法,其构成的复杂网络在拓扑结构中继承了原始时间序列的动力学特性.目前,单维时间序列的可视图分析已趋于成熟,但应用于复杂系统时,单变量往往无法描述系统的全局特征.本文提出一种新的多元时间序列分析方法,将心梗和健康人的12导联心电图(electrocardiograph, ECG)信号转换为多路可视图,以每个导联为一个节点,两个导联构成可视图的层间互信息为连边权重,将其映射到复杂网络.由于不同人群的全连通网络表现为完全相同的拓扑结构,无法唯一表征不同个体的动力学特征,根据层间互信息大小重构网络,提取权重度和加权聚类系数,实现对不同人群12导联ECG信号的识别.为判断序列长度对识别效果的影响,引入多尺度权重度分布熵.由于健康受试者拥有更高的平均权重度和平均加权聚类系数,其映射网络表现为更加规则的结构、更高的复杂性和连接性,可以与心梗患者进行区分,两个参数的识别准确率均达到93.3%.

    ...
  • 排序方式:
  • 8
  • /