传播IB方法的研究

项目来源

国家自然科学基金(NSFC)

项目主持人

叶阳东

项目受资助机构

郑州大学

项目编号

61772475

立项年度

2017

立项时间

未公开

项目级别

国家级

研究期限

未知 / 未知

受资助金额

62.00万元

学科

信息科学-计算机科学-信息安全

学科代码

F-F02-F0206

基金类别

面上项目

关键词

传播IB ; 信息度量 ; 传播机制 ; 分层模型 ; 多源异构数据 ; Propagation Information Bottleneck ; Information Measurement ; Multiple Heterogeneous Data ; Propagation Mechanism ; Hierarchical Model

参与者

姬波;卢红星;朱真峰;娄铮铮;吴云鹏;闫小强;吴宾;胡世哲;时增林

参与机构

郑州大学

项目标书摘要:本项目针对现有IB方法处理多源异构数据存在的局限性,提出传播IB方法,拟解决相关模型确定、传播机制构建、综合平衡参数调整、深度度量函数确定及应用适用性等关键问题。基于多信息和交互信息度量,对传播IB方法相关模型中变量间关系进行建模,构建模式参数确定策略;以因子图结构为核心,构建信息传播机制,使传播IB方法充分考虑异构数据对象的关联性和模式结构的层次性;使用自适应LASSO求解传播IB方法中的综合平衡参数;基于传播IB方法度量复杂数据的层次模型,用K-近邻估计法计算复杂数据模型中各层间、各层与相关变量间的互信息,提高深度度量方法的鲁棒性;开展隐藏信息分析、多传感器监控以及信息推荐的应用研究,力图发现传播IB方法所适用问题的特征及规律。该项目在传播机制、复杂数据模型度量方面的研究是原创性的,对多源异构数据处理的研究将进一步拓展IB方法的应用范围。项目的相关研究力图将IB方法推向新的研究阶段。

Application Abstract: This project proposes a propagation Information Bottleneck(IB)method which aims at remedying the limitations of current solutions on multiple heterogenous data.It intends to solve important problems such as determining related model,generating propagation mechanism,adjusting a series of balance parameters,selecting deep measurement function,and practicality of application.Based on criteria such as multi-information and interactive-information,we model the relationship of the latent variables in propagation IB method and propose a framework for determining the pattern parameters.Propagation IB can make use of the correlation of hetergeneous data object and the hierarchy of pattern structure by constructing the information propagation mechanisms based on the factor graph structure.The adaptive LASSO method is used to get the values of a series of balance parameters in propagation IB.To measure the complex data hierarchical models by propagation IB,K-Nearest Neighbor estimation method is used to compute the mutual information of each layers and the mutual information between each layer and relevant variables in complex data models.As a result,the robustness of the deep measurement function can be guaranteed.We will apply propagation IB to various application fields,including the analysis of hidden information,the warning of multi-sensor surveillance and information recommendation,in order to find the common patterns in problems which are solvable by propagation IB.The original contributions include the research on propagation mechanism and the measurement of complex data model.The research therein on multiple heterogeneous data will extend the field of IB method application.The works in this project will fill the research gap in literature and further open a new page for IB method.

项目受资助省

河南省

项目结题报告(全文)

项目针对传播IB方法及相关算法进行了深入的研究,超额完成了申报书中的任务,取得了丰硕的研究成果。1在传播IB方法中信息传播模型及相关算法的研究方面,提出了基于关联关系传播IB模型、双层关联的传播 IB模型、融合异构特征的协作IB模型、联合个性和共性信息的传播IB模型、视觉上下文IB模型、多任务联合IB算法、面向高维共现数据的交互IB模型,并研究了相关的优化算法。2在传播IB方法的权重学习研究方面,引入了不同的权重学习机制,提出了簇加权多视角IB算法、动态自动加权多视角联合IB聚类算法、基于内容和上下文的加权多视角IB聚类算法、双重加权的多视角IB聚类算法,实现了自动赋权和算法优化互相促进,从而提高了传播IB方法的有效性和灵活性。3在传播IB方法中信息度量及互信息最大化研究方面,提出了深度互信息最大最小化方法、组约束信息最大化聚类方法、多任务图像聚类的深度相关性挖掘方法、异构双任务聚类方法、基于信息最大化的多任务视频聚类算法、聚类模式参数的确定算法。4在传播IB方法的应用适应性研究方面,进行了传播IB方法在推荐系统、人群计数、多模态数据分析等方面的应用拓展研究,提出了相应的模型和相关算法,充分验证了传播IB方法的有效性和适应性。项目取得的研究成果发表在国内外重要会议或期刊上,如CVPR 2018、AAAI2021、SDM 2020、IEEE ICASSP 2020、IEEE Transactions on Image Processing、IEEE Transactions on Knowledge and Data Engineering、IEEE Transactions on Cybernetics、IEEE Transaction on Industrial Informatics、IEEE Transactions on Multimedia、Information Fusion、Pattern Recognition、ACM Transactions on Knowledge Discovery from Data、Information Sciences、Expert Systems with Applications、Knowledge-Based Systems、Applied Soft Computing、中国科学:信息科学、计算机学报等。

  • 排序方式:
  • 1
  • /
  • 1.SWE: A novel method with semantic-weighted edge for measuring gene functional similarity

    • 关键词:
    • Gene expression;Semantics;Integrated circuits;Proteins;Biological fields;Biological pathways;Functional classification;Functional similarity;Gene functional prediction;Human perspectives;Information contents;Protein-protein interactions
    • Tian, Zhen;Fang, Haichuan;Ye, Yangdong;Zhu, Zhenfeng
    • 《2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020》
    • 2020年
    • December 16, 2020 - December 19, 2020
    • Virtual, Seoul, Korea, Republic of
    • 会议

    In recent years, functional similarity has played an independent role in some biological fields such as gene clustering, gene functional prediction, and evaluation for proteinprotein interaction. In this premise, some effective methods have already been proposed based on Gene Ontology (GO). Although these mainstream methods achieve the purpose for measuring gene functional similarity, they may have some deficiency when calculating the Information Content (IC) of GO terms. Consequently, measuring the functional similarity accurately is still a meaningful objective of research. In this paper, a novel method called SWE, is proposed for measuring gene functional similarity based on the GO graph. Firstly, an algorithm to measure terms' semantics based on their information in the GO graph is put forward. The information of GO terms mainly contains their depth, ancestors and descendants. Secondly, we calculate the IC of a term set by means of retrieving the inherited relationship between terms in a term set. Finally, the functional similarity between two genes is computed based on the IC overlap ratio of term sets annotating two genes respectively. Results demonstrate that SWE is superior to existing methods in some experiments such as functional classification of genes in a biological pathway, protein-protein interaction and gene expression experiment. Further analysis demonstrates that SWE takes not only the specificity of terms into account, but their information in the GO graph, both of which are shown to be consistent with human perspectives. © 2020 IEEE.

    ...
  • 3.Content Vs Context: How about Walking Hand-In-Hand for Image Clustering?

    • 关键词:
    • Unsupervised learning;Context information;Hand in hands;Image clustering;Image clusters;Image distance;Information loss;Intrinsic characteristics;Sequential methods
    • Hu, Shizhe;Hou, Zhenquan;Lou, Zhengzheng;Ye, Yangdong
    • 《2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020》
    • 2020年
    • May 4, 2020 - May 8, 2020
    • Barcelona, Spain
    • 会议

    Image clustering has been one of the most important issues in the field of pattern recognition. However, most of existing methods only focus on utilizing either content or context information of images, failing to consider both of them. In fact, the powerful algorithms can be realized by a combination of the rich content and context information. This paper proposes a novel content-context information bottleneck (C2IB) algorithm, which simultaneously explores and exploits the content and context information for discovering image clusters. The content describes the intrinsic characteristics contained in each image such as the appearance feature, and the context depicts the close correlations between images such as inter-image distance or similarity. Then, we formulate the problem as an information loss function by maximally preserving the content and context information while compressing the images. Finally, we design a new sequential method for the optimization. Experimental results show the superiority of the proposed method.
    © 2020 IEEE.

    ...
  • 4.Heterogeneous dual-task clustering with visual-textual information

    • 关键词:
    • Information theory;Data mining;Space division multiple access;Clustering techniques;High level semantics;Learning tasks;Local optimal solution;Multiple modalities;Progressive optimization;State of the art;Textual information
    • Yan, Xiaoqiang;Mao, Yiqiao;Hu, Shizhe;Ye, Yangdong
    • 《2020 SIAM International Conference on Data Mining, SDM 2020》
    • 2020年
    • May 7, 2020 - May 9, 2020
    • Cincinnati, OH, United states
    • 会议

    Existing visual-textual cross-modal clustering techniques focus on finding a clustering partition of different modalities by dealing with each modality dependently or integrating multiple modalities into a shared space, which may results in unsatisfactory performance due to the heterogeneous gap of different modalities. Aiming at this problem, we propose a novel heterogeneous dual-task clustering (HDC) method, which is capable of exploring high-level relatedness between visual and textual data to improve the performance of individual task. Our intuition is that although the visual and textual data are heterogenous to each other, they may share related high-level semantics and rich latent correlations, which can lead to improved performance if we treat the clustering of visual and textual data as different but related learning tasks. Specifically, the problem of heterogeneous dual-task clustering is formulated as an information-theoretic function, in which the low-level information in each modality and high-level relatedness between multiple modalities are maximally preserved. Then, a progressive optimization method is proposed to ensure a local optimal solution. Extensive experiments show noticeable performance of the HDC approach in comparison with several state-of-the-art baselines. © 2020 by SIAM.

    ...
  • 排序方式:
  • 1
  • /