FET:Small:AlignMEM:Fast a... - Deliang Fa... - 美国国家科学基金(NSF...

FET:Small:AlignMEM:Fast and Efficient DNA Sequence Alignment in Non-Volatile Magnetic RAM

项目来源

美国国家科学基金(NSF)

项目主持人

Deliang Fan

项目受资助机构

Arizona State University

项目编号

2528723

财政年度

2025,2020

立项时间

未公开

项目级别

国家级

研究期限

未知 / 未知

受资助金额

613079.00美元

学科

未公开

学科代码

未公开

基金类别

Standard Grant

关键词

FET-Fndtns of Emerging Tech ; FET:Foundations of Emerging Technologie ; DES AUTO FOR MICRO&NANO SYST

参与者

未公开

参与机构

ARIZONA STATE UNIVERSITY

项目标书摘要：The state-of-the-art DNA sequencing technologies could generate Terabytes of DNA sequence data in a single run,and their throughput is expected to increase 3-5 times each year in the coming years.In order to apply these big DNA-data into follow-up complex disease diagnostics/prognostics,such as cancer risk assessment,tailor patient treatment,and prenatal testing,they must be first aligned to a 3.2-billion-length human reference genome.However,the existing software tools for this purpose may need hours or days to align such large amount of DNA sequence data even with very powerful computing systems of today due to the'memory wall'challenge in state-of-the-art computing architecture that describes the speed mismatch between memory units and computing units.To this end this,project leverages innovations from non-volatile nano-magnet based Magnetic Random Access Memory(MRAM)technology and in-memory computing architecture.If successful,it can achieve up to two orders magnitude higher computing performance,speed and energy efficiency for next-generation DNA sequence analysis system,which enables large-scale fast genomic data analytics to support research on various disease studies and biomedical applications.This project will develop new undergraduate/graduate level course modules on in-memory computing architecture and bioinformatics.This project will follow two main research tracks.The first one explores how to leverage the intrinsic non-volatile MRAM device property to efficiently develop ultra-parallel,reconfigurable in-memory logic required by DNA alignment computation and its big DNA-data Processing-in-Memory(PIM)accelerator architecture.The second research track will investigate how to develop fast DNA alignment-in-memory algorithm based on Burrows-Wheeler Transformation to match with the proposed MRAM based PIM platform and its large-scale genomic analysis application in disease phenotype prediction.Alignments generated will be used to estimate gene expression,and identify single nucleotide mutation events for patient samples,leading to molecular signatures for disease risk assessment.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.This project explores to leverage innovations from both post-CMOS non-volatile nano-magnet based Magnetic Random Access Memory(MRAM)device technology and in-memory computing architecture to develop a revolutionary DNA sequence alignment-in-memory(AlignMEM)system.It advances next-generation ultra-fast and high-throughput DNA short read AlignMEM paradigm and targets to achieve two orders higher speed,throughput and energy efficiency compared to existing CPU/GPU computing systems.Intellectual merits:Across the whole life of this project,for the intellectual merits,the PIs’team successfully finished the proposed two main research tracks.For the first research track,the PIs’team designed different types of non-volatile memory based ultra-parallel,reconfigurable in-memory logic required by DNA alignment computation and its big DNA-data Processing-in-Memory(PIM)accelerator architecture.For the second research track,the PIs’team developed fast DNA alignment-in-memory algorithm based on Burrows-Wheeler Transformation to match with the proposed MRAM based PIM platform and its large-scale genomic analysis application in disease phenotype prediction.The PIs’team further used the processed data for estimating gene expression and many different types of genome processing.Our fabricated world-first genome processing-in-memory chip prototype successfully achieved the targeted energy efficiency with around two orders of magnitude higher than state-of-the-art counterpart,a strong indication of the success of this project.Research Publications:The above discussed research outcomes have led to 10+IEEE/ACM international journal and conference research publications from the PIs’team,such as JSSC,SSCL,TCAD,JLPEA,ENM,CICC,DAC,GLSVLSI,ISQED.PhD thesis:PhD graduated with thesis:“Compute-in-memory Circuits and Architectures for Efficient Acceleration of AI and Data Centric Workloads”.Genome processing chip prototype fabrication:The PIs’team has designed and fabricated the world first genome processing-in-memory chip prototype.The chip prototype is designed to accelerate two key types of genome processing applications using our developed PIM chip prototype:the state-of-the-art(SOTA)burrows–wheeler transform(BWT)-based DNA short-read alignment and alignment-free mRNA quantification.The chip prototype achieves 2.12 G suffixes/J(suffixes per joule)at 1.0 V,which is the most energy-efficient solution to date for genome processing.Broader Impacts:To promote its broader impacts,the PI has conducted followings:Students training:Four PhD students are partially supported through this project at ASU and UCF,conducting research in the in-memory computing circuit hardware and genome processing algorithm.Multiple master students and undergraduate students from the PI’s classes are trained with knowledge of state-of-the-art non-volatile memory design and circuit design.The PI also supervised several senior design teams with the topic related to this project to train undergraduate students.Outreach:the PI has organized and chaired in-memory computing workshop associated with the community’s top-tier conference,Design Automation Conference.The workshop attracted 100+attendees each year,serving as a great platform to promote the research outcomes of this project.Open source tools/models:multiple open-source software and tools are generated and shared in github for public use.Those tools are free to download for public.Last Modified:12/11/2025Modified by:Deliang FanThis project explores to leverage innovations from both post-CMOS non-volatile nano-magnet based Magnetic Random Access Memory(MRAM)device technology and in-memory computing architecture to develop a revolutionary DNA sequence alignment-in-memory(AlignMEM)system.It advances next-generation ultra-fast and high-throughput DNA short read AlignMEM paradigm and targets to achieve two orders higher speed,throughput and energy efficiency compared to existing CPU/GPU computing systems.Intellectual merits:Across the whole life of this project,for the intellectual merits,the PIs team successfully finished the proposed two main research tracks.For the first research track,the PIs team designed different types of non-volatile memory based ultra-parallel,reconfigurable in-memory logic required by DNA alignment computation and its big DNA-data Processing-in-Memory(PIM)accelerator architecture.For the second research track,the PIs team developed fast DNA alignment-in-memory algorithm based on Burrows-Wheeler Transformation to match with the proposed MRAM based PIM platform and its large-scale genomic analysis application in disease phenotype prediction.The PIs team further used the processed data for estimating gene expression and many different types of genome processing.Our fabricated world-first genome processing-in-memory chip prototype successfully achieved the targeted energy efficiency with around two orders of magnitude higher than state-of-the-art counterpart,a strong indication of the success of this project.Research Publications:The above discussed research outcomes have led to 10+IEEE/ACM international journal and conference research publications from the PIs team,such as JSSC,SSCL,TCAD,JLPEA,ENM,CICC,DAC,GLSVLSI,ISQED.PhD thesis:PhD graduated with thesis:Compute-in-memory Circuits and Architectures for Efficient Acceleration of AI and Data Centric Workloads.Genome processing chip prototype fabrication:The PIs team has designed and fabricated the world first genome processing-in-memory chip prototype.The chip prototype is designed to accelerate two key types of genome processing applications using our developed PIM chip prototype:the state-of-the-art(SOTA)burrowswheeler transform(BWT)-based DNA short-read alignment and alignment-free mRNA quantification.The chip prototype achieves 2.12 G suffixes/J(suffixes per joule)at 1.0 V,which is the most energy-efficient solution to date for genome processing.Broader Impacts:To promote its broader impacts,the PI has conducted followings:Students training:Four PhD students are partially supported through this project at ASU and UCF,conducting research in the in-memory computing circuit hardware and genome processing algorithm.Multiple master students and undergraduate students from the PIs classes are trained with knowledge of state-of-the-art non-volatile memory design and circuit design.The PI also supervised several senior design teams with the topic related to this project to train undergraduate students.Outreach:the PI has organized and chaired in-memory computing workshop associated with the communitys top-tier conference,Design Automation Conference.The workshop attracted 100+attendees each year,serving as a great platform to promote the research outcomes of this project.Open source tools/models:multiple open-source software and tools are generated and shared in github for public use.Those tools are free to download for public.Last Modified:12/11/2025Submitted by:DeliangFan

人员信息

Deliang Fan(Principal Investigator)：dfan@asu.edu；

机构信息

【Arizona State University(Performance Institution)】StreetAddress：660 S MILL AVENUE STE 204,TEMPE,Arizona,United States/ZipCode：852813670；【ARIZONA STATE UNIVERSITY】StreetAddress：1475 N SCOTTSDALE RD STE 200,SCOTTSDALE,Arizona,United States/PhoneNumber：4809655479/ZipCode：852573538；

项目主管部门

Directorate for Computer and Information Science and Engineering(CSE)-Division of Computing and Communication Foundations(CCF)

项目官员

Sankar Basu(Email：sabasu@nsf.gov；Phone：7032927843)

排序方式：时间相关性
显示方式：列表摘要

1.Dichotomous intronic polyadenylation profiles reveal multifaceted gene functions in the pan-cancer transcriptome.

Sun, Jiao;Kim, Jin-Young;Jun, Semo;Park, Meeyeon;de Jong, Ebbing;Chang, Jae-Woong;Cheng, Sze;Fan, Deliang;Chen, Yue;Griffin, Timothy J;Lee, Jung-Hee;You, Ho Jin;Zhang, Wei;Yong, Jeongsik
《Experimental & molecular medicine》
2024年
卷
期
期刊

Alternative cleavage and polyadenylation within introns (intronic APA) generate shorter mRNA isoforms; however, their physiological significance remains elusive. In this study, we developed a comprehensive workflow to analyze intronic APA profiles using the mammalian target of rapamycin (mTOR)-regulated transcriptome as a model system. Our investigation revealed two contrasting effects within the transcriptome in response to fluctuations in cellular mTOR activity: an increase in intronic APA for a subset of genes and a decrease for another subset of genes. The application of this workflow to RNA-seq data from The Cancer Genome Atlas demonstrated that this dichotomous intronic APA pattern is a consistent feature in transcriptomes across both normal tissues and various cancer types. Notably, our analyses of protein length changes resulting from intronic APA events revealed two distinct phenomena in proteome programming: a loss of functional domains due to significant changes in protein length or minimal alterations in C-terminal protein sequences within unstructured regions. Focusing on conserved intronic APA events across 10 different cancer types highlighted the prevalence of the latter cases in cancer transcriptomes, whereas the former cases were relatively enriched in normal tissue transcriptomes. These observations suggest potential, yet distinct, roles for intronic APA events during pathogenic processes and emphasize the abundance of protein isoforms with similar lengths in the cancer proteome. Furthermore, our investigation into the isoform-specific functions of JMJD6 intronic APA events supported the hypothesis that alterations in unstructured C-terminal protein regions lead to functional differences. Collectively, our findings underscore intronic APA events as a discrete molecular signature present in both normal tissues and cancer transcriptomes, highlighting the contribution of APA to the multifaceted functionality of the cancer proteome. © 2024. The Author(s).

...

2.Aligner-D: Leveraging In-DRAM Computing to Accelerate DNA Short Read Alignment

关键词：
DNA; Random access memory; Task analysis; Genomics; Bioinformatics;Throughput; Sequential analysis; DNA short read alignment;processing-in-memory; DRAM; accelerator

Zhang, Fan;Angizi, Shaahin;Sun, Jiao;Zhang, Wei;Fan, Deliang
《IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS》
2023年
13卷
1期
期刊

DNA short read alignment task has become a major sequential bottleneck to humongous amounts of data generated by next-generation sequencing platforms. In this paper, an energy-efficient and high-throughput Processing-in-Memory (PIM) accelerator based on DRAM (named Aligner-D) is presented to execute DNA short-read alignment with the state-of-the-art BWT alignment algorithm. We first present the PIM design that utilizes DRAM's internal high parallelism and throughput. It converts each DRAM array to a potent processing unit for alignment tasks. The proposed Aligner-D can efficiently execute the bulk bit-wise XNOR-based matching operation required by the alignment task with only 3-transistor/col overhead. We then introduce a highly parallel and customized read alignment algorithm based on BWT that supports both exact and inexact match tasks. Next, we present how to map the correlated data of the alignment task to utilize the parallelism from both new hardware and algorithm maximumly. The experimental results demonstrate that Aligner-D obtains $\sim 4\times $ , $\sim 2.45\times $ , $\sim 3.26\times $ , and $\sim 1.65\times $ improvement, respectively, compared with other in-memory computing platforms: Ambit (Seshadri et al., 2017), DRISA-1T1C (Li et al., 2017), DRISA-3T1C (Li et al., 2017), and ReDRAM (Angizi and Fan, 2019). As for DNA short read alignment, Aligner-D boosts the alignment throughput per Watt by $\sim 20104\times $ , $\sim 3522\times $ , $\sim 927\times $ , $\sim 88\times $ , $\sim 5.28\times $ , and $\sim 2.34\times $ , over ReCAM, CPU, GPU, FPGA, Ambit, and DRISA, respectively.

...

3.MeF-RAM: A New Non-Volatile Cache Memory Based on Magneto-Electric FET

关键词：
Magneto-electric FETs; non-volatile memory; memory bit-cell; cachedesign;PERFORMANCE; BENCHMARKING; OPTIMIZATION; CIRCUIT; ENERGY; WSE2

Angizi, Shaahin;Khoshavi, Navid;Marshall, Andrew;Dowben, Peter;Fan, Deliang
《ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS》
2022年
27卷
2期
期刊

Magneto-Electric FET (MEFET) is a recently developed post-CMOS FET, which offers intriguing characteristics for high-speed and low-power design in both logic and memory applications. In this article, we present MeF-RAM, a non-volatile cache memory design based on 2-Transistor-1-MEFET (2T1M) memory bit-cell with separate read and write paths. We show that with proper co-design across MEFET device, memory cell circuit, and array architecture, MeF-RAM is a promising candidate for fast non-volatile memory (NVM). To evaluate its cache performance in the memory system, we, for the first time, build a device-to-architecture cross-layer evaluation framework to quantitatively analyze and benchmark the MeF-RAM design with other memory technologies, including both volatile memory (i.e., SRAM, eDRAM) and other popular non-volatile emerging memory (i.e., ReRAM, STT-MRAM, and SOT-MRAM). The experiment results for the PARSEC benchmark suite indicate that, as an L2 cache memory, MeF-RAM reduces Energy Area Latency (EAT) product on average by similar to 98% and similar to 70% compared with typical 6T-SRAM and 2T1R SOT-MRAM counterparts, respectively.

...

4.APA-Scan: detection and visualization of 3'-UTR alternative polyadenylation with RNA-seq and 3'-end-seq data.

关键词：
0 / 3' Untranslated Regions. 0 / MicroRNAs. 0 / Protein Isoforms. 0 / RNA Precursors. 0 / RNA, Messenger;3′-End-seq; Alternative polyadenylation; RNA-seq; Transcriptome

Fahmi, Naima Ahmed;Ahmed, Khandakar Tanvir;Chang, Jae-Woong;Nassereddeen, Heba;Fan, Deliang;Yong, Jeongsik;Zhang, Wei
《BMC bioinformatics》
2022年
23卷
Suppl 3期
期刊

BACKGROUND: The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3'-untranslated region (3'-UTR) of mRNA produces transcripts with shorter or longer 3'-UTR. Often, 3'-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3'-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3'-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3'-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3'-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations.; METHODS: APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3'-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3'-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significance of the events among two biological conditions; (iii) graphical representation of user specific event with 3'-UTR annotation and read coverage on the 3'-UTR regions. APA-Scan is implemented in Python3. Source code and a comprehensive user's manual are freely available at https://github.com/compbiolabucf/APA-Scan .; RESULT: APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3'-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3'-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3'-UTR APA events and improve genome annotation.; CONCLUSION: APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3'-UTR APA events. The pipeline integrates both RNA-seq and 3'-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots. © 2022. The Author(s).

...

5.Computational Methods to Study Human Transcript Variants in COVID-19 Infected Lung Cancer Cells

关键词：
COVID-19; transcript variants; alternative splicing; alternativepolyadenylation; RNA-seq; 3 '-UTR;GENE; RESPONSES; DATABASE

Sun, Jiao;Fahmi, Naima Ahmed;Nassereddeen, Heba;Cheng, Sze;Martinez, Irene;Fan, Deliang;Yong, Jeongsik;Zhang, Wei
《INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES》
2021年
22卷
18期
期刊

Microbes and viruses are known to alter host transcriptomes by means of infection. In light of recent challenges posed by the COVID-19 pandemic, a deeper understanding of the disease at the transcriptome level is needed. However, research about transcriptome reprogramming by post-transcriptional regulation is very limited. In this study, computational methods developed by our lab were applied to RNA-seq data to detect transcript variants (i.e., alternative splicing (AS) and alternative polyadenylation (APA) events). The RNA-seq data were obtained from a publicly available source, and they consist of mock-treated and SARS-CoV-2 infected (COVID-19) lung alveolar (A549) cells. Data analysis results show that more AS events are found in SARS-CoV-2 infected cells than in mock-treated cells, whereas fewer APA events are detected in SARS-CoV-2 infected cells. A combination of conventional differential gene expression analysis and transcript variants analysis revealed that most of the genes with transcript variants are not differentially expressed. This indicates that no strong correlation exists between differential gene expression and the AS/APA events in the mock-treated or SARS-CoV-2 infected samples. These genes with transcript variants can be applied as another layer of molecular signatures for COVID-19 studies. In addition, the transcript variants are enriched in important biological pathways that were not detected in the studies that only focused on differential gene expression analysis. Therefore, the pathways may lead to new molecular mechanisms of SARS-CoV-2 pathogenesis.

...

6.PIM-Assembler:A Processing-in-Memory Platform for Genome Assembly

关键词：
ALGORITHMS; TOOL

Angizi, Shaahin;Fahmi, Naima Ahmed;Zhang, Wei;Fan, Deliang
《PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE 》
2020年
卷
期
期刊

In this paper, for the first time, we propose a high-throughput and energy-efficient Processing-in-DRAM-accelerated genome assembler called PIM-Assembler based on an optimized and hardware-friendly genome assembly algorithm. PIM-Assembler can assemble large-scale DNA sequence dataset from all-pair overlaps. We first develop PIM-Assembler platform that harnesses DRAM as computational memory and transforms it to a fundamental processing unit for genome assembly. PIM-Assembler can perform efficient X(N)OR-based operations inside DRAM incurring low cost on top of commodity DRAM designs (similar to 5% of chip area). PIM-Assembler is then optimized through a correlated data partitioning and mapping methodology that allows local storage and processing of DNA short reads to fully exploit the genome assembly algorithm-level's parallelism. The simulation results show that PIM-Assembler achieves on average 8.4x and 2.3x higher throughput for performing bulk bitwise XNOR-based comparison operations compared with CPU and recent processing-in-DRAM platforms, respectively. As for comparison/addition-extensive genome assembly application, it reduces the execution time and power by similar to 5x and similar to 7.5x compared to GPU.

...

排序方式：时间相关性
显示方式：列表摘要