ATAC-seq signal processing and recurrent neural networks can identify RNA polymerase activity Journal Article uri icon

Overview

abstract

  • AbstractNascent transcription assays are the current gold standard for identifying regions of active transcription, including markers for functional transcription factor (TF) binding. Here we present a signal processing-based model to determine regions of active transcription genome-wide using the simpler assay for transposase-accessible chromatin, followed by high-throughput sequencing (ATAC-seq). The focus of this study is twofold: First, we perform a frequency space analysis of the “signal” generated from ATAC-seq experiments’ short reads, at a single-nucleotide resolution, using a discrete wavelet transform. Second, we explore different uses of neural networks to combine this signal with its underlying genome sequence in order to classify ATAC-seq peaks on the presence or absence of bidirectional transcription. We analyze the performance of different data encoding schemes and machine learning architectures, and show how a hybrid signal/sequence representation classified using recurrent neural networks (RNNs) yields the best performance across different cell types.Contactrobin.dowell@colorado.edu

publication date

  • January 26, 2019

has restriction

  • green

Date in CU Experts

  • November 4, 2020 1:06 AM

Full Author List

  • Tripodi IJ; Chowdhury M; Dowell R

author count

  • 3

Other Profiles