RNA Transcript Assembly Using Inexact Flows

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

RNA-Seq technology allows for high-throughput, low cost measurement of gene expression. An important step in this process is the assembly of mRNA transcript short reads into full transcripts. The problem can be viewed as a flow decomposition problem in which the objective is to minimize the number of path flows needed to represent a given flow. In this work we relax the edge flow constraints to allow for some uncertainty in their measurement. We formulate this as the Inexact Flow Decomposition problem and propose an algorithmic strategy to solve it. In practice, real biological data has measurement errors and so experimentally-derived edge-weighted splice graphs are often not flows. The proposed method is the first approach to this problem that explicitly controls the error allowed on each edge in these graphs in order to achieve a flow. In an intermediate step, the method solves an exact flow decomposition instance; if a greedy method is used for this step, the overall running time is O(\vert E\vert^{2}\vert V\vert^{2}+\vert P\vert^{3}), where P is the solution found to the flow decomposition instance. Preliminary results on simulated biological data sets show that in many cases the ground truth paths can be recovered at approximately correct abundances, even with noisy input data.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
EditorsIllhoi Yoo, Jinbo Bi, Xiaohua Tony Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1907-1914
Number of pages8
ISBN (Electronic)9781728118673
DOIs
StatePublished - Nov 2019
Event2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 - San Diego, United States
Duration: Nov 18 2019Nov 21 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019

Conference

Conference2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
Country/TerritoryUnited States
CitySan Diego
Period11/18/1911/21/19

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • flow networks
  • RNA sequencing
  • RNA splicing
  • transcript assembly

Fingerprint

Dive into the research topics of 'RNA Transcript Assembly Using Inexact Flows'. Together they form a unique fingerprint.

Cite this