In this work we develop a general optimization framework to more accurately
recover structural variants (SVs) in low-coverage sequencing data from
genomes of related individuals. In previous work the framework incorporated
biological constraints that reflect relatedness between individuals and enforced
sparsity to model the rarity of SVs. This framework operated under the assumption
that the genomes were haploid, meaning that each individual had one
copy of the genetic material. There are two main contributions of this thesis:
First we propose an approach that allows the child signal to possess variants
that are not present in either parent (i.e., novel SVs) under the assumption of
haploid signals. Second, we propose an approach to reconstruct the signals of
two parents and a child under the assumption of diploid genomes. We tested
the e↵ectiveness of these approaches on both simulated data and data from the
1000 Genomes Project.
Author
Advisor
Advisor