Refphase: Multi-sample reference phasing reveals haplotype-specific copy number heterogeneity

Tom L. Kaufmann
Marina Petkovic
Thomas B. K. Watkins
Emma C. Colliver
Sofya Laskina
Nisha Thapa
Darlan C. Minussi
Nicholas Navin
Charles Swanton
Peter Van Loo
Kerstin Haase
Maxime Tarabichi
Roland F. Schwarz

November 14 , 2022

Aneuploidy, chromosomal instability, somatic copy-number alterations, and whole-genome doubling (WGD) play key roles in cancer evolution and provide information for the complex task of phylogenetic inference. We present MEDICC2, a method for inferring evolutionary trees and WGD using haplotype-specific somatic copy-number alterations from single-cell or bulk data. MEDICC2 eschews simplifications such as the infinite sites assumption, allowing multiple mutations and parallel evolution, and does not treat adjacent loci as independent, allowing overlapping copy-number events. Using simulations and multiple data types from 2780 tumors, we use MEDICC2 to demonstrate accurate inference of phylogenies, clonal and subclonal WGD, and ancestral copy-number states.