Users of Segway will probably be interested in our new review of
segmentation and genome annotation (SAGA) algorithms generally. Now
available on arXiv:

Libbrecht MW*, Chan RCW*, Hoffman MM. “Segmentation and genome annotation
algorithms.” <> 2020. Preprint:

Abstract: Segmentation and genome annotation (SAGA) algorithms are widely
used to understand genome activity and gene regulation. These algorithms
take as input epigenomic datasets, such as chromatin
immunoprecipitation-sequencing (ChIP-seq) measurements of histone
modifications or transcription factor binding. They partition the genome
and assign a label to each segment such that positions with the same label
exhibit similar patterns of input data. SAGA algorithms discover categories
of activity such as promoters, enhancers, or parts of genes without prior
knowledge of known genomic elements. In this sense, they generally act in
an unsupervised fashion like clustering algorithms, but with the additional
simultaneous function of segmenting the genome. Here, we review the common
methodological framework that underlies these methods, review variants of
and improvements upon this basic framework, catalogue existing large-scale
reference annotations, and discuss the outlook for future work.


Michael M. Hoffman, PhD
Senior Scientist, Princess Margaret Cancer Centre
Associate Professor, Department of Medical Biophysics, University of Toronto
Associate Professor, Department of Computer Science, University of Toronto
Faculty Affiliate, Vector Institute
Princess Margaret Cancer Research Tower 11-311
101 College St
Toronto, ON M5G 1L7