CpG islands’ clustering uncovers early development genes in the human genome

Vladimir N. Babenko, Anton G. Bogomolov, Roman O. Babenko, Elvira R. Galieva, Yuriy L. Orlov

We address the problem of the annotation of CpG islands (CGIs) clusters in the human genome. Upon analyzing gene content within CGIs clusters, piRNA, tRNA, and miRNA-encoding genes were found as well as CpG-rich homeobox genes reported previously. Chromosome-wide CGI density is positively correlated with replication timing, confirming that CGIs may serve as open chromatin markers. Early embryonic stage expressed KRAB-ZNF genes abundant at chromosome 19 were found to be interlinked with CGI clusters. We detected that a number of long CGIs and CGI clusters are, in fact, tandem copies with multiple annotated macrosatellites and paralogous genes. This finding implies that tandem expansion of CGIs may serve as a substrate for non-homologous recombination events.