In order to get a general view of human genome replication, we are developing high-throughput analyses to map the origins of replication as well as their activation time (Figure 1).
Figure 1. Representative features of early and late replication domains
UCSC genome browser visualization of a 3Mb genomic region of the avian chromosome 1. Data were obtained in the DT40 avian cell line. Tracks of nascent strands (NS) enrichments in the four S-phase fractions, from early to late, are shown separately (S1 to S4). NS-enriched and depleted regions for each fraction are shown in red and blue, respectively. This data provides information on the temporal program of DNA replication. Single reads from nascent RNA-seq data and aligned SNS (spatial program of DNA replication) are reported and between tracks of annotated genes (RefSeq genes) and CpG Islands. The bottom track shows GC content (GC percent).
Collaboration with statisticians and bioinformaticians has allowed us to link these maps to genomic data on chromatin structure and gene expression. We also use an avian cell model (the DT40 cell line) which has the unique property of performing homologous recombination very efficiently. This powerful genetic model allows us to test very efficiently different hypotheses extracted from genomic analyses. We were able to show that G-quadruplexes play a key role in the definition of replication origins. We have just identified a complex motif that is found in half of the strong origins in humans, mice and chickens. Future projects will focus on understanding how this motif ensures the efficient recruitment of the replication machinery.
In a second line of research, we are analyzing how the origin signal carried by this complex motif interacts with the transcription process that takes place on the same substrate, DNA. These analyses will allow us to test a well-supported hypothesis that transcription is able to shift the starting points of replication and thus generally promote replication initiation in intergenic regions. Common Fragile Sites (CFSs) are recurrent sites of chromosomal rearrangements in cancers and some neurological diseases. They are found within large (> 300 kb) genes transcribed and replicated at the end of S phase. One hypothesis regarding their formation is that the lack of replication initiation events along these genes results in incomplete replication of these regions before mitosis. Transcription would remove the pre-replication complexes located within the bodies of these genes, thus inducing a depletion of replication origins. We seek to test the ability of RNA polymerase II to displace pre-replication complexes by inserting a strong or inducible promoter upstream of a highly efficient minimal model origin lacking transcriptional activity. This project will explore an important hypothesis on the formation of CFS. In the longer term, we will test the hypothesis that recurrent break sites observed during the proliferation of neuronal precursors have the characteristic properties of CFS. For this purpose, we will use human brain organoids as a model system.
The understanding of the duplication mode of eukaryotic genomes is essential. Indeed, replication not only ensures the maintenance of genome integrity, but also coordinates the establishment of expression programs during development.
Key words : DNA replication, G-quadruplex, Chromatin, Commun Fragile Sites, Cortical Organoïd.