Identification of transcription factor binding sites as sequencing targets, using Diamond-Blackfan Anaemia as a model
- Programme
- STP
- Specialty
- Clinical Bioinformatics - Genomics
- Author
- Clodagh McGuire
- Training location
- King's College Hospital NHS Foundation Trust
Background
Diamond-Blackfan Anaemia (DBA) is a rare inherited bone marrow failure syndrome. Most patients have a heterozygous pathogenic variant in a ribosomal protein (RP) gene. In rare cases, DBA is caused by a variant in GATA1 – an important transcription factor known to act downstream of the RP genes in the erythroid pathway, but suspected to also act upstream to regulate these genes. A third of patients do not receive a molecular diagnosis after sequencing these genes, so we suspect they may have variants in regulatory regions.
Aims
To investigate GATA1 binding in RP genes and identify targets in regulatory regions to sequence in undiagnosed patients.
Method
We used publicly available ChIP-seq data to investigate GATA1 binding in erythroblasts. ChIP-seq peaks were assigned to genes using ChIPseeker and histone modifications were used to confirm whether the GATA1-bound genes were transcriptionally active. We then scanned the sequences beneath the GATA1 peaks using FIMO to identify the GATA1 binding motifs and primers were designed to target these regions for future sequencing.
Results/Conclusion
We identified 66 RP genes where GATA1 was binding and confirmed that these genes were being actively transcribed in erythroblasts, demonstrating that GATA1 acts upstream to regulate these genes. In total, 558 GATA1 motifs were identified in RP genes and their regulatory regions; these regions +/- 30bp will be used to target for sequencing. This methodology could be used in future to investigate regulatory regions of genes implicated in other rare diseases.