Edition November 2023.
Prokaryotic microbes - Assembly, annotation, pangenome analysis and GWAS
This practical is part of the Microbial Genomics practical course of Utrecht University
- Working with genome assemblies, QC and annotation of genomes
- Pangenome analysis, genomic phylogenetic trees and GWAS
Data
We will make use of pneumococcal genomes of patients who have experienced invasive pneumococcal disease The dataset is described in the following papers: https://www.ncbi.nlm.nih.gov/pubmed/29788414 and https://www.ncbi.nlm.nih.gov/pubmed/28096486
In these lessons you will learn how to assemble genomes, how to check for problems in assemblies, how to annotate genomes, how to work with annotations, how to determine the pan genome and the core genome and finally you will associate gene presence/absence with phenotypes using contingency testing (“GWAS”).
Setup | Download files required for the lesson | ||
Day 1 | 09:00 | 1. Welcome | |
09:15 | 2. Introduction |
Where does the dataset come from?
How to login Where are the files located |
|
09:45 | 3. Sequence Read Quality Lecture |
How does sequencing work
Where do the errors come from |
|
10:15 | 4. Morning break | ||
10:30 | 5. Sequence assembly |
How can the information in the sequencing reads be reduced?
What are the different methods for assembly? |
|
11:40 | 6. Sequence Assembly Lecture |
How can the information in the sequencing reads be reduced?
What are the different methods for assembly? How can we assess the quality of an assembly? |
|
12:40 | 7. Lunch break | ||
13:25 | 8. Sequence Quality |
What is the N50 ?
What are single copy chromosomal marker genes ? |
|
14:25 | 9. Inspecting sequence graphs | Can we find out which scaffolds or contigs are connected? | |
15:55 | 10. Afternoon break | ||
16:10 | Finish | ||
Day 2 | 09:00 | 11. Introduction day 2 | |
09:30 | 12. Annotation | How are proteins predicted from a DNA sequence? | |
10:30 | 13. Morning break | ||
10:45 | 14. Bacterial GWAS Lecture |
Can we associate the presence of genes with phenotypes
What is population structure correction How do we deal with false positives? |
|
11:45 | 15. Pangenome analysis | How to determine a pangenome from a collection of isolate genome sequences? | |
12:45 | 16. Lunch break | ||
13:30 | 17. Phylogenetic trees from the core genome |
What is better, a gene presence absence tree or a tree from core genes/proteins
Is there a specific clone associated with patient mortality |
|
14:30 | 18. Bacterial GWAS | Which genes are associated with patient mortality | |
16:00 | 19. Wrapup | Do your findings match the published data? | |
16:20 | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.