Microbial Genomics course 2022

Edition November 2023.

Prokaryotic microbes - Assembly, annotation, pangenome analysis and GWAS

This practical is part of the Microbial Genomics practical course of Utrecht University

  1. Working with genome assemblies, QC and annotation of genomes
  2. Pangenome analysis, genomic phylogenetic trees and GWAS

Data

We will make use of pneumococcal genomes of patients who have experienced invasive pneumococcal disease The dataset is described in the following papers: https://www.ncbi.nlm.nih.gov/pubmed/29788414 and https://www.ncbi.nlm.nih.gov/pubmed/28096486

In these lessons you will learn how to assemble genomes, how to check for problems in assemblies, how to annotate genomes, how to work with annotations, how to determine the pan genome and the core genome and finally you will associate gene presence/absence with phenotypes using contingency testing (“GWAS”).

Schedule

Setup Download files required for the lesson
Day 1 09:00 1. Welcome
09:15 2. Introduction Where does the dataset come from?
How to login
Where are the files located
09:45 3. Sequence Read Quality Lecture How does sequencing work
Where do the errors come from
10:15 4. Morning break
10:30 5. Sequence assembly How can the information in the sequencing reads be reduced?
What are the different methods for assembly?
11:40 6. Sequence Assembly Lecture How can the information in the sequencing reads be reduced?
What are the different methods for assembly?
How can we assess the quality of an assembly?
12:40 7. Lunch break
13:25 8. Sequence Quality What is the N50 ?
What are single copy chromosomal marker genes ?
14:25 9. Inspecting sequence graphs Can we find out which scaffolds or contigs are connected?
15:55 10. Afternoon break
16:10 Finish
Day 2 09:00 11. Introduction day 2
09:30 12. Annotation How are proteins predicted from a DNA sequence?
10:30 13. Morning break
10:45 14. Bacterial GWAS Lecture Can we associate the presence of genes with phenotypes
What is population structure correction
How do we deal with false positives?
11:45 15. Pangenome analysis How to determine a pangenome from a collection of isolate genome sequences?
12:45 16. Lunch break
13:30 17. Phylogenetic trees from the core genome What is better, a gene presence absence tree or a tree from core genes/proteins
Is there a specific clone associated with patient mortality
14:30 18. Bacterial GWAS Which genes are associated with patient mortality
16:00 19. Wrapup Do your findings match the published data?
16:20 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.