Microbial Genomics course 2019

The lessons of the first two days use the Software Carpentry and Data Carpentry lesson template and are based on the Data Carpentry Genomics lesson.

Schedule week 1: Prokaryotic microbes (bacteria)

Days 1-2 are taught by Anita Schuerch. Days 3-4 are taught by Aldert Zomer and Bas Dutilh.

  1. Tuesday, 3 April: Intro to data and the command line, shell scripting, Quality control, workflows
  2. Wednesday, 4 April: Workflows continued, Genome visualization, Metadata manipulation with R
  3. Thursday, 5 April: Trimming, QC, Assembly, Assembly lecture
  4. Friday, 6 April: Annotation, pangenome analysis, Association scoring, Wrap-up

Schedule week 2: Eukaryotic microbes (fungi)

Days 5-8 are taught by Jerome Collemare, Ronnie de Jonge, and Robin Ohm.

  1. Monday, 9 April: Assembly
  2. Tuesday, 10 April: RNAseq and gene prediction
  3. Wednesday, 11 April: Functional annotation
  4. Thursday, 12 April: Comparative genomics
  5. Friday, 13 April: Exam

Data

In days 1-2 we will use data from a long term evolution experiment published in 2012: Genomic analysis of a key innovation in an experimental Escherichia coli population by Blount ZD, Barrick JE, Davidson CJ, and Lenski RE. (doi: 10.1038/nature11514).

We are using this collaborative document.

The etherpad of Day 1 is archived here. The etherpad of Day 2 is archived here.

In week 2 we will use publically available genome and RNA data from the organism Zymoseptoria tritici and related fungi. Outline and instructions (to be made available at the start of week 2) can be found here: https://drive.google.com/drive/folders/1_iTtzbKqk7j5uzMeN1R_lFgPGuF6KURl?usp=sharing

Schedule

Setup Download files required for the lesson
Day 1 09:00 1. Introduction What are the goals of this practical training?
What data do we work on?
09:10 2. Logging onto Cloud How do I connect to an AWS instance?
09:20 3. Introducing the Shell What is a command shell and why would I use one?
How can I move around on my computer?
How can I see what files and directories I have?
How can I specify the location of a file or directory on my computer?
09:50 4. Navigating Files and Directories How can I perform operations on files outside of my working directory?
What are some navigational shortcuts I can use to make my work more efficient?
10:40 5. Working with Files and Directories How can I view and search file contents?
How can I create, copy and delete files and directories?
How can I control who has permission to modify a file?
How can I repeat recently used commands?
11:25 6. Redirection How can I search within files?
How can I combine existing commands to do new things?
12:10 7. Writing Scripts How can we automate a commonly used set of commands?
12:50 8. Project Organization How can I organize my file system for a new bioinformatics project?
How can I document my work?
13:20 9. Assessing Read Quality How can I describe the quality of my data?
14:10 10. Trimming and Filtering How can I get rid of sequence data that doesn’t meet my quality standards?
15:05 11. Variant Calling Workflow How do I find sequence variants between my samples and a reference genome?
16:05 Finish
Day 2 09:00 12. Automating a Variant Calling Workflow How can I make my workflow more efficient and less error-prone?
10:15 13. R for microbial genomics How can I use R to manipulate and visualize microbial genomic data?
17:00 Finish
Day 3 09:00 14. Introduction Day3 Where does the dataset come from?
How to login
Where are the files located
09:50 15. Sequence assembly How can the information in the sequencing reads be reduced?
What are the different methods for assembly?
10:50 16. Sequence Quality What is the N50 ?
What are the different methods for assembly?
11:20 17. Inspecting sequence graphs Can we find out which scaffolds or contigs are connected?
12:20 Finish
Day 4 09:00 18. Annotation How are proteins predicted from a DNA sequence?
10:00 19. Pangenome analysis How to determine a pangenome from a collection of isolate genome sequences?
11:00 20. Phylogenetic trees from the core genome What is better, a gene presence absence tree or a tree from core genes/proteins
Is there a specific clone associated with patient mortality
12:00 21. Bacterial GWAS Which genes are associated with patient mortality
12:40 22. Wrapup Day 3 and 4 Do your findings match the published data?
13:00 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.