Introduction Day3
Overview
Teaching: 20 min
Exercises: 30 minQuestions
Where does the dataset come from?
How to login
Where are the files located
Objectives
Understand the data
Choose login details
Familiarize yourself with the environment
Introduction
We will be making use of Google Docs: https://docs.google.com/document/d/19RkKurPgTVAOrzZINQBKUjb2CkMRrsxjejERBvUJCq0/edit. The Google doc contains the link to the shared documents, the IP address of the server, etc.
Dataset
Introduction to the dataset is given here: https://www.dropbox.com/s/hi7ivyml0vle2k1/Bacterial%20GWAS%20DEC%202018.pptx?dl=0 . In total we will be analyzing 62 genomes, of which one is a closed reference genome (OXC141).
How to login
Follow the instructions at https://aschuerch.github.io/Microbial-Genomics-2018/02-intro-cloud/index.html. Use the hostname listed in the Etherpad. We will be using a shared server as the tools used for this part of the course require more computing power and small single Amazon instances will not be sufficient. As a consequence, everyone has their own username/password (see the Google Docs file).
Where are the files located
In your home folder (~/), you will find the directory “reads” which contains symlinks to the read files used in this study. As assembling of all the genomes in this study would be too time consuming, we will assembling only two genomes per person. We will combined the outputs of each person later on.
Key Points
Sequencing S. pneumoniae patient isolates to determine assocations of bacterial genes with disease severity