introduction_to_gwas
Material for the Course “Introduction to genome-wide association studies (GWAS)”
Instructors: Filippo Biscarini, Oscar Gonzalez-Recio, Christian Werner
This course will introduce students, researchers and professionals to the steps needed to build an analysis pipeline for Genome-Wide Association Studies (GWAS). The course will describe all the necessary steps involved in a typical GWAS study, which will then be used to build a reusable and reproducible bioinformatics pipeline.
Each day the course will start at 14:00 and end at 20:00 (CET).
As a general rule, we’ll have a longer break (30 minutes) at 16:00 and two shorter breaks (10-15 minutes) later on during the day (to be decided flexibly depending on the sessions).
Day 1
- Lecture 0 General Introduction / Overview of the Course [Filippo, Oscar, Christian]
- [General Introduction]
- [GWAS Workflow (short)]
- Lecture 1 GWAS Overview: Case Studies / Examples from Literature [Oscar]
- [GWAS Overview]<!–(slides/1_GWAS_overview.pdf)
- Lecture 2 Introduction to GWAS: Linkage Disequilibrium and Linear Regression [Oscar]
- Lecture 3 - Basic Linux and the Shell [Christian]
- Lab 1 - Practicalities and Set-up (Server, github repo, R, etc) and Description of Datasets [Christian]
- [Description of Datasets]
- [Unix Cheatsheet]
- [Course Manual]
- [GWAS Workflow]
Day 2
- Lab 2 (Demonstration) GWAS: Basic Models (Linear and Logistic Regression) [Oscar]
- Lecture 4 Data Types & Formats [Christian]
- [Common Data Types and Formats]
- Lecture 5 Initial Data Analysis, Exploratory Data Analysis and Data Pre-Processing [Christian]
- [IDA, EDA & Data Pre-Processing]
- Lab 3 EDA & IDA [Christian]
- Lab 4 Data Pre-Processing [Christian]
- Lecture 6 The Multiple Testing Issue [Oscar]
- Lecture 7 Statistical Power, Population Stratification and Experimental Design [Oscar]
Day 3
- Lecture 8 Imputation of Missing Genotypes [Christian]
- Lab 5 Imputation of Missing Genotypes using Beagle [Christian]
- Lecture 9 KNN Imputation
- Lab 6 (Demonstration) KNNI Imputation [Filippo]
- Lab 7 GWAS: The Stand-Alone Script(s) for the Full Model [Filippo]
- Lab 8 Revising the Steps involved in GWAS [Filippo]
- Brief Intermission:
- Lab 9 Introducing the Exercise [Filippo]
- Collaborative Exercise: let’s build our own GWAS workflow on new data [Filippo, Oscar, Christian]
- Part 1: Individual/Group Break-Out Sessions to give it a try independetly
- Part 2: Whole-Group Revision of the Exercise
Day 4
- Lecture 10 Bioinformatics Pipelines: a super-elementary Introduction [Filippo]
- [A bioinformatics pipeline for GWAS]
- Lab 10 Building a Pipeline with Snakemake [Filippo]
- Lab 11 The GWAS pipeline for Continuous Phenotypes [Filippo]
- Plug-In for Mean or KNN Imputation
- The GWAS pipeline for Binary Phenotypes (Guided Exercise) [Filippo]
- Q&A on building Pipelines for GWAS [Filippo, Oscar, Christian]
- Lecture 11 A light Touch on Post-GWAS Analysis: Inferring Functionality [Oscar]
Day 5
- Lecture 12 GWAS Model Extensions: [Filippo]
- Lecture 13 A Glimpse on ROH-based Alternative [Filippo, optional]
- [ROH-based and Resampling Methods as alternative Approaches]
- [Other gene actions]
- Kahoot Quiz on what we learned about GWAS! [Filippo, Oscar, Christian]
- Conclusions and Wrap-Up Discussion on GWAS [Filippo, Oscar, Christian]
Organization of the Code for the practical Sessions
- preparatory_steps: download and prepare the data
- preprocessing: filter the data
- imputation: imputing missing genotypes
- gwas: run the GWAS models
- power_and_significance: designing GWAS experiments
- steps: identifying the individual steps involved in a GWAS study
- pipeline: assembling the individual steps into a bioinformatics pipeline for GWAS
- collaborative exercise: trying out what we learnt on new data