Course unit details:
Bioinformatics, Interpretation, Statistics and Data Quality Assurance
Unit code | BIOL67981 |
---|---|
Credit rating | 15 |
Unit level | FHEQ level 7 – master's degree or fourth year of an integrated master's degree |
Teaching period(s) | Semester 1 |
Offered by | School of Biological Sciences |
Available as a free choice unit? | No |
Overview
Genetics/Genomics
- Introduction to the history and scope of genomics
- The Genome Landscape
- Nucleic Acid structure and function, including the structure and function of coding and non-coding DNA
- The central dogma: From DNA, to RNA and proteins
- Noncoding regulatory sequence: promoters, transcription factor binding sites, splice site dinucleotides, enhancers, insulators
- Genetic variation and its role in health and disease
- Genomic technology and role of the genome in the development and treatment of disease
Sequencing
- Types of sequencing, applications and limitations; Sanger versus short read
- Analysis, annotation and interpretation
- Panel versus exome versus whole genome resequencing
- Aligning genome data to reference sequence using up to date alignment programmes (e.g.BWA)
Statistics
- Basic statistics applied to clinical genetics/genomics
- Hardy-Weinberg, Bayes theorem, risks in pedigrees
- Assessment of data quality through application of quality control measures
- How to determine the analytical sensitivity and specificity of genomic tests
Bioinformatic Fundamentals
- Introduction to the history and scope of bioinformatics
- Primary biological sequence resources, including INDSC (GenBank, EMBL, DDBJ) and UniProt (SwissProt and TrEMBL)
- Genome browsers and interfaces; including Ensembl, UCSC Genome Browser, Entrez
- Similarity/homology, theory of sequence analysis, scoring matrices, dynamic programming methods including BLAST, pairwise alignments(e.g., Smith Waterman, Needleman Wunsch), multiple sequence alignments (e.g., ClustalW, T-Coffee, Muscle), BLAT
- Feature identification including SNP analysis and transcription factor binding sites and their associated TF binding sequence motifs
- Ontologies – in particular GO, Human Phenotype Ontology (HPO)
Clinical application of bioinformatics
Introduction to the clinical application of bioinformatic resources, including its role and use in a medical context in molecular genetics, cytogenetics and next generation sequencing for data manipulation and analysis, and genotyping microarrays (also used to predict CNVs).
Use of tools to call sequence variants e.g. GATK, annotation of variant-call files (vcf) using established databases. Filtering strategies of variants, in context of clinical data, and using publically-available control data sets. Use of multiple database sources, in silico tools and literature for pathogenicity evaluation, and familiarity with the statistical programmes to support this.
Background and application of specialist databases and browsers:
- dbSNP, DECIPHER, Orphanet, DMuDB / NGRL Universal Browser, ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/intro/) ,OMIM, ECARUCA. DGV, ExAC, NHLBI-GO
- LOVD/UMD database software and scientific literature
- HGMD
- Specific clinical analysis software
- CNV analysis
- Gene Prioritisation (e.g. ToppGene, Endeavour, GeCCO)
- Missense analysis (e.g. Align GVGD, SIFT, PolyPhen, Panther, PhDSNP, MAPP)
- Splicing analysis applications (e.g. GeneSplicer, MAxEntScan, NNSplice, SSFL, HSF, NetGene2)
- Commercially available software (e.g. NextGENe, Alamut, Cartegenia)
- Capture and representation of phenotype data
- Development of a simple application for clinical bioinformatic use
Aims
By the end of this compulsory module the student will be able to:
1. Analyse the principles applied to quality control of sequencing data, alignment of sequence to
the reference genome, calling and annotating sequence variants, and filtering strategies to
identify pathogenic mutations in sequencing data
2. Interrogate major data sources, e.g. of genomic sequence, protein sequences, variation,
pathways, (e.g. EVS, dbSNP, ClinVar, etc.) and be able to integrate with clinical data, to assess
the pathogenic and clinical significance of the genome result
3. Acquire relevant basic computational skills and understanding of statistical methods for
handling and analysing sequencing data for application in both diagnostic and research
settings
4. Gain practical experience of the bioinformatics pipeline through the Genomics England
programme.
5. Justify and defend the place of Professional Best Practice Guidelines in the diagnostic setting
for the reporting of genomic variation.
Teaching and learning methods
This unit is delivered entirely via distance learning, including assessment. The course runs over 10 weeks, with a nominal 15h/week of student work.
Each week consists of:
- An overview of the material, presenting the learning objectives for the week.
- Explanatory material (~3h of student activity/week) in the form of video lectures, papers/articles, the course text, and links to further resources.
- Exercises (~4h/week). These are formative, with feedback for them given in the tutorial or via discussion boards.
- Discussion (~2h/week). Students are encouraged to discuss the exercises and material in the forums where tutors will facilitate peer learning, providing feedback/input where necessary.
- Formative Questionnaire. This is to gather students’ questions and highlight misconceptions ready for the tutorial.
- Tutorial (~6h/unit). Student will video conference with their tutor to discuss and give/receive feedback.
There is also private study of ~5h/week consisting of:
- Revision
- Coursework
- Further practice (after the tutorials)
- Independent/further study
The times specified will vary greatly through the weeks, for example there might be no private study in the first week as no assessments have been set but much longer in week 9 when students will need to prepare for their presentation/submission of coursework.
Knowledge and understanding
1. Critically evaluate the governance and ethical frameworks in place within the NHS and how they apply to bioinformatics.
2. Discuss and justify the importance of standards, best practice guidelines and standard operating procedures: how they are developed, improved and applied to clinical bioinformatics.
3. Describe the structure of DNA and the functions of coding and non-coding DNA.
4. Discuss the flow of information from DNA to RNA to protein in the cell.
5. Describe transcription of DNA to mRNA and the protein synthesis process.
6. Discuss the role of polymorphisms in Mendelian and complex disorders and give examples of polymorphisms involved in genetic disease.
7. Describe appropriate bioinformatics databases capturing information on DNA, RNA and protein sequences.
8. Explain the theory of sequence analysis and the use of genome analysis tools.
9. Describe secondary databases in bioinformatics and their use in generating metadata on gene function.
10. Explain fundamental bioinformatic principles, including the scope and aims of bioinformatics and its development.
11. Explain fundamental genomic principles, including the scope and aims of genomics and its development.
12. Discover resources linking polymorphism to disease processes and discuss and evaluate the resources that are available to the bioinformatician and how these are categorised.
13. Discuss metadata and how it is captured in bioinformatics resources.
14. Interpret the metadata provided by the major bioinformatics resources.
15. Describe the use of ontologies in metadata capture and give examples of the use of ontologies for capturing information on gene function and phenotype.
16. Identify appropriate references where published data are to be reported.
17. Describe the biological background to diagnostic genetic testing and clinical genetics, and the role of bioinformatics.
18. Describe the partnership of Clinical Bioinformatics and Genetics to other clinical specialisms in the investigation and management of genetic disorders and the contribution to safe and effective patient care.
Intellectual skills
1. Critically analyse scientific and clinical data
2. Present scientific and clinical data appropriately
3. Formulate a critical argument
4. Evaluate scientific and clinical literature
5. Critically evaluate the knowledge of clinical bioinformatics to address specific clinical problems
Practical skills
1. Present information clearly in the form of verbal and written reports.
2. Communicate complex ideas and arguments in a clear and concise and effective manner.
3. Work effectively as an individual or part of a team.
4. Use conventional and electronic resources to collect, select and organise complex scientific information
5. Perform analysis on DNA data and protein sequence data to infer function.
6. Perform sequence alignment tasks.
7. Select and apply appropriate bioinformatic tools and resources from a core subset to typical diagnostic laboratory cases, contextualised to the scope and practice of a clinical genetics laboratory.
8. Compare major bioinformatics resources for clinical diagnostics, and how their results can be summarised and integrated with other lines of evidence to produce clinically valid reports.
9. Interpret evidence from bioinformatic tools and resources and integrate this into the sum of genetic information for the interpretation and reporting of test results from patients.
10. Perform the recording of building or version numbers of resources used on a given date, including those of linked data sources, and understand the clinical relevance of this data.
Transferable skills and personal qualities
1. Present complex ideas in simple terms in both oral and written formats.
2. Consistently operate within sphere of personal competence and level of authority.
3. Manage personal workload and objectives to achieve quality of care.
4. Actively seek accurate and validated information from all available sources.
5. Select and apply appropriate analysis or assessment techniques and tools.
6. Evaluate a wide range of data to assist with judgements and decision making.
7. Interpret data and convert into knowledge for use in the clinical context of individual and groups of patients.
8. Work in partnership with colleagues, other professionals, patients and their carers to maximise patient care.
Assessment methods
Method | Weight |
---|---|
Other | 30% |
Oral assessment/presentation | 70% |
Discussion Board contribution 30%
10 minute individual presentation 70%
Feedback methods
Formative and Summative Feedback given
Recommended reading
This is an example reading list but this will be updated on a yearly basis
Suggested Reading for ‘Introduction to genetics and genomics’ and ‘DNA sequencing’
Molecular Biology/Genetics textbooks – look for the latest edition
- Human Molecular Genetics, Tom Strachan and Andrew Read, Garland Science Chapters 1, 2 and 13
- New Clinical Genetics, Andrew Read and Dian Donnai, Scion Publishing
- Essential Medical Genetics, JM Connor and MA Ferguson-Smith, Blackwell Science
- Genomes, TA Brown, Bios Scientific Publishers
- Human Genetics and Genomics, Bruce R Korf, Blackwell PublishingInstant Notes in Bioinformatics by Hodgman, French and Westhead (Bios, 2009)
Journal papers
- What is a gene, post ENCODE? History and updated definition
Gerstein, MB et al (2007) Genome Research 17:p669
- Non-coding RNAs: key regulators of mammalian transcription
Kugel, JF and Goodrich, JA (2012) Trends Biochem Sci 37(4):p144
- Long non-coding RNAs and enhancers
Ørom UA and Sheikhattar, R (2011) Curr Opin Genet Dev 21(2):p194
- Human genetics and genomics a decade after the release of the draft sequence of the human genome
Naidoo N et al (2011) Human Genetics 5(6):p577
- Identifying Disease mutations in genomic medicine settings: current challenges and how to accelerate progress
Lyon, GJ and Wang, K (2012), Genome Medicine 4:58
- Implementing genomic medicine in the clinic: the future is here
Manolio T et al (2013) Genetics in Medicine 15(4):p258
Genome Project websites
- The Human Genome Project
UK: http://www.sanger.ac.uk/about/history/hgp/
USA: http://www.genome.gov/10001772
- 1000 Genomes Project
http://www.1000genomes.org/
- 10,000 Genomes Project
http://www.uk10k.org/
- Genomics England
http://www.genomicsengland.co.uk/
Professional Practice Guidelines
- USA: American College of Medical Genetics and Genomics (ACMG)
https://www.acmg.net/
- UK: Association for Clinical Genetic Science
http://www.acgs.uk.com/quality-committee/best-practice-guidelines/
Nomenclature Guidelines
- http://www.hgvs.org/mutnomen/recs.html#general
Study hours
Scheduled activity hours | |
---|---|
Practical classes & workshops | 120 |
Independent study hours | |
---|---|
Independent study | 30 |
Teaching staff
Staff member | Role |
---|---|
Michael Cornell | Unit coordinator |
Angela Davies | Unit coordinator |