
Course unit details:
Bioinformatics, Interpretation, Statistics and Data Quality Assurance
Unit code | BIOL67981 |
---|---|
Credit rating | 15 |
Unit level | FHEQ level 7 – master's degree or fourth year of an integrated master's degree |
Teaching period(s) | Semester 1 |
Available as a free choice unit? | No |
Overview
Genetics/Genomics:
- Introduction to the role of next generation sequencing in clinical diagnosis
- Introduction to the history and scope of genomics
- Nucleic Acid structure and function
- The central dogma: From DNA, to RNA and proteins
- Noncoding regulatory sequence: promoters, transcription factor binding sites, splice site dinucleotides, enhancers
- Types of genetic variation and its role in health and disease
Sequencing:
- Types of sequencing, applications and limitations; Sanger versus short read, and short read versus long read.
- Panel versus exome versus whole genome resequencing
- Overview of the stages of an NGS bioinformatic pipeline, including QC, mapping and variant calling.
- Analysis, annotation and interpretation of whole exome sequence data.
Statistics:
- Assessment of data quality through application of quality control measures
- How to determine the analytical sensitivity and specificity of missense predictions.
Bioinformatic Fundamentals:
- Introduction to the history and scope of bioinformatics for NGS
- Genome browsers and interfaces; including Ensembl and UCSC Genome Browser.
- Allele frequency and the gnomAD database.
- Clinical variant databases e.g. ClinVar
- Disease-specific databases, e.g. CardioDB and odds ratio calculation.
- Missense effect prediction using a range of tools.
- Feature identification including splice site analysis using recent developments in deep learning based tools.
Clinical application of bioinformatics
Introduction to the clinical application of bioinformatic resources, including its role and use in a medical context in molecular genetics, cytogenetics and next generation sequencing for data manipulation and analysis, and genotyping microarrays (also used to predict CNVs).
Use of tools to call sequence variants e.g. GATK, annotation of variant-call files (vcf) using established databases. Filtering strategies of variants, in context of clinical data, and using publically-available control data sets. Use of multiple database sources, in silico tools and literature for pathogenicity evaluation, and familiarity with the statistical programmes to support this.
Background and application of specialist databases and browsers:
- dbSNP, DECIPHER, Orphanet, DMuDB / NGRL Universal Browser, ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/intro/) ,OMIM, ECARUCA. DGV, ExAC, NHLBI-GO
- LOVD/UMD database software and scientific literature
- HGMD
- Specific clinical analysis software
- CNV analysis
- Gene Prioritisation (e.g. ToppGene, Endeavour, GeCCO)
- Missense analysis (e.g. Align GVGD, SIFT, PolyPhen, Panther, PhDSNP, MAPP)
- Splicing analysis applications (e.g. GeneSplicer, MAxEntScan, NNSplice, SSFL, HSF, NetGene2)
- Commercially available software (e.g. NextGENe, Alamut, Cartegenia)
- Capture and representation of phenotype data
- Development of a simple application for clinical bioinformatic use
- Standards and governance
- Data standards and formats
- IUPAC codes, FASTA, GenBank, FASTQ, SAM/BAM/CRAM, VCF
- HGVS variant nomenclature
- HGNC gene nomenclature
- RefSeq/RefSeqGene, LRG
- Role and development of Standard Operating Procedures
- Relevant standards (clinical, genetic, bioinformatic) for data representation and exchange
- Principles of integration of laboratory and clinical information, and place of best-practice guidelines for indicating the clinical significance of results.
Aims
By the end of this compulsory module the student will be able to:
1. Analyse the principles applied to quality control of sequencing data, alignment of sequence to
the reference genome, calling and annotating sequence variants, and filtering strategies to
identify pathogenic mutations in sequencing data
2. Interrogate major data sources, e.g. of genomic sequence, protein sequences, variation,
pathways, (e.g. EVS, dbSNP, ClinVar, etc.) and be able to integrate with clinical data, to assess
the pathogenic and clinical significance of the genome result
3. Acquire relevant basic computational skills and understanding of statistical methods for
handling and analysing sequencing data for application in both diagnostic and research
settings
4. Gain practical experience of the bioinformatics pipeline
5. Justify and defend the place of Professional Best Practice Guidelines in the diagnostic setting
for the reporting of genomic variation.
Teaching and learning methods
Knowledge and understanding
- Discuss and justify the importance of standards, best practice guidelines and standard operating procedures: how they are developed, improved and applied to clinical bioinformatics.
- Describe the structure of DNA and the functions of coding and non-coding DNA.
- Discuss the flow of information from DNA to RNA to protein in the cell.
- Describe transcription of DNA to mRNA and the protein synthesis process.
- Describe appropriate bioinformatics databases capturing information on DNA, RNA and protein sequences.
- Explain the theory of sequence analysis and the use of genome analysis tools.
- Describe secondary databases in bioinformatics
- Explain fundamental bioinformatic principles, including the scope and aims of bioinformatics and its development.
- Discover resources linking polymorphism to disease processes and discuss and evaluate the resources that are available to the bioinformatician and how these are categorised.
- Discuss metadata and how it is captured in bioinformatics resources.
- Interpret the metadata provided by the major bioinformatics resources.
- Identify appropriate references where published data are to be reported.
- Describe the biological background to diagnostic genetic testing and clinical genetics, and the role of bioinformatics.
- Describe the partnership of Clinical Bioinformatics to other clinical specialisms in the investigation and management of genetic disorders and the contribution to safe and effective patient care.
Intellectual skills
- Critically analyse scientific and clinical data
- Present scientific and clinical data appropriately
- Formulate a critical argument
- Evaluate scientific and clinical literature
- Critically evaluate the knowledge of clinical bioinformatics to address specific clinical problems
Practical skills
- Present information clearly in the form of verbal and written reports.
- Communicate complex ideas and arguments in a clear and concise and effective manner.
- Work effectively as an individual or part of a team.
- Use conventional and electronic resources to collect, select and organise complex scientific information
- Perform analysis on DNA data and protein sequence data to infer function.
- Perform sequence alignment tasks.
- Select and apply appropriate bioinformatic tools and resources from a core subset to typical diagnostic laboratory cases, contextualised to the scope and practice of a clinical genetics laboratory.
- Compare major bioinformatics resources for clinical diagnostics, and how their results can be summarised and integrated with other lines of evidence to produce clinically valid reports.
- Interpret evidence from bioinformatic tools and resources and integrate this into the sum of genetic information for the interpretation and reporting of test results from patients.
- Perform the recording of building or version numbers of resources used on a given date, including those of linked data sources, and understand the clinical relevance of this data.
Transferable skills and personal qualities
- Present complex ideas in simple terms in both oral and written formats.
- Consistently operate within sphere of personal competence and level of authority.
- Manage personal workload and objectives to achieve quality of care.
- Actively seek accurate and validated information from all available sources.
- Select and apply appropriate analysis or assessment techniques and tools.
- Evaluate a wide range of data to assist with judgements and decision making.
- Interpret data and convert into knowledge for use in the clinical context of individual and groups of patients.
- Work in partnership with colleagues, other professionals, patients and their carers to maximise patient care.
Assessment methods
Method | Weight |
---|---|
Other | 30% |
Written assignment (inc essay) | 70% |
Group presentation 30%
Individual written assignment 70%
Feedback methods
Formative and Summative Feedback given
Recommended reading
This is an example reading list but this will be updated on a yearly basis
Suggested Reading for ‘Introduction to genetics and genomics’ and ‘DNA sequencing’
Molecular Biology/Genetics textbooks – look for the latest edition
- Human Molecular Genetics, Tom Strachan and Andrew Read, Garland Science Chapters 1, 2 and 13
- New Clinical Genetics, Andrew Read and Dian Donnai, Scion Publishing
- Essential Medical Genetics, JM Connor and MA Ferguson-Smith, Blackwell Science
- Genomes, TA Brown, Bios Scientific Publishers
- Human Genetics and Genomics, Bruce R Korf, Blackwell PublishingInstant Notes in Bioinformatics by Hodgman, French and Westhead (Bios, 2009)
Journal papers
- What is a gene, post ENCODE? History and updated definition
Gerstein, MB et al (2007) Genome Research 17:p669
- Non-coding RNAs: key regulators of mammalian transcription
Kugel, JF and Goodrich, JA (2012) Trends Biochem Sci 37(4):p144
- Long non-coding RNAs and enhancers
Ørom UA and Sheikhattar, R (2011) Curr Opin Genet Dev 21(2):p194
- Human genetics and genomics a decade after the release of the draft sequence of the human genome
Naidoo N et al (2011) Human Genetics 5(6):p577
- Identifying Disease mutations in genomic medicine settings: current challenges and how to accelerate progress
Lyon, GJ and Wang, K (2012), Genome Medicine 4:58
- Implementing genomic medicine in the clinic: the future is here
Manolio T et al (2013) Genetics in Medicine 15(4):p258
Genome Project websites
- The Human Genome Project
UK: http://www.sanger.ac.uk/about/history/hgp/
USA: http://www.genome.gov/10001772
- 1000 Genomes Project
http://www.1000genomes.org/
- 10,000 Genomes Project
http://www.uk10k.org/
- Genomics England
http://www.genomicsengland.co.uk/
Professional Practice Guidelines
- USA: American College of Medical Genetics and Genomics (ACMG)
https://www.acmg.net/
- UK: Association for Clinical Genetic Science
http://www.acgs.uk.com/quality-committee/best-practice-guidelines/
Nomenclature Guidelines
- http://www.hgvs.org/mutnomen/recs.html#general
Study hours
Scheduled activity hours | |
---|---|
Practical classes & workshops | 30 |
Independent study hours | |
---|---|
Independent study | 120 |
Teaching staff
Staff member | Role |
---|---|
Michael Cornell | Unit coordinator |