[SBI] Structural Bioinformatics

[SBI] Structural Bioinformatics

Structural bioinformatics deals with aspects of structural biology that are best handled by computer. The topics range from databases of known structures, data handling, structure computation, modelling, manipulation of data.

In addition to the lectures students will do a number of computer "pracs" which are more in the nature of demonstrations of aspects of certain software. The software demonstrated will include: PyMol, Chimera, O, VMD, CHARMM, Modeller and the Accelrys software (Discover, Insight and Affinity).

Lecturers

Lecturers involved are Trevor Sewell, David Pugh, Arvind Varsani, Hugh Patterton, Kevin Naidoo, Michelle Kuttel, Muhammed Sayed and Colin Kenyon.


Main Outcomes

The ability to:

  • obtain structural insights from biomolecular databases
  • predict protein secondary structure
  • predict the three dimensional structure of proteins which have homologue whose structure is known.
  • display proteins and other biomolecules
  • display and interpret experimental structural data
  • dock protein models into experimentally determined maps

This module descriptor document also can be downloaded as an MS-Word document.

Main Content/Syllabus

Introduction

  • The observed structure of proteins - amino acids, alpha helices, beta sheets, beta bends
  • The forces that shape proteins - van der Waal's forces, charge interactions, hydrogen bonds (modelled as charge interactions), the hydrophobic effect

Force field modelling

  • The concept of energy and potential functions - what the physically real ones look like and empirical fudges. The limitations of our models and how we believe we could improve things by going to QM
  • Energy minimization, MD and Monte-Carlo. Normal Modes.

Emipirical prediction

  • Chou-Fasman
  • Psipred
  • PSST's
  • Secondary structure databases, protein classification
  • Structural alignment
  • Fugue
  • GenThreader

Fusion Ideas

  • Modeller
  • The ideas of David Baker

Bringing in Experimental Data

  • The nature of data from NMR,EM and Xray crystallography
  • Maps and resolution
  • Combining modelling with experimental data
  • Docking
  • Refinement - Refmac, CNS, real-space refinemnt
  • Matching constraints in NMR
  • Impact on proteomics - protein protein docking
  • Impact on drug discovery - databases of small molecules - active site docking - cavity fitting - in silico screening

Computational Strategies

Representations and interactive tools

  • PyMol
  • Chimera
  • O
  • VMD

Structural Databases

  • How they are structured
  • How they are accessed
  • Responsibility for their maintenance
  • Concept of a portal
  • Examples of databases of primary and derived data, in particular the PDB

Representation of Structure

  • The nature and design of structural representation tools
  • The capabilities of some implementations e.g. PyMol, Chimera, VMD
Home Department: Molecular and Cell Biology, UCT
Module description (Header): Structural Bioinformatics and Molecular Modelling
Generic module name: Structural Biology
Alpha-numeric code: STB705
Credit Value: 10 Credits
Duration: 3 Weeks
Module Type: P
Level: 8
Prerequisites: None
Co-requisites: None
Prohibited combinations: None
Learning time breakdown (hours):
Contact with lecturer/tutor: 55
Assignments & tasks: 15
Tests & examinations: 0
Practicals: 20
Selfstudy: 10
Total Learning Time 100
Methods of Student Assessment: Students will be required to submit a practical write up and complete an assignment. Assignments – 100%.

Online Lectures

Arvind Varsani

Colin Kenyon

Hugh Patterton

Mohammed Sayed

Protein Topology (from the European Bioinformatics Institute)

Pymol Tutorial

Links

Databases

  • PDB - Database of 3-D biological macromolecular structure data
  • Swissprot - protein sequence database
  • Prosite - Database of protein families and domains at Expasy
  • SLoop - Database of Super Secondary Fragments
  • Homstrad - Homologous Structure Alignment Database
  • Campass - Cambridge database of Protein Alignments organised as Structural Superfamilies

Fold recognition servers

  • Fugue - FUGUE uses environment-based fold profiles that are created from structural alignments. Gap penalties are environment dependent.
  • 3DPSSM - Based on sequence profiles, solvatation potentials and secondary structure.
  • SAMT02 - The query is checked against a library of hidden Markov models. This is NOT a threading technique, it is sequence based.
  • GenTHREADER - Combines profiles and sequence-structure alignments. A neural network-based jury system calculates the final score based on solvation and pair potentials.
  • Metaserver - The structure prediction Meta Server offers a gateway to many high quality fold recognition servers and provides and infrastructure and main interface to several highly reliable consensus methods.

Protein structure & alignment analysis

Homology Modelling Servers

  • MODELLER -homology or comparative modelling of protein three-dimensional structures by satisfaction of spatial restraints
  • SWISS-MODEL - SWISS-MODEL is an automated comparative modelling server
  • SDSC1 - Protein structure homology modeling server (San Diego, USA)
  • 3D-JIGSAW - Automated system for 3D models for proteins (Cancer Research)
  • WHATIF - WHAT IF Web interface: homology modelling, drug docking, electrostatics calculations, structure validation and visualisation.

Ab-intio Modelling

  • RAPPER - Restraint-based protein modelling
  • HMMSTR/Rosetta - Prediction of protein structure from sequence

Predicting protein loops

  • CODA - a combined algorithm for predicting protein loops

Structural validation Tools

  • RAMPAGE - Structural validation by assessment of the Ramachandran plot
  • PROCHECK - Checks the stereochemical quality of a protein structure, producing a number of PostScript plots analysing its overall and residue-by-residue geometry.
  • WHATIF - Protein structure analysis program for mutant prediction, structure verification, molecular graphics

Molecular visualization tools

Protein-Protein interaction

  • Protein-Protein Interaction Server - The server is a tool to analyse the protein-protein interface of any protein complex. You can submit the coordinates of a protein structure of your choice and then view tables describing the nature of the protein-protein interface.

References

Berman, HM, J Westbrook, Z Feng, G Gilliland, TN Bhat, H Weissig, IN Shindyalov, and PE Bourne. 2000. "The Protein Data Bank." Nucleic Acids Research, 28 (1): 235-242.

Bhat, TN, P Bourne, Z Feng, G Gilliland, S Jain, V Ravichandran, B Schneider, K Schneider, N Thanki, H Weissig, J Westbrook, and HM Berman. 2001. "The PDB data uniformity project." Nucleic Acids Research, 29 (1): 214-218.

Bourne, PE, KJ Addess, WF Bluhm, L Chen, N Deshpande, Z Feng, W Fleri, R Green, JC Merino-Ott, W Townsend-Merino, H Weissig, J Westbrook, and HM Berman. 2004. "The distribution and query systems of the RCSB Protein Data Bank." Nucleic Acids Research, 32 (Database issue).

Bourne, PE and H Weissig. 2005. Structural Bioinformatics. Hoboken, NJ, USA: John Wiley & Sons, Inc., Methods of Biochemical Analysis, Vol 44, 1st ed.

DePristo, MA, PIW de Bakker, and TL Blundell. 2004. "Heterogeneity and inaccuracy in protein structures solved by X-ray crystallography." Structure (London, England: 1993), 12 (5): 831-838.

Greer, DS, JD Westbrook, and PE Bourne. 2002. "An ontology driven architecture for derived representations of macromolecular structure." Bioinformatics (Oxford, England), 18 (9): 1280-1281.

Leach, AR. 2001. Molecular Modelling: Principles and Applications. Harlow, England: Prentice Hall, 2nd ed.

Richmond, TJ and CA Davey. 2003. "The structure of DNA in the nucleosome core." Nature, 423 (6936): 145-150.

Sowadski, JM, LF Epstein, L Lankiewicz, and R Karlsson. 1999. "Conformational diversity of catalytic cores of protein kinases." Pharmacology & Therapeutics, 82 (2–3): 157-164.

Tate, J. 2003. "Molecular Visualisation." In Structural Bioinformatics,, 135-158. Hoboken, New Jersey, USA: Wiley-Liss.

Westbrook, JD and PE Bourne. 2000. "STAR/mmCIF: an ontology for macromolecular structure." Bioinformatics (Oxford, England), 16 (2): 159-168.

Westbrook, J, Z Feng, S Jain, TN Bhat, N Thanki, V Ravichandran, GL Gilliland, W Bluhm, H Weissig, DS Greer, PE Bourne, and HM Berman. 2002. "The Protein Data Bank: unifying the archive." Nucleic Acids Research, 30 (1): 245-248.

Westbrook, J, Z Feng, L Chen, H Yang, and HM Berman. 2003. "The Protein Data Bank and structural genomics." Nucleic Acids Research, 31 (1): 489-491.

Other useful notes

Tate (2003) Summary of some of the currently available visualisation software. In: Structural Bioinformatics