Venue: Chilford Conference Centre
|Event Date/Time: Aug 07, 2002|
Keynote address: Where are we at in the post-genomics era?
Professor Sydney Brenner Distinguished Professor The Salk Institute
This afternoon session will assess the elementary level in bioinformatics, looking at gene and protein sequencing, via transcriptome analysis.
For each session, there will be a back to back presentations, one coming from the science/user point of view, with an emphasis of what they are trying to achieve, and what their needs are in terms of bioinformatics, the other one from a solution provider.
At the end of each session, there will be a question and answer session, which will help both parties to understand each other and some solution should be drafted.
Managing collection, analysis and interpretation of data from functional
genomics and proteomics
Effective and efficient methods for managing the data generated in functional genomics
Statistical input into the design, analysis and interpretation of functional genomics and proteomics studies
Achieving synergy between bioinformatics and statistics
Linking analysis of pharmacogenomic data into drug discovery and development
David Lovell Pfizer
Using genomics and bioinformatics for drug discovery
Dr Tim Bonnert Merck
Selecting and validating druggable targets starting with proteome-wide analyses
Analysis of known drugs and their molecular targets shows that launched pharmaceuticals act through approximately 200 distinct proteins spread over about 100 domain families. Occurrence of a druggable target would thus appear to be a relatively rare event in the genome, with candidates scattered across many enzyme and receptor classes. Given the stringent physicochemical requirements placed on orally-active small molecules if they are to have good pharmacokinetic properties (cf Lipinski rules and the like), coupled to the precondition of good potency and selectivity, it follows that only a limited subset of active sites in proteins will be suitable for binding drugs. Furthermore only a narrow subset of these druggable targets will be biologically valid points of intervention in disease pathways. It is therefore of enormous importance to concentrate our efforts on progressing those targets that have the highest probability of being both valid and druggable, and to scan the proteome for these in order to minimise downstream attrition in drug discovery. We have constructed an in silico chemogenomics platform, PharmaCartaTM, for the purpose of optimised druggable target and lead selection using genomic, protein structural and SAR information, and are progressing this paradigm of informatics-driven drug discovery through in-house laboratories and partnership.
Dr Malcolm Weir InPharmatica
Integration of data from proteomics and genomics
Comparing gene expression and protein expression in a simple system.
Coordinating pattern of expression in gene and protein side.
Christopher Ahlberg Spotfire
A consensus procedure for predicting the location of alpha-helical transmembrane segments in proteins.
To aid in the development of three-dimensional models of membrane-bound proteins, a consensus procedure for predicting alpha-helical transmembrane segments from amino acid sequence is presented. The algorithm combines the results of six individual prediction methods and some basic properties of membrane-spanning helices to obtain a final consensus prediction. Comparison with experiment and several other recently developed methods shows that the consensus procedure performs quite well in comparison to other recent methods. A FORTRAN program has been developed which takes an input file containing an amino acid sequence in one-letter code and outputs a list of the alpha-helical transmembrane segments predicted by the consensus algorithm.
Luis Parodi Pharmacia and Upjohn
Developing and exploring technologies
Making the bioinformatics tools closer o the experimental needs
Dr. Frédérique Lisacek GeneBio SA
Microbial evolution against the marketing forces of the pharmaceutical industry.
Our laboratory is coordinating the current largest applied structural genomics project in France. It consists in trying to identify new antibacterial target genes through the comprehensive bioinformatics analysis of all known bacterial genomes, emphasizing those of bio-medical importance. The candidate genes are then put into a pipeline of high throughput expression, biochemical characterization, crystallization and 3-D structure determination, using original or state-of-the-art methods. After given a technical overview of the whole project, we will show how the current R&D strategies of the large pharmaceutical industry appear to be seriously clashing with the inescapable evolution pattern of our microbial pathogens. This opens the following question: whatever the great tools at our disposal now, is the pharmaceutical industry ready to take advantage of the genomic revolution?
Jean-Michel Claverie Research Director, CNRS Structural & Genetics Information Laboratory, CNRS-AVENTIS
This session will present the secondary bioinformatics tools, as we move towards higher order in cells/functions, with presentations on data-mining, artificial intelligence, data fusion and correlation. The type of data produced, versus the reason to integrate them, and why scientists are struggling for solutions will be presented in a very interactive forum.
Finding Disease Genes Using Automated Mass Spectrometry
SEQUENOM has developed a fully automated genotyping platform that allows up to 1 million SNPs to be measured in a single day. It is also possible to study allele frequencies of SNPs on population pools to provide a quick scans for SNPs of particular interest that may warrant individual genotyping.
SEQUENOM has developed a set of validated Mass-Spectrometric SNP assays that altogether number about 200,000 different polymorphic SNPs. These assays contain both gene-based and evenly spaced SNPs across the entire human genome. Allele frequency differences in phenotypically-stratified populations can reveal genes with strong associations to phenotypes. We have used an age-stratified healthy population as our principle gene discovery tool. We test the hypothesis that age is the major risk factor in complex disease; alleles that confer susceptibility to disease should thus decline in frequency as a function of age in a healthy population. This is true for some known disease markers and, from a scan of about 50% of the human genome, we have discovered many new genes that appear to be linked to complex disease. We use additional highly phenotyped populations for validation studies, and in this way the specific disease association of some of these new genes has been revealed. We also use more traditional case control studies for full genome scans, and a number of intriguing leads for breast cancer and melanoma have already been revealed this way.
Dr Charles Cantor Chief Scientific Officer Sequenom
Methods for reconstructing gene networks from microarray data
Microarrays are providing us with genome scale gene expression data,
however interpreting these data is not a trivial problem. Here we
present methods for using microarray data to reconstruct gene
regulation networks and to study their properties. In particular,
we study dependencies between the gene expression profiles in a dataset
from genome wide yeast mutation studies, regarding the profiles as
random variables. We build gene expression dependency networks from
these data and study their properties. We look for 'important' genes,
i.e., genes with high out-degree in the dependency graph, and genes
with complex regulation, i.e., genes with high in-degree in the graph,
as well as the general properties of the dependency network. Several
subnetworks show clear relationship to particular biological processes
Dr Alvis Brazma European Bioinformatics Institute EMBL Outstation
Creating biological networks
Dr Charlie Hodgman GlaxoSmithKline
Gaining biological insight from genomics and proteomics using PathwayPrism™
More efficient target selection and validation in drug discovery are intimately tied to understanding the emergent behavior of biological systems, a key aspect of systems biology. These system behaviors, which are controlled by protein activities and their regulation, arise from quantitative changes in protein interactions, and cannot be explained by purely qualitative network connection maps. As such, quantitative modeling is necessary to extract useful insight from genomics and proteomics data for rational drug discovery. This talk will explore how to extract higher value information from the growing volume of biological data using Physiome Sciences' PathwayPrism™ platform. We will explore the myths and realities of integrating disparate data sets, to map, model, and simulate novel pathways in drug discovery.
Jonathan Levin Physiome
Computational genomics and comparative studies
Druggable but not human
Mor Amitai Compugen
Handling SNPs and haplotypes and relating that to disease prediction
Bioinformatics in relation to experimental data
Grouping genetic events into clinical samples
Tissue profiling of expressed genes, in situ hybridisation
Choosing what candidates to take forward
Rick Woychik Lynx Genetics
Effect of knock out for expression profiling
Binding assays: tools for function evaluation
Dr Mats Wikstrom Biovitrum
Down on the Farm – Industrialisation of In Silico Drug Discovery
Compute Farms and now compute ranches are no longer found solely in the domain of Bioinformatics. Farms are now being used in the Life Sciences to perform complex calculations on a range of scientific problems in Chemoinformatics space. This presentation outlines De Novo Pharmaceuticals’ approach to using High Performance Computing and the impact it has had on In Silico drug discovery within the business.
Use of GRID technology
Supporting the computing needs of each community, internal and external
GRID portals for application software
Real examples using proprietary tools – Skelgen, Quasi2, EasyDock
Storing the output for use and reuse.
Richard Scott De Novo Pharmaceuticals
Identification of protein targets and biomarkers in human therapeutics
AGY is focused on discovering and developing novel treatments for central nervous system diseases. The company has taken sequential molecular snapshots in models of human disease, at multiple time-points, throughout the course of the progression of the pathological process. Using this temporal resolution approach, AGY has been able to delineate specific pathological pathways, and identify protein targets and biomarkers within the critical time window relevant for human therapeutic approaches. Functionally characterized targets, using gene knockdowns and overexpression, have progressed into assay development, high-throughput screening and lead development. AGY’s technologies and concepts provide an efficient and comprehensive basis from gene discovery to drug development.
Karoly Nicolich AGY Therapeutics Inc.
Looking for Leads in the Genomics Age: From Target Class Design to Deck Selection
Computational chemistry is being leveraged in myriad ways in the search for new leads against genomics derived targets. For proteins of known fold the structural data conserved across the class, be it ligand or protein derived, provide key constraints that permit computer-aided lead discovery and optimization. When the target protein fold is unknown, said technology may still be applied in the search for novel leads. At the most direct level, the invaluable structural information made increasingly accessible through high throughput crystallography can be levaraged. Where structure is unknown, work is ongoing in the area of screening deck design to maximize the quality of HTS results. Experiences with HTS have shown that highly active molecules are generally only forthcoming with targets for which the in-house deck has a long history. Novel targets thus present a major challenge for screening, particularly given the non-lead like properties exhibited by many deck compounds. The design of decks containing properties compatible with the discovery of leads that are both selective and highly efficient binders thus takes on a high importance, particularly in the context of targets with novel fold.
Andrew Good Bristol Myers Squibb
Data Mining and Business Strategy
Keynote: Janet Thornton EBI
Context-Driven Decision Making in Life Science R&D
Enabling rapid and accurate decision making by providing researchers with all available information about specific molecular systems and their role in disease is a key differentiator in R&D innovation and productivity. Effective decision making in life science discovery and development requires a strong supporting informatics infrastructure. Much useful information is out of bounds to the researcher, either because it is unstructured text and therefore difficult to search for, or it is semantically diverse and difficult to correlate in the context of specific disease states.
This challenge has been addressed by bringing a series of cutting-edge informatics technologies together, coupled with a strong understanding of the scientific domain and R&D process. From a technology perspective, this approach combines sophisticated data mapping, data mining, data visualisation, workflow mapping, automation and textual retrieval and analysis tools with an optimised and integrated ontology based data schema containing data from a range of public and private sources from the life science R&D domain.
This presentation will focus on showing how these technologies and methodologies have been brought together, deployed and used in various drug discovery-related projects over the last year. The audience will learn about the technology components that have been integrated as well as a unique approach to implementing the technologies to ensure that they can be widely deployed effectively adopted.
Dr Steve Gardner Viaken Systems Ltd
T-Coffee, a novel method for combining biological information
The design and the assembly of highly accurate multiple alignments has become a pre requisite for most sequence analysis methods. A non exhaustive list of applications for multiple alignments includes structure prediction, phylogenetic analysis, detection of domains in protein families or the identification of SNPs. Unfortunately, the construction of an accurate multiple alignment remains a daunting task, plagued by our inability to simultaneously use all the biological information associated with
sequences. T-Coffee(Notredame, Higgins et al. 2000), one of the most accurate multiple alignment package available today (Katoh, Misawa et al. 2002) addresses very precisely this question. Using a new original algorithm, T-Coffee makes it possible to build a multiple alignment by combining various heterogeneous sources of information such as local alignments, 3-D structures or specialist information and to check the consistency between the incoming data and the resulting model. We introduce here the basic
principle of the method as well as our new T-Coffee HP server dedicated to the community (igs-server.cnrs-mrs.fr/TCoffee).
Katoh, K., K. Misawa, et al. (2002). "MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform." Nucleic Acids Res 30(14): 3059-66.
Notredame, C., D. G. Higgins, et al. (2000). "T-Coffee: A novel algorithm for multiple sequence alignment." J. Mol. Biol. 302: 205-217.
Professor Cédric Notredame CNRS and Swiss Institute of Bioinformatics
Relational data mining for functional genomics
The exponential increase in sequence data has made the problem of identifying gene function more acute. The pharmaceutical industry is faced with a plethora of possible drug targets, but insufficient information to prioritise them. Knowing more about gene functions is essential for selecting better targets. PharmaDM uses relational data mining methods to assign function to predicted genes in the absence of homologous sequences with known function. The predictions are based on more or less complex rules that use predicted molecular mass, amino acid composition, sequence motifs, etcetera. Several of the predictions have in the mean time been verified experimentally.
Dr Walter Luyten Chief Scientific Officer Pharma DM
The Use of Ontologies in Information Retrieval for Drug Discovery
The use of ontologies is a key step forward for structuring biology in a way that helps drug discovery scientists to understand the relationships that exist between terms in a specialised area of interest.
BioWisdom has focussed on establishing a system, DiscoveryInsight, that utilises ontologies to allow the scientist to effectively search and analyse the content of literature databases such as Medline, patent literature, or any other public databases. The user is presented with an ontology for search term selection, this includes the functionality to allow the user to automatically select a concept and all its children. The search is performed, across a number of databases, on the concepts and their associated synonyms. DiscoveryInsight allows the user to assess the co-occurrence of concepts, at any level in the ontology and between ontologies. DiscoveryInsight currently includes ontologies of anatomy, disease, drugs and drug targets focussed on drug discovery scientists needs.
The underlying technology uses ontologies to provide a mechanism for ensuring good precision and recall in information retrieval. The system clarifies ambiguous synonyms (i.e. terms with more than one meaning) by co-occurring the term with parent concept to ensure high precision by returning records relevant only to the context of interest.
The number of matches for each concept (and associated synonyms) is presented to the user in a tree view of the ontology. This allows the user to quickly browse the search results and identify areas of interest.
The key feature of DiscoveryInsight is the ability to view, browse, and select concepts from the ontologies, for the purpose of searching. In this context, drug discovery scientists have the opportunity to understand the domain, prior to searching. Future plans aim to build the ‘richness’ of the BioWisdom ontologies within DiscoveryInsight through the inclusion of searchable properties of concepts, providing the ability to cross ontologies through associations between concepts, and infer new knowledge for the purposes of drug discovery.
Further details of DiscoveryInsight and the BioWisdom ontologies will be presented.
Dr Julie Barnes Chief Scientific Officer Biowisdom
Drug discovery aspects of genomics and bioinformatics: the business/alliances side of things
Henri Theunissen Organon
Text mining and knowledge extraction
Thérèse Vachon Head of Knowledge Engineering Informatics and Knowledge Management Novartis
Exploring image and text data and mapping them to databases of genes
Sequence databases and annotation (genomics and proteomics)
Case studies of first phases
Help to focus R&D resources by getting more information out of databases
Dr Albert Wang Bristol Myers Squibb
Post-Genomics Drug Discovery
Impact of Genetics and Genomics in Drug Discovery: Opportunities and Challenges
Prof. Klaus Lindpaintner F. Hoffmann-La Roche
In silico drug discovery
Ligand or structure
Dr Mohammad Afshar Ribotargets
Virtual screening of fragments
The characteristics of a novel virtual screening platform will be described. In the virtual screening platform, all data associated with the virtual screen is stored in an Oracle database, and queried and manipulated via web-based applications allowing open-access to all scientists in the company. The virtual screening uses the molecular docking program, GOLD, to assess the potential binding characteristics of molecules against proteins of known 3d structure. GOLD has been modified to improve the speed and accuracy of virtual screening. Details on the extensive validation of the modified version of GOLD will be described. In real applications, molecular weight of the screened compounds has been restricted because such “fragments” provide good starting points for structure-based drug design. Results from several applications of fragment-based virtual screening will be discussed.
Dr Chris Murray Director of Computational Chemistry and Informatics Astex Technology