|
|
|
Bioinformatics and Computational Biology Roadmap
Grand Challenges
Printer
Friendly
Cassatt, NIGMS
From the amino acid sequence of a protein determine reliably its high
resolution structure.
Time frame: 5 to 10 years
Givens: Within the time frame of this grand challenge, we can expect
to have (1) through the NIGMS funded protein structure initiative, a library
of representative structures of families of soluble proteins sufficiently
complete that from the sequence of an unknown protein a low resolution
structure (chain tracing) can be obtained through homology modeling and
(2) computers will have advanced to the point that petaflop machines will
be available for the task.
Task: There is sufficient information contained within the amino acid
sequence to determine its three dimensional structure. To date scientists
have been unable to decipher this code and it is unlikely, without a major
unanticipated discovery, that they will be able to do so in the foreseeable
future. However, the availability of a low resolution structure that limits
the conformational space through which a computer will have to search
makes the problem tractable, especially with the expected availability
of the new generation of high performance computers. Thus within the time
frame of this "grand challenge", the problem becomes tractable.
Impact: The impact of the success of this "grand challenge"
will be enormous. Even now, biologists routinely asking questions that
can only be answered at atomic resolution. The solution is the arduous
task of expressing and crystallizing the protein for structure determination-generally
a one-year task with no assurance of success. On a more practical level,
the success of this grand challenge will revolutionize the discovery phase
of new pharmaceuticals.
Mike Waters, NIEHS
1. Understand the molecular basis of environmentally induced toxicity
and human disease.
2. Discovery and functional characterization of genetic variation in environmental
susceptibility genes.
3. Integrate environmental and genetic factors into our understanding
of the etiology of human disease.
4. Develop tools and assays that translate current research activities
into methods suitable for driving public health decisions
Dennis Glanzman, NIMH
Understanding the neural (physiological, biological) basis of cognition.
The illnesses for which NIMH takes primary responsibility involve, to
a significant degree, errors of cognition -- and the key to finding treatments,
cures and preventions is knowing the source of the disorder. It is assumed
at present that this may at least be approached through the use of dynamical
imaging technologies, e.g., fMRI, PET, etc. The sheer size and complexity
of imaging datasets presents a bioinformatics challenge, and an even more
elusive goal relates to making sense of this information. Without an adequate
theoretical basis, and computational tools tailored to the tasks at hand,
we may be hampered by a reliance on the perceptions and analytical capabilities
developed over the past century.
The future for understanding complex issues lies in the application of
newly evolving analytic tools for framing the interpretation of activity
across levels of organization. For example, the 'noise' at one level,
such as the stochastics of ion channel opening and closings, may underlie
the 'variability' at another, such as gene induction, membrane potentials,
and spiking activity in neurons. These dynamics of behavior across scales
of time and dimension, and their interaction and predictive power, represent
the gold standard of tying together information from molecular, genetic,
cellular, systems and whole organism biomedical research.
Huerta, NIMH
Background: Enormous sums have been invested in studying the molecular
pathways that underlie cellular physiology and pathophysiology. Such studies
are increasingly considered in the context of (and often driven by) sequence
information. For sequence data, bioinformatics tools and resources can
be used to relate the findings from one study to those of another, often
very different (species, cell type, etc.), studies. These approaches have
broadened the impact of individual studies, demonstrated the convergence
and divergence of these processes, and provided a sense of how pieces
of the puzzle might fit together. Yet, these approaches for sequence data
represent only an indirect, first step of a new way to understand the
relationship of particular molecular pathways to each other, and to specific
cellular functions and dysfunctions.
Need: While biomedicine is awash in information about molecular
processes, detailed quantitative information, and important qualitative
information, is sparse. And, despite the use of sequence-related informatics
to fuel this area of research to date, existing informatics tools and
resources are not adequate to accommodate or make full use of the more
complex and dynamic molecular data. Just as the use of sequence-based
lines of inquiry have brought together disparate findings, moving to this
next level of sophistication would be a key step toward a molecules-to-systems
integrative understanding. This initiative would have important and immediate
implications for therapeutic interventions for disease, including drug
discovery and drug development, as well as for illuminating basic biological
mechanisms.
Goals: The goals of this initiative would be to conduct research
to obtain a quantitative understanding of specific molecular pathways
(across species, tissue types, and states of health and disease), to develop
informatics tools and resources to share and make sense of these data,
and to develop computational biology approaches (models, simulations,
etc.) to integrate and use these data to push forward discovery in ways
not possible solely through actual experiments. Biomedical research could
be constrained by focusing on particular molecular pathways (in particular
tissues, species, states of health, etc.) that are of interest to multiple
ICs (see "Implementation" below), such as receptor-effector
signal transduction pathways common to a variety of different cell types,
or could simply be conducted in accord with existing research interests
of ICs. Quantitative biomedical data of interest would include reaction
rates, concentration parameters, etc., of molecular species involved in
specific cascades. Also of interest would be qualitative data currently
lacking, such as the subcellular, compartmental localization of particular
aspects of particular molecular pathways. Informatics efforts would include
community-based development of ontologies, semantic relationships, data
models, data schemas, etc., as well as the development of informatics
tools for data analysis, visualization, etc. Informatics resources would
also be developed, including databases, query approaches, data sharing
protocols, etc. The computational biology efforts would develop models
and simulations that would bring data in from the informatics resources
to integrate those data and to serve as dynamic engines of discovery,
exploring parameter space and performing virtual experiments to generate
new hypotheses to be tested by actual experiments.
Implementation: This initiative lends itself to (but would not
require) phased implementation, from a demonstration project to a much
broader and larger effort. The demonstration phase might focus on a small
number of specific pathways chosen and prioritized on the basis of their
ubiquity and known or suspected importance in physiological and pathophysiological
processes. Since molecular pathways are fundamentally similar, the knowledge,
informatics, and computational approaches developed for these particular
"demonstration" pathways could be used in the study of other
pathways. Thus, even if this began as a highly focused effort, it could
scale up quickly to encompass broader areas. Expansion might be determined
on the basis of the interests of different ICs or clusters of ICs, which
might, for example, differentially emphasize their support for work on
different molecular pathways. Nevertheless, it is essential that this
initiative be conducted in a fully coordinated fashion across ICs.
Links to other roadmap activities: This grand challenge has clear
scientific links to the following roadmap groups: Building Blocks, Molecular
Libraries, and Molecular Imaging. It would likely also relate scientifically
to the Structural Biology and Nanotechnology groups, and would relate
to virtually all of the roadmap groups that are focusing on the way science
is conducted (e.g., Pathways, Multidisciplinary Teams, Public-Private
Partnerships, etc.)
Good, NHGRI
1) Establish a comprehensive knowledgebase of human genes and their function
that will assist biologists and clinicians to understand genetic contributions
to human biology and disease.
2) Develop computational methods to discover which of the millions of
genetic variants are associated with specific phenotypes, in order to
find genes contributing to common disease and drug response.
3) Develop new structured vocabularies that enable the interchange of
information between basic biologists and clinicians. This new vocabulary
will help translate the information from basic research to develop new
insights into the cause and treatments of human disease.
4) Develop a computer model of a eukaryotic cell that will predict the
behavior of the cell.
Corollary: Model the genetic networks and protein pathways in human cells
and predict how they contribute to cellular and organismal phenotypes.
5) Train scientists that have the necessary background in a quantitative
approach to science to efficiently use computational approaches to advance
biomedical research.
6) Not necessarily a Grand Challenge: Enable the prepublication release
of small and large datasets to facilitate computational approaches to
biology (See Wellcome Trust/NHGRI 2003 workshop on sharing data).
Farber, NCRR
One way to divide computational biology is to distinguish problems involving
modeling and simulation from those that involve data. Examples of modeling
and simulation involve molecular dynamics, models of collections of proteins
that act together to accomplish a biological function, and models of how
tissues or organs function. It is not clear to me that this area should
be the subject of a grand challenge since improvements in this area are
likely to depend on improvements in algorithms and in descriptions of
the specific problem of interest. Improvements in software and in access
to computational power will have a major impact on this field. NIH should
be concerned about improving the quality and availability of software
developed to solve these problems.
Problems involving data seem to be a better area for a grand challenge.
There are two major problems in this area. The first involves attempts
to use large databases to find patterns that can, in turn, make predictions
about the behavior of a system. Tools for effective data mining are few
and far between. Many biologists have problems using or creating the necessary
tools, and many computational scientists have problems understanding the
quality of the data in the database. It is not hard to imagine a grand
challenge in the development of tools to analyze data.
An additional problem is to combine data measured by different laboratories,
or using different instruments. One could certainly imagine that NIH could
collect all of this data in a centralized database and make it available
to the public. However, there are a number of drawbacks to this approach.
The idea of federating small to medium sized databases to allow an investigator
to easily access and combine data from different sources seems to be a
much better approach. There are certainly major problems in this area
that could be the subject of a grand challenge. Among them are the language
needed to describe data (defining ontologies or standards) as well as
the development of tools to use these ontologies to make the data readily
available. In addition to technical challenges, there are serious problems
in making sure that investigators get appropriate credit for creating
data as well as for interpreting/combining it. Care also must be taken
with data from human subjects.
Guo, NIAAA
Generation of an interactive database (or networking databases) that incorporates
all (or most) genetic, genomic, functional genomic, and functional information
for the purpose of mining the relationships between all cellular components
(DNA, RNA, protein, lipid. etc), functional modules, "super-modules",
and networks under specified conditions.
Mangan, NIDCR
We are in our infancy in this field
almost every level is going
to be a grand challenge as the complexity of the system increases. Start
with the ability to quantitatively study and describe biology in the cell,
move up to tissue, organs and eventually to the organism.
In the microbiology field, the study of microbes as singular organisms,
as well as organisms associated with both human tissues, defenses (innate
and adaptive host immune factors) and other bacterial species (biofilms).
Public health benefit: new molecular targets for therapy, early diagnosis,
prevention,
In medicine, the ability to accurately predict how an human will respond
to external and internal stimuli based on the person's genetic makeup.
(Using sophisticated mathematical modeling of genetic principles
reverse biology) A complete computer representation of the human.
Public Health benefit: (almost unlimited number of possibilities) early
diagnosis of disease, disease prevention, individual pharmacology, so
forth
Voluminous biological data will be open to all researchers worldwide
via the Internet. All life science degrees will have to offer training
in how to use and extract information from these databases. Various interfaces
and shadow software systems will need to be developed to enable inexperienced
researchers and scientists to access the data in meaningful ways.
Some individual challenges:
Storage of huge numbers of data sets
Common terminology between databases
Improved high throughput technologies
Graphic representation of molecular interactions (e.g., protein-protein)
Development of biological principles from existing data
Creation and testing of evolutionary theories based on new data
Liu, NINDS
- Analyze and interpret the enormously complex datasets obtained from
biomedical experiments
- Integrate our knowledge obtained from different levels of biomedical
research at genetic, molecular, cellular, system levels to behavioral
and clinical research
- Model biological systems based on experimental data to generate testable
hypotheses that will predict the mechanisms underlying normal and abnormal
functions
Rosenberg, NCI
BACKGROUND: In our division (NCI/DCEG) we are beginning to put
together large-scale datasets on people, the environment, and the genome
to identify causes of cancer. Our studies are population-based and include
case-control and cohort designs (including trials). A large study of either
type is equal in complexity to a large randomized clinical trial. The
informatics needs of a traditional epidemiological study are substantial.
With the advent of genomic information from SNPs, microarrays, and the
proteome, the complexity of our studies is rising to a new level. I believe
our problem is becoming widely encountered at NIH.
Broadly speaking, the biomedical informatics issues we face coalesce
around two themes:
1. How do we get there without breaking the bank? The "there"
is the point that we have a dataset in hand in a format suitable for analysis.
2. What do we do with the information once we get it? How do we analyze
it to obtain meaningful conclusions?
For both issues, the "we" has a dual meaning - each individual
investigator and also the scientific community.
I believe the first issue has a specific cause that can be remedied.
In my opinion, one underlying problem is that biomedical data has few
standards - we live in a world of "data chaos." A simple example
should make this point clear. Right now, at least six different ways are
commonly used to code the variable sex - as character 'M' or 'F', character
'Male' or 'Female', numeric 0 and 1 for Male and Female, respectively,
or the converse, and numeric 1 and 2.
It doesn't matter scientifically how the data are coded - but every time
the data changes hands people have to get in the loop to transmit the
codes and reformat the data. This is very costly, inefficient, and error
prone. It also is increasingly infeasible as we begin to measure vast
numbers of genomic variables.
Therefore, I propose the following solution:
Grand Challenge 1: NIH should create a working group to define
standard schema for biomedical data, which I propose we call the Data
Markup Language or DML.
I outline a proposed overview of this concept below.
With a standard way to represent biomedical data in hand, a number of
specific activities would build on the DML standard to create infrastructure
that will help scientists make sense of the impending "data avalanche."
Grand Challenge 2: NIH should promote the creation and dissemination
of analytical methods compatible with DML. Specifically, NIH should:
- Develop pilot projects to create a "data warehouse" using
DML and browser-based tools to view, subset, manipulate, and download
biomedical data to a diversity of formats for analysis.
- Work with commercial database software companies to build such warehouses.
- Work with commercial software companies (including the developers
of SAS, STATA, S-PLUS, and MATLAB) to make these widely used commercial
packages DML compatible. DML compatibility would insure that each package
could import and export data using the DML format.
- Work with public-domain software development groups, including the
group developing the R language, to make these packages DML compliant.
- Develop server-side analysis software that uses the DML standard.
The informatics landscape would be dramatically different in such a world.
Basic biomedical information could flow efficiently without costly human
intervention and error. Raw data could be disseminated in a standardized
format. Each investigator would have access to basic analytical tools
via their browsers, and could obtain data for any other software package
in analysis-ready format without the need for recoding.
Summary
In a biomedical world with standards for data, we would be better positioned
to get correct answers faster. A common format for data exchange would
have many advantages. The Data Markup Language (Grand Challenge 1) would
provide a standardized way to disseminate "raw" data to meet
any current and future requirements to make biomedical data available,
and it would create efficiencies for collaborative studies, including
consortia, multi-center studies, and complex studies involving extensive
laboratory and clinical analysis. DML- compatible analysis tools (Grand
Challenge 2) would give each investigator access to a basic set of analytical
tools via server-side analysis; foster openness and transparency in the
analysis of data; support efforts to validate analysis results, and help
independent teams replicate key findings.
DML Architecture - Concept
DML is a set of schema, not a particular software product. It is not
designed to replace existing database or analysis software, but rather
to help biomedical investigators make better use of existing software
and algorithms. Each schema identifies a class of variables by defining
properties of the data along with the metadata that gives meaning
to the values. For example, a standard definition of the variable sex
could be a categorical with value 1 corresponding to the label 'male'
and 2 corresponding to 'female'.
A central DML schema is a dataset. A dataset is an aggregation
of variables measured on a set of cases, plus meta-data about the dataset
including labels that distinguish the cases.
DML would contain a set of 'generic' schema for statistical variables,
including but not limited to continuous, categorical, and string variables.
A minimum of seven types of statistical variables would be needed to support
the basic types biostatistical data analysis (not defined here).
DML would include a set of variables that could be called 'core variables
for human subjects research' or HSC variables. HSC variables would include
Schema for: sex, age, race/ethnicity, date of birth, time, date, SNPs,
each component of the complete blood count or CBC, specific CLIE assays,
etc.
Each schema would define the relevant data and meta-data. A schema for
a SNP, for example, could include gene name, variant, Unigene ID, assay,
primers, reaction conditions, etc., plus a convention that 0, 1, or 2
meant homozygous for minor allele (defined by the meta-data), heterozygous,
and homozygous for the variant allele, respectively. To transmit a SNP
variable, one would transmit not only the raw data (a series of 0's, 1's,
and 2's) but also the meta-data.
Whenever possible DML would adopt existing standards, i.e. standards
for image data, emerging standards for microarrays, etc.
The combination of generic statistical variables, HSC variables, and
existing standards would provide containers for every type of data - everything
could be stored, processed, and transmitted.
Lyster, NIBIB
(1) An overview of grand challenge efforts in Computational Biology is
presented in the article "Trends in Computational Biology: A Summary
Based on a RECOMB Plenary Lecture, 1999" by John Wooley, J. Comp.
Biol, 6, 3/4, 1999. Wooley is a good general outside resource
http://medicine.ucsd.edu/pharmaco/jwooley.html. In addition to this,
the following pages describe grand challenges in computational structural
biology http://cbcg.lbl.gov/ssi-csb/Program.html
(2) (a) Analysis, data mining, epidemiology, informatics, and knowledge
bases that relate genotype to phenotype:
In order to facilitate analysis and understanding of the relationship between
genomic and phenotypic data, NIH needs to foster research and tools for
data analysis, data mining, epidemiology, informatics, and knowledge bases.
Most of the grand challenge here is analytical tools and methods as well
as information technology guided by scientific judgment; there is little
high end computing. Suggested outside expert for consultation: Russ Altman,
Pharmacogenomic network (PharmGKB)
http://smi-web.stanford.edu/people/altman/
(2) (b) Population to Ecosystem modeling:
This involves modeling the interaction with population and the ecosystem.
Hence it involves elements of biostatistics, epidemiology, informatics as
well as earth system modeling approaches such as climate modeling. This
is definitely somewhat speculative, and maybe outside of the near-term goals
of NIH research. It is not clear if there are many legitimate outside experts(?).
Try Anthony Busalacchi http://www.essic.umd.edu/
(3) Multiscale physiologic modeling:
This involves modeling systems whose spatial and temporal scales vary
from subcellular to organismal. This covers all organ systems. As a case
study, for whole heart electromechanics you need a minimum of about 50
million grid points, each grid point has 100 variables, and a 1ms time
step. This requires 5 teraflop/s (5 x 10^12 floating point operations
per second). In addition to this high end computing requirement, much
research is needed in the development of scientific algorithms to handle
subgridscale and other unresolved processes (e.g., mechanics, ion channel
responses, G-coupled protein receptors, tyrosine kinase pathways, and
protein expression changes). Algorithm and software development is at
least as important as hardware development for speedups: Suggested outside
experts for consultation: Andrew McCulloch http://cardiome.ucsd.edu/
Rai Winslow http://www.cmbl.jhu.edu/
and Peter Hunter http://www.esc.auckland.ac.nz/People/Staff/Hunter/
(4) Computer Assisted Surgery:
Computer Assisted Surgery may include computational elements from Multiscale
Physiological Modeling (item 3) with the emphasis on Real Time computation
update. This is expected to scale to Petaflop/s (10^15 floating point
operations per second). In addition there are considerable infrastructural
and networking issues to be dealt with such as the time to feed segmented
image information back to the models: Suggested outside expert for consultation:
Ron Kikinis http://splweb.bwh.harvard.edu:8000/pages/ppl/kikinis/
Twery, NHLBI
Simulation of Biological Systems: Genome to Function
Current biological approaches based on characterizations of molecular,
cellular, and organ processes in one or two dimensions must be greatly
enhanced to collect and interpret data with complexity approximating that
of physiological interactions. Teams of biologists, informaticists, and
computational biologists will be needed to advance competitive strategies
for close collaboration; systematic large scale collection of data; models
to visualize and interpret the cellular behavior as a function of component
pathways; elucidate pathophysiological abnormalities; and provide plans
for full integration of project data and models with other teams in the
Grand Challenges program. The heart, lung, blood, and sleep fields offer
rich opportunities for pioneering the development of new standards for
physiological modeling. Detailed models will be useful in predicting responses
to experimental and environmental manipulations challenges, and testing
potential therapies related to cell function, blood flow, lung ventilation
and particle deposition, intensive patient managment, and sleep disorders.
The Translation of Clinical Research to Clinical Practice
The NHLBI has vigorous programs on the development and assessment of treatments,
preventions, and diagnostics. Fundamental research in genetics, genomics,
proteomics, tissue engineering, instrumentation, cell and developmental
biology, epidemiology, and clinical trials also contribute an abundance
of data, resources, and tools that could be applied to clincial application.
Biomedical informatic approaches are needed to enhance knowledge building
through improved clinical trial data management tools allowing data-sharing
and analysis across studies, and mechanisms to facilitate quality control.
Bioinformatic and computational approaches facilitating the integration
of diverse information from clinical and basic research will help to identify
potential targets for therapy and enhance the overall ability to translate
fundamental research findings to clinical application.
Virtual Reality Systems for Research and Medicine
There is a need for virtual reality interfaces that emulate biological
systems for remote robotic delivery of surgical care; virtual experimental
research; and training of investigators and medical caregivers. Semi-autonomous
simulations of biological systems are essential to enhance the quality
of the interface, improve reliability, and decrease the amount of data
transferred. An in depth systematic understanding of underlying biological
systems and tissue mechanics will be needed to model these processes for
application in virtual reality interfaces. New instrumentation and informatic
capabilities will also be needed to monitor the tissue response to manipulations
and dynamically adjust the emulation as needed.
Suggested Consultants
Lindberg, NLM
David Botstein, PhD Princeton U
Ted Shortliffe, MD, PhD, Columbia P&S
Cassatt, NIGMS
Bioinformatics:
Phil Bourne-UCSD
Shankar Subramaniam-UCSD
Computational Biology:
Michael Levitt-Stanford; one of the broadest thinking computational biologist
I know. Focuses on proteins
Michael Klein-U Penn; Interfaces molecular dynamics with high end computers
Mel Simon-Cal Tech; Cellular Modeling
Stan Leibler-Rockefeller; One of the best in the business of cellular
modeling
Waters, NIEHS
Trey Ideker, Whitehead Institute, MIT. After his Ph.D., Trey remained
in Seattle as a research scientist at the Institute for Systems Biology,
working on methods to integrate the large amount of diverse biological
data generated by genomic sequencing, mRNA microarrays, and proteomics.
He continues this work at the Whitehead Institute for Biomedical Research
as Pfizer Fellow of Computational Biology. Ideker also serves on the advisory
board of Genstruct, has served as Bioinformatics Lecturer for ISTR, Inc.,
and holds several patents in the fields of microarray analysis and systems
biology.
John Quackenbush, TIGR. In January 1997, John Quackenbush joined the
faculty of The Institute for Genomic Research (TIGR) in Rockville, MD.
He is currently an Investigator at TIGR, where he leads a number of research
projects in DNA microarray analysis and bioinformatics. He is currently
working on an analysis of gene expression in rodent models of human disease,
human colon cancer metastasis and in Arabidopsis thaliana.
Mark Miller, San Diego Supercomputer Center. Mark Miller is Program Coordinator
of Integrative BioSciences at the San Diego Supercomputer Center (SDSC)
-- a research unit of the University of California, San Diego, and the
leading-edge site of the National Partnership for Advanced Computational
Infrastructure. SDSC researchers conduct studies in computational science,
develop high-performance computing and networking technologies.
Edward Marcotte, UT Austin. Ed Marcotte's research group combines computational
and bioinformatics approaches with experimental approaches to study protein
function and protein-protein interactions. Using these techniques and
information from over 30 fully sequenced genomes, his group was able to
calculate the first genome-wide predictions of protein function, finding
very preliminary function for over half the 2,500 uncharacterized genes
of yeast. Now, with over 80 genomes in hand, they're extending these techniques,
as well as asking fundamental questions about the evolution of protein
interactions and the evolution of genomes.
HT Banks, NC State University and The Statistical and Applied Mathematical
Sciences Institute (SAMSI). Dr. Banks has 30 years of experience in advancing
the use of computational mathematics and statistics as tools in the analysis
of biological data. Dr. Banks' expertise is in the modeling of response
at the level of the tissue/organ. Dr. Banks is also the Associate Director
of SAMSI, a national institute whose vision is to forge a new synthesis
of the statistical sciences and the applied mathematical sciences with
disciplinary science to confront the very hardest and most important data-
and model-driven scientific challenges.
Glanzmann, NIMH
Dr. Arthur Toga, Ph.D., University of California at Los Angeles. Dr. Toga
is a neuroscientist with extensive experience in computer science and
engineering. His professional interests are centered around the development
of extensive neuroanatomical tools using MRI and fMRI for the creation
of a "probabilistic map" of the human brain. He recently reported
that he has successfully localized fMRI signals to within the width of
a single cortical barrel (about 300 microns). In addition, he heads the
first team to record Optical Intrinsic Signals and fMRI in the same (human)
subjects, showing a not-surprising high degree of correlation between
these two imaging techniques. http://neurology.medsch.ucla.edu/faculty/TogaA.htm
Dr. Lawrence Abbott, Ph.D., Brandeis University. Dr. Abbott is a theoretical
physicist who has worked in neuroscience for the past twenty years. His
research involves the mathematical modeling and analysis of neurons and
neural networks. Analytic techniques and computer simulations are used
to study how different conductances contribute to the electrical characteristics
of a neuron, how neurons interact to produce functioning neural circuits,
and how large populations of neurons represent, store, and process information.
He is especially interested in the mechanism that control the development
and maintenance of neural circuits, dynamic properties of large neural
networks, and methods used by populations of spiking neurons to represent
and process information. http://www.bio.brandeis.edu/faculty01/abbott.html
Dr. Henry Abarbanel, Ph.D., University of California at San Diego. Dr.
Abarbanel is a physicist who had 30 years of experience in laser communications
before he began studying the dynamics of small assemblies of neurons which
have the job of translating sensory inputs into nearly rhythmic output
to muscles. He has investigated the oscillations of these neurons individually
as well as in subcircuits of the whole system, where he has shown that
the oscillations of neurons require some additional slow dynamical process
in addition to the usual ion channel dynamics of the Hodgkin-Huxley models.
His work is highly theoretical, and computational, and his breadth of
knowledge in many fields would make him a valuable member of an advisory
team.
http://inls.ucsd.edu/~hdia/
Huerta, NIMH
Dr Peter D. Karp
Director, Bioinformatics Research Group
Artificial Intelligence Center
SRI International
Room EK207
333 Ravenswood Avenue
Menlo Park, CA 94025-3493
Tel: 650-859-4358
Email: pkarp ai.sri.com
Eric Neumann, Ph.D.
Vice President of Informatics
Beyond Genomics
40 Bear Hill Road
Waltham, MA 02451
Tel: 781-890-1199
Email: ENeumann@BeyondGenomics.com
Frank Olken
Lawrence Berkeley National Laboratory
Computational Science Research Division
Bldg. 50B3238
1 Cyclotron Road
Berkeley, CA 94720-8147
Tel: 510-486-5891
Email: olken@lbl.gov
James Schwaber, Ph.D.
Director, Daniel Baugh Institute of Functional Genomics/Computational
Biology
Pathology
1020 Locust Street, Room 520 JAH
Thomas Jefferson University Medical College
Philadelphia PA 19107
Tel: 215-503-7823
Email: james.schwaber@mail.tju.edu
Good, NHGRI
Sean Eddy, computational biology at Washington University
David Haussler, computational biology and the human genome browser at
U. California Santa Cruz
Judy Blake, ontologies and the mouse genome databases at JAX labs
Rolf Apweiler, databases and data standards, EBI
Farber, NCRR
Ralph Roskies, Pittsburgh Supercomputer Center
Michael Klein, University of Pennsylvania
Larry Smarr, UCSD
Mark Ellisman, UCSD
Peter Arzberger, UCSD
Peter A. Freeman, Assistant Director, CISE, NSF
Guo, NIAAA
Lee Hartwell (U. of Washington) or Andrew Murray (Harvard) - "Modular
Biology"
Aravinda Chakravati (Hopkins) - bioinformatics in genotyping
Richard Young (MIT) - Functional Genomics, biology
Eric Lander (MIT) - Sequencing, genotyping, and database
Neil Risch (Stanford) - Statistics in gene mapping and linkage study
David Botstein (Stanford): Functional genomics, gene mapping, biology
Mangan, NIDCR
Leslie Loew (Univ of Connecticut): The Virtual Cell Project; Systems
Biology Markup Language- biochemical network models
Richard Young (MIT): Genome transcriptional regulatory networks; Validation
of metabolic data using Saccharomyces.
David Gifford (MIT): predictive modeling in biology using voluminous
data from high throughput technologies.
Liu, NINDS
Perry L. Miller MD/PhD
Director, Center for Medical Informatics, Professor of Anesthesiology
& Molecular, Cellular, and Developmental Biology, Yale University
School of Medicine. Dr. Miller served on the Board of Directors of American
Medical Informatics; Scientific Advisory Committee and Central Committee
on Education of the American College of Medical Informatics (ACMI); and
as Session Chair on the Bioinformatics Workshop Planning Committee of
National Academies of Science. (http://ycmi.med.yale.edu/people/miller.html)
Phone: 203-764-6715 Email: perry.miller@yale.edu
Mark H. Ellisman PhD
Director, Biomedical Informatics Research Network (BIRN); Council Member,
NCRR; Director, NCRR's National Center for Microscopy and Imaging Research
at San Diego; Professor of Neurosciences and Bioengineering, Director,
Center for Research in Biological Structure UCSD.
(http://medicine.ucsd.edu/neurosci/the-faculty/ellisman.html)
Phone: 853-534-2251 Email: mark@ncmir.ucsd.edu
Terrence J. Sejnowski PhD
Professor, Head, Computational Neurobiology Laboratory, Salk Institute.
Dr Sejnowski is a pioneer in the field of computational neuroscience.
(http://www.salk.edu/faculty/faculty/details.php?id=48)
Phone: 858-453-4100 Email tsejnowski@ucsd.edu
John Miller PhD
Professor of Neuroscience and Director, Center for Computational Biology,
Montana State University
Dr. Miller served on the President's Information Technology Advisory Committee
(PITAC)
(http://www.nervana.montana.edu/people/jpmbio.html)
Phone: 406-994-7332 Email: jpm@cns.montana.edu
Shankar Subramaniam PhD
Professor, Department of Bioengineer, Department of Chemistry, Biochemistry
and Biology, San Diego Supercomputing Center, UCSD.
Dr. Subramaniam served on the original committee that developed the BISTI
Report in June 1999
(http://www-bioeng.ucsd.edu/research/research_groups/compbio/shankar.html)
Phone: 858-822-0986 Email: shankar@sdsc.edu
David B. Searls PhD
Director, Bioinformatics Division, Genetics Research, GlaxoSmithKline
Pharmaceuticals.
Dr. Searls is the first one to use sophisticated linguistic methods in
the analysis of DNA sequences. He served on the BISTI Workshop in January
2003
(http://www.gsk.com/press_archive/sb/1996/press_19960610.htm)
Phone: 610-270-4551 Email: david_b_searle@gsk.com
Rosenberg, NCI
It might be very helpful to hear from the CEO of Quintiles, a large premiere
contract research organization that conducts clinical trials for industry.
It might also be worthwhile to here from Peter Goodfellow, who recently
was Senior vice president for discovery and research, GlaxoSmithKline.
Dr. Goodfellow was invited to speak at the 2001 Jackson Lab Short Course,
where I heard him talk about how well (or not) the big pharma companies
are able to make drug discoveries.
Twery, NHLBI
Howard J. Jacob, Ph.D.
Medical College of Wisconsin (MCW)
phone (414) 456-4887; fax (414) 456-6516
Email:JACOB@MCW.EDU
Biographical page - http://hmgc.mcw.edu/laboratories/jacob/jacoblabpage.html
Dr. Jacob is a leading figure in the field of functional genomics. He
chairs a consortium of ten NHLBI supported centers in the Program for
Genomic Applications, and serves the MCW as Director of the Human and
Molecular Genetics Center, and the Warren P. Knowles endowed chair in
genetics. Dr. Jacob has led development of a genomic strategies to elucidate
disease pathophysiology in rat and human. Specifically Dr. Jacob is elucidating
the role of genetic factors in hypertension, cardiovascular disease, cancer,
and diabetes. His approach is based on the premise that the onset and
later progression of disease may not represent a linear cascade of events
linked simply by a chronological exacerbation of the pathological features,
and that the genetic factors that determine the severity of the final
pathological outcome may be different than those that triggered the disease
process.
Christopher R. Johnson, Ph.D.
University of Utah
phone (801) 581-7705; fax (801) 585-6513
Email: crj@sci.utah.edu
Biographical page - http://www.sci.utah.edu/personnel/crj.html
Dr. Johnson is the founding Director of the Scientific Computing and
Imaging Institute at Utah and is widely recognized for his application
of scientific computing approaches to biomedical problems including imaging
problems, adaptive methods for partial differential equations, automatic
mesh generation, numerical analysis, large scale computational problems
in medicine, and scientific visualization. He currently directs several
center programs supported by NHLBI and NCRR to enhance biomedical informatic
approaches.
Lenord Zon, Ph.D.
Harvard University
phone 617-355-7707; fax 617-738-5922
zon@rascal.med.harvard.edu
Biographical page - http://cbr.med.harvard.edu/jpitm/mentors/zon.html;
http://zon.tchlab.org/
Dr. Zon is President of the International Society for Stem Cell Research,
and a Howard Hughes Investigator. His use of genomic and developmental
approaches in zebrafish to study hematopoiesis and neoplasia are powerful
tools leading to the development of novel interventions and chemotherapeutic
agents. Dr. Zon is also an active participant in the zebrafish genome
mapping project.
Shankar Subramaniam, Ph.D.
University of California at San Diego
phone (858) 822-0986
E-mail shankar@sdsc.edu
Dr. Subramaniam is a leading member of the San Diego SuperComputer Center
faculty and the center program, Alliance for Cellular Signalling, that
is pioneering informatic approaches for the collaborative study of cellular
pathways in cardiomyocytes and the immune system. His research includes
the analysis of protein structure and the development of microarray strategies
for detecting genes and pathways.
Perry Miller, M.D., Ph.D.
Yale University School of Medicine
phone 203-764-6715; fax 203-764-6717
Biographical page - http://fondue.med.yale.edu/people/miller.html
Dr. Miller has broad experience ranging from computer-based decision
support for public health to computational biology for genomic and proteomic
research. He is Director of the Center for Medical Informatics at Yale
University School of Medicine, and is widely recoganized as a leader concerned
with biomedical informatics research and training. His research includes
working on informatic approaches for the study of infectious agents, neurobiology,
and cardiovascular disease.
Andrew D. McCulloch, Ph.D.
University of California, San Diego
phone (858) 534-2547; fax (858) 534-5722
Biographical page - http://cardiome.ucsd.edu/Research.html
Dr. McCulloch is Director of the BioNOME Resource, a web-based repository
of bio-computational models and observational data, at the San Diego Supercomputer
Center.
His approach is to integrate experimental and computational models to
investigate the cellular structure of cardiac muscle and the electrical
and mechanical function of the heart. He is involved in a broad range
of Informatic dependent activities including high-throughput cardiac phenotyping;
tissue engineering of the cell microenvironment using microlithography
to study cell-matrix interactions; computational modeling applied to fluorescence
optical mapping of mechanoelectric action potential propagation; and in-silico
modeling of signal transduction pathways related to calcium cycling.
IC Inventories
NIGMS
Databases:
PharmGKB The database associated with the NIGMS pharmacogenomics effort.
HIV RT & Protease Sequence Database. ($280,000)
HIV Protease Database. ($50,000)
Protein Data Bank: Administered by NSF. Co funded by NSF, NIGMS, NLM,
and DOE. NIGMS provides about 40% (1.7 million dollars) of the support.
Uniprot: Administered by NHGRI. NIGMS provides $1,000,000 per year.
Databases Associated with other Large Efforts (ca. $750,000 each) :
Glue grants: Alliance for Cellular Signaling
Consortium for Functional Glycomics
Cell Migration Consortium
Inflammation and the Host Response
Protein Structure Initiative
Each of the nine project has associated with it informatics and data management
efforts.
Computational Biology/Bioinformatics Research Grants:
(Not part of the center for Bioinformatics and Computational Biology)
Theoretical Studies of Biological Macromolecules: 70 Grants ca. 15 million
dollars.
Population Biology: 70 grants ca. 16 million dollars.
Within the Center for Bioinformatics and Computational Biology:
Research Grants:
Bioinformatics: 15; ca. $3,000,000
Computational Biology: 49; ca. $11,000,000
Centers of Excellence: 2; ca. $5,000,000
Precenters
NIGMS Announcement: 2: ca. $400,000
BISTI Precenters: 8; $3,000,000
NIEHS
The mission of the National Institute of Environmental Health Sciences
(NIEHS) is to reduce the burden of environmentally associated disease
and dysfunction by defining how environmental exposures affect our health,
how individuals differ in their susceptibility to these exposures, and
how these susceptibilities change over time. The Institute fulfills its
mission through three main channels: 1) research conducted by NIEHS personnel
on site (intramural research); 2) support of research by groups external
to NIEHS (extramural research) and through training grants; and 3) the
leadership role in the National Toxicology Program (NTP), an interagency
program within DHHS designed to broaden toxicological characterization
of environmental chemicals and to develop and validate tests for toxic
environmental agents. All three of these research channels make substantial
use of and contributions to bioinformatics and computational biology.
The studies conducted by the Division of Intramural Research (DIR) are
often long term and high-risk basic research efforts and involve unique
components, such as epidemiological studies of environmentally associated
diseases, and intervention and prevention studies to reduce the effects
of exposures to hazardous environments. The Laboratory of Computational
Biology and Risk Analysis is a good example of DIR research combining
development of laboratory methods for humans and animals with computational/statistical/
mathematical methods to further our understanding of the mechanisms underlying
environmental disease and apply these methods in a risk assessment framework.
The close collaboration between laboratory and computational scientists
in this research group have had direct impacts on the use of mechanistic
data by regulatory agencies in understanding and quantifying health risks
from a number of important environmental and occupational hazards such
as dioxin, butadiene, mercury and phthalates. The Laboratory of Structural
Biology is another good example wherein the relationships among the atomic
level structures of macromolecules and their biochemical properties, their
abilities to interact with substrates and other molecules including those
of environmental concern, and their functions in vivo are investigated.
This requires an integrated approach wherein x-ray crystallography, nuclear
magnetic resonance, mass spectrometry and computational chemistry are
combined with biochemical and genetic approaches. DIR scientists are also
actively involved in translational research. New advances in cell and
molecular biology are being extended not only into molecular medicine
(from bench to bedside) but also into disease prevention (from bench to
longer, healthier lives). In sum, DIR scientists are involved in research
that contributes to our basic understanding of biological and chemical
processes, to our understanding of the role of environmental agents in
human disease and dysfunction, and to the underlying mechanisms of environmentally
associated diseases. The National Toxicology Program is recognized as
the most thorough and scientifically comprehensive toxicology testing
program in the world. In addition to its mandate to provide critical data
for public health decisions, the NTP continuously strives to translate
emerging research methods into mainstream use through their testing program.
In computational biology, the NTP is developing pharmacokinetic and biochemical
models on a routine basis as part of their testing program. In its extramural
research program, NIEHS supports bioinformatics and/or computational biology
core facilities at the MIT, Harvard, UNC, Oregon Health and Science U
(OSHU), Duke, Harvard, Fred Hutchinson Cancer Research Center/UW, Texas
A&M, UT (San Antonio), U Cincinnati, and the Mt. Desert Island Biological
Laboratory. These bioinformatics and computational cores provide direct
support to the extramural Toxicogenomics Research Consortium, the Environmental
Genome Project Comparative Mouse Genomics Centers Consortium to develop
mouse models to determine functional significance of human DNA polymorphism
and the Human DNA Polymorphism Discovery and Characterization Re-sequencing
involving functional analysis of polymorphic variants in environmentally
responsive genes, the UW-FHCRC Variation Discovery Resource (SeattleSNPs)
and the Comparative Toxicogenomics Database (primarily for marine species)
at Mount Desert Island Biological Laboratory. In September 2000, the NIEHS
created the National Center for Toxicogenomics (NCT). This Center will
seek to promote new understanding of the mechanisms of biological responses
to environmental stressors, including toxic injury, and to identify biomarkers
of exposure and disease that can be used to improve and protect human
health. New computational and bioinformatics tools together with global
gene, protein and metabolite expression analysis methods will play a significant
role in improving our understanding of toxicant-related disease. When
combined with information on gene/protein groups, functional pathways
and networks, and human genetic polymorphisms, these data will confer
new knowledge of gene-environment interactions and human health risks.
This information will be captured and continually updated in the Chemical
Effects in Biological Systems (CEBS) knowledge base. CEBS will be linked
extensively to other databases and to Web genomics and proteomics resources,
providing users the suite of information and bioinformatics and computational
tools needed to fully interpret global molecular expression datasets.
NIMH
The Theoretical and Computational Neuroscience Research Program supports
research investigating the development and application of realistic models
for the analysis and understanding of brain function. It focuses on research
projects combining mathematical and computational tools with neurophysiological,
neuroanatomical, or neurochemical techniques in order to decipher the
mechanisms which underlie specific neuronal and behavioral systems. It
also supports research projects focusing on understanding the computations
made by nerve cells and groups of nerve cells in orchestrating behaviors.
The Neurotechnology Program supports research and development of new
technologies and approaches for studying the brain and behavior, including
projects in basic and applied informatics. Tools supported include software
for analysis of behavior, images, molecular data, etc. Resources supported
include databases for gene expression in the brain (and other spatial
information), protein structure, genetic sequence data, etc. Basic research
supported includes the novel application of mathematical and statistical
approaches to advance informatics related to brain and behavioral research.
It is the NIMH home for several initiatives that include multiple Institutes
and Centers of NIH, including those of BECON and BISTIC.
The Neuroimaging Informatics Technology Initiative is jointly sponsored
with NINDS, and provides coordinated service, training, and research to
develop and enhance the utility of informatics tools related to functional
magnetic resonance imaging used in brain and behavioral research.
The Office on Neuroinformatics plans, directs, coordinates, and supports
activities of the Human Brain Project. The Office gives grants that will
lead to new digital and electronic tools for all domains of brain and
behavioral research. The approaches and technologies studied under this
grant funding initiative are being utilized to generate information that
is generalizable, scalable, extensible, and interoperable.
NHGRI
The Genome Informatics program supports research in computational biology
that will enable the development of tools for sequence analysis, gene
mapping, complex trait mapping and genetic variation. These tools include
mathematical and statistical methods for the identification of functional
elements in complex genomes; the identification of patterns in large datasets
(for example, microarray data); and the mapping of complex traits and
genetic variations (for example, single nucleotide polymorphisms, or SNPs).
The program also encourages development and maintenance of databases
of genomic and genetic data. Of particular importance is the continued
maintenance of genome databases that links the genome sequence of model
organisms with the biology of the organisms. This emphasis also includes
new tools for annotating complex genomes so as to expand their utility.
The program also supports the production of robust, exportable software
that can be widely shared among different databases in order to facilitate
database interoperability. These bioinformatics resources will allow the
scientific community efficient access to genomic data, which will enable
new types of analyses. The analyses, in turn, will allow for the computer
modeling and subsequent experimental validation of the complex pathways
and networks that ultimately determine the phenotype of a cell or the
causes of many human diseases.
NIMH
The Theoretical and Computational Neuroscience Research Program supports
research investigating the development and application of realistic models
for the analysis and understanding of brain function. It focuses on research
projects combining mathematical and computational tools with neurophysiological,
neuroanatomical, or neurochemical techniques in order to decipher the
mechanisms which underlie specific neuronal and behavioral systems. It
also supports research projects focusing on understanding the computations
made by nerve cells and groups of nerve cells in orchestrating behaviors.
The Neurotechnology Program supports research and development of new
technologies and approaches for studying the brain and behavior, including
projects in basic and applied informatics. Tools supported include software
for analysis of behavior, images, molecular data, etc. Resources supported
include databases for gene expression in the brain (and other spatial
information), protein structure, genetic sequence data, etc. Basic research
supported includes the novel application of mathematical and statistical
approaches to advance informatics related to brain and behavioral research.
It is the NIMH home for several initiatives that include multiple Institutes
and Centers of NIH, including those of BECON and BISTIC.
The Neuroimaging Informatics Technology Initiative is jointly sponsored
with NINDS, and provides coordinated service, training, and research to
develop and enhance the utility of informatics tools related to functional
magnetic resonance imaging used in brain and behavioral research.
The Office on Neuroinformatics plans, directs, coordinates, and supports
activities of the Human Brain Project. The Office gives grants that will
lead to new digital and electronic tools for all domains of brain and
behavioral research. The approaches and technologies studied under this
grant funding initiative are being utilized to generate information that
is generalizable, scalable, extensible, and interoperable.
NCRR
NCRR supports the development of computational infrastructure in a number
of different ways. We fund the following P41 centers in modeling/simulation
{Leslie Loew (system biology), Ralph Roskies (supercomputer - database
analysis - genetics and models), Peter Arzberger (supercomputer - database
analysis - genetics and models), Thomas Ferrin (molecular graphics) Charlie
Brooks (molecular simulations),Klaus Schulten (molecular simulations)}
A major area of emphasis at NCRR is the Bioinformatics Research Network.
This program is focused on solving problems involved with the federation
of databases. A number of pilot projects are underway in this initiative.
NCRR has sponsored a program announcement in the area of collaborative
science that is just making its way through ENS and involved computational
biology. The purpose of this program announcement is to invite proposals
to develop tools and techniques to harness the unprecedented volume of
data generated by collaborations between researchers. Proposals dealing
with data from either research laboratories or from the clinical laboratories
are welcome. Using these new tools and techniques, it is expected that
two or more laboratories will be able to productively collaborate in ways
that are not currently possible.
NCRR has also recently initiated a program announcement in software maintenance.
The goal of this PA is to support the continued development, maintenance,
testing and evaluation of existing software. The proposed work should
apply best practices and proven methods for software design, construction
and implementation to extend the applicability of existing bioinformatics/computational
biology software to a broader biomedical research community.
Finally, NCRR supports the development of software for instrumentation
through PAR-03-075.
NIAAA
I. Bioinformatics and computation biology for analysis
of sequences and sequence-related resources
- Interest: Tools that can better explore the information embedded
in DNA and protein sequences. For example, regulatory elements in promoters
and introns, alternatively spliced RNAs, identification of functional
domains or signaling sequences, and prediction of protein structures.
Generation of sequence-related databases and tools for mining these
resources.
- Ongoing activity: (1) A bioinformatics core is developing
software that can analyze the 5'-UTRs of the differentially expressed
genes. (2) Trans-NIH involvement: rat genome database, MGC, HapMap.
II. Bioinformatics and computation biology on studies of genetic
linkage and Epidemiology
- Interest: Bioinformatics, statistical, and computational tools
that can assist the analysis of genetic linkage or epidemiological studies
in complex diseases, for example, QTL analysis.
- Ongoing activities: A NIAAA-funded program "Collaborative
Studies on the Genetics of Alcoholism" (COGA) has been using and
developing tools and databases for genetic linkage analysis.
III. Bioinformatics and computation biology for studies in genomics
and functional genomics
- Interest: Bioinformatics, statistical, and computational tools
or methods for image analysis, data analysis, data comparison, data
storage, data sharing, and data publishing.
- Ongoing activity: (1) Two microarray centers, two bioinformatics
cores, and several microarray projects are currently funded. (2) 3 relevant
RFAs (microarray, proteomics, and ENU targeted random mutagenesis) have
been issued. (3) 2 relevant PAs will be issued (proteomics, novel technologies).
(4) 4 potential initiatives are being discussed - functional genomics
alliance, genetical genomics, metabolomics, and computational biology.
IV. Bioinformatics and computation biology for deciphering
biochemical or signal pathways, and their interactions (networks).
- Ongoing activity: Bioinformatics cores, neuroinformatics resource
facilities, and some initiatives under discussion.
V. Bioinformatics and computation biology for studies in structural
biology
- Interest: database for protein structure, tools to compare
and predict structures, and tools to identify structure-fitting compounds.
- Ongoing activity: Ethanol binding site of glycine, GABA, nicotinic
acetylcholine, and serotonin receptors.
VI. Bioinformatics and computation biology used in imaging
- Interest: computational tools for better analyzing and visualizing
imaging data.
- Ongoing activity: Many NIAAA-funded projects involve neuroimaging
approaches.
VII. Computational or mathematical modeling to study "functional
modules" in the cell.
VIII. Computational or mathematical modeling for biological
system, whole organism, or behavior.
- Ongoing activity: 1 neuro-computational research project.
NIDCR
- Microbial genomic sequences - full and draft (8x) coverage
- Microarrays
- Proteomics - function and structure
- Metabolome
- Use of bioinformatics tools (e.g., comparative genomics) to identify
genes with unknown functions
- Databases - Oragen (Oral Pathogens Relational Database at Los Alamos)
- SNPs
- Registries - cell, tissue, saliva
- Animal model systems
- Biological imaging
- Biological structure - proteins, carbohydrates, tissues; microbial
biofilms
- Computer assisted training
virtual head
NINDS
A. Trans Government Agency Activities
1. RFA: Joint NSF-NIH Initiative to Support Collaborative Research
in Computational Neuroscience (NSF-02-18/NS-02-501)
NINDS has led the Trans-Agency Working Group of seven NIH Institutes and
four Directorates at NSF that developed this initiative. The review was
conducted at NSF and the two agencies jointly funded 31 grants out of
157 applications. Currently, the Working Group is developing a PA (FY2004)
as follow up to the RFA.
2. The Human Brain Project (HBP)
NINDS is participating in HBP, a trans-agency (NIH, NSF and DOE) initiative
led by NIMH that supports research and development of advanced neuroinformatics
technologies and infrastructure through cooperative efforts among neuroscientists
and information scientists. Currently, four PAS are active under HBP.
(http://www.nimh.nih.gov/neuroinformatics/index.cfm)
B. Trans NIH Activities
1. Biomedical Information Sciences and Technology Initiative (BISTI)
NINDS has been participating in BISTI. This initiative is aimed at making
optimal use of computer science and technology to address problems in
biology and medicine. Currently, three PAs are active under BISTI: (http://www.bisti.nih.gov)
2. Neuroimaging Informatics Technology Initiative (NIfTI)
NINDS has jointed the NIMH to sponsor NIfTI, to provide coordinated and
targeted service, training, and research to facilitate the development
and enhance the utility of informatics tools related to neuroimaging.
The current focus is on fMRI-related informatics tools. Under the NIfTI
program, four workshops were held and one RFA (RFA-MH-02-008 Characterizing,
Validating, and Comparing Neuroimaging Informatics Tools) was funded.
(http://nifti.nimh.nih.gov)
3. Nature Neuroscience Special Supplement Computational Approaches
to Brain Function
In 2000, NINDS and NIMH, NIDA and NIAAA Working Group supported the publication
of this special supplement. It contains a preview of the NIH and eight
review articles. It was widely distributed in the neuroscience community
and well received. (http://www.nature.com/neuro/supplements)
C. NINDS Activities
1. NANDS Council Subcommittee on Computational Neuroscience, Neuroinformatics
and Infrastructure
This subcommittee was established in February 2000. It meets prior to
and reports to the NANDS council. The role of the committee is to access
the needs in the computational neuroscience and neuroinformatics field
and monitor the related activities supported by NINDS.
2. Workshop: Computational and Theoretical Neuroscience - From Synapse
to Circuitry (April 28, 2000)
Participants of this workshop were theoreticians and computational scientists
who have been working very closely with experimental neuroscientists.
The workshop identified research area of emphasis for the NINDS, it also
gave the following general recommendations: establish cross-cultural collaboration;
promote interdisciplinary training; create career opportunities for computational
scientists in biological research fields; educate computational scientist
to become successful NIH applicants.
(http://www.ninds.nih.gov/news_and_events/computationalwkshp_technical.htm)
3. Grant Funding
The NINDS is currently supporting 125 grants related to computational
biology and 58 related to bioinformatics. The grants can be breaking down
into molecular, cellular, system, behavioral levels and disease-related,
such as epilepsy and Parkinson's disease. Some of them are related to
neuroimaging or neural prostheses.
Peng, NIBIB
The mission of the NIBIB is to focus on new and novel technology, including
algorithm and model development. The areas of bioinformatics and computational
biology, therefore, focus specifically on new and novel mechanistic developments
in the algorithm or model, rather than the application of existing algorithms
or models to a particular organ or disease area.
Bioinformatics Areas
Includes imaging informatics such as the study, invention, and implementation
of structures and algorithms to improve communication, understanding and
management of information related to biomedical images. Includes knowledge-based
systems, deformable atlases, integration of imaging information from diverse
imaging methods, automation of image-guided treatment, outcome studies
and meta-analyses.
Includes development of new technologies to collect, store, retrieve,
and integrate quantitative data ranging from the genome to the organism
and to elucidate functional dynamics in living cells and tissues with
sensitivity down to the level of single molecules.
Includes large-scale data- driven knowledge base and database methods
that support data mining, statistical analysis, systems biology and modeling
efforts. Integration of different data types, especially time-variant
data; development of database and software infrastructure standards; design
of experiments to build mesoscopic databases, including those describing
the physico-chemical properties of gene products and databases on physiological
function; development of tools for information management and dissemination
to cope with the large amount of data generated by combinatorial approaches
are all included.
Includes development and application of classification systems and standard
terminology to improve health care and reduce costs and creation of better
techniques to facilitate the accurate and efficient collection of information
from physicians, other health professionals, and patients. Methods may
include user-friendly remote sensing devices for home and community use.
Includes improvement of computer science methods to protect confidentiality
of patient data; development of methods for structuring, managing, and
analyzing large, distributed, networked, adaptive databases; development
of methods for acquiring patient data and knowledge; and development of
methods for sharing knowledge for multiple purposes and updating disseminated
information as it is superseded by more recent data.
Basic research in computer science (such as software engineering methods,
high-end computing (software and hardware), high-performance networking,
grid computing, knowledge representations, ontology development, basic
algorithms and solvers, and methods and standards for data and image manipulation)
that is expected to have long-term biomedical impact is should be referred
to NIBIB.
Includes software and hardware for image display, visualization, and
computer-aided interpretation. Studies of image perception and psychophysics
related to imaging devices; development of display systems and methodologies
that facilitate the review of large volumes of image data; enhanced methods
for early detection of diseases and disorders; image computation, including
hardware and software for image reconstruction and processing. Computer-aided
image analysis methodologies for increasing the specificity of current
clinical imaging methods; incorporation of biostatistical components in
image analysis programs; and development of human computer interfaces
for simulation of medical procedures or virtual manipulation of physical
and quantitative models. Image processing including the segmentation,
filtering, reformatting, augmentation, registration etc. of images for
improved detection, diagnosis, and treatment of disease and injury.
Surgical tools and techniques include the development of new medical
technologies, including image-guided therapies, computer-assisted surgeries
and large-scale simulation modeling to improve surgical outcomes.
Computational Biology Areas
Includes studies that focus on the development of algorithms, mathematical
models, simulations and analysis of complex biological, physiological,
and biomechanical systems and use genomics and proteomics (examples include
studies in systems biology and multi-scale modeling approaches).
Systems Biology approaches, which include development of data-driven,
genome-based, organism-scale models for the analysis, interpretation,
and prediction of the genotype-phenotype relationship; development of
modeling, simulation, and statistical theory to describe single cell behavior
to parallel empirical observation; and development and testing of comprehensive
informatic-based multivariate physiological models that span scales from
the cell to the organism, e.g., to uncover the rules of nonlinear cellular
and systemic regulation. Includes development of physiologically based
mathematical models to predict therapy delivery or in vivo remodeling;
development of knowledge-based modeling which incorporates analysis for
validation and testing; and construction of methods for visualizing and
interpreting large and possibly heterogeneous data sets and the results
of multivariate, time-dependent simulations of biological and biomolecular
systems.
NHLBI
The NHLBI is involved in a broad range of biomedical informatic and computational
biology activities addressing the need to improve our understanding of
genomic, proteomic, and physiological functions in health and disease.
Generally, specific informatic activities are integrated with ongoing
scientific programs. The ongoing Programs for Genomic Applications (PGAs)
is a new initiative to advance functional genomic research related to
heart, lung, blood, and sleep health and disorders. The PGA centers have
developed web-based approaches for collaboration and the open dissemination
of databased research information including mutant and transgenic rodent
strains and phenotypes, DNA sequence comparisons, cDNA libraries, protein
sequences, and large-scale microarray expression profile data of human
disease models. Coordination on cross-cutting bioinformatic issues such
as database nomenclature has enhanced inter-operability and facilitated
comparisons across species. A unique feature of the PGA program is that
the bioinformatic and computational tools are freely distributed, and
that the external use of these bioinformatic and computational resources
is facilitated through a regular schedule of training workshops, courses,
and visiting scientist programs. A new NHLBI Proteomics Initiative will
converge results from different platform technologies that analyze intracellular
and secreted proteins related to heart, lung, blood, and sleep disease
processes. A key component is the development of relational software and
statistical analysis regimens that will allow the comparison and correlation
of different datasets generated by common but diverse technologies such
as fluorescence activated cell sorters (FACS), robotic microarray printers
and scanners, and microfabrication design for capillary electrophoresis
equipment. The goal is to facilitate interdisciplinary research across
diseases and model organisms.
The Institute specializes in computational and database platforms necessary
to facilitate gene finding, data mining, combinatorial partitioning and
neural network models for heart, lung and blood disease. Genelink is a
new program that facilitates data sharing, information exchange and meta-analysis
in ongoing NHLBI family studies. An important activity is examining the
efficacy and safety of new data gathering methodologies, monitoring nationwide
health trends, and developing innovative diagnostic and informatic technologies
in the epidemiologic setting to promote diagnosis and information gathering.
The NHLBI has a limited role in the curation and limited-access dissemination
of epidemiological datasets for research.
A diverse array of informatic and computational activities are also included
as core subprojects within a relatively large portfolio of interactive
center and program project grants in heart, lung, and blood diseases.
The NHLBI participates in the program announcements of the NIH Biomedical
Information Science and Technology Initiative Consortium and funds a portfolio
of phased innovative awards (R21/R33) and pre-center bioinformatic planning
grants. These activities produce mutant libraries and gene expression
microarrays; phenotype and gene databases for animal models; computer
simulation of molecular and tissue mechanics in disease processes; and
enhanced treatment management systems to facilitate physician decision-making.
The NHLBI Division of Intramural Research (DIR) supports a spectrum of
tools ranging from web-accessible clinical and laboratory databases with
data warehousing to dedicated clusters of computers and custom software
supporting molecular modeling and simulation, real-time MRI imaging of
the heart, or pattern recognition analysis of large genomic and proteomic
data sets. The DIR has both centralized information technologists for
tools to support research, as well as tenured and career scientists whose
research interests require computational methods to address scientific
questions.
|