Bioinformatics
Definition
Bioinformatics is the branch of computer science that deals with the collection, organisation, analysis or interrogation of biological information, particularly genetic information. It also involves the use of 3-D modelling of biological molecules (biomolecules) and biological systems. In the Dairy CRC's genotyping laboratories, bioinformatics has to be an almost seamless part of the whole research effort.
Dr Kevin Nicholas, Project Leader 
The bioinformatics computer image displayed on the right is the result of four years intensive experiments, on the lactation genes of the tammar wallaby. The x axis shows the time in days of the lactation cycle. The distinctive peaks in the y axis show the particular genes involved at that time in the lactation cycle. (Right) Another part of Dr Nicholas' research: the microarray of the tammar wallaby.
Data types
It is barely 50 years since Watson and Crick made their famous discovery of the structure of the genetic molecule DNA, but in that time there has been an amazing advance in our understanding of genes and in our technical capacity to study them.
The recently completed human genome project provided us with the sequence of the 3 billion bases in the DNA of each human cell nucleus, and with a good idea about the organisation within the DNA of 30,000 genes.
It is now fairly easy for laboratories to do experiments in which they measure the extent to which each of those genes is expressed (turned on or off) within a certain tissue, such as liver or brain, under certain conditions. To make matters even more complex, we can also obtain information about how each gene varies within a population and whether particular variations are associated with traits of the organism, such as body weight or susceptibility to disease.
This is a very exciting time in biology but the need to be able to manage so much data - sequence data, expression data, and population data - has contributed to the emergence of a new area: bioinformatics.
Technologies
Bioinformatics needs to organise the data in a way that will make it possible for researchers to answer the sorts of questions in which they are interested.
Organising and structuring the data is often very difficult. Tables and databases, which link with each other, are tools used to assist this.
Usually specialised computer programs are necessary to analyse or query the data once it is organised. For example, it is often necessary to run a program that makes comparisons between the DNA sequences of different genes to find genes that might share the same function. Data in the public arena, such as the huge databases of the genetic information of the human, the human genome project, and the mouse genome, can be used in such comparisons.
Cow genes have not been studied in as much detail as human or mouse genes and within the Dairy CRC we are often limited to working only with partial information from a gene. Sometimes the best way to find out more about such a gene is to compare it with the 30,000 human genes, where we have more known information. If we find some that seem to be similar then we can infer that our cow gene may have the same function as the human gene.
The most popular search program is called BLAST, Basic Local Alignment Search Tool.
Another important approach to data analysis is through statistics. Considering Dairy CRC studies, researchers may gather masses of information on the 30,000 genes in the bovine genome, usually by microarray experiments (refer to GenEd Webglossary) determining whether certain genes are switched on or off in certain areas, for example, the mammary glands. Experiments are then conducted to see the effects of certain treatments, such as drugs, food or changes in environmental conditions, on the gene expression.
Once all this data is collected from one animal, the scientists have to screen across populations to look at the tiny, and sometimes big, variations between individuals that can markedly affect the characteristic of interest.
So an interesting and important aspect of bioinformatics is that it combines several fields of expertise:
computational biology
statistics
molecular biology
computer science

Advantages
Large amounts of data can be analysed much more quickly than if working by hand. The raw data is collected from the laboratory genotyping experiments, then using bioinformatics the data is interrogated to extract useful information. For example, Dairy CRC scientists may find a set of sequences from their bovine tissue which they can compare with the human genome data to see if it correlates with a gene that has been identified in humans.
The cost of bioinformatics is relatively inexpensive compared with the cost of 'wet' laboratory work with biological specimens and the very expensive technologies necessary to identify genes. If we can use the computer to predict which genes are important in a particular trait, then we can reduce the time spent in the laboratory trying to find the genes experimentally.
Disadvantages
Bioinformatics is difficult for someone educated as a molecular biologist. The ordinary desktop PC cannot manage nor analyse all the gene information. Bioinformatics is very specialized, so usually institutions will have a centralised bioinformatics resource. The bioinformatician is required to present the information so it is accessible to researchers, usually by producing web pages that interface with the results or databases.
Bioinformatics can also be difficult for someone educated in information technology, but less so in bioscience! It can take a lot of study to properly understand concepts such as recombination, transcription and translation which are central to bioinformatics.In specific areas, current workers are strongly advocating reviewing all the information available, as new and better analyses may reveal important information.
Visualisation of the data of thousands of points to produce web-friendly information is often not an easy task. Graphs, heat maps and pictures assist in ensuring the resulting data is meaningful to the reader.
As with many scientific endeavours, there is the risk that no useful information will come out of a study, so commercially there is the risk that results may not be delivered.
Often reviews show reports contain incomplete information. For example, in a current Dairy CRC project, extensive reviewing of all relevant publications showed that important data was missing. The bioinformatician needed each study to be on pedigreed populations with production records and from which there were genomic DNA samples.
Despite this, the field of bioinformatics is expanding rapidly and is essential to biotechnology industries.
Bioinformaticians are highly sought after, and they enjoy the challenges in their fascinating endeavours . For more information, visit Making Connections - Careers.
Applications in Dairy CRC
The Dairy CRC has just published some work on QTLs (quantitative trait loci) in the cow. Each QTL is a region within the cattle DNA which is associated with some trait of interest, such as the amount of protein in milk. Using our own research data and comparing it by extensively reviewing articles in the public domain, has allowed our scientists to have more precise ideas where QTLs are in the cow.
We have published a draft on-line QTL map for dairy production traits:
protein yield protein per cent fat yield fat per cent milk yield.
Using bioinformatics we can try to answer questions such as:
How many genes are there in the QTL region? What do they do? Are any of them expressed in the mammary gland which produces milk? Which genes vary within the bovine population? What DNA sequences can we use in tests to identify such variants?
If the particular genes associated with these traits can be implemented into breeding programs, pre-selection of young bulls could occur before expensive and time consuming testing of offspring.
Selected Web Links
NCBI Science Primer
http://www.ncbi.nlm.nih.gov Go to About NCBI, Science Primer, Bioinformatics
Open Directory Project, UK
http://dmoz.org Go to Science, Search Bioinformatics
Biocomputing for Schools, Germany
http://www.uni-mainz.de/~cfrosch/bc4s/welcome.html
CSIRO Bioinformatics
http://www.bioinformatics.csiro.au
To download this page as a fact sheet in Adobe Acrobat .PDF format, click here
Overview - DairyCRC information - Teacher Guide - Biosciences and technologies - Safety - Issues - Careers - Web links - Gen Ed Glossary - Give your feedback
|