• 0
    • pathogenbioinformatics

Biodiversity Informatics is the management and manipulation of biodiversity data. In order to be successful in Biodiversity Informatics, you really need to have a good background in both Systematics and Informatics (the computerized management and manipulation of data). You have to bridge the gap between Systematists and Computer Scientists. Historically, a Systematist with few computer skills would often enlist the help of a Computer Scientist or technician with no Systematics training for a project that required large amounts of data processing. As one might predict, the lack of overlap in skills often meant not only that the project failed, but also almost always that the true potential of the project was never realized. The true potential was generally never even imagined, because neither of the two had the skills to see both the importance of all the data AND all of the possibilities for exploiting it. Now, Biodiversity Informatics specialists are in high demand because of their ability to see both sides of the picture.

Science runs on data. The advent of electronic storage and manipulation of those data has fostered a revolution in how much we can learn from the enormous amount of biodiversity data that it is now possible to generate, store and repeatedly analyze to answer questions that were never dreamt of when the data were originally generated. For example, centuries ago, many systematists carefully recorded minute physical and geographic distribution variations among species in order to discover and describe the earth's diversity. Later systematists using theoretical frameworks that hadn't existed before (e.g. evolution, plate tectonics, paleoecology) employed, and added to, those same data to discover the evolutionary history and relationships among species, as well as a great deal about the geologic relationships and climatic history of the areas in which they occurred in the past.
Now, we can use large amounts of those same data collected over the centuries, and add in remotely sensed data (data derived from images and sensors in space), and computerized modeling and simulation techniques. We can figure out where a species will grow on earth, why it will grow there and what it might take to grow it elsewhere, as well as where it will likely grow on a globally warmed future earth.
By combining our knowledge of plant phylogeny with the large amount of DNA sequencing data that has been generated over the last few years, we can readily predict the function of many genes across many species of plants. That lets us identify potential plants to grow for medicines, food, biofuels, lumber etc. It also helps us find other characteristics such as disease resistance, tolerance for bad soils or drought, and prolific production. We can also apply that knowledge back to the distribution data described above to find out where those species will grow and how they might affect other species (maybe endangered, or weed species). Using the spectacular power of the internet, we can deliver that information to billions of people across the globe, as it is developed. Much of this work is still in its infancy. The amount of biological data available is growing at an unprecedented rate, the density of data that computers can process doubles every 18 months. It is all just waiting for new and creative Biodiversity Informaticists.

Stinger Guala, USDA, NRCS National Plant Data Center

Share |