About the Lab
Much biological data is semantic in nature; this makes it a difficult substrate for computation. My research interests are centered on organizing biological data in ways that make it more amenable to computation. I am particularly interested in the development of ontologies that describe biological knowledge, and provide a means for detailed analysis of associated data.
Projects
- Sequence Ontology. The Sequence Ontology (SO) aims to unify the terminology used to describe biological sequence. It has been developed in conjunction with the model organism database groups to simplify data exchange and promote the development of computable genomic annotations.
The SO is curated and maintained by this lab. We are also developing software to facilitate using the ontology, and are having fun exploring genomic annotations.
- Gene Ontology. The Gene Ontology (GO) has also provided the biological community with a tool that allows researchers to both communicate with each other effectively as it unified the vocabulary and also analyze large quantities of data. The GO is an ontology that describes the classes of molecular function, biological process and cellular location, and the relationships that hold between them. It is used by many of the model organism databases to label what the gene products do, what process they are involved in and where they are located. These functional annotations are then used to search across the genomes based on semantics rather than sequence similarity.
We are part of the Gene Ontology Consortium
Funding
This lab is funded by the NIH (1R01HG004341-01) to develop software to facilitate the adoption of the Sequence Ontology, and a subcontract to the Gene Ontology Consortium (P41 HG002273)