The Annotation Projects
The Annotation Projects are a group of corpora that have been manually and automatically annotated with biological and linguistic markup. The automatic markup has been curated by human annotators. The text in the individual corpora were seclected by grouping biological concepts or the verbs that denote those concepts.
The projects available include:
- Expression: cell-specific expression of genes in GeneRIF texts
- Please cite the following paper for use of this corpus:
- L. Hunter, Z. Lu, J. Firby, W.A. Baumgartner, Jr., H.L. Johnson, P.V. Ogren, K.B. Cohen. (2008) OpenDMAP: An open-source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression. (link)
- Transport: protein transport in GeneRIF texts
- Please cite the following thesis for use of this corpus:
- Z. Lu, (2007) Text Mining on GeneRIFs. PhD thesis, Computional Bioscience Program, University of Colorado School of Medicine, CO, USA. (link)
- Regulation of Transport: regulaion of protein transport in GeneRIF texts
- Please cite the following thesis for use of this corpus:
- Z. Lu, (2007) Text Mining on GeneRIFs. PhD thesis, Computional Bioscience Program, University of Colorado School of Medicine, CO, USA. (link)
- Activation: Seven types of protein activations annotated to the Gene Ontology in GeneRIF texts
- Please site the following paper for use of this corpus: [forthcoming]
- Protein Tag project: GeneRIFs tagged for proteins
Check back regularly for new projects.
We are interested in your feedback about these corpora.
Please direct all bug reports and comments about the contents of the corpora to BioNLP-Corpora Bug Tracker. Be sure to choose the appropriate Annotation Project dataset (e.g. "Annotation Project: Transport") from the dropdown options in the "Category" field.
Maintained by Helen L. Johnson.
This file last modified Thursday, 18-Mar-2010 14:49:24 UTC