BioNLP-Corpora |
This page lists a collection of corpora related to extraction and annotation of protein residues, both plain amino acid mentions and mutation sites, in text.
The main sourceforge download page is http://sourceforge.net/projects/bionlp-corpora/files/ProteinResidue
MutationFinder Corpora:
   MutationFinder-1.1-Corpus.tar.gz
Nagel K (2009) Automatic functional annotation of predicted active sites: combining PDB and literature mining. Cambridge, UK: University of Cambridge.
The package ending in "_A1" is in the A1 format of the BRAT Annotation tool (http://brat.nlplab.org/). Thanks to S.V. Ramanam of NPJoint http://npjoint.com/Cocoa_pre.html for producing this version.
Ravikumar K.E., Haibin, L., Cohn, JD, Wall, M.E., Verspoor, K.M. (2011) "Pattern Learning Through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature". The Tenth International Conference on Machine Learning and Applications (ICMLA) 2011, Honolulu, Hawaii, USA, December, 2011.