
The AntibodyRegistry is proud to announce that as of May 2011 we are working with the antibody database provided by the Journal of Comparative Neurology. This database has been created by the careful work of Dr. Clifford Saper, who had implemented a visionary policy that requires a rigorous categorization of all antibodies used in all manuscripts submitted to the journal. This collaboration allows the antibody registry to add important links between the antibodies in heavy use and all antibodies available for a particular antigen, especially those useful for neuroscience. The simple search of the registry has also been updated to reflect the relative importance of the antibodies used in the Journal of Comparative Neurology papers.
The goal of the AntibodyRegistry is to provide a stable, traceable, permanent identifier to all antibody products created by large commercial vendors and individual labs that can be included in any publication as a way to easily identify the antibody used in any scientific paper and trace that antibody back to the creator.
In general, recognition of entities inside of text is a difficult task for both machines (text mining and natural language algorithms) and humans because both humans and machines often need much more information to identify an entity than is present in text. Most journals require information such as the city and state of the reagent vendor, but do not require useful information such as the catalog number. In the case of antibodies most companies have multiple antibodies against a single gene-product at the time of publication or some time later so the published paper often under-specifies the reagent used.
The neuroscience information framework (NIF) as part of its mission to make scientific data and resources discoverable was tasked with a pilot project to determine the feasibility of identifying entities automatically in text and we were granted the ability to automatically index one full volume of the Journal of Neuroscience. Dr. Martone, an expert anatomist and project lead, generated a list of antibodies from the same volume of the Journal of Neuroscience, which served as a control dataset useful to compare human and machine efforts. In 8 articles, 106 (95 unique) antibodies were identified, and of those 52 references did not contain enough information to unambiguously determine the catalog number, in addition only a few antibodies were identified with either a clone ID or a catalog number, but supplier name and url were available in all but 26 cases. No antibodies had lot numbers associated. Automated systems were not deployed as their success is not possible, when a human expert, using papers (often going back to previous work of the authors) and company catalogs can identify less than 50% of the antibodies. The solution to the problem requires a change in publishing practices, not software. To aid in this change, we have negotiated agreements for data download of a set of information for more than 800,000 commercially available antibodies including a unique identifier for each antibody that, we propose, should be included in any new publication of antibodies. The information includes the vendor, catalog number, clone id, antibody target, target subregion (where available), target modification, target species, raised in species, target Entrez ID, clonality, name and comments. The data model was aligned with the model created by the Eagle-I project, tasked with cataloging non-commercial antibodies. In addition we have created an easy to use, web-accessible graphical interface, written in JAVA, which would serve as a search and registration tool for any scientist to search for and add antibodies. With this extensive database, covering most commercial antibodies, and an easy to use registration tool, scientists should be able to disambiguate antibodies within papers that any text mining software, or researcher can find unleashing the potential of bright minds to focus on important scientific problems not tracking down reagents.
Use of data is intended for non-commercial purposes. Any commercial use of the antibody registry data is not permitted under current license agreements.