RSC Publishing


Publishing

 

Changing the face of publishing


31 March 2009

InChIs, Open Standards and data collections - Richard Kidd describes how RSC Publishing is leading the way in technological innovation.

Richard Kidd
Richard Kidd Informatics Manager
RSC Publishing has long been a champion of technological innovation, and its recent activities in promoting the creation and adoption of new standards promises to transform the way that science is published, not just within its own publications but for all creators and users of scientific data.

Identifying and linking compounds


RSC Publishing has been an enthusiastic early adopter of the new standard for chemical compound identifiers, the InChI, developed by IUPAC and NIST.  The InChI identifier contains full structural information about compounds, but can get long and unwieldy for normal use. A fixed format InChIKey, which is more search-engine friendly, can be derived from the InChI. While the InChI can be converted back to the original chemical structure, the InChIKey needs to be linked back to the original InChI code to derive the real compound information. 

To underpin our commitment to the standard, we sponsored the development of an InChI resolver service via ChemZoo's ChemSpider, which already contains over 21 million compounds.  The resolver service will allow users to lookup full InChI identifiers from the shorter fixed-length InChIkey. This is a basic 'plumbing' service available to the community which will facilitate the lookup of full compound information, and allows anyone with compound collections to make them available to all. The InChI Resolver service will allow compound deposition so that compound collections can be deposited with the service, preserving  their continued access for the future. This free service will allow the community to easily use InChIs and facilitate sharing of compound collections.

RSC Publishing and ChemZoo launched this service at the ACS Spring Meeting in Salt Lake City.


Related Links

Link icon The InChI Resolver
powered by ChemSpider


External links will open in a new browser window



Open Standards for subject classifications


A common problem when trying to find information is being able to use the right terminology. Agreed and open standards covering the chemical sciences have been lacking for some time, and hinder efforts to find and compare chemical text and data. RSC Publishing has used selections from the Open Biomedical Ontologies (the Gene, Sequence and Cell Ontologies, and also ChEBI) and has also contributed to these as an active user to help increase their accuracy and relevance. 

"We will be working to encourage authors to store and supply their research data files within their publication."
 In addition we have started to build our own subject classifications covering selected areas of chemistry - to allow us to classify our own content better and offer new means to search - which we will make open and act as curator.Again, we hope that by making these available for anyone to use we can make it possible to link together related science across not just our own publications but other publishers' content and other sources available online. The advantages that ontology terms offer over simple keywords include the reduction of ambiguity caused by synonyms, and the ability to use relationships described between the ontology terms to widen or narrow down collections in very specific ways.

The first two ontologies that we're making available are:
. RXNO - a reaction ontology 
. CMO - chemical methods ontology
These are freely available to download from:


RSC Ontologies

The Homepage for all the Royal Society of Chemistry's ontology-related material

Data collections


Already we make associated supplementary data files available alongside our articles, but the success of the crystallographic CIF format has shown how powerful a standard format for data can be.

"We are using our award winning project RSC Prospect to show some of the benefits of applying new standards to our journal articles."
It becomes not just a means to preserve research data but to share and allow the data to be visualised and reused. RSC Publishing is a supporter of open data and will be working to encourage authors to store and supply their research data files within their publication. We will be looking at possible standards covering different areas of the chemical sciences and providing demonstrations to show what can be done with the data if it is available to share in an open, standard form.


RSC Prospect - we show what's possible


We are using our award winning project RSC Prospect to show some of the benefits of applying new standards to our journal articles. By using the standards mentioned above, and using our skills to develop them further and apply them specifically to our areas of chemical science publishing,  we have added a layer of semantic enrichment to articles. This enables them to be found more easily, to be better understood, to have the chemical compound data available in a machine-readable form, and to link together content by subject term or by chemical compound. 

When this first went live, we were limited to offering an enhanced HTML view of a paper, with inline links highlighting unique compounds via InChI and terms from the Gene, Sequence and Cell ontologies. 

computer


Since then we have extended this to create machine readable RSS feeds containing real chemical information (and structures for humans!), compound image popups on mouseover, and the application of the ChEBI ontology for chemical classes and groups.  Last year we introduced chemical structure and substructure searching on our enhanced articles, the first primary publisher to achieve this. We're now using our subject and compound information pages to direct readers to content: for example, if you're interested in a particular compound or subject area, we can tell you our articles which include it. Most recently we have started identifying reactions and chemical methods within our papers.


The future...?


We'll be looking at promoting the use of InChI compound identifiers further, and working with other publishers to link together our compounds.  We'll develop our classifications to cover all the areas of the chemical sciences that we publish. We'll be applying this markup to all our content to give our readers an unparalleled view of our publications. And we'll be working with our authors to preserve real scientific data through the publication process and make it openly available for reuse. An important part of this is providing compelling demonstrations of what all this can achieve, and we'll be doing this through our RSC Prospect developments. We're proud to do this as a learned society publisher, here to promote the chemical sciences worldwide.


RSC Prospect Home

For FAQs, examples, contact information and latest news about RSC Prospect