Royal Society of Chemistry

Paper


Guidelines on the capture of RSC journal articles

Richard Lighta and Richard Kiddb
aBurgess Hill, West Sussex, UK
bRSC, Cambridge, UK. Last update 2 August 2002


Contents
1Introduction
1.1Scope of this document
1.2Feedback and updates
1.3Format of this document
2Scope of the data capture work
2.1XML encoding
2.2File naming conventions
2.3File Delivery
2.4Form of PUBLIC identifiers
2.5Form of SYSTEM identifiers
3Documents relating to the RSC DTD
3.1SGML Declaration
3.2Catalog file
3.3Table support
3.4Entity declarations
3.4.1Character mappings file
4General conventions
4.1Guidelines
4.1.1Style guidelines
4.1.2Semantics of the table model
4.2Article structure
4.3Front matter
4.4Body matter
4.5Appendices
4.6Back matter
4.7Graphics
4.8Assigning unique id's
4.9Links and cross-references
4.9.1Recognising cross-references
4.10Numbering
5Low-level elements
5.1Emphasis and font style elements
5.1Footnotes
5.2Text
5.2Tables
5.3Chemistry
5.4Equations
5.5Citations
5.5.1Numbering citations
5.5.2Standard journal citations
5.5.3Non-standard citations
5.5.4Book citations
5.5.5RSC journal abbreviations
5.6Lists
5.7General
1Appendix A. Alphabetical list of element types
1.1Element definitions
1.1.1a
1.1.2above
1.1.3abstract
1.1.4ack
1.1.5address
1.1.6addrelt
1.1.7admin-event
1.1.8advert
1.1.9aff
1.1.10affref
1.1.11agent
1.1.12appendix
1.1.13appmat
1.1.14art-admin
1.1.15art-back
1.1.16art-body
1.1.17art-front
1.1.18art-links
1.1.19art-toc-entry
1.1.20article
1.1.21articleref
1.1.22arttitle
1.1.23arttoc
1.1.24authgrp
1.1.25author
1.1.26below
1.1.27bi
1.1.28biblist
1.1.29biblscope
1.1.30biography
1.1.31bo
1.1.32board
1.1.33book-review
1.1.34box
1.1.35boxref
1.1.36byline
1.1.37chart
1.1.38chartref
1.1.39citation
1.1.40citauth
1.1.41citext
1.1.42citgroup
1.1.43citpub
1.1.44citref
1.1.45city
1.1.46coden
1.1.47colspec
1.1.48commentary
1.1.49compname
1.1.50compound
1.1.51compoundgrp
1.1.52compoundref
1.1.53conference
1.1.54confgrp
1.1.55confname
1.1.56contact
1.1.57country
1.1.58cpyrt
1.1.59date
1.1.60daterange
1.1.61day
1.1.62dd
1.1.63dedicate
1.1.64def
1.1.65deflist
1.1.66denom
1.1.67doi
1.1.68editnote
1.1.69editor
1.1.70email
1.1.71entry
1.1.72eqnref
1.1.73eqntext
1.1.74equation
1.1.75fax
1.1.76figref
1.1.77figure
1.1.78fname
1.1.79fnoteref
1.1.80footer
1.1.81footnote
1.1.82fpage
1.1.83fraction
1.1.84fulltext
1.1.85group
1.1.86head
1.1.87icgraphic
1.1.88ictext
1.1.89index
1.1.90index-entry
1.1.91inf
1.1.92info
1.1.93issn
1.1.94issue
1.1.95issue-back
1.1.96issue-front
1.1.97issue-toc
1.1.98issueid
1.1.99issueno
1.1.100issueref
1.1.101it
1.1.102item
1.1.103jnltrans
1.1.104journal
1.1.105journalcit
1.1.106journalref
1.1.107keyword
1.1.108link
1.1.109list
1.1.110location
1.1.111logo
1.1.112lpage
1.1.113member
1.1.114month
1.1.115ms-id
1.1.116nameelt
1.1.117news-article
1.1.118news-item
1.1.119news-section
1.1.120no
1.1.121no-of-pages
1.1.122note
1.1.123numer
1.1.124office
1.1.125org
1.1.126orgname
1.1.127overbar
1.1.128p
1.1.129pages
1.1.130persname
1.1.131person
1.1.132phone
1.1.133pii
1.1.134plate
1.1.135plateref
1.1.136postcode
1.1.137pubfront
1.1.138published
1.1.139publisher
1.1.140pubname
1.1.141pubplace
1.1.142qualifier
1.1.143received
1.1.144role
1.1.145roman
1.1.146row
1.1.147sansserif
1.1.148scheme
1.1.149schemref
1.1.150scp
1.1.151section
1.1.152sercode
1.1.153sertitle
1.1.154sici
1.1.155stack
1.1.156state
1.1.157subject
1.1.158subsect1
1.1.159subsect2
1.1.160subsect3
1.1.161subsect4
1.1.162subsect5
1.1.163subsect6
1.1.164subtitle
1.1.165sup
1.1.166suppinf
1.1.167surname
1.1.168table
1.1.169table-entry
1.1.170tableref
1.1.171tbody
1.1.172term
1.1.173textref
1.1.174tfoot
1.1.175tgroup
1.1.176thead
1.1.177title
1.1.178titlegrp
1.1.179toc-entry
1.1.180toc-head
1.1.181trans
1.1.182ugraphic
1.1.183ul
1.1.184underbar
1.1.185unknown
1.1.186url
1.1.187value
1.1.188volume
1.1.189volumeno
1.1.190volumeref
1.1.191warning
1.1.192who
1.1.193year
1Appendix B. Notations
1Appendix C. Changes to the RSC DTD
1.1Summary of changes in version 3.4
1.2Summary of changes in version 3.5
1.3Summary of changes in version 3.6

Introduction

Scope of this document

These guidelines are a guide to Version 3.6 of the RSC Article DTD.

Feedback and updates

Please let us know of any problems you encounter in using these instructions while trying to encode articles using the DTD provided. This will help us to improve both the application and its associated documentation.

We plan to issue updates to the DTD and documentation at regular, planned, intervals. You will be notified of these updates in advance, so that you can allocate resources to deal with any changes to data capture instructions or rendering software that might be required.

Format of this document

This document fulfils two functions. As well as containing instructions on the conventions to follow, it acts as an example of the results that are expected, being written to conform to the RSC Primary Articles DTD Version 3.6.

The XML version of this can be browsed using Internet Explorer 5.0 or above at http://www.rsc.org/dtds/desc36.xml. This HTML version was created from the XML using Saxon.

Scope of the data capture work

The objective is to capture all the text within each article which can be encoded in XML (see next section). The DOCTYPE and document element will always be <article>. Within this, the <art-admin> (which holds the article's unique manuscript number), <published> (for articles which have already appeared in print), <art-front>, <art-body> and <art-back> element types will be routinely used, with an occasional <appmat>.

XML encoding

As far as possible, all the information in the articles presented should be encoded in XML and included in the resulting document. Obvious exceptions are figures, which should be referenced as external entities in the standard manner (see Graphics below).

Both tables and equations are liable to be more difficult. If possible, these should be encoded in XML, but we accept that there are liable to be cases where this is not possible due to the complexity of the data or inadequacies in the DTD as currently drafted. In these cases the relevant object should be treated as a graphic. A particular example is where a table contains graphics spanned across rows or columns - this would be impossible to render accurately from the XML. See Tables and Equations below for specific guidelines.

Articles should conform to XML as well as SGML conventions. This means that:

A variety of tools can (and should) be used to check that articles consist of valid SGML/XML. The nsgmls program will check for SGML conformance. There is a wide variety of free or inexpensive XML-aware software. For example, if you open an XML document in Internet Explorer 5, its built-in XML parser will check the document for validity and report any errors.

File naming conventions

All manuscripts will have a unique identifier, assigned by RSC, e.g. a901234h. As well as being used to name the file containing the encoded article, this identifier will be encoded as the <ms-id> element within the article.

The RSC will name graphics files as follows:

Graphic types: (from RSC)The following filename styles should be supplied to the RSC:

Lower-case should be used.

File Delivery

We require, for each paper:

Each document and associated files should be delivered as a zip file, named as above (e.g. a901234h.zip)

Form of PUBLIC identifiers

PUBLIC identifiers should be used throughout

In addition, each PUBLIC identifier should be followed by a SYSTEM identifier giving a URL that locates the resource in question. This belt and braces strategy will allow articles to be treated as valid XML (XML requires a SYSTEM identifier), while offering us the flexibility of using SGML-aware software to interpret the PUBLIC identifiers in different ways, as necessary.

Thus the DOCTYPE declaration at the head of each article should always take the form:

<!DOCTYPE article PUBLIC "-//RSC//DTD RSC Primary Article DTD 3.6//EN" "http://www.rsc.org/dtds/rscart36.dtd">

PUBLIC identifiers should be constructed using the general format:

"RSC// [MS number] [object src]"

where the object src is the element type with number:

e.g.

"RSC// a706828h eqn3"

The names assigned within each article for the external entities it references should reflect the last component of the entity's PUBLIC identifier, e.g.

<!ENTITY eqn3 PUBLIC "RSC// a706828h eqn3" ...

Form of SYSTEM identifiers

The SYSTEM identifiers (i.e. filenames) assigned to each external entity should consist of the article's manuscript number followed by the entity's name, with a suitable suffix, e.g.:

<!ENTITY ugt3 PUBLIC "RSC// a706828h ugt3" "a706828h-t3.tif" NDATA tiff>

Documents relating to the RSC DTD

The DTD itself is in the file rscart36.dtd. A number of other files are required before documents will parse against the DTD. They should all be stored in the same directory as the DTD itself, apart from the entities files (*.ent) which should be stored in a subdirectory named entities. We use Internet Explorer 5 as our (XML) parser. We suggest suppliers use the same parser.

SGML Declaration

An SGML Declaration suitable for use with this DTD is in the file rscxml33.dcl. This Declaration allows an XML-encoded article to be processed by SGML software. It specifies features such as case-sensitivity for element and attribute names, quoting of attribute values, XML-style processing instructions and empty element syntax, and Unicode support.

Catalog file

The catalog file rscart3s.cat is in the standard OASIS catalog file format. It resolves all the PUBLIC identifiers declared in the DTD, as well as the PUBLIC identifier of the DTD itself. This catalog file invokes the SGML version of the DTD, rather than the XML version. It uses the file rscsgm36.dtd to set up the DTD's parameter entities for SGML. If required, an updated rscart3s.cat can be used to override the DTD's online SYSTEM identifier and point instead to a local copy.

Table support

The file calstab1.dtd contains the OASIS-supported DTD fragment which supports the interoperable CALS table model subset. Additions and changes to this model are declared in the body of the DTD itself, not here.

Entity declarations

Two files containing character entities are provided. One of these contains mappings of characters to numeric values that conform to Unicode 2.0 (rsc_x.ent). This is for use with the default XML interpretation of the DTD. It should be noted that we plan to use Unicode Combining Characters to partially solve the problem of 'one character over another'. This means that rendering software will need to support Combining Characters, ideally in a generalized manner.

The other file maps exactly the same characters to SDATA entities, and is for use with the SGML interpretation of the DTD (rsc_s.ent).

If an article contains any characters which are not in the RSC set, the RSC should be alerted to the need to add them to the standard set.

Character mappings file. RSC maintains information about special characters in a character mappings file (charmaps.xml). The entity declarations described above are generated from this file by XSLT style sheets. Characters in this file are categorized into one of the following classes:

These categories help to ensure that each character is mapped to the most appropriate result when different types of output encoding are generated:

General conventions

Guidelines

Style guidelines. The style guidelines for each journal describe general conventions for article structure. Use these as a guide to the structure and content of articles.

In particular, while encoding articles these guidelines should be used to infer when a change of type style (e.g. to bold) implies a specific element type, as discussed below under Cross-references.

Semantics of the table model. The table model used is developed from the interoperable CALS table model subset supported by OASIS1a-b. The OASIS web site contains a description of the generic CALS table model1a, and a description of the semantics of this interoperable subset1b.

Version 3.6 of the DTD simplifies the level of CALS table support that is required by removing the <spanspec> element type (which is not part of the interoperable subset). This has been found to be unnecessary, since both horizontal and vertical spans within tables can be represented without it. (<colspec> provides all the information that is required for horizontal spanning, while the MOREROWS attribute supports vertical spans.) It adds support for rotated tables by including the ORIENT attribute, which can be set to "land" to indicate a landscape, i.e. rotated, table.

Article structure

Each article consists of front matter, body matter and back matter.

The article itself can have a type attribute, which specifies what type of article it is. This table summarises the codes to be used for each type of article, and the types of article that are currently liable to appear in each journal published by the RSC. (See below for a key to the journal codes in this table.)
Table -arttypes Article type codes and usage


Article type
Code
PO
EM
GC
DT
JM
P1
P2
JC
CC
FT
AN
AC
JA
MC
FD
NP
CS
IC/OC/PC
NJ
RC
QU
CE
GT
Papers ART X X X X X X X X X X X X X X
Comms COM X X X X X X X X X X
Perspectives PER X X
Letters LET X X X X X
Feature Articles FEA X X X X
Editorial EDI X X X X X X X X X X X X X
Synopsis SYN X
Full text ART X
Research Articles RES X
Discussions DIS X
Review Articles REV X X X X X X X X X X X
Book Reviews BKR X X X X
News NWS X X X X X
News articles NAR X
Highlights HIG X X X X
Interviews INT X
Technical note TEC X
Events/Conference Diary CNF X
Conference reports CRP X X
Synthetic abstract SAB X
Cover Feature COV X
Focus FOC X X
Viewpoints VPT X X
Invited Lecture LEC X
Keynote Article KEY X
Hot off the Press Articles HOT X
Atomic Spectrometry Update ASU X
Analytical Methods Committee AMS X
Inter-laboratory Note ILN X
Critical Review CRV X
Tutorial Review TRV X
Glow Discharge Paper GDP X
Glow Discharge Comm GDC X
Glow Discharge Review GDR X
Glow Discharge News Article GDN X
Glow Discharge Technical Note GDT X


Front matter

The front matter consists of <art-admin>, which holds the article's unique manuscript number, <published>, which contains details of the journal, volume, issue in which the article has been printed and the relevant pagination details, and <art-front>, which is the front matter proper.

For accepted date, use:

<date role="accepted"><year>1999</year><month>April</month><day>23</day></date>

For the date on which a revised version of an article was issued, use the same date format, with role="revised".

Authors - we would like the corresponding author to be identified. There's no need to mark the others as 'princ'.

For affiliations, use the <org> <orgname></orgname> <org><address></address> for the address - although the <org> group does contain its own <address> element, this shouldn't be used for encoding the articles. We don't require any org ids.

The <published> element should be set with the attribute type="print", along with the journal code — the other pubfront subelements should be left blank.

<published type="print"> <journalref><link>GC</link></journalref> <volumeref><link>001 </link></volumeref> <issueref><link>unknown</link></issueref> <pubfront><fpage></fpage><no-of-pages></no-of-pages> <date><year></year></date> </pubfront> </published>

Body matter

The body of each article consists of an <art-body>, containing one more <section>s.

These are the top-level structural units within each article: lower levels are represented by <subsect1>, <subsect2>, etc. (N.B. the numbering of section-level element names represents their depth of nesting, not repetition.)

Care should be taken to ensure that the structure of the article, implied by the style of headings, is correctly reflected in the <section> and <subsectN> elements assigned. See <title> for details of heading typestyles.

Appendices

Any appendices to an article are placed within an <appmat> element, between the <art-body> and <art-back> elements. This contains one or more lt;appendix> elements, each optionally numbered and containing one or more <section>s.

Back matter

The back matter contains an optional <ack> element. This is followed by mandatory <biblist> and <compoundgrp> elements.

This last element is provided as a place to collect together <compound> elements, each of which defines the ID of a chemical compound mentioned in the article, and thus to provide a target for <compoundref> cross-references (which are normally set in bold face: see Cross-references). (The ultimate intention, not to be implemented at this stage, is to provide links back from these <compound> elements to the points in the article where the compound is defined or illustrated.)

Graphics

Graphical objects should be declared as external entities, with a suitable Notation. The RSC application provides a comprehensive set of possible notations, which ought to include all the image formats encountered. Let us know if any new image formats are encountered.

External entity declarations should include PUBLIC identifiers as well as SYSTEM identifiers, e.g.

<!ENTITY ugr1 PUBLIC "RSC// a904043i ugr1" "a904043i-u1.tif" NDATA tiff>

Graphics take the following attributes:

ID: a unique ID for this graphic (see notes below on assigning IDs) (required)
src: the entity which contains the graphic (see notes above on external entities)
height:
width:
pos: "float" for floating graphics: otherwise "fixed". "float" should be used for graphics marked as "A" blocks, while "fixed" should be used for "B" blocks and for graphics appearing in the body of the text. Graphics appearing within tables, equations should be assumed to be fixed.

Chemical formulae, equations, symbols for which no character entity is provided in the DTD and tables which are too complex to encode as XML should all be encoded as a <ugraphic> element. As well as the standard attributes for graphics, this has a displayed attribute which can take the value "displayed" (which indicates that the graphic should be set off from the surrounding text) or "inline" (which means that the graphic should form part of the current line).

Assigning unique id's

In order to make id's unique within each article, a prefix should be added to the identifier assigned by the author:
Table 1 id prefixes for different classes of target


author affiliation aff
chart cht
chemical compound chem
citation cit
equation eqn
figure fig
footnote fn
plate pl
scheme sch
table tab
table footnote tab + fn1
untitled graphic ugr
typesetter-generated graphic (e.g. equations and tables which cannot be encoded in SGML/XML) ug
1 Table footnotes should be given an id which is a combination of the table's id and a unique id for the footnote within that table, e.g. tab2fna. Table footnotes should be given letters (a, b, c, etc).   


Thus, for example, a citation referred to in the paper as 8a should be given the id cit8a, while chemical compound 8a should acquire the id chem8a.

We are using the number or letter in some of the id's to generate some of the numbering within the HTML article: for affiliations, equation numbering and table footnote lettering the aff, eqn, table fn should all be given the literal values that would appear in the text, e.g. affa, affb; eqn1, eqn2; tab1fna, tab3fnc. For the remaining id's a unique number or letter will be sufficient.

Links and cross-references

Internal cross-references within an article should use the standard SGML/XML ID-IDREF mechanism. To enforce this, we have specified as #REQUIRED the id attribute for all the elements that cross-references might point to. It is not practicable to do the same for pointer elements, since their target is not always present. To allow for this, the idrefs attribute is not mandatory. Instead, a presence attribute is provided. When a linking element has no target, this attribute should always be specified, with the value presence="missing".

This table summarises the element types which indicate cross-references, and the target element type for each.


Table 3 Mapping of cross-reference element types to target element types
Cross-reference element type
Target element type
<compoundref> <compound>
<textref> any textual element with an ID attribute
<figref> <figure>
<schemref> <scheme>
<plateref> <plate>
<chartref> <chart>
<eqnref> <equation>
<boxref> <box>
<tableref> <table-entry>
<citref> <citgroup>
<fnoteref> <footnote>
<affref> <aff>


One specific point to note is that <citref> does not point to a <citation> or <journalcit> element: instead it points to <citgroup>. This design allows any number of citations to occur within a single numbered or sub-numbered part of a References list.

In the unlikely event that an external link to another article (also encoded in SGML/XML) needs to be made, the general-purpose <link> element type is provided. This implements the Text Encoding Initiative (TEI) Extended Pointer mechanism, which allows all or part of a document to become the target of a link. It is anticipated that only the ID-based part of the TEI Extended Pointer syntax would be required in practice. Do not use the <link> element without checking with RSC first. The linking strategy described here is likely to be reviewed once the W3C's XLink proposal reaches Recommendation status.

Recognising cross-references. This table summarises typographical conventions which are often used to represent various types of cross-reference. Where a change of font style indicates such a cross-reference, it should always be marked up as such. In such cases, the cross-reference should not also be marked up as a change of font style.
Table 4


type style
data type
cross- reference type
superscript arabic no. [+ letter suffix] citref
superscript letter affref
superscript symbol fnoteref
bold numbers, letters, roman numerals compoundref


Numbering

For the present, numbers should be included in the <no> element if they are required.

There is no need (and no opportunity!) to number figures, schemes, boxes or plates. Suitable prefixes and numbers (e.g. "Fig 1.") will be supplied by style sheets. Other concepts (e.g. citations, equations, appendices, and chemical compounds) have an optional <no> element. This does not need to be used where the numbering scheme follows a simple sequence of arabic numbers, since the entries will be auto-numbered in this case. If any instance of a given element type has a non-standard number within an article, then the <no> element should be specified for all instances of that element type.

However, all of these concepts are allowed to have an ID, and some require one — these IDs still need to be specified even if the title or heading itself can be auto-numbered. We can't (yet) auto-number tables in appendices, which require numbers in the form A1, A2, etc.

Low-level elements

Emphasis and font style elements

Changes in font style should be marked up with the appropriate emphasis tags unless they indicate a specific concept, as discussed above under Cross-references.

Individual elements can be used to mark bold text, italic text, bold italic, underlined text, SMALL CAPS, superscript and subscript. They can also be used in combination to represent, for example, superscript bold text.

Footnotes. Footnotes to be placed just after the first <fnoteref>.

All footnote characters should be auto-generated. In text, they follow the order:

In table footnotes, they just appear as a, b, c, d, etc, where these letters are taken from the end of the id attribute's value.

Text. Spacing:

Equation spacing: +, minus, divide, times, are spaced on either side when in an equation (there is spacing around the mathematical character when it is between two digits e.g. 4 + 4. When it is just the character and one digit there is no space e.g. +4). This also to applies to proportional to, plusminus, similar to, approx. equal to, >, < and their >= variants.

multiple citrefs shouldn't be spaced: <citref idrefs="cit1 cit4 cit5 cit12">1, 4, 5, 12</citref> should be: <citref idrefs="cit1 cit4 cit5 cit12">1,4,5,12</citref>

Figure, scheme, etc references should be placed at the end of the paragraph in which they are first referenced.

<p> in titles to be used for Green Chemistry font change. Second <p> of GC titles will contain the details for the smaller title content. Simple titles don't need to use p at all.

For elements where the content model is empty (ugraphic, colspec, icgraphic) the elements need a closing solidus for XML: <colspec colname="1" colwidth="2.82*" align="left"/>

Compoundrefs: these can take any form, but the ids don't have to exactly match, e.g. <compoundref idrefs="chem61a">6·1a</compoundref>

Tables

Tables will normally appear inline, marked up according to CALS-compatible SGML. The standard CALS attributes should be used to render the table in a form that is as close as possible to the printed result. This includes, but is not limited to, the relative widths of columns, spanning of rows and columns, and the use of lines to separate headings. The specific conventions listed below are intended to be compatible with the approach supported by Adept's table editor:

However, tables will sometimes be too complex to represent in this way, and so will be prepared as a graphic. To deal with this variation, a 'cover element' lt;table-entry> is provided, which contains either an inline <table> entry or a <ugraphic>. It is <table-entry> which requires a unique ID for <tableref> elements to point to, and which contains a <title> element.

One side-effect of this approach is that un-numbered tables can simply be encoded as <table>. From version 3.3 onwards, <table> can appear within text and between paragraphs.

Chemistry

Chemical compounds and simple formulae can often be represented as inline markup. <sup> and <inf> can be used to shift text, and <overbar> and <underbar> to place rules above or below chemical symbols. The character entity sets provided as part of the DTD (especially the ISO Chemistry set and the custom RSC set) support most chemical symbols that will be encountered. The <stack> element type can be used to encode the situation where one character appears directly above another.

Where chemical formulae are too complex to render as inline SGML, an inline or displayed <ugraphic> should be used instead.

Equations

Equations may appear inline, marked up in SGML using the tools available such as <fraction>: 1 /3.

However, equations will fequently be too complex to represent in this way, and so will be prepared as a graphic. To deal with this variation, a 'cover element' <equation> is provided, which contains either an inline <eqntext> entry or a <ugraphic>. <equation> requires a unique ID for <eqnref> elements to point to.

Multi-line text equations can be accommodated by adding another <p>. Within <eqntext>, you should either have no <p> subelements (one-line or inline equations), or nothing but <p> subelements (multi-line equations).

Citations

Where citations follow the standard pattern for journal articles, the <journalcit> element type should be used. In all other cases (including 'difficult' journal article citations, books, theses, computer software, etc.), the more flexible <citation> element type should be used. <citext> should be used to mark up text within the References section which is not a citation of any kind.

Numbering citations. As noted above in Links and Cross- references, the citation number is a property of the enclosing <citgroup> element, not the citation itself. This makes it easy to deal with the case where more than one citation is given under the same reference number. It also allows running text to be mixed with, or indeed take the place of, proper citations.

Note that the expected pattern for numbering citations is to use numbers for top-level entries, and letters for sub-entries. If the citations follow this pattern, the <no> element should not be provided for any <citgroup> element. Instead, nested <citgroup> elements should be used to represent the lower-level citations. (See the source SGML of these instructions for an example of this technique.)

Standard journal citations. Standard journal citations follow this model:

Unless stated otherwise, each element should appear exactly once, and elements should appear in the order given. In such cases, <journalcit> can and should be used. The citation should be entered as a series of analysed subelements. No punctuation should be recorded between each component of the citation, and no style markup (e.g. italic for titles; bold for volume numbers) should be included. Punctuation and styling will be applied by the rendering process. Thus the citation:

G.H. Jonker and J.H. Van Santen, Physica, 1950, 16, 337

should be encoded:

<journalcit><citauth><fname>G. H.</fname><surname>Jonker</surname></citauth> <citauth> <fname>J. H.</fname><surname>Van Santen</surname></citauth> <title>Physica</title><year>1950</year><volumeno> 16</volumeno> <pages><fpage>337</fpage></pages></journalcit>

Non-standard citations. The <citation> element type should always be used for non-standard citations which, do not fit the standard model. The type of citation should be specified in the type attribute. Allowed values are:

This isn't being done at present.

Within citations, the following concepts should always be marked up when they are present:

<citation> elements will be marked up as found, including all punctuation and style changes.

This is an example of a reference to a patent:

S. Iwaya, H. Masumura, Y. Midori, Y. Oikawa and H. Abe, US Patent, 4,404,029, 1983.

This should be encoded:

<citation type="patent"><citauth><fname>S.</fname><surname> Iwaya</surname></citauth>, <citauth><fname>H.</fname><surname>Masumura</surname> </citauth>, <citauth><fname>Y.</fname><surname>Midori</surname></citauth>, <citauth><fname>Y.</fname><surname>Oikawa</surname></citauth> and <citauth><fname>H.</fname><surname>Abe</surname></citauth>, <it>US Patent</it>, 4,404,029, <year>1983</year>.</citation>

Book citations. One particular type of non-standard citation which will frequently occur is a reference to a book, either in whole or in part. Again, <citation> should be used to mark these up. The <editor>, <citpub> and <pubplace> element types will often be required within such citations. A fairly typical, simple, example is:

S. Brooks and B. Johansson, in Handbook of Magnetic Materials, ed. K. H. J. Buschow, 1993, 7th edn.

This should be encoded:

<citation type="book"><citauth><fname>S.</fname><surname> Brooks</surname></citauth> and <citauth><fname>B.</fname><surname>Johansson</surname> </citauth>, in <title>Handbook of Magnetic Materials</title>, ed. <editor> K. H. J. Buschow</editor>, <year>1993</year>, 7th edn.</citation>

Note the following:

A good mixed citation example:

<citgroup id="cit5"> <citation>During the preparation of this manuscript, diester <compoundref idrefs="chem1">1</compoundref> was isolated as a minor side product in the base promoted rearrangement of the analogous (<it>R</it>,<it>R</it>,<it>R</it>,<it>R</it>)-2,3-butane diacetal (BDA) protected dimethyl tartrate, see: <citauth> <fname>M. T.</fname> <surname>Barros</surname> </citauth> , <citauth> <fname>A. J.</fname> <surname>Burke</surname> </citauth> and <citauth> <fname>C. D.</fname> <surname>Maycock</surname> </citauth>, <title>Tetrahedron Lett.</title>, <year>1999</year>, <volumeno>40</volumeno>, <biblscope>1583</biblscope>.</citation>

and a <citext>:

<citgroup id="cit8"> <citext>The strong bias towards axial silylation was seen to fall if the mono sodium alkoxide did <it>not</it> precipitate prior to addition of the silicon halide.</citext></citgroup>

two other points:

a) where a citref appears within another citation. We have extended that content model of citelt so that it can contain "m.simple-text", i.e. any element types which can occur within paragraphs.This change should make citelt a much better 'catch-all' for miscellaneous stuff within citations.

b) where a citation includes a compoundref and ugraphic of the compound. The compoundref is allowed, but the ugraphic isn't. We have created a new class 'para-graphic' for these two element types. They can now appear anywhere 'text-elts' can appear, as well as between paragraphs.

RSC journal abbreviations. The journals published by the RSC have the following abbreviations, which can be used within the SGML/XML framework, e.g. in <journalref> elements:
Table 6


AC Analytical Communications
AN Analyst
CC Chemical Communications
CE Cryst. Eng. Communications
CP PCCP
CS Chem. Soc. Reviews
DT Dalton Transactions
EM J. Environmental Monitoring
FD Faraday Discussions
FT Faraday Transactions
GC Green Chemistry
GT Geo. Trans.
IC/OC/PC Ann Rep (Inorganic, Organic, Physical)
JA JAAS
JC JCR
JM J. Materials Chemistry
MC Mendeleev
NJ New Journal of Chemistry
NP Natural Product Reports
P1 Perkin Transactions 1
P2 Perkin Transactions 2
PO Pesticide Outlook
RC RCR
QU Phys. Chem. Comm.


Lists

Lists can be entered as a <list>, containing an optional <head> and any number of <item> elements. The type attribute can be used to indicate the type of list. It should take one of the following values:

Note that, since <list> can occur within <item>, it is possible to declare lists nested to any depth.

General

If there are internal references that are in effect impossible, just put the text in and leave out the reference. It would helpful to advise us in case an amendment to the DTD may be wise, but usually these are one-offs. One case recently had a number of equations in a single ugraphic, itself called scheme 1. In this case it was not possible to add eqnrefs to the scheme.

Appendix A. Alphabetical list of element types

Element definitions

This section contains a definition of every element type in the RSC DTD, including element types which are not required for the data capture work. These additional element types are included for editorial use within RSC, or to support future processing of the encoded articles. They are indicated thus:

RSC internal use only

a. 'anchor': a wrapper round a resource (an image, scheme, table, etc.). An anchor specifies a non-printable external entity which can augment the resource. Where appropriate, it should be represented as a clickable link to navigate to the external entity. Can contain zero or more:

above. The top half of a stack. Contains 'characters only'.

abstract. An abstract of the article. Contains 'text or paragraphs'.

ack. Acknowledgements for the article. Contains 'text or paragraphs'.

address. A complete postal address. Can be represented by a link, or by a sequence of address subelements:

each separated by spacing but no punctuation.

addrelt. An element within a postal address. Used only when no more specific element type (e.g. city) is appropriate. Can contain 'simple text'.

admin-event. A single event relating to the administration of an article, e.g. its receipt, acceptance, or rejection. Provided in versions 3.4 onwards of the DTD as a place-holder for RSC management information. Has a mixed content model, which allows the following subelements within text:

advert. An advertisement, i.e. any self-contained block of text which is to be 'dropped in' to a journal issue (including information on grants available, etc.). Contains a link, or one or more sections.

aff. An author's affiliation. Contains one or more pairs of:

followed by any of the following which apply:

affref. A reference to an author's affiliation. In practice this element is not used, since authors' affiliations are indicated by the aff attribute on author.

agent. A person playing a role within an admin-event. Contains one person element.

appendix. An appendix to an article. Contains an optional no and one or more sections.

appmat. A container for appendix matter. See above for general guidance.

Contains one or more appendix elements.

art-admin. A container for administrative information relating to an article. Contains, in the order specified:

art-back. A container for an article's back matter. Contains, in the order specified:

art-body. A container for an article's body matter. See above for general guidance.

Contains one or more sections, or one or more news-sections.

art-front. A container for an article's front matter. See above for general guidance on analysing front matter.

Contains a link, or the following elements in the order specified:

art-links. A container for links from an article to other resources. Contains any number of suppinf and/or fulltext elements.

art-toc-entry. Container for resources to use when creating the article's entry in the table of contents for a journal issue. Contains, in the following order:

article. An article. Contains a link element, or the following elements in the order specified:

articleref. RSC internal use only

A pointer to an article (within an issue), used when generating index entries. Contains a link.

arttitle. An article title within a citation or journalcit. Contains 'simple text or paragraphs'.

arttoc. An article's table of contents. Entering an empty <arttoc> element is an instruction to generate an article table of contents from the section and subsection headings (levels a to d, i.e. <section> to <subsect3>) found in the article. In the HTML output, hyperlinks from the ToC to each section are generated. These are based on the section's id if specified, otherwise on a unique system-generated code (which is liable to change each time the document is edited).

Can, if desired, contain toc-head (optional) and toc-entry (optional and repeatable).

authgrp. A container for details of authors and their affiliations. Contains one or more author elements, followed by one or more affs.

author. One author of an article. Repeat for each distinct author. Contains a person, followed by an optional footnote.

below. the bottom half of a stack. Contains 'characters only'.

bi. Indicates that the contained text should be rendered as bold italic. This is preferable to using separate <bo> and <it> elements. Only use this element when it is not possible to deduce why the text is rendered in this way. If possible, always use a more meaningful element type.

biblist. A container for the bibliography at the end of an article. Contains a mixture of text and citgroups.

biblscope. The scope of a citation within the work cited. Can include references to sections, chapters, page ranges, etc. Contains 'simple text'.

biography. A person's biography. Contains a link, or one or more sections.

bo. Indicates that the contained text should be rendered as bold. Only use this element when it is not possible to deduce why the text is rendered in this way. If possible, always use a more meaningful element type (specifically compoundref, which is the most common reason for bold-face within article text).

board. RSC internal use only

a journal or issue's [Editorial] Board. Contains a link, or an optional title followed by zero or more groups and/or members.

book-review. A book review, consisting of the citation of the book being reviewed, reviewer's details, and the review itself. Contains a citation, followed by an optional authgrp for the reviewer's details (i.e. the 'author' of the review), followed by one or more paragraphs (p) and/or 'inter-paragraph elements'.

box. a floating text box. Contains a single section.

boxref. A reference to a floating text box. Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

byline. RSC internal use only

a journal's byline. Contains 'simple text'.

chart. A chart. Contains an optional title. See above for general guidance on encoding graphics.

chartref. A cross-reference to a chart. Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

citation. Container for an individual citation that doesn't fit the model for a standard journal citation (journalcit). Should only be used if <journalcit> cannot. See above for general guidance on encoding citations.

Contains mixed content, which can include the following element types as required:

citauth. An author within a citation or journalcit element. Contains a link, or an optional fname followed by a mandatory surname.

citext. Citation text. Used only when it is not possible to encode material found within a citations list using journalcit or citation. (This should only apply when the text isn't actually a citation at all.) Contains 'simple text'.

citgroup. A group of citations with a single reference number. (Most <citgroup>s will only contain a single journalcit or citation element.) See above for general guidance on encoding citations.

Contains an optional no element for a non-standard citation number, followed by one or more of the following, in any order:

citpub. The publisher of a citation. Contains 'simple text'.

To be added by data capture agency.

citref. A reference to a citation. Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

city. The name of a city. Must consist of character data only.

coden. RSC internal use only

A CODEN identifier for a journal. Contains character data only.

colspec. A specification of the characteristics of a column in a table. Empty element: has no data content.

commentary. A description of the value of a citgroup. Contains 'simple text'

compname. The name of a chemical compound. Contains a link or 'simple text'.

compound. Specifies the id of a chemical compound. Optionally contains one or more compoundref elements, each linking to a definition of that compound

compoundgrp. A container for zero or more compound elements. A <compoundgrp> is required at the end of each article so that compoundref elements have a target to point to. (At present no use is made of these links when rendering articles.)

compoundref. A reference to a chemical compound. Contains 'emphasised text' specifying the compound's code. See above for general guidance on creating cross-references.

conference. Information about a conference or similar meeting. Contains an optional sequence number (no), followed by zero or more of the following, in any order:

confgrp. A container for zero or more conference elements.

confname. A conference's name or title. Contains 'simple text'.

contact. A contact, e.g. for a conference. Contains zero or more of the following, in any order:

country. A country name. Must consist of character data only.

cpyrt. RSC internal use only

A copyright statement. Contains 'simple text'.

date. A general year-month-day date. Contains a year, followed by an optional month and an optional day.

daterange. A range of two dates.

day. A numerical day: 1/2/3/.../31. Should not contain anything apart from the day number itself.

dd. A definition description, part of a deflist. Contains 'text or paragraphs'.

dedicate. A dedication. Contains 'text or paragraphs'.

def. The definition of a term, part of a deflist. Contains the term itself, followed by its definition in a dd.

deflist. A definition list, containing an optional head, and one or more definitions def).

denom. The denominator of a fraction. Contains 'simple text'.

doi. A Digital Object Identifier. Contains character data only.

editnote. An editorial note. Use this element type for any comments generated by the editing process - these do not form part of the article. Contains the following, in this order:

editor. The editor of an article or book. Contains 'simple text'.

email. An e-mail address. Contains character data only. Only enter the actual address: the prefix E-mail: will be generated by style sheets.

entry. An entry (cell) in a table. See above for general guidance on encoding tables.

Contains mixed content which can include text elements, graphics, and equations.

eqnref. A reference to an equation.

Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

eqntext. An equation expressed in textual form. See above for general guidance on encoding equations.

Contains 'simple text or paragraphs'. Use ps to lay out multi-line equations.

equation. An equation. See above for general guidance on encoding equations.

Contains an optional no, followed by a textual equation (eqntext) or a graphic displaying the equation (ugraphic).

fax. A fax number. Can only contain character data.

figref. A cross-reference to a figure. Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

figure. A figure. Contains an optional title. See above for general guidance on encoding graphics.

fname. A person's first name. Contains 'simple text'.

fnoteref. A reference to a footnote (at the end of the article, or in the footer of a table). Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on footnotes, and on creating cross-references.

footer. A sequence of paragraphs at the end of a news item, typically set in italic. Contains one or more ps.

footnote. A footnote in the article, or in a table footer. See above for general guidance on encoding footnotes.

Footnotes in the article are placed at the point where the footnote reference is to appear in the rendered result. This means that fnoteref is only required for such footnotes if the same footnote is referenced more than once. In contrast, table footnotes are placed within the tfoot, and are referenced by a separate fnoteref.

Contains text or paragraphs.

fpage. The number of the first page within an issue on which the printed version of an article appears. Can only contain character data.

fraction. A fraction. Contains a numerator (numer), followed by a denominator (denom).

fulltext. A link to the full text of an article (e.g. in PDF). Probably not required - do not use without checking with RSC.

Contains a link element.

group. RSC internal use only

A group of people with similar roles within an Editorial Board. Contains an optional title, followed by zero or more members.

head. A heading (e.g. for a list, index, or definition list). Contains paragraphs or text.

icgraphic. A graphic to be included in an illustrated contents list entry. Empty element: has no contents. See above for general guidance on encoding graphics.

ictext. Text describing the article, to be included in an illustrated contents list entry. Contains paragraphs or text.

index. RSC internal use only

An [author] index. Contains an optional head, followed by zero or more index-entrys.

index-entry. RSC internal use only

An entry in an [author] index. Contains a value, followed by one or more articlerefs.

inf. Inferior (subscript) text. Indicates that the contained text should be rendered as subscript. Only use this element when it is not possible to deduce why the text is rendered in this way. If possible, always use a more meaningful element type.

info. Information, e.g. about a journal. Contains a link, or one or more sections.

issn. RSC internal use only

The International Standard Serial Number for a journal. Contains character data only.

issue. RSC internal use only

One issue of a journal. Contains a link, or the following elements in this order:

issue-back. RSC internal use only

The back matter for an issue. Contains any number of any of the following, in any order:

issue-front. RSC internal use only

The front matter for an issue. Contains any number of any of the following, in any order:

issue-toc. RSC internal use only

The table of contents for an issue. Contains an optional toc-head, followed by zero or more toc-entry elements.

issueid. RSC internal use only

An identifier (other than the issue number) for an issue of a journal. Can only contain character data.

issueno. The issue number within a volume. Can only contain character data. When used within the issue element, this should be a 3-digit number with leading zeroes. Still true?

To be added by data capture agency Still true?

issueref. A reference to [a document describing] one issue of a journal. See above for general guidance on creating cross-references.

Contains a link, or these elements in the following order:

it. Indicates that the contained text should be rendered as italic. Only use this element when it is not possible to deduce why the text is rendered in this way. If possible, always use a more meaningful element type.

item. An item within a list. See above for general guidance on encoding lists.

Contains paragraphs or 'simple text'.

jnltrans. A translation of a simple journal citation (journalcit). Also used for Chem. Abstracts references, with the abstract number in <fpage>.

Contains the following, in the order specified:

journal. RSC internal use only

A description of an RSC journal. Contains a link, or these elements in the order specified:

journalcit. A citation which follows the standard model for simple citations of journal articles. Use citation for more complex cases, and for citations to anything other than journal articles. Use citext only for text within the References section which is not a citation at all. See above for general guidance on encoding citations.

Contains these elements in the order specified:

journalref. A reference to a document describing a journal. See above for general guidance on creating cross-references, and for a list of RSC journal codes.

It contains a link element, which should have the appropriate journal code as its value. These codes are listed below.

Contains a link, or these elements in the order specified:

keyword. A keyword describing an article's content. Contains 'simple text'.

link. A link to [part of] another document. Contains simple text.

Although the attributes within <link> provide a powerful means of expressing links, they are not yet being used. Instead, the data content within <link> is used to specify the target document. This content will be a unique identifier for the document, e.g. a journal code or an article's manuscript number.

list. A list. See above for general guidance on encoding lists.

Contains an optional head, followed by one or more items.

location. A location (i.e. an address). Contains one or more of the following, in any order:

logo. RSC internal use only

A logo. Contains a ugraphic specifying the image to be used.

lpage. The number of a printed article's last page. Contains character data only.

member. RSC internal use only

A member of a <group>. Contains an optional role, followed by zero or more persons.

month. A month. Contains character data only. Months should be specified in full, e.g. "January". Since the style sheet can convert numeric months to their full form, should we be allowing, or even asking for, numeric months?

ms-id. The RSC's unique identifier for an article. Contains character data only.

Conventions for formatting article identifiers are given above. To be added by data capture agency

nameelt. A component of an organisation's name. Contains 'simple text'.

news-article. A full article (with title and author details, and back matter such as a list of citations) found within a news section. Contains these elements, in the order specified:

news-item. A relatively simple news item. For more complex material, use news-article instead. Contains these elements, in the order specified:

news-section. A container for one or more news articles or (more usually) news items, plus other formats such as advertisements and conference listings. Can contain nested <news-section>s to support e.g. a two-level structure of news sections.

Contains an optional title, followed by zero or more of the following, in any order:

no. A number or other identifier (for a table, figure, etc.). Contains character data only. See above for general guidance on numbering strategy.

no-of-pages. The number of pages in the printed version of an article. Contains character data only.

note. A note. Contains text or paragraphs.

numer. The numerator of a fraction. Contains 'simple text'.

office. The RSC office responsible for managing an article. Contains character data only.

org. An organisation's name and address. Contains a link, or one or more orgnames followed by zero or more addresses.

orgname. An organization's name. Contains one or more nameelts.

overbar. An overbar. Indicates that a bar should be placed above all the text within this element. Contains 'simple text'.

p. A paragraph. Contains mixed content (i.e. text and subelements intermixed), including any of these elements, at any point and in any order:

pages. The range of pages covered by a citation. Contains a fpage, optionally followed by a lpage.

persname. A person's name. Contains the following, in the order specified:

person. Details about a person. Contains a link, or the following elements in the order specified:

phone. A telephone number. Contains character data only.

pii. A Publisher Item Identifier. Contains character data only.

plate. A plate. Contains an optional title. See above for general guidance on encoding graphics.

plateref. A reference to a plate. Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

postcode. A postcode. Contains character data only.

pubfront. Should this be 'RSC internal use only'?

Publication front matter. Contains the following elements in the order specified:

published. A link to a document/resource in which an article has been published. Contains a citext, or the following elements in the order specified:

Use the analysed citation subelements to describe print publication, or <citext> to record online publication. Is that right? Web publication uses <pubfront>. RK: not sure that this is right - citext??

publisher. RSC internal use only

The publisher of a journal. Contains "organisation" subelements, i.e. a link, or one or more orgnames followed by zero or more addresses. <aff> now has <address>, and <org> within it also has <address> - overkill?

pubname. A publisher name. Contains 'simple text'. This is no longer linked to anything, so should be removed from the DTD.

pubplace. The place of publication of a book, etc. Contains 'simple text'.

qualifier. A qualification to a person's name, such as a title, an honorific, or a phrase such as 'the late'. Contains 'simple text'.

received. A container for details of the date when, and place where, an article was received. Contains an optional city, followed by a date.

role. RSC internal use only

A role played by one or more people. Contains 'simple text'.

roman. Indicates that the contained text should be rendered as a roman typeface. Contains 'simple text'.

Only use this element when it is not possible to deduce why the text is rendered in this way. If possible, always use a more meaningful element type.

row. A row in a table or table heading. See above for general guidance on encoding tables.

Contains one or more entry elements.

sansserif. Indicates that the contained text should be rendered in a sans serif typeface. Contains 'simple text'.

Only use this element when it is not possible to deduce why the text is rendered in this way. If possible, always use a more meaningful element type.

scheme. A scheme. Contains an optional title. See above for general guidance on encoding graphics.

schemref. A reference to a scheme. Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

scp. Indicates that the contained text should be rendered in small caps. Contains 'simple text'.

Only use this element when it is not possible to deduce why the text is rendered in this way. If possible, always use a more meaningful element type.

section. A top-level section. Contains these elements in the order specified:

sercode. RSC internal use only

A serial (journal) code, conforming to the list of codes given above. Contains character data only.

To be added by data capture agency

sertitle. A serial (journal) title. Contains 'simple text' or paragraphs.

the DTD now only has <sertitle> within <jnltrans>. Elsewhere it has become <title>. The style sheet needs updating to take account of this (the code described here will never be called upon), and <sertitle> should probably be removed from the DTD and replaced by <title> within <jnltrans>.

sici. A Serial Item Contribution Identifier. Contains character data only.

stack. One or more characters appearing directly above other characters (like a fraction without the horizontal line). Contains above followed by below.

state. A geopolitical unit such as a state, county, etc. Contains character data only.

subject. A broad subject heading, ideally taken from a controlled list. Contains 'simple text'.

subsect1. A level-1 subsection. Contains these elements in the order specified:

subsect2. A level-2 subsection. Contains these elements in the order specified:

subsect3. A level-3 subsection. Contains these elements in the order specified:

subsect4. A level-4 subsection. Contains these elements in the order specified:

subsect5. A level-5 subsection. Contains these elements in the order specified:

subsect6. A level-6 subsection. Contains these elements in the order specified:

subtitle. A [table] subtitle. Contains 'simple text' or paragraphs.

sup. Indicates that the contained text should be rendered in superscript. Contains 'simple text'.

Only use this element when it is not possible to deduce why the text is rendered in this way. If possible, always use a more meaningful element type. <sup> is often mistakenly used instead of <citref>.

suppinf. Contains a link to supplementary information for an article.

surname. A surname. Contains 'simple text'.

table. A table, encoded using CALS-compliant XML markup. See above for general guidance on encoding tables.

(Tables which cannot be thus encoded should be prepared as images, and encoded as ugraphics.)

Contains an optional title, followed by an optional subtitle, followed by one or more tgroups. Note that <title> and <subtitle> within table-entry should be used in preference to these elements, since this allows titles for XML-encoded and 'image' tables to be treated consistently. Although, as the DTD notes, we can't clear %titles;, we could set parameter entity %tbl.tbl-titles.mdl to "" and so remove this possibility.

table-entry. 'cover group' for a table, whether declared inline as tableor given as a ugraphic. See above for general guidance on encoding tables.

Contains an optional title, followed by an optional subtitle, followed by either table or ugraphic.

tableref. A reference to a table. Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

tbody. A table's body matter (i.e. the main table, ignoring any header or footer). See above for general guidance on encoding tables.

Contains one or more rows.

term. A term being defined in a deflist. Contains 'simple text'.

textref. A cross-reference to text elsewhere in the article. Contains 'emphasised text' giving a human-readable description of the cross-reference. See above for general guidance on creating cross-references.

tfoot. The footer area of a table. See above for general guidance on encoding tables.

Contains zero or more colspecs, followed by one or more rows. Shouldn't <tfoot> have some CALS-style attributes?

tgroup. A table group. See above for general guidance on encoding tables.

Contains these elements, in the order specified:

thead. The header area of a table. See above for general guidance on encoding tables.

Contains zero or more colspecs, followed by one or more rows. Shouldn't <thead> have some CALS-style attributes?

title. A title (of a figure, table, journal, etc.). Contains 'simple text' or paragraphs.

titlegrp. A container for an article's main titles. Contains one or more titles.

toc-entry. An entry in a table of contents. Contains 'simple text' or paragraphs.

toc-head. Heading for a table of contents. Contains 'simple text'.

trans. A translation (of a citation)

Contains mixed content, which can include the following element types as required:

ugraphic. An untitled graphic. Use this element to encode any graphical content which doesn't have a title. See above for general guidance on encoding graphics.

ul. Indicates that the contained text should be underlined. Contains 'simple text'.

underbar. An underbar. Indicates that a bar should be placed below all the text within this element. Contains 'simple text'. In what way is this different from <ul>?

unknown. A feature in the text which cannot be encoded by any other element type in the DTD. Use the type attribute to indicate the nature of the feature. Do we need to generate some warning when this element is used?

Contains 'simple text'.

url. A URL. Contains character data only.

value. RSC internal use only

The value of an index entry. Contains character data only.

volume. RSC internal use only

One volume of a journal. Contains a link, or the following elements in this order:

volumeno. A journal volume number. Contains character data only.

When used within the <volume> element, this should be a 3-digit number with leading zeroes Still true?

volumeref. A reference to one volume of a journal. See above for general guidance on creating cross-references.

Contains a link, or the following elements in this order:

warning. A warning. Contains 'simple text'.

who. The identity of the person making an editorial note (editnote). Contains 'simple text'. Wouldn't it make more sense to have <person> in place of this element type - replace by <person> in next version of DTD.

year. A 4-digit year. Contains character data only. The value "PENDING" is allowed for date within pubfront.

Appendix B. Notations


Table -notations Notations recognized within the RSC application
Name
PUBLIC identifier where known
bmp "+//ISBN 0-7923-9432-1::Graphic Notation//NOTATION Microsoft Windows bitmap//EN"
cgm "-//USA-DCD//NOTATION Computer Graphics Metafile//EN"
cgm-binary "ISO 8632/3//NOTATION Binary encoding//EN"
cgm-char "ISO 8632/2//NOTATION Character encoding//EN"
cgm-clear "ISO 8632/4//NOTATION Clear text encoding//EN"
eps "+//ISBN 0-7923-9432-1::Graphic Notation//NOTATION Adobe Systems Encapulated PostScript//EN"
fax "-//USA-DOD//NOTATION CCITT Group 4 Facsimile Type 1 Untiled Raster//EN"
gif "+//ISBN 0-7923-9432-1::Graphic Notation//NOTATION Compuserve Graphic Interchange Format//EN"
iges "-//USA-DOD//NOTATION (ASME/ANSI Y14.26M-1987) Initial Graphics Exchange Specification//EN"
jpeg "ISO/IEC 10918:1993//NOTATION Digital Compression and Coding of Continuous-tone Still Images (JPEG)//EN"
mpeg1aud "ISO/IEC 11172-3:1993//NOTATION Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 3: Audio//EN"
mpeg1vid "ISO/IEC 11172-2:1993//NOTATION Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s - Part 2: Video//EN"
mpeg2aud "ISO/IEC 13818-3:1995//NOTATION Coding of moving pictures and associated audio: Part 3. Audio//EN"
mpeg2vid "ISO/IEC 13818-2:1995//NOTATION Information technology - Coding of moving pictures and associated audio: Part 2. Video//EN"
pcx "+//ISBN 0-7923-9432-1::Graphic Notation//NOTATION ZSoft PCX bitmap//EN"
pict "+//ISBN 0-7923-9432-1::Graphic Notation//NOTATION Apple Computer Quickdraw Picture//EN"
sgml "+//ISO 8879:1986//NOTATION Information processing - Text and office systems - Standard Generalized Markup Language (SGML)//EN"
tex "+//ISBN 0-201-13448-9::Knuth//NOTATION The TeXbook//EN"
tiff "+//ISBN 0-7923-9432-1::Graphic Notation//NOTATION Aldus/Microsoft Tagged Interchange File Format//EN"
wmf "+//ISBN 0-7923-9432-1::Graphic Notation//NOTATION Microsoft Windows Metafile//EN"
chemdraw
eqn
pdf
ps


Appendix C. Changes to the RSC DTD

This Appendix lists the changes made to the RSC DTD from version 3.4 onwards.

Summary of changes in version 3.4

Version 3.4 of the RSC Article DTD is a maintenance release, which aims to solve problems encountered while encoding articles, and to provide the RSC with the opportunity to add improved management information to articles.

The following changes are relevant to the encoding of actual articles:

The following changes are only relevant to RSC's internal management procedures:

Summary of changes in version 3.5

The following changes in version 3.5 will affect the encoding of articles:

The following changes are only relevant to RSC's internal management procedures:

Summary of changes in version 3.6

This version contains the following changes:

References

1(a) http://www.oasis-open.org/html/a502.htm; (b) http://www.oasis-open.org/html/a503.htm.

This journal is © The Royal Society of Chemistrye\ Unassigned