![]() |
|
![]() |
![]() |
|
This is a document mock-up intended as a drafting tool for work on the XML Encoding Rules (XER). There is no formalized activity yet initiated on XER in a formal standards body. Key reference documents include: ASN.1 and Basic Encoding Rules (BER). See XER Decisions and Proposals for background on the consensus reached and pending issues. To join the XER discussion list, see <http://asf.gils.net/xer/list.html>. DATA NETWORKS AND OPEN SYSTEM OSI NETWORKING AND SYSTEM ASPECTS ABSTRACT SYNTAX NOTATION ONE (ASN.1) Information Technology ASN.1 Encoding Rules Specification of XML Encoding Rules (XER) Contents
Summary This Recommendation | International Standard describes a set of encoding rules that can be applied to data values of all ASN.1 types to achieve a more human-readable representation than that achieved by the Basic Encoding Rules and its derivatives (described in Recommendation X.690). It is implicit in the specification of these encoding rules that they are also used for decoding. Introduction The publications ITU-T Rec. X.680 | ISO/IEC 8824-1, ITU-T Rec. X.681 | ISO/IEC 8824-2, ITU-T Rec. X.682 | ISO/IEC 8824-3, ITU-T Rec. X.683 | ISO/IEC 8824-4 together describe Abstract Syntax Notation One (ASN.1), a notation for the definition of messages to be exchanged between peer applications. This Recommendation | International Standard defines encoding rules that may be applied to data values of types defined using the notation specified in ITU-T Rec. X.680 | ISO/IEC 8824-1. Application of these encoding rules produces a transfer syntax for such values. It is implicit in the specification of these encoding rules that they are also to be used for decoding. There are more than one set of encoding rules that can be applied to values of ASN.1 types. This Recommendation | International Standard defines a set of XML (eXtensible Markup Language) Encoding Rules (XER). XER uses the eXtensible Markup Language (XML) recommendation of the World Wide Web Consortium (W3C) to achieve a more human-readable representation than that achieved by the Basic Encoding Rules (BER) and its derivatives described in ITU-T Rec. X.690 | ISO/IEC 8825-1. XER allows for information described in ASN.1 to be carried in XML. An XER definition specifies the equivalence and necessary conversion between appropriate ASN.1 encoded data structures and XML encoded data structures. Annex A gives an example of the application of the XML Encoding Rules. It does not form an integral part of this Recommendation | International Standard. Recommendation | International Standard Information Technology ASN.1 Encoding Rules: Specification of XML Encoding Rules (XER) This Recommendation | International Standard specifies a set of XML Encoding Rules that may be used to derive a transfer syntax for data values of types defined in ITU-T Rec. X.680 | ISO/IEC 8824-1. These XML Encoding Rules are also to be applied for decoding such a transfer syntax in order to identify the data values being transferred. The encoding rules specified in this Recommendation | International Standard are used at the time of communication; are intended for use in circumstances where compatibility with XML or human-readability of the representation of data values is the major concern in the choice of encoding rules. The following Recommendations and International Standards contain provisions which, through reference in this text, constitute provisions of this Recommendation | International Standard. At the time of publication, the editions indicated were valid. All Recommendations and Standards are subject to revision, and parties to agreements based on this Recommendation | International Standard are encouraged to investigate the possibility of applying the most recent editions of the Recommendations and Standards listed below. Members of IEC and ISO maintain registers of currently valid International Standards. The Telecommunication Standardization Bureau of the ITU maintains a list of currently valid ITU-T Recommendations. 2.1 Identical Recommendations | International Standards ITU-T Recommendation X.200 (1994) | ISO/IEC 7498-1:1994, Information technology Open Systems Interconnection Basic Reference Model: The basic model. ITU-T Recommendation X.226 (1994) | ISO/IEC 8823-1:1994, Information technology Open Systems Interconnection Connection-oriented presentation protocol: Protocol specification. ITU-T Recommendation X.680 (1997) | ISO/IEC 8824-1:1998, Information technology Abstract Syntax Notation One (ASN.1): Specification of basic notation. ITU-T Recommendation X.681 (1997) | ISO/IEC 8824-2:1998, Information technology Abstract Syntax Notation One (ASN.1): Information object specification. ITU-T Recommendation X.682 (1997) | ISO/IEC 8824-3:1998, Information technology Abstract Syntax Notation One (ASN.1): Constraint specification. ITU-T Recommendation X.683 (1997) | ISO/IEC 8824-4:1998, Information technology Abstract Syntax Notation One (ASN.1): Parameterization of ASN.1 specifications. 2.2 Paired Recommendations | International Standards equivalent in technical content CCITT Recommendation X.208 (1988), Specification of Abstract Syntax Notation One (ASN.1). ISO/IEC 8824:1990, Information Technology Open Systems Interconnection Specification of Abstract Syntax Notation One (ASN.1). ISO International Register of Coded Character Sets to be used with Escape Sequence. ISO/IEC 2022:1994, Information processing ISO 7-bit and 8-bit coded character sets Code extension techniques. ISO 6093:1985, Information processing Representation of numerical values in character strings for information interchange. ISO 6429:1992, Information technology Control functions for coded character sets. ISO/IEC 10646-1:1993, Information technology Universal Multiple-Octet Coded Character Set (UCS): Architecture and Basic Multilingual Plane. ISO/IEC 10646-1:1993/Amd.2:1996, Information technology Universal Multiple-Octet Coded Character Set (UCS) Amendment 2, UCS Transformation Format 8 (UTF-8). For the purposes of this Recommendation | International Standard, the following definitions apply. This Recommendation | International Standard makes use of the following terms defined in ITU-T Rec. X.690 | ISO/IEC 8825-1: a) dynamic conformance; b) static conformance; c) data value; d) encoding (of a data value); e) sender. 3.2 Specification of Basic Notation For the purposes of this Recommendation | International Standard, all the definitions in ITU-T Rec. X.680 | ISO/IEC 8824-1 apply. 3.3 Information Object Specification For the purposes of this Recommendation | International Standard, all the definitions in ITU-T Rec. X.681 | ISO/IEC 8824-2 apply. The term "octet" is frequently used in this Recommendation | International Standard to stand for "eight bits". The use of this term in place of "eight bits" does not carry any implications of alignment. Where alignment is intended it is explicitly stated in this Recommendation | International Standard. ASN.1 Abstract Syntax Notation One W3C World Wide Web Consortium XER XML Encoding Rules of ASN.1 XML eXtensible Markup Language (recommendation of the W3C) This Recommendation | International Standard references the notation defined by ITU-T Rec. X.680 | ISO/IEC 8824-1. 7.1 General rules for encoding 7.1.1 XER generated XML should be readily understandable XML in common practice has the characteristic that the XML tag names are mnemonic for human readers as well as being tokens for machine processes. Although one could generate well-formed XML by simply converting numeric ASN.1 tags to alphanumeric equivalents that follow XML syntax, these would not be mnemonic for human readers. Therefore, it is necessary to have rules for automatically generating XML tag names from the ASN.1 identifiers in ASN.1 specifications, rather than from just the numeric ASN.1 tags. Coded values are expressed in their human language equivalents to the extent such equivalency information is accessible to XER processors. The XML generated with XER is then be understandable by humans using nothing more than a display parser built into one of the Internet browsers. XER introduces several XML tag names in addition to ASN.1 identifiers in ASN.1 specifications (see, for example, rules 7.1.2.2, 7.1.2.3, 7.1.10, 7.7, 7.10, and 7.19). Having these XML tag names begin with an upper-case letter assures that they do not duplicate any XML tag names generated from an ASN.1 identifier, as ASN.1 requires that identifiers begin with a lower-case letter. 7.1.2 Encoding XML tags for ASN.1 components of a CHOICE, SEQUENCE or SET type An ASN.1 component is of the form
General rule for ASN.1 Type Reference encoding: output XML start tag using
generated XML tag name It is recommended that ASN.1 specifications be written according to ASN.1:1994, which mandates the unique naming of components. If the optional name on the left hand side of the Type Reference is present, the optional name is used to generate an XML tag name. For ASN.1 specifications written to eariler versions of ASN.1, a new name is formed by concatenating the string "Name" with the smallest integer greater than or equal to 1 that will form a unique name within the Type Reference. These numbers will be allocated based on order of appearance. For example, the following CHOICE occurs in Z39.50
version 2: 7.1.5 XER requires well-formed XML All XER generated XML must be well-formed and otherwise compliant with the appropriate version of XML. 7.1.6 XER does not require XML validation XER operates on record values but the underlying structure of the record, fully represented only in the particular ASN.1 specification, must also be communicated. A useful feature of XML is that some aspects of the record structure can be inferred merely from the encoding of record values in well-formed XML. XER should be designed to exploit this feature of XML. In many typical applications, the inferred structure is adequate for using a record value encoded with XER generated XML and it is expected that the XML generated with XER will sometimes be processed without validation. 7.1.7 Optional DTD and schema representations are subordinate to the ASN.1 specification Beyond the inferred structure, there are several other ways in XML to represent analogues of ASN.1 specifications for record structures. Among these are an XML DTD, an XML-Data schema, and a DCD, plus more abstract mechanisms such as XMI. Such mechanisms can be useful for validating XER generated XML, but the canonical form of record structures in the context of XER is always the base ASN.1 specification. 7.1.8 XER does not distinguish between implicit and explicit ASN.1 tagging XER defines encoding and decoding as though the ASN.1 tagging environment were IMPLICIT TAGS. 7.1.10 XER uses XML namespace for top-level scope definition XER will use the XML namespace attribute (see http://www.w3.org/TR/REC-xml-names/
) to identify the particular XML with respect to the source of its
component XML elements and attributes. Here is an example top-level XML
element generated by XER using the Z39.50 ASN.1
specification: 7.1.11 XER uses XML namespace for identifiers not globally unique In an ASN.1 specification, any particular identifier need not be unique. An ASN.1 identifier is only required to be unique within a context, i.e., between the curly brackets which are after a SEQUENCE, SET or CHOICE type. In an XML encoding, however, the tag name for an XML element must be globally unique. XER uses the XML namespace attribute where necessary to
construct unique tag names for XML elements based on ASN.1 identifiers.
The XML namespace attribute is relative to the URI of the top-level
element, formed by appending to the URI a slash character and the name of
the immediate container. For example, suppose the top level element has
the attribute xmlns="http://host/something" and a structure "foo" that
would contain the non-unique element "bar". The constructed unique name
foo:bar is then represented as the element tag 7.2 Encoding of a boolean data value BOOLEAN
7.3 Encoding of an integer data value An ASN.1 INTEGER data value will be encoded in XML as a sequence of decimal (base 10) digits rendered as text. A leading minus sign will indicate negative numbers. Other punctuation, embedded white space, or any other characters are illegal. Example: the ASN.1 component
with a data value of 1000000 is encoded as:
7.4 Encoding of an enumerated data value INTEGER { identifier ( num ) ... }
7.5 Encoding of a real data value An ASN.1 REAL data value will be encoded in XML as a sequence of decimal (base 10) digits rendered as text. A leading minus sign will indicate negative numbers and a period will indicate the decimal point, if applicable. An exponent (scientific notation) will be encoded by following the mantissa with an "e" or "E", either a "+" or a "-" and the exponent, in base 10. Positive infinity will be encoded as "PLUS-INFINITY" and negative infinity will be encoded as "MINUS-INFINITY". Other punctuation, embedded white space, or any other characters are illegal. Examples: 7.6 Encoding of a bitstring data value Two types of BitStrings are defined in ASN.1: those with named bits and those with unnamed bits. BitStrings with named bits are encoded as a list of names of bits, delimited by white space, and the order of the named bits is not significant, i.e., BIT STRING { identifier ( num ) ... }
Example: the ASN.1 Type Reference
with all bits set to "true" could be encoded as
BitStrings with unnamed bits are encoded as a sequence of binary digits. The first binary digit corresponds to bit zero. Characters other than binary digits are assumed to have been inserted for human readability and are ignored. XER places no limit on the length of a BitString. 7.7 Encoding of an octetstring data value OctetStrings can be encoded as either a sequence of XML characters or as case-insensitive hexadecimal strings. If the hexadecimal encoding is chosen, then the enclosing XML element must include the element "Hex" and each octet must be encoded within the "Hex" element as paired hexadecimal digits. Embedded white space is ignored. Any other characters are illegal. OCTET STRING
Example: the ASN.1 Type Reference
With a data value of "Ralph" could be encoded as :
or
7.8 Encoding of a null data value Null is encoded as an XML empty element. Example: the ASN.1 Type Reference
would be encoded as:
7.9 Encoding of a sequence data value An ASN.1 SEQUENCE contains a list of components, each of which may need to be encoded. The Type References must be encoded in the order specified in the ASN.1. The contents of a Sequence are encoded according to the rules for the Type References contained in the Sequence, i.e., SEQUENCE { field [ OPTIONAL ] ... }
Example: the ASN.1 Production
With a data values of "16384" and "500000" respectively could be encoded as:
7.10 Encoding of a sequence-of data value A SEQUENCE OF clause is followed by a single Type. Any number of occurrences of that Type can be encoded. The Type is encoded according to the rules for the Production that defined it, i.e., SEQUENCE OF type
7.11 Encoding of a set data value An ASN.1 SET contains a list of Type References, each of which may need to be encoded. The contents of a Set are encoded according to the rules for the Type References contained in the Set. The Type References may be encoded in any order. Example: the ASN.1 Production
Would be encoded as:
or: Example: the ASN.1 Production
Would be encoded as:
7.12 Encoding of a set-of data value 7.13 Encoding of a choice data value An ASN.1 CHOICE contains a list of Type References. A single Type Reference from the list is encoded according to the rules for that Type Reference, i.e., CHOICE { field ... }
CHOICE's always generate exactly one XML element. Example: the ASN.1 Production
Would be encoded as:
7.14 Encoding of a ASN.1 tagged data value 7.16 Encoding of an instance-of data value 7.17 Encoding of a data value of the embedded-pdv type 7.18 Encoding of an object identifier data value ObjectIdentifiers will be encoded as sequences of dot-delimited decimal digits, i.e., OBJECT IDENTIFIER
Example: the ASN.1 Type Reference
with a data value of { iso(1) member-body(2) ansi(840) z39-50(10003) z39-50-recordSyntax(5) usmarc(10) } would be encoded as:
7.19 Encoding of a data value of the external type Externals are defined in the ASN.1 Production: EXTERNAL ::= [UNIVERSAL 8] IMPLICIT SEQUENCE
{ Where the single-ASN1-type is chosen for data described by ASN.1 and octet-aligned is chosen for all other data. XER encodes an external with a XML tag of <External>. Example: A USMARC record (not defined with ASN.1) would be encoded in an EXTERNAL as:
7.20 Encoding for data values of the restricted character string types 7.21 Encoding for data values of the unrestricted character string type 7.22 Encoding for data values of the any character string type An ASN.1 ANY does not specify any contents. At this point in the record, any well-formed XML generated from an ASN.1 specification may be generated according to the rules for that ASN.1 specification. 7.23 Encoding for data values of the general string type GeneralStrings will be encoded as a sequence of XML characters. 7.24 Encoding for data values of the visible string type VisibleStrings will be encoded as a sequence of XML characters. 8.1 Dynamic conformance is specified by clause 9 onwards. 8.2 Static conformance is specified by those standards which specify the application of these XML Encoding Rules. 8.3 The rules in this Recommendation | International Standard are specified in terms of an encoding procedure. Implementations are not required to mirror the procedure specified, provided the bit string produced as the complete encoding of an abstract syntax value is identical to one of those specified in this Recommendation | International Standard for the applicable transfer syntax. 8.4 Implementations performing decoding are required to produce the abstract syntax value corresponding to any received bit string which could be produced by a sender conforming to the encoding rules identified in the transfer syntax associated with the material being decoded. 9 Use of XER in transfer syntax definition 9.1 The encoding rules specified in this Recommendation | International Standard can be referenced and applied whenever there is a need to specify an unambiguous, undivided and self-delimiting octet string representation for all of the data values of a single ASN.1 type. NOTE All such octet strings are unambiguous within the scope of the single ASN.1 type. They would not necessarily be unambiguous if mixed with encodings of a different ASN.1 type. 9.2 The following object identifier and object descriptor values are assigned to identify and describe the XML Encoding Rules specified in the Recommendation | International Standard: {joint-iso-ccitt asn1 (1) xml-encoding (_)} ...NOTE: OID for XER is TBD and "XML Encoding of a single ASN.1 type" 9.3 Where an unambiguous specification defines an abstract syntax as a set of presentation data values, each of which is a data value of some specifically named ASN.1 type, usually (but not necessarily) a choice type, then the object identifier value specified in 9.2 may be used with the abstract syntax name to identify the XML Encoding Rules to the specifically named ASN.1 type used in defining the abstract syntax. 9.4 The name specified in 9.2 shall not be used with an abstract syntax name to identify a transfer syntax unless the conditions of 9.3 for the definition of the abstract syntax are met. Example of encodings (This annex does not form an integral part of this Recommendation | International Standard)
Θρςξχνθκ περσπρΰ: http://asf.gils.net/xer/, 21 April 1999 |