[Accessibility conventions are described at the bottom of the page]
*** This is a free preview excerpt of a commercial publication. ***

5. Associating controlled vocabularies in XML documents
[> 6.][< 4.0][^^^]
5.0 Constraining information items using controlled vocabularies
[> 5.0.1][> 6.][< 5.][^^][^^^]
Three kinds of constraints to be validated for an XML document
[[1] - structural constraints ensure information items are correctly found
 [1] - lexical constrains ensure information items are correctly formed
 [1] - value constraints ensure information items are correctly understood
]
Constraining the document structure and lexical patterns is independent of business/value rules
[[1] - a community of users can publish an agreed upon schema to validate information items are correctly found and formed
]
Constraining information item use of controlled vocabularies is very dependent on business/value rules
[[1] - business/value rules implied by the nature of the information item
[[2] - e.g. points of a compass will never change
][1] - business/value rules imposed by a community of users
[[2] - e.g. the document status codes for the condition of a document in a transaction
][1] - business/value rules agreed upon between trading partners
[[2] - e.g. identification of account numbers for particular purposes
]]
Typical use of W3C Schema conflates structural and value constraints inflexibly
[[1] - one gets more flexibility by separating value constraints from structural constraints
 [1] - only structural constraints should be imposed across a community of users
[[2] - standard should constrain how the information is found and how it is formed, not how it is valued
 [2] - very infrequent changes to the structure of information being interchanged
 [2] - changes imply big impacts on applications and processing
][1] - value constraints should be selectively imposed
[[2] - changes in trading partners
 [2] - changes in business practices over time
 [2] - possibly frequent changes to the values allowed by different parties
 [2] - once programs accommodate a given set of values, changing the subsets of values in use doesn't change the applications
][1] - business rules should be selectively added
[[2] - private requirements could never be anticipated by standards committees
]]
5.0.1 Context/value association
[> 5.0.2][> 6.][< 5.0][^][^^][^^^]
Context/value association files
[[1] - [http://www.oasis-open.org/committees/document.php?document_id=29990]
 [1] - an XML vocabulary for associating document contexts with specified values
 [1] - suitable for constraining document entry in a user interface
 [1] - suitable for document validation before application processing
 [1] - techniques for specifying, restricting and extending lists for the purposes of validation
]
Masquerading meta data when restricting a large list to a subset of values
[[1] - the validation needs to match an instance's use of large list meta data to a declaration of a subset list using subset list list-level meta data
 [1] - the subset list list-level meta data necessarily is different than the list-level meta data of the list from which it is derived
 [1] - the subset list masquerades as the list from which it is derived so that instance-level meta data doesn't use the custom subset list list-level meta data
]
ISO/IEC 19757-3 Schematron deployment
[[1] - as supplied, the methodology reports context/value constraint violations in simple text
 [1] - Schematron can alternatively be deployed with different available reporting techniques
]
The principles of context/value association are as follows:
[[1] - XML documents have information items that need to be validated
[[2] - the locations (contexts) of those items can be addressed using XPath addresses
][1] - genericode files have values and list meta data to use for validation
[[2] - the locations of those files can be declared with URL addresses
 [2] - the identity of each list is uniquely specified in order to be referenced multiple times
][1] - an association marries a document context with a set of genericode files
[[2] - each XPath document context is specified with the identities of the genericode declarations
][1] - validation checks values found in document contexts against genericode files linked by the association for the document context
[[2] - any present meta data in the document context is checked with the available genericode meta data
]]
[Figure 5.1: Context/value association
Three groups of triangles are shown, one triangle labeled "Document Instance Being Validated", a set of triangles labeled "External Value List Expressions", and one triangle labeled "Context/Value Associations".
In the triangle labeled "Context/Value Associations" are a number of drawn areas with arrows pointing to other triangles. The areas are labeled "An association ties a document context to one or more lists of valid values".
All areas each have one arrow to the "Document Instance Being Validated" triangle, with the arrow labeled "Document context specifies information items being validated". All areas have one or more arrows pointing to the "External Value List Expressions" triangles, with the arrow labeled "Lists of valid values are referenced by file location".
]
Appropriate for constraining data entry application user interfaces
[[1] - used as a front end to a user preventing the data entry of different values
[[2] - drop-down lists
 [2] - radio buttons
 [2] - check boxes
][1] - the end result of editing an instance is that the values are all from the associated lists
 [1] - the value-level meta data can be presented to the user
[[2] - assists the user in choosing which value or values to use
][1] - the options to include instance-level meta data should be offered
[[2] - reflects the list-level meta data for the list from where the values are taken
]]
Appropriate for constraining data validation
[[1] - used as a front end to an application that implements the logic for all possible values
 [1] - selective association for business scenarios prevents the application from acting on inappropriate values for a given transaction
[[2] - relationships between specific partners may be different
 [2] - different profiles of using documents may constrain particular values
]]
Only the CVA vocabulary is standardized by OASIS, not how it is used
[[1] - the file format and the semantics represented by the elements and attributes are being standardized by OASIS
 [1] - any implementation is considered out of scope of the committee work
]
5.0.2 Using context/value association for validation
[> 6.][< 5.0.1][^][^^][^^^]
Separates structural/lexical validation from value validation
[[1] - an XML document is checked using a two-step process
 [1] - the first pass for structural and lexical validation passes
 [1] - the second pass reports that a coded value used for a currency is unexpected
 [1] - the document structure and lexical content can be constrained by standardization
[[2] - e.g. the UBL technical committee publishes normative W3C schemas
][1] - the document controlled-value content is constrained by business requirements between trading partners
[[2] - e.g. the UBL committee publishes default coded value checks
[[3] - defaultCodeList.xsl
][2] - trading partners can use this value validation methodology to create their own value checking second-pass process
]]
[Figure 5.2: Two-step validation
The diagram is split with a horizontal line indicating runtime process above the line and advance preparation process below the line.
Above the line and at the left is an incoming XML instance depicted as a triangle. This is connected by an arrow to the box at the right labeled "Application Code" under the column "Semantic Interpretation". Two arrows lead down from this horizontal arrow, one under the column "Structure Validation" to a box labeled "W3C Schema", and the other under the column "Value Validation" to a box labeled "XSLT".
Below the line and under the column "Structure Validation" an "XSD" labeled triangle titled "Structure Constraints" and identified with a circled "1" has an arrow leading into the "W3C Schema" box. Below the line and under the column "Value Validation" an "XSLT" labeled triangle titled "Value Constraints" and identified with a circled "2" has an arrow leading into the "XSLT" box.
]
Document arrives at application unchanged
[[1] - validation only confirms the use of structure and content, without modifying it
]
Second pass results meaningless without first pass being successful
[[1] - the values must be correctly found and correctly formed before checking the actual values produces an accurate result
]
Crane-CVA2sch package from Crane Softwrights Ltd. web site
[[1] - historically developed in the OASIS UBL Technical Committee
 [1] - moved into the OASIS Code List Representation Technical Committee
 [1] - moved out of the OASIS Code List Representation Technical Committee
[[2] - the committee decided to focus on file formats and not methodologies
 [2] - intellectual property returned to Crane Softwrights Ltd.
][1] - Crane is donating CVA2sch to an Apache Schematron project
]
A methodology for code list and value validation based on ISO/IEC 19757-3 Schematron
[[1] - an information item is asserted to have one of an allowed set of predetermined values
[[2] - a failed assertion is a value validation error
][1] - assertions are derived from context/value associations
]
Schematron is usually implemented using the Extensible Stylesheet Language (XSLT)
[[1] - the supplied Schematron stylesheet for stylesheets is a copy of the publicly-available reference XSLT implementation
[[2] - [http://www.schematron.com]
 [2] - the methodology supplies a wrapper stylesheet for the reference skeleton
][1] - other non-XSLT implementations of Schematron exist
[[2] - e.g. Amara/Scimitar implements ISO Schematron in Python
[[3] - [http://uche.ogbuji.net:8080/uche.ogbuji.net/tech/4Suite/amara/]
 [3] - same architecture as reference XSLT implementation in that Scimitar is a Python program that writes a Python program that performs the validation
]]]
The XSLT generated to implement the Schematron assertions is used as the second pass of validation to test XML instances for having correct controlled-vocabulary values
[[1] - the testing relies on the first-pass structural validation, having already confirmed the structure and lexical values used in the instance
 [1] - without the first pass confirming the accurate presence of information items, the second pass is meaningless
]
The methodology supports the incorporation of any number of sets of Schematron assertions
[[1] - ISO Schematron supports the inclusion of multiple schema fragments into a single schema expression
 [1] - business rules related or unrelated to code lists may be expressed as Schematron assertions
[[2] - the trading partner schema can then include business rules in addition to coded value rules
]]
Overview of the process to prepare the second pass value validation XSLT stylesheet:
[Figure 5.3: Second-pass value-validation artefact creation
The diagram shows triangles and boxes in three different areas.
The area labeled "Definition" shows the "XML" labeled triangle titled "Code List Context Associations" and identified with a circled "3", a set of "GC" labeled triangles titled "External Code List Expressions" and identified with a circled "4", and a setoff "SCH" labeled triangles titled "Business Rules" and identified with a circled "6". Each of these has an arrow directed to a box labeled "Schematron implementation of OASIS context/value association files for validation" in the area labeled "Preparation".
One arrow leaves this box to the "XSLT" labeled triangle titled "Assertion Validation Stylesheet" and identified with a circled "2".
One arrow leaves this box to the "XSLT Process" labeled box in the area labeled "Processing". The other input to this box is a set of "XML" labeled triangles titled "Document Instances Being Validated". The one output from this box is a set of "Report" labeled parallelograms titled "Validation Reports".
]
[[1] - the circled labels in the diagram are indicated by the parenthesized numbers
 [1] - the inputs:
[[2] - (3) the specification of contexts uses the context/value association XML vocabulary defined by the OASIS Code List Representation TC
 [2] - (4) the specification of coded values uses the genericode vocabulary defined by the OASIS Code List Representation TC
 [2] - (5) supplemental business rules are specified using ISO/IEC 19757-3 Schematron
][1] - the output:
[[2] - (2) an XSLT stylesheet (or some other implementation of Schematron assertion checking)
]]
Recall [Figure 1.4]
[[1] - the XSLT created here (2) plugs in to the two-step validation process
]
Recall [Figure 5.1]
[[1] - all three documents on that diagram are shown here as instances being validated, the context value association files and the external value list expressions
]

*** This is a free preview excerpt of a commercial publication. ***

This is an accessible version of Crane's commercial training material. The content has been specifically designed to assist screen reader software in viewing the entire textual content. Figures are replaced with text narratives.

Navigation hints are in square brackets:
[Tx.x] and [Fx.x] are textual representations of the applicability icons;
[digit] indicates list depth for nested lists;
[link [URL]] indicates the URL of a hyperlink if different than link;
[EXAMPLE] indicates an example listing of code;
[FIGURE] indicates the presence of a figure replaced by its description;
[>] jumps forward;
[<] jumps backward;
[^] jumps to start of the section;
[^^] jumps to the start of the chapter;
[^^^] jumps to the table of contents.
Suggestions for improvement are welcome: [info@CraneSoftwrights.com]
Book sales: [http://www.CraneSoftwrights.com/links/trn-acc.htm]
Information: [http://www.CraneSoftwrights.com/links/info-acc.htm]
This content is protected by copyright and, as there are no means to protect this accessible version from plagiarism, please do not make any commercial edition available to others.

+//ISBN 1-894049::CSL::Presentation::UBL//DOCUMENT Practical Code List Implementation 2009-02-09 22:30UTC//EN
Practical Code List Implementation
First Edition - 2009-02-09
ISBN 978-1-894049-22-1
Copyright © Crane Softwrights Ltd.