Skip to content

Support validation of conformance #1223

@Asbjoedt

Description

@Asbjoedt

Is your feature request related to a problem? Please describe.
In the following, I will be referring to code adopted in the context of spreadsheets and therefore not all code pieces may be reuseable for text or presentations.

Open XML SDK currently supports the two following means of identifying OOXML conformance (either Transitional or Strict).

First, you have the class Conformance, which is a property saved to workbook.xml based on the ConformanceClass enum. So you have to go to WorkbookPart.Workbook.Conformance to find this.

Second, you have StrictRelationshipFound, which is a bool value you can fetch from root level. Here's the code in Open XML SDK:

/// <summary>
/// Gets a value indicating whether this package contains Transitional relationships converted from Strict.
 /// </summary>
public bool StrictRelationshipFound { get; private set; }

The question is whether these two ways of identifying conformance, actually checks for a VALID STRICT or VALID TRANSITIONAL OOXML file? With conformance class, it only fetches the property, meaning it does not check if any INVALID TRANSITIONAL features appear in the file. I don't know how StrictPropertyRelationship works, and judging from the summary, it only finds existing Transitional files, that have been converted to Strict conformance, but it does not identify Strict files created as Strict. However, when I test this on files, whether the file was Transitional has no relevance, it will find any Strict file. But how is this qualified, will it simply look for the conformance class value or will it also check for existence of X, Y, Z properties for instance many namespaces are different between the two conformance modes and these should be checked.

I think it is OK, if both of the outlined means of identification simply check for conformance class value, but I don't think this is sufficient for validation of a VALID STRICT or VALID TRANSITIONAL file. All differences specified in the OOXML ISO standard should be separated into two different ways of validating a file. At the moment the validator does not distinct between a Transtional and a Strict file, so how can I know if a Strict does not contain any Transitional features, it is not allowed to contain?

I refer to this issue about discussion on Strict validation and a brief (I have not have found all, for instance it does not mention anything about namespaces) outlining of conformance differences as outlined in the OOXML standard: mikeebowen/ooxml-validator-vscode#14

Two other places to find some (non-exhaustive) info:

Describe the solution you'd like
The Open XML SDK validator method is this:

var validator = new OpenXmlValidator();
var validation_errors = validator.Validate(filepath).ToList();

What we need is a parameter for method Validate() to specify which conformane should the document be validated against: transitional or strict and for this purpose can the ConformanceClass enum be used - link to enum.

Then it should report errors for Strict, which may not be errors for Transitional and vice versa. If no errors is reported, then the file is by logic valid, or you can create a bool or any other value to return determining succesful validation.

Describe alternatives you've considered
Unfortunately, I know of no other OOXML validators out there. I think it is crucial, that Microsoft fully supports the OOXML ISO standard, which means supporting Strict conformance. The life cycle of data begins with creation and ends with deletion or archiving. Right now this life cycle is not fully supported, because data quality for archiving means Strict conformance of OOXML.

Additional context
Here's my prototype repo for digital archiving of spreadsheets using Open XML SDK. You can find info on Strict conformance in the wiki: https://github.com/Asbjoedt/CLISC

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions