Skip to content

Pattern Matching #118

@AtesComp

Description

@AtesComp

Modify the "Value Pattern Matching Algorithm" to match on regular expressions.

NOTE: I've split this off from #73 as I believe this is a useful framing issue on its own.

From the framing specification:

4.3 Value Pattern Matching Algorithm

The Value Pattern Matching Algorithm is used as part of the Framing and Frame Matching algorithms. A value object matches a value pattern using the match none and wildcard patterns on @value, @type, and @language, in addition to allowing a specific value to match a set of values defined using the array form for each value object property.

It seems matching is an all or nothing affair: match anything {}, nothing [], or everything (a specific value). There is no middle ground for generic string matching. It would be useful for matching to use regex patterns. Example:

"ex:relationship-.+"

This is helpful for a wide range of use cases. For example, when there is no @type in the input, but the @id may contain information that can be used to infer type, then a partial string match within the @id can identify the default type. It can be very useful for matching on properties (2.1.1). Then, 4.3 Value Pattern Matching Algorithm becomes much more robust.

RELATED:
The JSON Schema specification that uses the pattern keyword for regular expressions:
https://json-schema.org/understanding-json-schema/reference/regular_expressions.html

OWL2 also reserves xsd:pattern for regex and uses it in restrictions.

PROPOSITION: Extend JSON-LD Framing with the @pattern keyword.

[FRAME]
{
  "@context": {"@vocab": "http://example.org/"},
  "@id": {"@pattern": ".*\/[Ll]ibrary\/.*"},
  "@type": {"@default": "Library"},
  "contains": {
    "@id": {"@pattern": ".*\/[Bb]ook\/.*"},
    "@type": {"@default": "Book"},
    "contains": {
      "@id": {"@pattern": ".*\/[Cc]hapter\/.*"}
      "@type": {"@default": "Chapter"}
    }
  }
}

@pattern should accept an array of patterns.

[FRAME]
{
  "@context": {"@vocab": "http://example.org/"},
  "@id": {"@pattern": [
    ".*\/[Ll]ibrary\/.*",
    ".*\/[Aa]thenaeum\/.*",
    ".*\/[Bb]ook_?[Cc]ollection\/.*"]
  }
  "@type": {"@default": "Library"},
  "contains": {
    "@id": {"@pattern": ".*\/[Bb]ook\/.*"},
    "@type": {"@default": "Book"},
    "contains": {
      "@id": {"@pattern": ".*\/[Cc]hapter\/.*"}
      "@type": {"@default": "Chapter"}
    }
  }
}

or we could just use regex or'ing, |, but it might be nice to include such constructs.

[FRAME]
...
  "@id": {"@pattern": ".*\/([Ll]ibrary|[Aa]thenaeum|[Bb]ook_?[Cc]ollection)\/.*"}
...

How about matching property names, not just the values? Then, 2.1 Framing becomes much more robust and we can do some interesting things like shaping based on property patterns. In the following case, typing based on property patterns and relations:

[FRAME]
{
  "@context": {"@vocab": "http://example.org/"},
  "@type": {"@default": "Library"},
  "location": {"@pattern": "[Aa]thens(, (Greece|Tennesee, USA))?"},
  "contains": [
    {
      "@type": {"@default": "Book"},
      {"@pattern": ".*([Cc]reator|[Aa]uthor).*"}: {},
      "contains": {
        "@id": {"@pattern": ".*\/[Cc]hapter\/.*"},
        "@type": {"@default": "Chapter"}
      }
    },
    {
      "@type": {"@default": "Periodical"},
      {"@pattern": ".*([Cc]reator|[Pp]ublisher).*"}: {},
      "contains": {
        "@id": {"@pattern": ".*\/[Aa]rticle\/.*"},
        "@type": {"@default": "Article"}
      }
    }
  ]
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Future Work

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions