JSON-LD Framing 1.0

An Application Programming Interface for the JSON-LD Syntax

Unofficial Draft 30 August 2012

Latest editor's draft:
http://json-ld.org/spec/latest/json-ld-framing/
Editors:
Manu Sporny, Digital Bazaar
Gregg Kellogg, Kellogg Associates
Dave Longley, Digital Bazaar
Markus Lanthaler, Graz University of Technology
Authors:
Dave Longley, Digital Bazaar
Manu Sporny, Digital Bazaar
Gregg Kellogg, Kellogg Associates
Markus Lanthaler, Graz University of Technology

Abstract

JSON-LD Framing allows developers to query by example and force a specific tree layout to a JSON-LD document.

Status of This Document

This document is merely a public working draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organisation.

This document is an experimental work in progress.

Table of Contents

1. Introduction

A JSON-LD document is a representation of a directed graph. A single directed graph can have many different serializations, each expressing exactly the same information. Developers typically work with trees, represented as JSON objects. While mapping a graph to a tree can be done, the layout of the end result must be specified in advance. A Frame can be used by a developer on a JSON-LD document to specify a deterministic layout for a graph.

1.1 How to Read this Document

This document is a detailed specification for a serialization of Linked Data in JSON. The document is primarily intended for the following audiences:

To understand the basics in this specification you must first be familiar with JSON, which is detailed in [RFC4627]. You must also understand the JSON-LD Syntax [JSON-LD], which is the base syntax used by all of the algorithms in this document, and the JSON-LD API [JSON-LD-API]. To understand the API and how it is intended to operate in a programming environment, it is useful to have working knowledge of the JavaScript programming language [ECMA-262] and WebIDL [WEBIDL]. To understand how JSON-LD maps to RDF, it is helpful to be familiar with the basic RDF concepts [RDF-CONCEPTS].

1.2 General Terminology

Issue

The intent of the Working Group and the Editors of this specification is to eventually align terminology used in this document with the terminology used in the RDF Concepts document to the extent to which it makes sense to do so. In general, if there is an analogue to terminology used in this document in the RDF Concepts document, the preference is to use the terminology in the RDF Concepts document.

The following is an explanation of the general terminology used in this document:

JSON object
An object structure is represented as a pair of curly brackets surrounding zero or more name-value pairs. A name is a string. A single colon comes after each name, separating the name from the value. A single comma separates a value from a following name. The names within an object should be unique.
array
An array is represented as square brackets surrounding zero or more values that are separated by commas.
string
A string is a sequence of zero or more Unicode (UTF-8) characters, wrapped in double quotes, using backslash escapes (if necessary). A character is represented as a single character string.
number
A number is similar to that used in most programming languages, except that the octal and hexadecimal formats are not used and that leading zeros are not allowed.
true and false
Values that are used to express one of two possible boolean states.
null
The use of the null value within JSON-LD is used to ignore or reset values.
keyword
A JSON key that is specific to JSON-LD, specified in the JSON-LD Syntax specification [JSON-LD] in the section titled Syntax Tokens and Keywords.
context
A a set of rules for interpreting a JSON-LD document as specified in The Context of the [JSON-LD] specification.
IRI
An Internationalized Resource Identifier as described in [RFC3987].
Linked Data
A set of documents, each containing a representation of a linked data graph.
linked data graph or dataset
An unordered labeled directed graph, where nodes are IRIs or Blank Nodes, or other values. A linked data graph is a generalized representation of a RDF graph as defined in [RDF-CONCEPTS].
named graph
A linked data graph that is identified by an IRI.
graph name
The IRI identifying a named graph.
default graph
When executing an algorithm, the graph where data should be placed if a named graph is not specified.
node
A piece of information that is represented in a linked data graph.
node definition
A JSON object used to represent a node and one or more properties of that node. A JSON object is a node definition if it does not contain they keys @value, @list or @set and it has one or more keys other than @id.
node reference
A JSON object used to reference a node having only the @id key.
blank node
A node in the linked data graph that does not contain a de-referenceable identifier because it is either ephemeral in nature or does not contain information that needs to be linked to from outside of the linked data graph. A blank node is assigned an identifier starting with the prefix _:.
property
The IRI label of an edge in a linked data graph.
subject
A node in a linked data graph with at least one outgoing edge, related to an object node through a property.
object
A node in a linked data graph with at least one incoming edge.
quad
A piece of information that contains four items; a subject, a property, an object, and a graph name.
literal
An object expressed as a value such as a string, number or in expanded form.

1.3 Contributing

There are a number of ways that one may participate in the development of this specification:

2. The Application Programming Interface

This API provides a clean mechanism that enables developers to convert JSON-LD data into a a variety of output formats that are easier to work with in various programming languages. If a JSON-LD API is provided in a programming environment, the entirety of the following API must be implemented.

2.1 JsonLdProcessor

The JSON-LD processor interface is the high-level programming structure that developers use to access the JSON-LD transformation methods. The definition below is an experimental extension of the interface defined in the [JSON-LD-API].

Note

The JSON-LD API signatures are the same across all programming languages. Due to the fact that asynchronous programming is uncommon in certain languages, developers may implement processor with a synchronous interface instead. In that case, the callback parameter must not be included and the result must be returned as return value instead.

[NoInterfaceObject]
interface JsonLdProcessor {
    void frame (object or object[] or IRI input, object or IRI frame, object or IRI? context, JsonLdCallback callback, optional JsonLdOptions? options);
};

2.1.1 Methods

frame
Frames the given input using the frame according to the steps in the Framing Algorithm. The input is used to build the framed output and is returned if there are no errors. If there are no matches for the frame, null must be returned. Exceptions must be thrown if there are errors.
ParameterTypeNullableOptionalDescription
inputobject or object[] or IRI✘✘The JSON-LD object or array of JSON-LD objects to perform the framing upon or an IRI referencing the JSON-LD document to frame.
frameobject or IRI✘✘The frame to use when re-arranging the data of input; either in the form of an JSON object or as IRI.
contextobject or IRI✔✘An optional external context to use additionally to the context embedded in input when expanding the input.
callbackJsonLdCallback✘✘A callback that is called when processing is complete on the given input.
optionsJsonLdOptions✔✔A set of options that may affect the framing algorithm such as, e.g., the input document's base IRI.
Return type: void

2.2 Callbacks

2.2.1 JsonLdCallback

The JsonLdCallback is used to return a processed JSON-LD representation as the result of processing an API method.

See JsonLdCallback definition in [JSON-LD-API].

2.3 Data Structures

This section describes datatype definitions used within the JSON-LD API.

2.3.1 JsonLdOptions

The JsonLdOptions type is used to convert a set of options to an interface method.

See JsonLdOptions definition in [JSON-LD-API].

3. Algorithms

All algorithms described in this section are intended to operate on language-native data structures. That is, the serialization to a text-based JSON document isn't required as input or output to any of these algorithms and language-native data structures must be used where applicable.

3.1 Syntax Tokens and Keywords

This specification adds a number of keywords to the ones defined in the [JSON-LD] specification:

@default
Used in Framing to set the default value for an output property when the framed node definition does not include such a property.
@explicit
Used in Framing to override the value of explicit inclusion flag within a specific frame.
@omitDefault
Used in Framing to override the value of omit default flag within a specific frame.
@embed
Used in Framing to override the value of object embed flag within a specific frame.
@null
Used in Framing when a value of null should be returned, which would otherwise be removed when Compacting.

All JSON-LD tokens and keywords are case-sensitive.

3.2 Algorithm Terms

active subject
the currently active subject that the processor should use when processing.
active property
the currently active property that the processor should use when processing. The active property is represented in the original lexical form, which is used for finding coercion mappings in the active context.
active object
the currently active object that the processor should use when processing.
active context
a context that is used to resolve terms while the processing algorithm is running. The active context is the context contained within the processor state.
compact IRI
a compact IRI is has the form of prefix and suffix and is used as a way of expressing an IRI without needing to define separate term definitions for each IRI contained within a common vocabulary identified by prefix.
local context
a context that is specified within a JSON object, specified via the @context keyword.
processor state
the processor state, which includes the active context, active subject, and active property. The processor state is managed as a stack with elements from the previous processor state copied into a new processor state when entering a new JSON object.
JSON-LD input
The JSON-LD data structure that is provided as input to the algorithm.
JSON-LD output
The JSON-LD data structure that is produced as output by the algorithm.
term
A term is a short word defined in a context that may be expanded to an IRI
prefix
A prefix is a term that expands to a vocabulary base IRI. It is typically used along with a suffix to form a compact IRI to create an IRI within a vocabulary.
language-tagged string
A language-tagged string is a literal without a datatype, including a language. See languaged-tagged string in [RDF-CONCEPTS].
typed literal
A typed literal is a literal with an associated IRI which indicates the literal's datatype. See languaged-tagged literal in [RDF-CONCEPTS].

3.3 Framing

Framing is the process of taking a JSON-LD document, which expresses a graph of information, and applying a specific graph layout (called a Frame).

Framing makes use of the Node Map Generation algorithm to place each object defined in the JSON-LD document into a flat list of objects, allowing them to be operated upon by the framing algorithm.

3.3.1 Framing Algorithm Terms

input frame
the initial frame provided to the framing algorithm.
framing context
a context containing a map of embeds, the object embed flag, the explicit inclusion flag and the omit default flag.
map of embeds
a map that tracks if a subject is to be embedded in the output of the Framing Algorithm; it maps a subject @id to a parent JSON object and property or parent array.
object embed flag
a flag specifying that objects should be directly embedded in the output, instead of being referred to by their IRI.
explicit inclusion flag
a flag specifying that for properties to be included in the output, they must be explicitly declared in the framing context.
omit default flag
a flag specifying that properties that are missing from the JSON-LD input, but present in the input frame should be omitted from the output.
map of flattened subjects
a map of subjects that is the result of the Node Map Generation algorithm.

3.3.2 Framing Algorithm

This algorithm is a work in progress. Presently, it only works for documents without named graphs.

Currently, framing allows just to select node definitions based on @type matching or duck typing for included properties. It allows value properties to be explicitly matched based on defining the property and excluding things that are undefined, but it does not allow to be more specific about the types of values selected. Allowing this is currently being discussed.

The framing algorithm takes an JSON-LD input (expanded input) and an input frame (expanded frame) that have been expanded according to the Expansion Algorithm, and a number of options and produces JSON-LD output.

Create framing context using null for the map of embeds, the object embed flag set to true, the explicit inclusion flag set to false, and the omit default flag set to false along with map of flattened subjects set to the @merged property of the result of performing the Node Map Generation algorithm on expanded input. Also create results as an empty array.

Invoke the recursive algorithm using framing context (state), the map of flattened subjects (subjects), expanded frame (frame), result as parent, and null as active property.

The following series of steps is the recursive portion of the framing algorithm:

  1. Validate frame.
  2. Create a set of matched subjects by filtering subjects checking the map of flattened subjects against frame:
    1. If frame has a @type property containing one or more IRIs match any node definition with a @type property including any of those IRIs.
    2. Otherwise, if frame has a @type property only a empty JSON object, matches any node definition with a @type property, regardless of the actual values.
    3. Otherwise, match if the node definition contains all of the non-keyword properties in frame.
  3. Get values for embedOn and explicitOn by looking in frame for the keys @embed and @explicit using the current values for object embed flag and explicit inclusion flag from state if not found.
  4. For each id and subject from the set of matched subjects, ordered by id:
    1. If the active property is null, set the map of embeds in state to an empty map.
    2. Initialize output with @id and id.
    3. Initialize embed with parent and active property to property.
    4. If embedOn is true, and id is in map of embeds from state:
      1. Set existing to the value of id in map of embeds and set embedOn to false.
      2. If existing has a parent which is an array containing a JSON object with @id equal to id, element has already been embedded and can be overwritten, so set embedOn to true.
      3. Otherwise, existing has a parent which is a node definition. Set embedOn to true if any of the items in parent property is a node definition or node reference for id because the embed can be overwritten.
      4. If embedOn is true, existing is already embedded but can be overwritten, so Remove Embedded Definition for id.
    5. If embedOn is false, add output to parent by either appending to parent if it is an array, or appending to active property in parent otherwise.
    6. Otherwise:
      1. Add embed to map of embeds for id.
      2. Process each property and value in the matched subject, ordered by property:
        1. If property is a keyword, add property and a copy of value to output and continue with the next property from subject.
        2. If property is not in frame:
          1. If explicitOn is false, Embed values from subject in output using subject as element and property as active property.
          2. Continue to next property.
        3. Process each item from value as follows:
          1. If item is a JSON object with the key @list, then create a JSON object named list with the key @list and the value of an empty array. Append list to property in output. Process each listitem in the @list array as follows:
            1. If listitem is a node reference process listitem recursively using this algorithm passing a new map of subjects that contains the @id of listitem as the key and the node definition from the original map of flattened subjects as the value. Pass the first value from frame for property as frame, list as parent, and @list as active property.
            2. Otherwise, append a copy of listitem to @list in list.
          2. If item is a node reference process item recursively using this algorithm passing a new map as subjects that contains the @id of item as the key and the node definition from the original map of flattened subjects as the value. Pass the first value from frame for property as frame, output as parent, and property as active property.
            Issue
            Passing a node reference doesn't work if this map is used recursively. Presently pass node definition from original map of flattened subjects.
          3. Otherwise, append a copy of item to active property in output.
      3. Process each property and value in frame, where property is not a keyword, ordered by property:
        1. Set property frame to the first item in value or a newly created JSON object if value is empty.
        2. Skip to the next property in frame if property is in output or if property frame contains @omitDefault which is true or if it does not contain @omitDefault but the value of omit default flag true.
        3. Set the value of property in output to a new JSON object with a property @preserve and a value that is a copy of the value of @default in frame if it exists, or the string @null otherwise.
      4. Add output to parent. If parent is an array, append output, otherwise append output to active property in parent.

At the completion of the recursive algorithm, results will contain the top-level node definitions.

The final two steps of the framing algorithm require results to be compacted according to the Compaction Algorithm by using the context provided in the input frame. If the frame has no context, compaction is performed with an empty context (not a null context). The compaction result must use the @graph keyword at the top-level, even if the context is empty or if there is only one element to put in the @graph array. Subsequently, replace all key-value pairs where the key is @preserve with the value from the key-pair. If the value from the key-pair is @null, replace the value with null. If, after replacement, an array contains only the value null remove the value, leaving an empty array. The resulting value is the final JSON-LD output.

3.3.3 Remove Embedded Definition

This algorithm replaces an already embedded node definition with a node reference. It then recursively removes any entries in the map of embeds that had the removed node definition in their parent chain.

Issue

About as clear as mud

The current behaviour avoids embedding the same data multiple times in the result makes it difficult to work with the output. A proposal to change this to "agressive re-embedding" is currently being discussed.

The algorithm is invoked with a framing context and subject id id.

  1. Find embed from map of embeds for id.
  2. Let parent and property be from embed.
  3. If parent is an array, replace the node definition that matches id with a node reference. If parent is a JSON object, replace the node definition for property that matches id with a node reference.
  4. Remove dependents for id in map of embeds by scanning the map for entries with parent that have an @id of id, removing that definition from the map, and then removing the dependents for the parent id recursively by repeating this step. This step will terminate when there are no more embed entries containing the removed node definition's @id in their parent chain.

3.3.4 Embed Values

This algorithm recursively embeds property values in node definition output, given a framing context, input node definition element, active property, and output.

  1. For each item in active property of element:
    1. If item is a JSON object with the key @list, then create a new JSON object with a key @list and a value of an empty array and add it to output, appending if output is an array, and appending to active property otherwise. Recursively call this algorithm passing item as element, @list as active property, and the new array as output. Continue to the next item.
    2. If item is a node reference:
      1. If map of embeds does not contain an entry for the @id of item:
        1. Initialize embed with output as parent and active property as property and add to map of embeds.
        2. Initialize a new node definition o to act as the embedded node definition.
        3. For each property and value in the expanded definition for item in subjects:
          1. Add property and a copy of value to o if property is a keyword.
          2. Otherwise, recursively call this algorithm passing value as element, property as active property and o as output.
      2. Set item to o.
  2. If output is an array, append a copy of item, otherwise append a copy of item to active property in output.

A. IANA Considerations

This section is non-normative.

This section is included merely for standards community review and will be submitted to the Internet Engineering Steering Group if this specification becomes a W3C Recommendation.

application/ld-frame+json

Type name:
application
Subtype name:
ld-frame+json
Required parameters:
None
Optional parameters:
None
Encoding considerations:
The same as the application/json MIME media type.
Security considerations:
Since a JSON-LD frame is intended to specify a deterministic layout for a JSON-LD graph, the serialization should not be passed through a code execution mechanism such as JavaScript's eval() function. It is recommended that a conforming parser does not attempt to directly evaluate the JSON-LD frame and instead purely parse the input into a language-native data structure.
Interoperability considerations:
Not Applicable
Published specification:
The JSON-LD specification.
Applications that use this media type:
Any programming environment that requires the exchange of directed graphs. Implementations of JSON-LD have been created for JavaScript, Python, Ruby, PHP and C++.
Additional information:
Magic number(s):
Not Applicable
File extension(s):
.jsonldf
Macintosh file type code(s):
TEXT
Person & email address to contact for further information:
Manu Sporny <msporny@digitalbazaar.com>
Intended usage:
Common
Restrictions on usage:
None
Author(s):
Manu Sporny, Gregg Kellogg, Markus Lanthaler, Dave Longley
Change controller:
W3C

Fragment identifiers have no meaning with application/frame-ld+json resources.

B. Acknowledgements

A large amount of thanks goes out to the JSON-LD Community Group participants who worked through many of the technical issues on the mailing list and the weekly telecons - of special mention are Niklas Lindström, François Daoust, and Zdenko 'Denny' Vrandečić. The editors would like to thank Mark Birbeck, who provided a great deal of the initial push behind the JSON-LD work via his work on RDFj. The work of Dave Lehn and Mike Johnson are appreciated for reviewing, and performing several implementations of the specification. Ian Davis is thanked for this work on RDF/JSON. Thanks also to Nathan Rixham, Bradley P. Allen, Kingsley Idehen, Glenn McDonald, Alexandre Passant, Danny Ayers, Ted Thibodeau Jr., Olivier Grisel, Josh Mandel, Eric Prud'hommeaux, David Wood, Guus Schreiber, Pat Hayes, Sandro Hawke, and Richard Cyganiak for their input on the specification.

C. References

C.1 Normative references

[JSON-LD]
The JSON-LD Syntax Manu Sporny, Gregg Kellogg, Markus Lanthaler Editors. World Wide Web Consortium (work in progress). 22 May 2012. Editor's Draft. This edition of the JSON-LD Syntax specification is http://json-ld.org/spec/ED/json-ld-syntax/20120522/. The latest edition of the JSON-LD Syntax is available at http://json-ld.org/spec/latest/json-ld-syntax/
[JSON-LD-API]
The JSON-LD API 1.0 Manu Sporny, Gregg Kellogg, Dave Longley, Markus Lanthaler, Editors. World Wide Web Consortium (work in progress). 24 May 2012. Editor's Draft. This edition of the JSON-LD Syntax specification is http://json-ld.org/spec/ED/json-ld-api/20120524/. The latest edition of the JSON-LD Syntax is available at http://json-ld.org/spec/latest/json-ld-api/
[RDF-CONCEPTS]
RDF 1.1 Concepts and Abstract Syntax Richard Cyganiak, David Wood, Editors. World Wide Web Consortium (work in progress). 30 May 2012. Editor's Draft. This edition of the JSON-LD Syntax specification is http://www.w3.org/TR/2011/WD-rdf11-concepts-20110830/. The latest edition of the JSON-LD Syntax is available at http://www.w3.org/TR/rdf11-concepts/
[RFC3987]
M. Dürst; M. Suignard. Internationalized Resource Identifiers (IRIs). January 2005. Internet RFC 3987. URL: http://www.ietf.org/rfc/rfc3987.txt
[RFC4627]
D. Crockford. The application/json Media Type for JavaScript Object Notation (JSON) July 2006. Internet RFC 4627. URL: http://www.ietf.org/rfc/rfc4627.txt
[WEBIDL]
Web IDL Cameron McCormack, Editor. World Wide Web Consortium. 19 April 2012. Candidate Recommendataion. This edition of Web IDL is http://www.w3.org/TR/2012/CR-WebIDL-20120419/. The latest edition of Web IDL is available at http://dev.w3.org/2006/webapi/WebIDL/

C.2 Informative references

[ECMA-262]
ECMAScript Language Specification. June 2011. URL: http://www.ecma-international.org/publications/standards/Ecma-262.htm