JSON-LD - Linked Data Expression in JSON

1. How to Read this Document
2. Introduction
3. Design Goals
4. Design Rationale
- 4.1 Map Terms to IRIs
5. Mashing Up Vocabularies
6. An Example of a Default Context
7. The JSON-LD Processing Algorithm
8. Markup Examples
9. Markup of RDF Concepts
10. Advanced Concepts
- 10.1 Disjoint Graphs
- 10.2 Acknowledgements
A. References
- A.1 Normative references
- A.2 Informative references

1. How to Read this Document

This document is a detailed specification for a serialization of JSON for Linked data. The document is primarily intended for the following audiences:

Developers that want to encode Microformats, RDFa, or Microdata in a way that is cross-language compatible via JSON.
Developers that want to understand the encoding possibilities.

To understand this specification you must first be familiar with JSON, which is detailed in [RFC4627].

2. Introduction

Write the introduction once all of the technical details are hammered out. Explain why JSON-LD is designed as a light-weight mechanism to express RDFa, Microformats and Microdata. It is primarily intended as a way to express Linked Data in Javascript environments as well as a way to pass Linked Data to and from Web services. It is designed to be as simple as possible, utilizing the large number of JSON parsers (and understanding) that is out there already. It is designed to be able to express key-value pairs, RDF data, Microformats data, and Microdata (that is, every data model currently in use) using one unified format. It does not require anyone to change their JSON, but easily add meaning by adding context in a way that is out-of-band - it is designed to not disturb already deployed systems running on JSON, but provide a smooth migration path from JSON to JSON with added semantics. Finally, the format is intended to be fast to parse, fast to generate, stream-based and document-based processing compatible, and require a tiny memory footprint in order to operate.

3. Design Goals

A number of design considerations were explored during the creation of this markup language:

Simplicity: Developers don't need to know RDF in order to use the basic functionality provided by JSON-LD.
Compatibility: The JSON-LD markup should be 100% compatible with JSON.
Expressiveness: All major RDF concepts must be expressible via the JSON-LD syntax.
Terseness: The JSON-LD syntax must be very terse and human readable.
Zero Edits: JSON-LD provides a mechanism that allows developers to specify context in a way that is out-of-band. This allows organizations that have already deployed large JSON-based infrastructure to add meaning to their JSON in a way that is not disruptive to their day-to-day operations and is transparent to their current customers.
Streaming: The format supports both document-based and stream-based processing.

4. Design Rationale

The following section outlines the rationale behind the JSON-LD markup language.

4.1 Map Terms to IRIs

Establishing a mechanism to map JSON values to IRIs will help in the mapping of JSON objects to RDF. This does not mean that JSON-LD must be restrictive in declaring a set of terms, rather, experimentation and innovation should be supported as part of the core design of JSON-LD. There are, however, a number of very small design criterial that can ensure that developers will generate good RDF data that will create value for the greater semantic web community and JSON/REST-based Web Services community.

We will be using the following JSON object as the example for this section:

{
  "a": "Person",
  "name": "Manu Sporny",
  "homepage": "http://manu.sporny.org/"
}

The Default Context

A default context is used in RDFa to allow developers to use keywords as aliases for IRIs. So, for instance, the keyword name above could refer to the IRI http://xmlns.com/foaf/0.1/name. The semantic web, just like the document-based web, uses IRIs for unambiguous identification. The idea is that these terms mean something, which you will eventually want to query. The semantic web specifies this via Vocabulary Documents. The IRI http://xmlns.com/foaf/0.1/ specifies a Vocabulary Document, and name is a term in that vocabulary. Paste the two items together and you have an unambiguous identifier for a term.

Developers, and machines, would be able to use this IRI (plugging it directly into a web browser, for instance) to go to the term and get a definition of what the term means. Much like we can use WordNet today to see the definition of words in the English language. Machines need the same sort of dictionary of terms, and URIs provide a way to ensure that these terms are unambiguous.

Non-prefixed terms should have term mappings declared in the default context so that they may be expanded later.

If a set of terms, like Person, name, and homepage, are pre-defined in the default context, and that context is used to resolve the names in JSON objects, machines could automatically expand the terms to something meaningful and unambiguous, like this:

{
  "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": "http://xmlns.com/foaf/0.1/Person",
  "http://xmlns.com/foaf/0.1/name": "Manu Sporny",
  "http://xmlns.com/foaf/0.1/homepage": "<http://manu.sporny.org>"
}

In order to differentiate between plain text and IRIs, the < and > are used around IRIs.

Doing this would mean that JSON would start to become unambiguously machine-readable, play well with the semantic web, and basic markup wouldn't be that much more complex than basic JSON markup. A win, all around.

5. Mashing Up Vocabularies

Developers would also benefit by allowing other vocabularies to be used automatically with their JSON API. There are over 200 Vocabulary Documents that are available for use on the Web today. Some of these vocabularies are:

XSD - for specifying basic types like strings, integers, dates and times.
Dublin Core - for describing creative works.
FOAF - for describing social networks.
Calendar - for specifying events.
SIOC - for describing discussions on blogs and websites.
CCrel - for describing Creative Commons and other types of licenses.
GEO - for describing geographic location.
VCard - for describing organizations and people.
DOAP - for describing projects.

A JSON-LD Web Service could define these as prefixes in their default context beside the terms that are already defined. Using this feature, developers could also express markup like this:

{
  "rdf:type": "foaf:Person",
  "foaf:name": "Manu Sporny",
  "foaf:homepage": "<http://manu.sporny.org/>",
  "sioc:avatar": "<http://twitter.com/account/profile_image/manusporny>"
}

Developers can also specify their own Vocabulary documents by modifying the default context in-line using the # character, like so:

{
  "#": { "myvocab": "http://example.org/myvocab#" },
  "a": "foaf:Person",
  "foaf:name": "Manu Sporny",
  "foaf:homepage": "<http://manu.sporny.org/>",
  "sioc:avatar": "<http://twitter.com/account/profile_image/manusporny>",
  "myvocab:credits": 500
}

Think of the # character as a "hashtable", which maps one string to another string. In the example above, the myvocab string is replaced with "http://example.org/myvocab#" when it is detected above. In the example above, "myvocab:credits" would expand to "http://example.org/myvocab#credits".

This mechanism is a short-hand for RDF, and if defined, will give developers an unambiguous way to map any JSON value to RDF.

6. An Example of a Default Context

JSON-LD strives to ensure that developers don't have to change the JSON that is going into and being returned from their Web applications. A JSON-LD aware Web Service may define a default context. For example, the following default context could apply to all incoming Web Service calls previously accepting only JSON data:

{
  "#": 
  {
    "__vocab__": "http://example.org/default-vocab#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "dc": "http://purl.org/dc/terms/",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "sioc": "http://rdfs.org/sioc/ns#",
    "cc": "http://creativecommons.org/ns#",
    "geo": "http://www.w3.org/2003/01/geo/wgs84_pos#",
    "vcard": "http://www.w3.org/2006/vcard/ns#",
    "cal": "http://www.w3.org/2002/12/cal/ical#",
    "doap": "http://usefulinc.com/ns/doap#",
    "Person": "http://xmlns.com/foaf/0.1/Person",
    "name": "http://xmlns.com/foaf/0.1/name",
    "homepage": "http://xmlns.com/foaf/0.1/homepage"
  }
}

The __vocab__ prefix is a special prefix that states that any term that doesn't resolve to a term or a prefix should be appended to the __vocab__ IRI. This is done to ensure that terms can be transformed to an IRI at all times.

7. The JSON-LD Processing Algorithm

The processing algorithm described in this section is provided in order to demonstrate how one might implement a JSON-LD processor. Conformant implementations are only required to produce the same type and number of triples during the output process and are not required to implement the algorithm exactly as described.

The Processing Algorithm is a work in progress, there are still major bugs in the algorithm and it's difficult to follow. It's provided only to very early implementers to give them an idea of how to implement a processor.

Processing Algorithm Terms

default context - a context that is specified to the JSON-LD processing algorithm before processing begins.
default graph - the destination graph for all triples generated by JSON-LD markup.
active subject - the currently active subject that the processor should use when generating triples.
inherited subject - a subject that was detected at a higher level of processing to be used to generate a triple once a current subject is discovered or generated.
active property - the currently active property that the processor should use when generating triples.
inherited property - a property that was detected at a higher level of processing to be used to generate a triple once a current subject is discovered or generated.
active object - the currently active object that the processor should use when generating triples.
active context - a context that is used to resolve CURIEs while the processing algorithm is running. The active context is the top-most item on the active context stack.
local context - a context that is specified at the JSON associative-array level, specified via the # key.
list of incomplete triples - A list of triples that have yet to have their subject set.
list of unprocessed items - A list of objects that cannot be processed until a local context is detected or the end of the current associative-array is detected.
processor state - the processor state, which includes the active context stack, current subject, current property, list of incomplete triples, and the list of unprocessed items.

The algorithm below is designed for streaming (SAX-based) implementations. Implementers will find that non-streaming (document-based) implementations will be much easier to implement as full access to the JSON object model eliminates some of the steps that are necessary for streaming implementations. A conforming JSON-LD processor must implement a processing algorithm that results in the same default graph that the following algorithm generates:

If a default context is supplied to the processing algorithm, push it onto the active context stack.
If an associative array is detected, create a new processor state. Copy the current context stack to the newly created processor state. Push the active context onto the newly created processor state's active context stack. For each key-value pair in the associative array, using the newly created processor state do the following:
1. If a # key is found, the processor merges each key-value pair in the local context into the active context, overwriting any duplicate values in the active context. Process each object in the list of unprocessed items, starting at Step 2.2.
2. If the local context has not been detected, the current key-value pair is placed into the list of unprocessed items and processing proceeds to the next key-value pair. Otherwise, if the local context is known perform the following steps:
If a regular array is detected, process each value in the array by doing the following:
1. If the value is an associative array, processes per Step 2, ensuring to set the inherited subject to the active subject and the inherited property to the active property in the newly created processor state.
2. If the value is not an array, set the active object by performing Object Processing on the value. Generate a triple representing the active subject, the active property and the active object and place it into the default graph.
3. If the value is a regular array, should we support RDF List/Sequence Processing?
  If the value is a regular array, should we support RDF List/Sequence generation of triples? For example, would implementing this be worth the more complex processing rules: "ex:orderedItems" : [["one", "two", "three"]]

8. Markup Examples

The JSON-LD markup examples below demonstrate how JSON-LD can be used to express semantic data marked up in other languages such as RDFa, Microformats, and Microdata. These sections are merely provided as proof that JSON-LD is very flexible in what it can express across different Linked Data approaches.

8.1 RDFa

The following example describes three people with their respective names and homepages.

<div prefix="foaf: http://xmlns.com/foaf/0.1/">
   <ul>
      <li typeof="foaf:Person">
        <a rel="foaf:homepage" href="http://example.com/bob/" property="foaf:name" >Bob</a>
      </li>
      <li typeof="foaf:Person">
        <a rel="foaf:homepage" href="http://example.com/eve/" property="foaf:name" >Eve</a>
      </li>
      <li typeof="foaf:Person">
        <a rel="foaf:homepage" href="http://example.com/manu/" property="foaf:name" >Manu</a>
      </li>
   </ul>
</div>

An example JSON-LD implementation is described below, however, there are other ways to mark-up this information such that the context is not repeated.

[
 {
   "#": { "foaf": "http://xmlns.com/foaf/0.1/" },
   "@": "_:bnode1",
   "a": "foaf:Person",
   "foaf:homepage": "<http://example.com/bob/>",
   "foaf:name": "Bob"
 },
 {
   "#": { "foaf": "http://xmlns.com/foaf/0.1/" },
   "@": "_:bnode2",
   "a": "foaf:Person",
   "foaf:homepage": "<http://example.com/eve/>",
   "foaf:name": "Eve"
 },
 {
   "#": { "foaf": "http://xmlns.com/foaf/0.1/" },
   "@": "_:bnode3",
   "a": "foaf:Person",
   "foaf:homepage": "<http://example.com/manu/>",
   "foaf:name": "Manu"
 }
]

8.2 Microformats

The following example uses a simple Microformats hCard example to express how the Microformat is represented in JSON-LD.

<div class="vcard">
 <a class="url fn" href="http://tantek.com/">Tantek Çelik</a>
</div>

The representation of the hCard expresses the Microformat terms in the context and uses them directly for the url and fn properties. Also note that the Microformat to JSON-LD processor has generated the proper URL type for http://tantek.com.

{
  "#": 
  {
    "vcard": "http://microformats.org/profile/hcard#vcard"
    "url": "http://microformats.org/profile/hcard#url"
    "fn": "http://microformats.org/profile/hcard#fn"
  },
  "@": "_:bnode1",
  "a": "vcard",
  "url": "<http://tantek.com/>",
  "fn": "Tantek Çelik"
}

8.3 Microdata

The Microdata example below expresses book information as a Microdata Work item.

<dl itemscope
    itemtype="http://purl.org/vocab/frbr/core#Work"
    itemid="http://purl.oreilly.com/works/45U8QJGZSQKDH8N">
 <dt>Title</dt>
 <dd><cite itemprop="http://purl.org/dc/terms/title">Just a Geek</cite></dd>
 <dt>By</dt>
 <dd><span itemprop="http://purl.org/dc/terms/creator">Wil Wheaton</span></dd>
 <dt>Format</dt>
 <dd itemprop="http://purl.org/vocab/frbr/core#realization"
     itemscope
     itemtype="http://purl.org/vocab/frbr/core#Expression"
     itemid="http://purl.oreilly.com/products/9780596007683.BOOK">
  <link itemprop="http://purl.org/dc/terms/type" href="http://purl.oreilly.com/product-types/BOOK">
  Print
 </dd>
 <dd itemprop="http://purl.org/vocab/frbr/core#realization"
     itemscope
     itemtype="http://purl.org/vocab/frbr/core#Expression"
     itemid="http://purl.oreilly.com/products/9780596802189.EBOOK">
  <link itemprop="http://purl.org/dc/terms/type" href="http://purl.oreilly.com/product-types/EBOOK">
  Ebook
 </dd>
</dl>

Note that the JSON-LD representation of the Microdata information stays true to the desires of the Microdata community to avoid contexts and instead refer to items by their full IRI.

[
  {
    "@": "<http://purl.oreilly.com/works/45U8QJGZSQKDH8N>",
    "a": "http://purl.org/vocab/frbr/core#Work",
    "http://purl.org/dc/terms/title": "Just a Geek",
    "http://purl.org/dc/terms/creator": "Whil Wheaton",
    "http://purl.org/vocab/frbr/core#realization": 
      ["<http://purl.oreilly.com/products/9780596007683.BOOK>", "<http://purl.oreilly.com/products/9780596802189.EBOOK>"]
  },
  {
    "@": "<http://purl.oreilly.com/products/9780596007683.BOOK>",
    "a": "<http://purl.org/vocab/frbr/core#Expression>",
    "http://purl.org/dc/terms/type": "<http://purl.oreilly.com/product-types/BOOK>"
  },
  {
    "@": "<http://purl.oreilly.com/products/9780596802189.EBOOK>",
    "a": "http://purl.org/vocab/frbr/core#Expression",
    "http://purl.org/dc/terms/type": "<http://purl.oreilly.com/product-types/EBOOK>"
  }
]

9. Markup of RDF Concepts

JSON-LD is designed to ensure that most Linked Data concepts can be marked up in a way that is simple to understand and author by Web developers. In many cases, Javascript objects can become Linked Data with the simple addition of a context. Since RDF is also an important sub-community of the Linked Data movement, it is important that all RDF concepts are well-represented in this specification. This section details how each RDF concept can be expressed in JSON-LD.

9.1 IRIs

Expressing IRIs are fundamental to Linked Data as that is how most subjects and many objects are identified. IRIs can be expressed by wrapping a text string with the < and > characters.

{
...
  "foaf:homepage": "<http://manu.sporny.org>",
...
}

The example above would set the object to an IRI with the value of http://manu.sporny.org.

Wrapping IRIs with the < and > characters are only necessary when IRIs are specified as objects. At no other point do you need to wrap an IRI. You do not need to wrap IRIs when declaring a property, declaring a CURIE, or describing key-value pairs in a context.

9.2 Identifying the Subject

A subject is declared using the @ key. The subject is the first piece of information needed by the JSON-LD processor in order to create the (subject, predicate, object) tuple, also known as a triple.

{
...
  "@": "<http://example.org/people#joebob>",
...
}

The example above would set the subject to .

9.3 Specifying the Type

The type of a particular subject can be specified using the a key. Specifying the type in this way will generate a triple of the form (subject, type, type-url).

{
...
  "@": "<http://example.org/people#joebob>",
  "a": "<http://xmlns.com/foaf/0.1/Person>",
...
}

The example above would generate the following triple (in N-Triples notation):

<http://example.org/people#joebob> 
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
      <http://xmlns.com/foaf/0.1/Person> .

9.4 Plain Literals

Regular text strings are called "plain literals" in RDF and are easily expressed using regular JSON strings.

{
...
  "foaf:name": "Mark Birbeck",
...
}

9.5 Language Specification in Plain Literals

JSON-LD attempts to make sure that it is easy to express triples in other languages while simultaneously ensuring that hefty data structures aren't required to accomplish simple language markup. When the @ symbol is used in a literal, the JSON-LD processor tags the literal text with the language tag that follows the @ symbol.

{
...
  "foaf:name": "花澄@ja",
...
}

The example above would generate a plain literal for 花澄 and associate the ja language tag with the triple that is generated. Languages must be expressed in [BCP47] format.

9.6 Typed Literals

Literals may also be typed in JSON-LD by using the ^^ sequence at the end of the text string.

{
...
  "dc:modified": "2010-05-29T14:17:39+02:00^^xsd:dateTime",
...
}

The example above would generate an object with the value of 2010-05-29T14:17:39+02:00 and the datatype of http://www.w3.org/2001/XMLSchema#dateTime.

9.7 Multiple Objects for a Single Property

A JSON-LD author can express multiple triples in a compact way by using arrays. If a subject has multiple values for the same property, the author may express each property as an array.

{
...
  "@": "<http://example.org/people#joebob>",
  "foaf:nick": ["stu", "groknar", "radface"],
...
}

The markup shown above would generate the following triples:

<http://example.org/people#joebob> 
   <http://xmlns.com/foaf/0.1/>
      "stu" .
<http://example.org/people#joebob> 
   <http://xmlns.com/foaf/0.1/>
      "groknar" .
<http://example.org/people#joebob> 
   <http://xmlns.com/foaf/0.1/>
      "radface" .

9.8 Multiple Typed Literals for a Single Property

Multiple typed literals are expressed very much in the same way as multiple properties:

{
...
  "@": "<http://example.org/articles/8>",
  "dcterms:modified": ["2010-05-29T14:17:39+02:00^^xsd:dateTime", "2010-05-30T09:21:28-04:00^^xsd:dateTime"],
...
}

The markup shown above would generate the following triples:

<http://example.org/articles/8> 
   <http://purl.org/dc/terms/modified>
      "2010-05-29T14:17:39+02:00"^^http://www.w3.org/2001/XMLSchema#dateTime .
<http://example.org/articles/8> 
   <http://purl.org/dc/terms/modified>
      "2010-05-30T09:21:28-04:00"^^http://www.w3.org/2001/XMLSchema#dateTime .

9.9 Blank Nodes

At times, it becomes necessary to be able to express information without being able to specify the subject. Typically, this is where blank nodes come into play. In JSON-LD, blank node identifiers are automatically created if a subject is not specified using the @ key. However, authors may name blank nodes by using the special _ CURIE prefix.

{
...
  "@": "_:foo",
...
}

The example above would set the subject to _:foo, which can then be used later on in the JSON-LD markup to refer back to the named blank node.

9.10 Escape Character

Special characters in property values must be escaped in order to not be interpreted as CURIEs, IRIs, language tags, or TypedLiterals.

The special characters in JSON-LD are: <, >, @, #, : and ^.

{
...
  "example:code": "\\<foobar\\^\\^2\\>",
...
}

9.11 Automatic Typing

Since JSON is capable of expressing typed information such as decimals, integers and boolean values, JSON-LD utilizes that information to create Typed Literals.

{
...
  // This value is automatically converted to having a type of xsd:decimal
  "measure:cups": 5.3,
  // This value is automatically converted to having a type of xsd:integer
  "chem:protons": 12,
  // This value is automatically converted to having a type of xsd:boolean
  "sensor:active": true,
...
}

10. Advanced Concepts

There are a few advanced concepts where it is not clear whether or not the JSON-LD specification is going to support the complexity necessary to support each concept. The entire section on Advanced Concepts should be taken with a grain of salt; it is merely a list of possibilities where all of the benefits and drawbacks have not been explored.

10.1 Disjoint Graphs

When serializing an RDF graph that contains two or more sections of the graph which are entirely disjoint, one must use an array to express the graph as two graphs. This may not be acceptable to some authors, who would rather express the information as one graph. Since, by definition, disjoint graphs require there to be two top-level objects, JSON-LD utilizes a mechanism that allows disjoint graphs to be expressed using a single graph.

Assume the following RDF graph:

<http://example.org/people#john> 
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
      <http://xmlns.com/foaf/0.1/Person> .
<http://example.org/people#jane> 
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
      <http://xmlns.com/foaf/0.1/Person> .
</section>

Since the two subjects are entirely disjoint with one another, it is impossible to express the RDF graph above using a single JSON-LD associative array.

In JSON-LD, one can use the subject to express disjoint graphs as a single graph:

{
  "#": { "foaf": "http://xmlns.com/foaf/0.1/" },
  "@": 
  [
    {
      "@": "<http://example.org/people#john>",
      "a": "foaf:Person"
    },
    {
      "@": "<http://example.org/people#jane>",
      "a": "foaf:Person"
    }
  ]
}

10.2 Acknowledgements

The editor would like to thank Mark Birbeck, who provided a great deal of the rationale and reasoning behind the JSON-LD work via his work on RDFj, Dave Longley who reviewed and provided feedback on the overall specification and contexts, and Ian Davis, who created RDF/JSON.