Return a nested data structure from a SPARQL query

You could use a CONSTRUCT query with JSON-LD Framing.

Example query (on DBpedia endpoint)

CONSTRUCT
{
  ?person  rdf:type       foaf:Person ;
           dbo:birthName  ?name1s ;
           dbo:birthDate  ?date1s ;
           dbo:spouse     ?spouse .
  ?spouse  rdf:type       foaf:Person ; 
           dbo:birthName  ?name2s ;
           dbo:birthDate  ?date2s .
}
WHERE
{
  ?person  dbo:birthName  ?name1 ;
           dbo:birthDate  ?date1 ;
           dbo:spouse     ?spouse .
  ?spouse  dbo:birthName  ?name2 ;
           dbo:birthDate  ?date2 .
  BIND (str(?name1) AS ?name1s)
  BIND (str(?date1) AS ?date1s)
  BIND (str(?name2) AS ?name2s)
  BIND (str(?date2) AS ?date2s)
  VALUES (?person) { ( dbr:Brad_Pitt ) }
}

Output (in JSON-LD format with context)

{ "@context": {
    "spouse": { "@id": "http://dbpedia.org/ontology/spouse"},
    "birthDate": { "@id": "http://dbpedia.org/ontology/birthDate" },
    "birthName": { "@id": "http://dbpedia.org/ontology/birthName" } },
  "@graph": [
    { "@id": "http://dbpedia.org/resource/Angelina_Jolie",
      "birthName": "Angelina Jolie Voight",
      "birthDate": "1975-06-04" },
    { "@id": "http://dbpedia.org/resource/Brad_Pitt",
      "@type": "http://xmlns.com/foaf/0.1/Person",
      "birthName": "William Bradley Pitt",
      "spouse": [ "http://dbpedia.org/resource/Angelina_Jolie",
                  "http://dbpedia.org/resource/Jennifer_Aniston" ],
      "birthDate": "1963-12-18" },
    { "@id": "http://dbpedia.org/resource/Jennifer_Aniston",
      "birthName": "Jennifer Joanna Aniston",
      "birthDate": "1969-02-11" }
] }

JSON-LD Frame (very simple)

{
  "@context": {"dbo": "http://dbpedia.org/ontology/",
               "dbr": "http://dbpedia.org/resource/",
               "foaf": "http://xmlns.com/foaf/0.1/"},
  "dbo:spouse": {
   }
}

Framed JSON-LD (playground)

{
  "@context": {
    "dbo": "http://dbpedia.org/ontology/",
    "dbr": "http://dbpedia.org/resource/",
    "foaf": "http://xmlns.com/foaf/0.1/"
  },
  "@graph": [
    {
      "@id": "dbr:Brad_Pitt",
      "@type": "foaf:Person",
      "dbo:birthDate": "1963-12-18",
      "dbo:birthName": "William Bradley Pitt",
      "dbo:spouse": [
        {
          "@id": "dbr:Angelina_Jolie",
          "@type": "foaf:Person",
          "dbo:birthDate": "1975-06-04",
          "dbo:birthName": "Angelina Jolie Voight"
        },
        {
          "@id": "dbr:Jennifer_Aniston",
          "@type": "foaf:Person",
          "dbo:birthDate": "1969-02-11",
          "dbo:birthName": "Jennifer Joanna Aniston"
        }
      ]
    }
  ]
}

Some discussion

JSON-LD Framing is an unofficial, yet well implemented specification that describes a deterministic layout for serializing an RDF graph into a particular JSON-LD document layout.

Obviously, with blank nodes property lists, one can achieve something structurally similar to the output you want:

Brad_Pitt
        dbo:birthName   "William Bradley Pitt" ;
        dbo:birthDate   "1963-12-18" .
        dbo:spouse  [   dbo:birthName   "Angelina Jolie Voight" ;
                        dbo:birthDate   "1975-06-04" ] ,
                    [   dbo:birthName   "Jennifer Joanna Aniston" ;
                        dbo:birthDate   "1969-02-11" ] .

However, this is Turtle, not JSON, and nobody can garantee that these blank nodes property lists will be used in serialization.


You're conflating the query result itself (essentially an abstract table structure) with the syntax in which that result is written (in your case, a customized nested JSON structure).

Don't try to do tricks with group concatenation in this case. Just do this query:

SELECT ?given ?family ?friend_given ?friend_family
WHERE {
  ?person foaf:givenName ?given ;
          foaf:familyName ?family .
  ?person :knows ?friend .
  ?friend foaf:givenName ?friend_given ;
          foaf:familyName ?friend_family .
}
GROUP BY ?family ?given

Which results in a result like this:

given  family  friend_given friend_family
-------------------------------------------- 
Alice  Lidell  Bob          Doe
Alice  Lidell  Hwa          Choi

And then let a custom streaming result writer write the result to the nested syntax format you require. Given that the query groups by name, the writer can safely assume that subsequent rows with the same given and family names "belong together".

Alternatively, use a CONSTRUCT query instead of a SELECT, and post-process the retrieved RDF graph (which accurately represents the tree structure you're after).

Tags:

Sparql