TechnicaLee Speaking: January 2007 Archives

« December 2006 | Main | February 2007 »

January 19, 2007

Announcing: Boca 1.8 - new database support

While I've been writing dense treatises on Semantic Web development, Matt's been hard at work on the latest release of Boca. Matt's announcement of Boca 1.8 carries all the details as well as a look at what Boca 2.0 will bring. Amidst the usual slew of bug fixes, usability improvements, and performance fixes, the major addition to Boca is support for three new databases beyond DB2. Boca now also runs on MySQL, PostgreSQL, and HSQLDB. Cool stuff.

In other Semantic Layered Research Platform news, we're working towards pushing out stable releases(with documentation and installation packaging) of two more of our components: Queso (Atom-driven Web interface to Boca) and DDR (binary data repository with metadata-extractor infrastructure to store metadata within Boca). We're hoping to get these out by the middle of February, so stay tuned.

Posted by Lee Feigenbaum at 5:57 PM | Permalink

January 18, 2007

Using RDF on the Web: A Vision

(This is the second part of two posts about using RDF on the Web. The first post was a survey of approaches for creating RDF-data-driven Web applications.) All existing implementations referred to in this post are discussed in more detail and linked to in part one.

Here's what I would like to see, along with some thoughts on what is or is not implemented. It's by no means a complete solution and there are plenty of unanswered questions. I'd also never claim that it's the right solution for all or most applications. But I think it has a certain elegance and power that would make developing certain types of Web applications straightforward, quick, and enjoyable. Whenever I refer to "the application" or "the app", I'm talking about browser-based Web application implemented in JavaScript.

To begin with, I imagine servers around the Web storing domain-specific RDF data. This could be actual, materialized RDF data or virtual RDF views of underlying data in other formats. This first piece of the vision is, of course, widely implemented (e.g. Jena, Sesame, Boca, Oracle, Virtuoso, etc.)
The application fetches RDF from such a server. This may be done in a variety of ways:
- An HTTP GET request for a particular RDF/XML or Turtle document
- An HTTP GET request for a particular named graph within a quad store (a la Boca or Sesame)
- A SPARQL CONSTRUCT query extracting and transforming the pieces of the domain-specific data that are most relevant to the application
- A SPARQL DESCRIBE query requesting RDF about a particular resource (URI)
In my mind, the CONSTRUCT approach is the most appealing method here: it allows the application to massage data which it may be receiving from multiple data sources into a single domain-specific RDF model that can be as close as possible to the application's own view of the world. In other words, reading the RDF via a query effectively allows the application to define its own API.

Once again, the software for this step already exists via traditional Web servers and SPARQL protocol endpoints.
Second, the application must parse the RDF into a client-side model. Precisely how this is done depends on the form taken by the RDF received from the server:
- The server returns RDF/XML. In this case, the client can use Jim Ley's parser to end up with a list of triples representing the RDF graph. The software to do this is already implemented.
- The server returns Turtle. In this case, the client can use Masahide Kanzaki's parser to end up with a list of triples representing the RDF graph. The software to do this is already implemented.
- The server returns RDF/JSON. In this case, the client can use Douglas Crockford's JSON parsing library (effectively a regular expression security check followed by a call to eval(...) While the software is implemented here, the RDF/JSON standard which I've cavalierly tossed about so far does not yet exist. Here, I'm imagining a specification which defines RDF/JSON based on the common JavaScript data structure used by the above two parsers. ( A bit of work probably still needs to be done if this were to become a full RDF/JSON specification, as I do not believe the current format used by the two parsers can distinguish blank node subjects from subjects with URIs.)
In any case, we now have on the client a simple RDF graph of data specific to the domain of our application. Yet as I've said before, we'd like to make application development easier by moving away from triples at this point into data structures which more closely represent the concepts being manipulated by the application.
The next step, then, is to map the RDF model into a application-friendly JavaScript object model. If I understand ActiveRDF correctly (and in all fairness I've only had the chance to play with it a very limited amount), it will examine either the ontological statements or instance data within an RDF model and will generate a Ruby class hierarchy accordingly. The introduction to ActiveRDF explains the dirty-but-well-appreciated trick that is used: "Just use the part of the URI behind the last ”/” or ”#” and Active RDF will figure out what property you mean on its own." Of course, sometimes there will be ambiguities, clashes, or properties written to which did not already exist (with full URIs) in the instance data received; in these cases, manual intervention will be necessary. But I'd suggest that in many, many cases, applying this sort of best-effort heuristics to a domain-specific RDF model (especially one which the application has selected especially via a CONSTRUCT query) will result in extremely natural object hierarchies.

None of this piece is implemented at all. I'd imagine that it would not be too difficult, following the model set forth by the ActiveRDF folks.

Late-breaking news: Niklas Lindström, developer of the Python RDF ORM system Oort followed up on my last post and said (among other interesting things):
I use an approach of "removing dimensions": namespaces, I18N (optionally), RDF-specific distinctions (collections vs. multiple properties) and other forms of graph traversing.

Sounds like there would be some more simplification processes that could be adapted from Oort in addition to those adapted from ActiveRDF.
The main logic of the Web application (and the work of the application developer) goes here. The developer receives a domain model and can render it and attach logic to it in any way he or she sees fit. Often this will be via a traditional model-view-controller approach: this approach is facilitated by toolkits such as dojo or even via a system such as nike templates (nee microtemplates). Thus, the software to enable this meat-and-potatoes part of application development already exists.

In the course of the user interacting with the application, certain data values change, new data values are added, and/or some data items are deleted. The application controller handles these mutations via the domain-specific object structures, without regards to any RDF model.
When it comes time to commit the changes (this could happen as changes occur or once the user saves/commits his or her work), standard JavaScript (i.e. a reusable library, rather than application-specific code) recognizes what has changed and maps (inverts) the objects back to the RDF model (as before, represented as arrays of triples). This inversion is probably performed by the same library that automatically generated the object structure from the RDF model in the first place. As with that piece of this puzzle, this library does not yet exist.

Reversing the RDF ORM mapping is clearly challenging, especially when new data is added which has not been previously seen by the library. In some cases--perhaps even in most?--the application will need to provide hints to the library to help the inversion. I imagine that the system probably needs to keep an untouched deep copy of the original domain objects to allow it to find new, removed, and dirty data at this point. (An alternative would be requiring adds, deletes, and mutations to be performed via methods, but this constrains the natural use of the domain objects.)
Next, we determine the RDF difference between our original model and our updated model. The canonical work on RDF deltas is a design note by Tim Berners-Lee and Dan Connolly. Basically, though, an RDF diff amounts simply to a collection of triples to remove and a collection of triples to add to a graph. No (JavaScript) code yet exists to calculate RDF graph diffs, though the algorithms are widely implemented in other environments including cwm, rdf-utils, and SemVersion. We also work often with RDF diffs in Boca (when the Boca client replicates changes to a Boca server). I'd hope that this implementation experience would translate easily to a JavaScript implementation.
Finally, we serialize the RDF diffs and send them back to the data source. This requires two components that are not yet well-defined:
- A serialization format for the RDF diffs. Tim and Dan's note uses the ability to quote graphs within N3 combined with a handful of predicates (diff:replacement, diff:deletion, and diff:insertion). I can also imagine a simple extension of (whatever ends up being) the RDF/JSON format to specify the triples to remove and add:
```
  {
    'add' : [ RDF/JSON triple structures go here ],
    'remove' : [ RDF/JSON triple structures go here ]
  }
```
- An endpoint or protocol which accepts this RDF diff serialization. Once we've expressed the changes to our source data, of course, we need somewhere to send them. Preferably, there would be a standard protocol (à la the SPARQL Protocol) for sending these changes to a server. To my knowledge, endpoints that accept RDF diffs to update RDF data are not currently implemented. (Late-breaking addition: on my first post, Chris and Richard both pointed me to Mark Baker's work on RDF forms. While I'm not very familiar with any existing uses of this work, it looks like it might be an interesting way to describe the capabilities of an RDF update endpoint.)
As an alternative for this step, the entire client-side RDF model could be serialized (to RDF/XML or to N-Triples or to RDF/JSON) and HTTP PUT back to an origin server. This strategy seems to make the most sense in a document-oriented system; to my knowledge this is also not currently implemented.

That's my vision, as raw and underdeveloped as it may be. There are a large number of extensions, challenges and related work that I have not yet mentioned, but which will need to be addressed when creating or working with this type of Web application. Some discussion of these is also in order.

Handling Multiple Sources of Data

To use the above Web-application-development environment to create Web 2.0-style mash-ups, most of the steps would need to be performed once per data source being integrated. This adds to the system a provenance requirement, whereby the libraries could offer the application a unified view of the domain-specific data while still maintaining links between individual data elements and their source graphs/servers/endpoints to facilitate update. When the RDF diffs are computed, they would need to be sent back to the proper origins. Also, the sample JavaScript structures that I've mentioned as a base for RDF/JSON and the RDF/JSON diff serialization would likely need to be augmented with a URI identifying the source graph of each triple. (That is, we'd end up working with a quad system, though we'd probably be able to ignore that in the object hierarchy that the application deals with.) In many cases, though, an application that reads from many data sources will write only to a single source; it does not seem particularly onerous for the application to specify a default "write-back" endpoint.

Inverting SPARQL `CONSTRUCT` Queries

An appealing part of the above system (to me, at least) is the use of CONSTRUCT queries to map origin data to a common RDF model before merging it on the client and then mapping it into a domain-specific JavaScript object structure. Such transformations, however, would make it quite difficult--if not impossible--to automatically send the proper updates back to the origin servers. We'd need a way of inverting the CONSTRUCT query which generated the triples the application has (indirectly) worked with, and while I have not given it much thought, I imagine that that is quite difficult, if not impossible.

SPARQL `UPDATE`.

The DAWG has postponed any work on updating graphs for the initial version of SPARQL, but Max Völkel and Richard Cyganiak have started a bit of discussion on what update in SPARQL might look like (though Richard has apparently soured on the idea a bit since then). At first blush, using SPARQL to update data seems like a natural counterpart to using SPARQL to retrieve the data. However, in the vision I describe above, the application would likely need to craft a corresponding SPARQL UPDATE query for each SPARQL CONSTRUCT query that is used to retrieve the data in the first place. This would be a larger burden on the application developer, so should probably be avoided.

Related Work

I wanted to acknowledge that in several ways this whole pattern is closely related to but (in some mindset, at least) the inverse of a paradigm that Danny Ayers has floated in the past. Danny has suggested using SPARQL CONSTRUCT queries to transition from domain-specific models to domain-independent models (for example, a reporting model). Data from various sources (and disparate domains) can be merged at the domain-independent level and then (perhaps via XSLT) used to generate Web pages summarizing and analyzing the data in question. In my thoughts above, we're also using the CONSTRUCT queries to generate an agreed-upon model, but in this case we're seeking an extremely domain-specific model to make it easier for the Web-application developer to deal with RDF data (and related data from multiple sources).

Danny also wrote some related material to www-archive. It's not the same vision, but parts of it sound familiar.

Other Caveats

Updating data has security implications, of course. I haven't even begun to think about them.

Blank nodes complicate almost everything; this may be sacrilege in some circles, but in most cases I'm willing to pretend that blank nodes don't exist for my data-integration needs. Incorporating blank nodes makes the RDF/JSON structures (slightly) more complicated; it raises the question of smushing together nodes when joining various models; and it significantly complicates the process of specifying which triples to remove when serializing the RDF diffs. I'd guess that it's all doable using functional and inverse-functional properties and/or with told bnodes, but it probably requires more help from the application developer.

I have some worries about concurrency issues for update. Again, I haven't thought about that much and I know that the Queso guys have already tackled some of those problems (as have many, many other people I'm sure), so I'm willing to assert that these issues could be overcome.

In many rich-client applications, data is retrieved incrementally in response to user-initiated actions. I don't think that this presents a problem for the above scheme, but we'd need to ensure that newly arriving data could be seamlessly incorporated not only into the RDF models but also into the object hierarchies that the application works with.

Bill de hÓra raised some questions about the feasibility of roundtripping RDF data with HTML forms a while back. There's some interesting conversation in the comments there which ties into what I've written here. That said, I don't think the problems he illustrates apply here--there's power above and beyond HTML forms in putting an extra JavaScript-based layer of code between the data entry interface (whether it be an HTML form or a more specialized Web UI) and the data update endpoint(s).

OK, that's more than enough for now. These are still ideas clearly in progress, and none of the ideas are particularly new. That said, the environment as I envision doesn't exist, and I suppose I'm claiming that if it did exist it would demonstrate some utility of Semantic Web technologies via ease of development of data- and integration-driven Web applications. As always, I'd enjoy feedback on these thoughts and also any pointers to work I might not know about.

Posted by Lee Feigenbaum at 7:32 PM | Permalink | Comments (8)

January 16, 2007

Using RDF on the Web: A Survey

(This is part one of two posts exploring building read-write Web applications using RDF. Part two will follow, shortly. Update: Part two is now available, also.)

The Web permeates our world today. Far more than static Web sites, the Web has come to be dominated by Web applications--useful software that runs inside a Web browser and on a server. And the latest trend in Web applications, Web 2.0, encourages--among other things--highly interactive Web sites with rich user interfaces featuring content from various sources around the Web integrated within the browser.

Many of us who have drank deeply from the Semantic Web Kool-Aid are excited about the potential of RDF, SPARQL, and OWL to provide flexible data modeling, easier data integration, and networked data access and query. It's no coincidence that people often refer to the Semantic Web as a web of data. And so it seems to me that RDF and friends should be well-equipped to make the task of generating new and more powerful Web mash-ups simple, elegant, and enjoyable. Yet while there are a great number of projects using Semantic Web technologies to create Web applications, there doesn't seem to have emerged any end-to-end solution for creating browser-based read-write applications using RDF which focus on data integration and ease of development.

Following a discussion on this topic at work the other day, I decided to do a brief survey of what approaches do already exist for creating RDF-based Web applications. I want to give a brief overview of several options, assess how they fit together, and then outline a vision for some missing pieces that I feel might greatly empower Web developers working with Semantic Web technologies.

First, a bit on what I'm looking for. I want to be able to quickly develop data-driven Web applications that read from and write back to RDF data sources. I'd like to exploit standard protocol and interfaces as much as possible, and limit the amount of domain-specific code that needs to be written. I'd like the infrastructure to make it as easy as possible for the application developer to retrieve data, integrate the data, and work with it in a convenient and familiar format. That is, in the end, I'm probably looking for a system that allows the developer to work with a model of simple, domain-specific JavaScript object hierarchies.

In any case, here's the survey. I've tried to include most of the systems I know of which involve RDF data on the Web, even those which are not necessarily appropriate for creating generalized RDF-based Web apps. I'll follow-up with a vision of what could be in my next post.

Semantic Mediawiki

This is an example of a terrific project which is not what I'm looking for here. Semantic Mediawiki provides wiki markup that captures the knowledge contained within a wiki as RDF which can then be exported or queried. While an installation of Semantic Mediawiki will allow me to read and write RDF data via the Web, I am constrained within the wiki framework; further, the interface to reading and writing the RDF is markup-based rather than programmatic.

The Semantic Bank API

The SIMILE project provides an HTTP POST API for publishing and persisting RDF data found on local Web pages to a server-side bank (i.e. storage). They also provide a JavaScript library (BSD license) which wraps this API. While this API supports writing a particular type of RDF data to a store, it does not deal with reading arbitrary RDF from across the Web. The API also seems to require uploaded data to be serialized as RDF/XML before being sent to a Semantic Bank. This does not seem to be what I'm looking for to create RDF-based Web applications.

The Tabulator RDF parser and API

MIT student David Sheets created a JavaScript RDF/XML parser (W3C license). It is fully compliant with the RDF/XML specification, and as such is a great idea for any Web application which needs to gather and parse arbitrary RDF models expressed in RDF/XML. The Tabulator RDF parser populates an RDFStore object. By default, it populates an RDFIndexedFormula store, which inherits from the simpler RDFForumla store. These are rather sophisticated stores which perform (some) bnode and inverse-functional-property smushing and maintain multiple triple indexes keyed on subjects, predicates, and objects.

Clearly, this is an excellent API for developers wishing to work with the full RDF model; naturally, it is the appropriate choice for an application like the Tabulator which at its core is an application that eats, breathes, and dreams RDF data. As such, however, the model is very generic and there is no (obvious, simple) way to translate it into a domain-specific, non-RDF model to drive domain-specific Web applications. Also, the parser and store libaries are read-only: there is no capability to serialize models back to RDF/XML (or any other format) and no capability to store changes back to the source of the data.

(Thanks to Dave Brondsema for an excellent example of using the Tabulator RDF parser which clarified where the existing implementations of the RDFStore interface can be found.)

Jim Ley's JavaScript RDF parser

Jim Ley created perhaps the first JavaScript library for parsing and working with RDF data from JavaScript within a Web browser. Jim's parser (BSD license) handles most RDF/XML serializations and returns a simple JavaScript object which wraps an array of triples and provides methods to find triples by matching subjects, predicates, and objects (any or all of which can be wildcards). Each triple is a simple JavaScript object with the following structure:

{
  subject: ...,
  predicate: ...,
  object: ...,
  type: ...,
  lang: ...,
  datatype: ...
}

The type attribute can be either literal or resource, and blank nodes are represented as resources of the form genid:NNNN. This structure is a simple and straightforward representation of the RDF model. It could be relatively easily mapped into an object graph, and from there into a domain-specific object structure. The simplicity of the triple structure makes it a reasonable choice for a potential RDF/JSON serialization. More on this later.

Jim's parser also provides a simple method to serialize the JavaScript RDF model to N-Triples, though that's the closest it comes to providing support for updating source data with a changed RDF graph.

Masahide Kanzaki's Javascript Turtle parser

In early 2006, Masahide Kanzaki wrote a JavaScript library for parsing RDF models expressed in Turtle. This parser is licenses under the terms of the GPL 2.0 and can parse into two different formats. One of these formats is a simple list of triples, (intentionally) identical to the object structure generated by Jim Ley's RDF/XML parser. The other format is a JSON representation of the Turtle document itself. This format is appealing because a nested Turtle snippet such as:

@prefix : <http://example.org/> .

:lee :address [ :city "Cambridge" ; :state "MA" ] .

translates to this JavaScript object:

{
  "@prefix": "<http://example.org/>",
  "address": {
    "city": "Cambridge",
    "state": "MA"
  }
}

While this format loses the URI of the root resource (http://example.org/lee), it provides a nicely nested object structure which could be manipulated easily with JavaScript such as:

  var lee = turtle.parse_to_json(jsonStr);
  var myState = lee.address.state; // this is easy and domain-specific - yay!

Of course, things get more complicated with non-empty namespace prefixes (the properties become names like ex:name which can't be accessed using the obj.prop syntax and instead need to use the obj["ex:name"] syntax). This method of parsing also does not handle Turtle files with more than a single root resource well. And an application that used this method and wanted to get at full URIs (rather than the namespace prefix artifacts of the Turtle syntax) would have to parse and resolve the namespaces prefixes itself. Still, this begins to give ideas on how we'd most like to work with our RDF data in the end within our Web app.

Masahide Kanzaki also provides a companion library which serializes an array of triples back to Turtle. As with Jim Ley's parser, this may be a first step in writing changes to the RDF back to the data's original store; such an approach requires an endpoint which accepts PUT or POSTed RDF data (in either N-Triples or Turtle syntax).

SPARQL + SPARQL/JSON + sparql.js

The DAWG published a Working Group Note specifying how the results of a SPARQL SELECT or ASK query can be serialized within JSON. Elias and I have also written a JavaScript library (MIT license) to issue SPARQL queries against a remote server and receive the results as JSON. By default, the JavaScript objects produced from the library match exactly the SPARQL results in JSON specification:

{
  "head": { "vars": [ "book" , "title" ]
  } ,
  "results": { "distinct": false , "ordered": false ,
    "bindings": [
      {
        "book": { "type": "uri" , "value": "http://example.org/book/book6" } ,
        "title": { "type": "literal" , "value": "Harry Potter and the Half-Blood Prince" }
      } ,
      ...

The library also provides a number of convenience methods which issue SPARQL queries and return the results in less verbose structures: selectValues returns an array of literal values for queries selecting a single variable; selectSingleValue returns a single literal value for queries selecting a single variable which expect to receive a single row; or selectValueArrays which returns a hash relating each of the query's variables to an array of values for that variable. I've used these convenience methods in the SPARQL calendar and SPARQL antibodies demos and found it quite easy for SPARQL queries returning small amounts of data.

Note, however, that this method does not actually work with RDF on the client side .Because it is designed for SELECT (or ASK) queries, the Web application developer ends up working with lists of values in the application (more generally, a table or result set structure). Richard Cyganiak has suggested serializing entire RDF graphs using this method by using the query SELECT ?s ?p ?o WHERE { ?s ?p ?o } and treating the three-column result set as an RDF/JSON serialization. This is a clever idea, but results in a somewhat unwieldy JavaScript object representing a list of triples: if a list of triples is my goal, I'd rather use the Jim Ley simple object format. But in general, I'd rather have my RDF in a form where I can easily traverse the graph's relationships without worrying about subjects, predicates, and objects.

Additionally, the SPARQL SELECT query approach is a read-only approach. There is no current way to modify values returned from a SPARQL query and send the modified values (along with the query) back to an endpoint to change the underlying RDF graph(s).

JSONC, JSONI, and JSONP

Benjamin Nowack implemented the SPARQL JSON results format in ARC (W3C license), and then went a bit further. He proposes three additions/modifications to the standard SPARQL JSON results which result in saved bandwidth, more directly usable structures, and the ability to instruct a SPARQL endpoint to return JavaScript above and beyond the results object itself.

JSONC: Benjamin suggests an additional jsonc parameter to a SPARQL endpoint; the value of this parameter instructs the server to flatten certain variables in the result set. The result structure contains only the string value of the flattened variables, rather than a full structure containing type, language, and datatype information.
JSONI: JSONI is another parameter to the SPARQL endpoint which instructs the server to return certain selected variables nested within others. Effectively, this allows certain variables within the result set to be indexed based on the values of other variables. This results in more naturally nested structures which can be more closely aligned with domain-specific models and hence more directly useful by JavaScript application developers.
JSONP: JSONP is one solution to the problem of cross-domain XMLHttpRequest security restrictions. The jsonp parameter to a SPARQL server would specify a function name which the resulting JSON object will be wrapped in in the returned value. This allows the SPARQL endpoint to be used via a <script src="..."></script> invocation which avoids the cross-domain limitation.

The first two methods here are similar to what the sparql.js feature provides on the client side for transforming the SPARQL JSON results format. By implementing them on the server, JSONC and JSONI can save significant bandwidth when returning large result sets. However, in most cases bandwidth concerns can be alleviated by sending gzip'ed content, and performing the transforms on the client allow for a much wider range of possible transformations (and no burden on SPARQL endpoints to support various transformations for interoperability). As far as I know, ARC is currently the only SPARQL endpoint that implements JSONC and JSONI.

JSONP is a reasonable solution in some cases to solving the cross-domain XMLHttpRequest problem. I believe that other SPARQL endpoints (Joseki, for instance) implement a similar option via an HTTP parameter named callback. Unfortunately, this method often breaks down with moderate-length SPARQL queries: these queries can generate HTTP query strings which are longer than either the browser (which parses the script element) or the server is willing to handle.

Queso

Queso is the Web application framework component of the IBM Semantic Layered Research Platform. It uses the Atom Publishing Protocol to allow a browser-based Web application to read and write RDF data from a server. RDF data is generated about all Atom entries and collections that are PUT or POSTed to the server using the Atom OWL ontology. In addition, the content of Atom entries can contain RDF as either RDF/XML or as XHTML marked up with RDFa; the Queso server extracts the RDF from this content and makes it available to SPARQL querying and to other (non-Web) applications.

By using the Atom Publishing Protocol, an application working against a Queso server can both read and write RDF data from that Queso server. While Queso does contain JavaScript libraries to parse the Atom XML format into usable JavaScript objects, libraries do not yet exist to extract RDF data from the content of the Atom entries. Nor do libraries exist yet that can take RDF represented in JavaScript (perhaps in the JIm Ley fashion) and serialize it to RDF/XML inthe content of an Atom entry. Current work with Queso has focused on rendering RDFa snippets via standard HTML DOM manipulation, but have not yet worked with the actual RDF data itself. In this way, Queso is an interesting application paradigm for working with RDF data on the Web, but it does not yet provide a way to work easily with domain-specific data within a browser-based development environment.

(Before Ben, Elias, and Wing come after me with flaming torches, I should add that Queso is still very much evolving: we hope that the lessons we learn from this survey and discussion about a vision of RDF-based Web apps (in my next post) will help guide us as Queso continues to mature.)

RPC / RESTful API / the traditional approach

I debated whether to put this on here and decided it was incomplete without it. This is the paradigm that is probably most widely used and is extremely familiar. A server component interacts with one or more RDF stores and returns domain-specific structures (usually serialized as XML or JSON) to the JavaScript client in response to domain-specific API calls. This is the approach taken by an ActiveRDF application, for instance. There are plenty of examples of this style of Web application paradigm: one which we've been discussing recently is the Boca Admin client, a Web app. that Rouben is working on to help administer Boca servers.

This is a straightforward, well-understood approach to creating well-defined, scalable, and service-oriented Web applications. Yet it falls short in my evaluation in this survey because it requires a server and client to agree on a domain-specific model. This means that my client-sde code cannot integrate data from multiple endpoints across the Web unless those endpoints also agree on the domain model (or unless I write client code to parse and interpret the models returned by every endpoint I'm interested in). Of course, this method also requires the maintenance of both server-side and client-side application code, two sets of code with often radically different development needs.

This is still often a preferred approach to creating Web applications. But it's not really what I'm thinking of when I contemplate the power of driving Web apps with RDF data, and so I'm not going to discuss it further here.

That's what I've got in my survey right now. I welcome any suggestions for things that I'm missing. In my next post, I'm going to outline a vision of what I see a developer-friendly RDF-based Web application environment looking like. I'll also discuss what pieces are already implemented (mainly using systems discussed in this survey) and which are not yet implemented. There'll also be many open questions raised, I'm sure. (Update: Part two is now available, also.)

(I didn't examine which of these approaches provide support for simple inferencing of the owl:sameAs and rdfs:subPropertyOf flavor, though that would be useful to know.)

Posted by Lee Feigenbaum at 1:42 AM | Permalink | Comments (9)

January 9, 2007

Who loves RDF/XML?

I wrote the following as a comment on Seth's latest post about RDF/XML syntax, but the blog engine asked me to add two unspecified numbers, and I had a great deal of difficulty doing that correctly. So instead, it will live here, and I'd love to learn answers to this question from Seth or anyone else who might have any answers. Quoting myself:

Hi Seth,

This is a completely serious question: Who are these people who are insisting on RDF/XML as the/a core of the semantic web? Where can I meet them? Or have I met them and not realized it? Or are they mostly straw-men, as part of me suspects?

Inquiring minds -- and SWEO members -- want to know.

thanks,
Lee

Posted by Lee Feigenbaum at 1:10 AM | Permalink | Comments (5)

TechnicaLee Speaking

Software designs, implementations, solutions, and musings by Lee Feigenbaum