IKS: Toward smarter content management systems

July 27, 2011

This article was contributed by Koen Vervloesem

Interactive Knowledge Stack (IKS) is an open source project focused on building an open and flexible technology platform for semantically enhanced Content Management Systems (CMS). Recently, the project held a workshop in Paris, myCMS and the Web of Data, where some IKS tools were presented and where users of the IKS framework demonstrated how they used the semantic enhancements of the project in their CMS. According to the organizers, the event attracted 90 participants.

IKS is a collaboration between academia, industry, and open source developers, co-funded with €6.58 million by the European Union. The goal is to enrich content management systems with semantic content in order to let the users benefit from more intelligent extraction and linking of their information. In other words, as researcher Wernher Behrendt described it in his introduction of the workshop: "The vision of IKS is to move the CMS forwards in the domain of interactive knowledge." Anyone can participate in this vision, for instance by adding their input to the user stories page on the project's wiki.

All of the code for the various IKS projects are provided under a permissive open source license, either BSD, Apache, or MIT. This is expressly done to pave the way for commercial use of IKS. Two of the software components of the IKS stack that are already in good shape are Apache Stanbol (a Java-based software stack to provide semantic services) and VIE (Vienna IKS Editables, a solution to make RDFa encoded semantics browser-editable).

Semantic applications

In his keynote speech "From Semantic Platforms to Semantic Applications", Stéphane Croisier emphasized some problems of the current semantic technology solutions. There is a lot of development happening, with Linked Data, natural language processing, entity extraction, ontologies, and reasoners, that make a lot of promises, but all of these solutions are moving slowly. Croisier has investigated some of them in a so-called "one-week reality check", and he didn't like what he saw:

Many of the semantic web solutions are not ready for multi-language environments, which is especially in Europe a big problem, or they have poor scalability. Others have a steep learning curve, and this industry is also plagued a lot by fanaticism and religious wars, like we had in open source five years ago. All these factors prevent mainstream adoption of the semantic web.

But the problems are not limited to the technical level. According to Croisier, the next key challenge is improved user experience:

Current user interfaces for the semantic web are ugly and not user-friendly. One of the reasons is that the budgets go mostly to platform development, not to development of the user interface, which is probably because many semantic web projects are born in universities and have an academic approach, focusing on the technology. But it doesn't have to be this way, and if we want a breakthrough of the semantic web, we better start working on good user interfaces.

At the same time, Croisier expressed his hope that developers will move their efforts from semantic platforms to semantic applications, or in other words "migrate from the geek to the practitioner". Only geeks are excited by the platform stuff like RDF (Resource Description Framework), ontologies and REST (REpresentational State Transfer) interfaces, but the industry needs some smart content applications. However, there seems to be a barrier to overcome, as Croisier admitted: "We're all still trying to find the killer app for the semantic web."

Apache Stanbol

After Croisier's talk, a couple of early adopters showed their demos of applications built on Apache Stanbol, the open source modular software stack for semantic content management initiated by IKS. Stanbol components are meant to be accessed over RESTful interfaces to provide semantic services, and the code is written in Java and based on the OSGi modularization framework.

Stanbol has four main features to offer to applications using its services: persistence (it stores or caches semantic information and makes it searchable), lifting/enhancement (it adds semantic information to unstructured pieces of content), knowledge models and reasoning (to enhance the semantic information), and interaction (management and generation of intelligent user interfaces). If you want to take a peek at the possibilities, there's an online demo: just paste some text into the form and run the engine to look at the entities Stanbol finds. There are also some installation instructions in the documentation to run a Stanbol server yourself. Because Stanbol has a RESTful API, it's also easy to test it with a command line tool like curl.

At the IKS workshop, some integrators showed how they integrated Stanbol into an open source CMS. For instance, the London-based company Zaizi showed an integration with the enterprise content management system Alfresco. The code for this integration is licensed under the LGPL and there's a website with some information and installation instructions. The semantic engine extracts entities from Microsoft Office, ODF, PDF, HTML, and plain text documents uploaded to Alfresco and shows the entities next to the content details. The entities can also be selected in Alfresco's interface to list all other documents classified with that entity.

Jürgen Jakobitsch from the Austrian company punkt. netServices presented its Drupal plugin to integrate Stanbol. The current version is targeted at Drupal 6, but an update for Drupal 7 is coming soon. The module enables tag recommendations as well as semi-automated semantic annotation. The open data website of the Austrian government is running this Drupal/IKS integration.

Andrea Volpini and David Riccitelli from the Italian company InsideOut10 presented WordLift, an open source plugin to enrich textual content on a WordPress blog using HTML microdata, which is easy to parse by search engines. When writing a blog post, the content is sent to Stanbol, and the entities it finds will be added in Google Rich Snippets or Schema.org format. The user can then select which of the found entities are relevant. It's all still quite experimental, but the target of the developers is clear: spoon-feeding HTML microdata to the search engines using semantic web technologies. According to Volpini, the source code of the plugin will be published in a few weeks.

In addition, Olivier Grisel from the French open source ECM (enterprise content management) company Nuxeo presented their Semantic Entities module for the Nuxeo CMS and Juan A. Prieto presented the integration of Stanbol with the semantic CMS XIMDEX.

Vienna IKS Editables

The other key component of the IKS software stack is VIE (Vienna IKS Editables), presented at the workshop by the main developer, Henri Bergius. The idea is to "build a CMS, no forms allowed", as people don't like forms ("forms are only for communication with the government," according to Bergius). To make this possible, the CMS and some JavaScript code must agree on the content model, and this is what VIE offers: it understands RDFa, a semantically annotated version of HTML.

If you annotate your website (or CMS) with RDFa, suddenly JavaScript code can understand the meaning of your content. VIE is an MIT-licensed browser API for RDFa, bridging RDFa to JavaScript. It depends on Backbone.js and jQuery, and it reads all RDFa annotated entities as JavaScript objects on a page where the library is loaded. These objects can then be edited by the user in the browser, and changes are synchronized with the server and the Document Object Model (DOM) in the browser.

The big promise of VIE is that it is independent of the CMS: the same lines of JavaScript work on Drupal, WordPress, TYPO3, and any other CMS that has provided an implementation of the Backbone.sync method. Apart from implementing this method, you only have to mark up your content with RDFa, include vie.js in your pages and write some JavaScript code. The three latter tasks are all independent of the underlying CMS.

On top of VIE, there's also VIE^2 (Vienna IKS Editable Entities), which talks to services like Stanbol and OpenCalais to find related information for your content. To show what's possible with VIE and VIE^2, the IKS developers created palsu.me, an online collaborative meeting tool.

Surviving after EU funding

The IKS project was started in 2009 as a 4-year EU project, but how will it survive after the project (and the funding) is done? Bertrand Delacretaz from Adobe had some advice. Apart from being a developer at Adobe, he is also a member of the board of directors of the Apache Software Foundation. Stanbol is currently an Apache Incubator project since 2010, and the developers would like it to graduate to a full Apache project, preferably before the end of 2012 because that's when the IKS project (and hence the funding) stops.

There are, however, some criteria before an Apache incubator project is allowed full project status. Delacretaz gave two examples: all communication about the project has to happen on the -dev mailing list, and there have to be at least three legally independent committers (with different income sources). The latter is currently a problem for Stanbol, because too many committers get funded by the IKS project. So Delacretaz would like to see more (external) committers for Stanbol to secure its future.

In search of the killer app

At the end of the conference, the organizers announced the IKS Semantic CMS UI/X Competition. Project manager John Pereira said that the first 1.5 years of IKS were focused on infrastructure, but now the focus has shifted to the users. In the contest, the IKS project will give two awards of €40,000 to CMS developers who build "killer user experiences and user interfaces" on top of IKS technology. Anyone with an idea for a killer semantic application can enter the contest.

Of course there are some conditions. The proposed solution should reuse as many IKS components as possible, and it should ideally be easy to implement. It also should focus on providing a compelling semantic experience. Ideas can be found in the list of semantic UI/X user stories. The awards will let the winners finance the development of their proposed solution, and in exchange the deliverables have to be released under a permissive open source license. Proposals should be submitted online (there's no online form yet, at the moment you should email John Pereira) before November 2011 and the five best ones will be shortlisted and invited to pitch their proposals at the J. Boye Conference in November 2011, where the two winners will be selected.

There are some striking parallels between the promises of the semantic web and the "year of the Linux desktop" meme. Since at least 2000, IT magazines and web sites have been declaring every year as the year of the Linux desktop, in the sincere hope that that year would see a breakthrough in Linux adoption by businesses and home users on desktop computers. In the same way, the press has been writing about small success stories of semantic web technology, with expectations that it would soon come to a breakthrough. However, although most of the technology under the hood is ready, it looks like we still have to wait a while for this "year of the semantic web". What the IKS workshop made clear is that there's a lot of work to do on the level of the user interface. VIE looks like an interesting component for semantic web user interfaces, but as many of the speakers made clear, the whole industry is still desperately searching for that killer app.

Index entries for this article
GuestArticles	Vervloesem, Koen