AnsweredAssumed Answered

Alfresco and Apache Stanbol (semantics)

Question asked by ttownsend on Mar 10, 2013
Latest reply on Mar 11, 2013 by ttownsend
Hello all,

I am looking to see if anyone has experience with successfully integrating Alfresco/Share and Apache Stanbol for semantic information extraction and auto-tagging of content with semantic data (tags).

Searching the whole of the Alfresco forums for "semantics" brought up only two threads:

My environment is fairly straight-forward:
<ol>I have a repository of ~75GB of proprietary and sensitive information</ol>
<ol>I share this repository with my clients/associates to support a number of strategic and operational business processes</ol>
<ol>The repository is almost exclusively text (pdf, doc/docx) and is unstructured data</ol>
<ol>Effectively, 0% of these documents have been tagged in any way</ol>

So, I wish to be able to:
<ol>Configure an Apache Stanbol server in-house</ol>
<ol>Be able to have my entire repository, or individual folders within it, run as a batch</ol>
<ol>Be entirely self-contained with no access to the internet</ol>

From the links I posted above, no clear experiences actually integrating Apache Stanbol with Alfresco CE emerge.
In one of these threads, someone stated that Zaizi was working towards an open-source Stanbol/Alfresco solution, but I've not seen any evidence of this.

I understand that, for example, Semantics4Alfresco looks at providing some semantic tagging capability by extending OpenCalais for this purpose, but (again) my restrictions prevent the use of URL-based APIs or any other method that would take data/information out of my secure server space (Internet baaaaad….).

So, here are a few questions:
<ol>Has anyone reading this successfully integrated Apache Stanbol and Alfresco CE</ol>
<ol>Are you willing to share your development path here or with my privately?</ol>
<ol>Can anyone from Zaizi comment on the status of your Stanbol solution?</ol>

Many thanks and please feel free to PM me if you prefer.