Integrating Alfresco with Accumulo backend

Question asked by dansh on Mar 22, 2016
Hi everyone, I am working on a new project where we are using Accumulo (a big-data key/value store) as our backend. We are building off an existing system that uses a specific table design in accumulo to store and access data. The values in this key/value store will mostly be documents (pdf, xml, .doc, etc)

Basically, we want to use alfresco as a nice UI to read (with the possibility of full CRUD somewhere down the road) these documents out of the datastore. There are two approaches to this we are considering:

1. Modify Alfresco backend to read from our accumulo database instead of the out of the box repositories.
Pros: no worries about keeping things in sync, no duplication of data
Cons: could be a prohibitively large amount of work. creates a dependency to the database design. Could run into ACID problems if we allow full CRUD

2. Use an external service to pull the data from accumulo and push it into Alfresco using CMIS.
Pros: probably much less effort.
Cons: have to keep things in synch. How often do we run the job to keep them in sync? do we just trash all the old stuff and do a full refresh or try to find the diff and just update? Data will be duplicated (on accumulo tables and in alfresco's file system). Amount of data could be massive.

Does anyone have any experiences like this? We are very open to alternative approaches or even alternatives to alfresco.