AnsweredAssumed Answered

Arcitectural discussion: how to integrate documents from multiple different systems into Alfresco

Question asked by huima on Jan 31, 2014
Latest reply on Jan 31, 2014 by mitpatoliya
Hi all,

this is a thread to share your experiences in real life what architectural designs and options you have witnessed and implemented in organisations on how to integrate documents / document flow from multiple different systems into Alfresco.

What we are doing is essentially using Alfresco as a content hub in an organisation and publishing documents from different system silos into one ECM-platform.

The best scenario in perfect world would be:

Applications <—> ESB <—> Alfresco

All the integration logic and transformations would be on one platform and people managing it would stay in control of the flow of information. But due to limitations in technical capabilities in integration platform we have - we are currently doing integrations different ways.

1) Bulk imports ( so not 'integration' per se ) for large batches of one off transfers with Alfresco bulk import tools

2) Webscript based APIs that applications ( or in this case an ESB ) can call to push files into Alfresco with metadata via http post

3) Custom filesystem based integration script, where custom import script reads xml-file for metadata info and accesses referenced binary files through shared NFS drive then importing those files and metadata into the Alfresco repository

Number 1 and number 3 both go around ESB as the platform is currently constrained and can't process files in large batches. Naturally this means that integration logic and responsibilities get easily divided into multiple places, which is less that ideal.

Our goals with different architectural choises and solutions has been to create solution, where team managing Alfresco ECM system has been and can retain the control over how and where imports happen as they will also know the data model and solution dependencies best.

1) In bulk imports different metadata mappings prepare and filter metadata to be used in in-place-imports. Alfresco team knows best how to prepare the metadata correctly.

2) Webscripts based APIs decouple systems that publish documents and metadata from the actual data model and repository structure. Alfresco team can adjust API implementations if datamodel or repository structures change.

3) Custom import scripts have the integration logic in them and actually do the work that should be closer to the integration platform – or actually fully responsibility of integration platform

In some distant – but hopefully not too distant – future the integration platform gets technical constraints removed and gets also CMIS capability, so that custom import scripts could be migrated fully to functionalities in the integration platform.

Now, for the discussion and sharing ideas.

What kind of architectures you have built or witnessed, what has been thinking behind those choises and what have you learned from those experiences. Alfresco is so versatile system that there is no one right way to do things, so in that sense it would be interesting to share these different approaches and ideas behind those decisions.

Looking forward hearing your thoughts.