Alfresco Hybrid Sync

Blog Post created by davidcognite on Sep 19, 2012

It was back in February when I first started working on Hybrid Sync: helping get a demo together for an internal company event & it was from then that the momentum increased, along with the team size, and the last few month have seen a lot of effort from all across the engineering team to fix some 'interesting' problems relating to document sync. The team have worked hard and I'm really pleased with the result.

The major use-case we were asked to address was Alfresco users in the enterprise wanting to collaborate with individuals outside their firewall using Alfresco in the cloud - but I can see countless other situations where being able to sync content off-site and into the cloud would be very useful. I think particularly that I'll use the new quick share functionality to share content and use sync to ensure that external individuals always have access to the latest version.

I'll let you find out more about the marketing yourself (see press release and comments from Paul Hampton and Kathleen Reidy), and instead I'll look at how it works technically.

How does it work technically?

The first step in a sync is authenticating with Alfresco in the Cloud - this is done through the new Cloud Sync panel on a user's profile page or through an inline prompt when first asking to set up a sync. In both instances, these details will be saved (in a new & secure credentials store) for future sync actions and are checked when you enter them (so you can't save a bad password). In the future, we expect to replace this with an oAuth implementation.

You can sync to any account you have access to in the cloud. An on-premise user can only store the credentials for one cloud account, but a cloud account may have multiple on-premise users connecting to it. Any modifications made in the cloud will be pulled down as the on-premise user and any changes made on-premise will be pushed as the cloud user, so the UI at either end will only ever show the local user.

Sync Sets:
The next step is to select files for syncing, this is done through the on-premise Document Library (all sync actions are initiated on-premise). You can either multi-select files or choose to sync them individually through the appropriate Doc Lib actions. Under the covers, Alfresco has the concept of sync sets: these are groups of files that sync to a particular location. If you've selected multiple files then they will be grouped together as a single sync set, otherwise a sync set containing a single node will be created. A sync set is owned by the user that created it and will use that user's cloud credentials to sync.

Pushing and pulling of changes is done on a sync set by sync set basis, but currently all sync management needs to be done on an individual file basis & sync sets themselves are managed by the system - Admin/User level sync set management features are planned for a future release. If you have synced a folder then any sync management actions (i.e. unsync) need to be done on that folder, and not it's implicitly synced children.

Change Audit:
Once a node is added to a sync set, it will have a sync:syncSetMemberNode marker aspect applied to it & all changes to that node or sync set membership will trigger an entry in the audit log. The first entry will be the addition of that node to the sync set and this will trigger a node creation on the remote system. Periodically (every 10 seconds) this audit log is checked for changes and any sync changes found are sent to the remote server. On Alfresco in the cloud, changes to synced content are also stored in an audit log and are pulled by the local system every 60 seconds. All changes are aggregated together, meaning that only the final state of the node is pushed and versions are not created for interim state. However a version is explicitly created every time a change is pushed or pulled, even if it is only a property change and your local repo is set to not version on property changes.

Conflicts and Version handling:
Occasionally you will find that a node has changed in two places at the same time - there are lots of complicated was that could be used to mark and resolve conflicts, but we've taken the approach that we want to simplify this process. Being aware that the small poll period makes conflicts unlikely and that the major use case is to enable collaboration in the cloud, we have implemented a 'Cloud wins' approach to conflict resolution. If a file changes at both ends simultaneously then the on-premise node will be versioned and will then be overwritten with the node from the cloud. To see the changes you can view the version history & revert or update as necessary.

What gets synced:
The node content and any properties from common content models (these can be found in sync-service-context.xml under propertiesToTrack config section). If the file is being synced as part of a folder sync, then the directory structure will also be synced (e.g. the structure in the cloud will remain the same as the structure on-premise), but if you've just synced the file individually, then just that node is synced and you can move it around without affecting the sync.

When you unsync, the sync:syncSetMemberNode aspect is removed and a record is added to the audit log (deleting a node on-premise also triggers this) - when the audit log gets queried for changes, this node removal is pushed to the cloud just as any other change and will optionally delete the version in the cloud.

When a node cannot be synced then an error aspect gets applied. That error may be transient (e.g. comms failure with the cloud or authentication failure) or require a user interaction to solve (e.g. a name conflict in the sync folder or a permissions change means the target folder is no longer writable). In either of those cases, a marker aspect gets applied (sync:transientError or sync:failed respectively), the user is notified through Doc Lib indicators and banners and these aspects also affect how the audits work. When a transient error is hit, the audit logs continues to keep a record of changes and the system will automatically recover when it is fixed, but when a hard error is hit, the audit log clears current entries, stops recording changes and will only start again when the user manually requests a sync. A request sync triggers the full push of a node if the node has an error on it. If the push of that node fails again, then the error aspect will be reapplied.

For more information on setting up sync, see the documentation and FAQ.

That was a brief overview of some of the technical concepts that Alfresco developers and implementers may want to know about, let me know in the comments if you'd like a more in depth look at any part or if you've got any technical questions about this new feature.