AnsweredAssumed Answered

Document compression

Question asked by fschnell on Sep 18, 2008
Latest reply on Mar 3, 2011 by mrogers

we are currently implementing Alfresco as a replacement for our document management system and also as replacement for an existing filesystem. On the latter we have quite an amount of files ( ~ 1 million). As a storage space optimisation technique I was wondering why Alfresco does not come with one or both of the following features:

1) After indexing compress the file. That would save quite some space. Upon read request a document would be decompressed and kept in a cache for some time before being removed from the cache.

2) On our filesystem we happen to have many duplicate files. These cannot be deleted as they came as contractual deliveries. I was thinking Alfresco should support to store a hash value per file in its metadata. That way a file would need to be saved only once, even though it may appear many times in different spaces. Only if a file gets edited it would be stored as a second file as its hash value changed.

Does this make sense?  Is it thinkable that something like this gets implemented? How do others cope with the problem of data duplication?

Thanks for your valuable feedback