How-to define encoding of text document uploaded thru CIFS?

Question asked by olange on May 20, 2009
Is it possible to change the character encoding on the fly, for a given folder, for text document being uploaded thru a CIFS attached network drive? If so, how would I do that? If not, is there a workaround?

We're evaluating Alfresco 3 and came accross the following issue: when searching for words with accented characters, text documents whose contents actually match do not show up in the results.

We figured out that they were uploaded in Alfresco thru a CIFS attached network drive and were saved with our default Windows XP encoding (cp1252 / Windows Swiss-french) that does not match the default encoding of the CIFS connector (UTF-8). I assume this mismatch is the cause of the issue: Alfresco tries to read the cp1252 encoded documents as if they were UTF-8 encoded and of course, do not index correctly words with accented characters.

Our CIFS connector is configured to use UTF-8 per default. Most of our text documents are edited and saved on workstations running Windows XP in Swiss-french, hence use cp1252 as their encoding. Although I imagine we could reconfigure CIFS to cp1252 as its default, we still have the requirement to use other default encodings (developers will create  UTF-8 XML text documents for instance).

Note: this question is similar to an unanswered question modify the encoding of a new upload file to the Alfresco Discussion list, from zen on May 7th, 2009.