AnsweredAssumed Answered

xml meta data extractor doesn't work with DOCTYPE

Question asked by kilo on Jan 29, 2010
Latest reply on Jul 9, 2010 by syspro
Hello,

I noticed that my xml meta data extractor doesn't work when the content (xml, of course) has <!DOCTYPE > declaration. I'm using the built-in XML extractor, just like the sample. I get log message:


No working metadata extractor could be found:
   Document: ContentAccessor[ contentUrl=store://2010/1/29/17/54/eed0a6a1-49b3-4863-b3aa-4edfa0fef55d.bin, mimetype=text/xml, size=4371, encoding=UTF-8, locale=en_US]
17:54:10,870 INFO  [STDOUT] 17:54:10,870 User:admin DEBUG [metadata.xml.XPathMetadataExtracter]
XML metadata extractor redirected:
   Reader:    ContentAccessor[ contentUrl=store://2010/1/29/17/54/eed0a6a1-49b3-4863-b3aa-4edfa0fef55d.bin, mimetype=text/xml, size=4371, encoding=UTF-8, locale=en_US]
   Extracter: null

which seems strange since everything works as expected without <!DOCTYPE > in the xml document. Is this because the built-in extractor is trying to validate the xml document?

Is there any option to disable <!DOCTYPE > interpretation in built-in extractor?

Thank you. I will appreciate your suggestions.

Outcomes