how to check if the bulk import has really imported all the contents?

Question asked by jmjimenezt on Apr 3, 2013
I'm dealing with a problem related to the bulk import of content and I wonder if you see any other way different than mine to address the problem.

Recently I imported a huge amount of contents (around 2 million files and folders) and the bulk import finished successfully, with just a few unreadable files (due to they were broken symbolic links).

Now, once the import process is finished, I want to test if it really has imported all the contents. In order to do that, I counted the number of items scanned by the bulk import process (let's say it scanned 300K folders and 1,7M files) and I did the same in the filesystem (I counted it using different means, like the linux commands "find . type {f, d, l} | wc -l" or "tree" just to make sure) and I got an slightly different number of files and folders, let's say that I found 2000 folders and 8000 files more.

So I have a bunch of questions, should be these numbers equal?
How can I figure out which files are missing, since they are not present in the bulk import status report?

Could be that these files cannot be read by the "tomcat" user that runs Alfresco?

I guess that the main question here is: When an import process is done in Alfresco, is there any way to check if the number of files and folder in the filesystem and in Alfresco are the same?

