Running Benchmark Applications: Alfresco Data Load

Document created by derek Employee on Apr 22, 2015
Version 1Show Document
  • View in full screen mode

What it Does


  • Create Alfresco sites, create site members, load site document libraries with folders and files.  See the javadoc for the SiteFolderLoader to calculate how many folders and files will be created.
  • Create RM site, RM users (Alpha)
  • Record, in detail, any errors that occurred during individual API calls operations
  • Record data creation times


Prerequisites


Use the Benchmark Testing with Alfresco page for version compatibility.


  • Java 1.7.0_51 or later
  • MongoDB 2.6.3 or later installed and running on port 27017 on some server: <mongo-host>
  • A compatible version of the Benchmark Server running on a Tomcat7 at port 9080: <bmserver-host>
  • Alfresco with /alfresco available: <alfresco-host>.  The public API endpoints, including the CMIS browser binding URL must be available.
  • A user data mirror reflecting the list of users that can be used by the test.  See the Alfresco Sign Up test on how to create users in Alfresco.
  • Validate the server API availability using the <alfresco-url>/alfresco/api/-default-/public/cmis/versions/1.1/browser URL with the administrator login.


Deploying


Local Deployment


  • Check out the required tag or branch of  source code of dataload test.
~ > svn checkout https://svn.alfresco.com/repos/alfresco-open-mirror/benchmark/tests/dataload//tags/V2.2 dataload
~ > cd dataload

  • Build and start a local Tomcat7 instance
~/workflow > mvn tomcat7:run -Dmongo.config.host=<mongo-host>


...
[INFO] Running war on http://localhost:9086/alfresco-benchmark-tests-dataload-2.2
[INFO] Creating Tomcat server configuration at c:\work\projects\benchmark\tests\dataload\tags\V2.2\target\tomcat
...
06:43:48,691 [localhost-startStop-1] [ INFO] [                 MongoClientFactory: 124] - New MongoDB client created using URL: mongodb://localhost/?co...
...
06:43:49,063 [localhost-startStop-1] [DEBUG] [                LifecycleController: 174] - Started components: appLifeCycleController
Feb 03, 2015 6:43:49 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ['http-bio-9086']

Remote Deployment


  • Set up a Tomcat7 load driver instance with the manager application listening on port 9080 and configure your Maven settings with the manager application credentials.
  • Deploy the application directly into the load driver:
mvn tomcat7:redeploy -DskipTests  -Dbm.tomcat.ip=<bmdriver-host> -Dbm.tomcat.port=9080 -Dbm.tomcat.server=bm-remote

  • Connect to the driver server and check that the application was successfully deployed.
 http://bmdriver-host:9080/manager

  • You can start/stop/undeploy the driver applications as required


Create and Start a Test Run


The example given here will assume the a setup where several test runs are going to be made against a single server for investigative purposes.


  • Connect to the Benchmark Server
http://bmserver-host:9080/alfresco-benchmark-server

  • Create a new test, DATALOAD_V22 using alfresco-benchmark-tests-dataload-2.2 Schema:0 or whichever version you deployed
  • The property edit page is displayed, but can also be accessed using the gears icon.
  • Click on the Driver Details box, which will show details of all compatible driver(s) connected to the same configuration database.
  • The following properties should be set for all new tests:

























Section
Property
Description
MongoDB Connection mongo.test.host The hostname of a MongoDB server where the test results and general working data will be stored.  This should be an IP address that is visible to all load drivers and _may_ be the same MongoDB instance as the configuration database.  This value must always be set for new tests, as there is no working default.  A value of 'localhost' is dangerous and should not be used except for local testing.
Test Controls Test Duration Maximum time for the test to run.  This acts as an emergency shut off if the test is unable to complete the desired operations e.g. if the server-in-test fails to respond to new session requests and each session only times out after 5s.

Test Duration Unit The unit in which the Test Duration is expressed: SECONDS, MINUTES, HOURS or DAYS.
Alfresco Server Details Alfresco port Use 8080 for a default Alfresco server install.

Alfresco host The server host name.  It is important to use the same host name as used by the Alfresco Sign Up test so that the user data mirror name matches.  As mentioned before, this value can be set once at the test level when performing repeated runs against the same server.  The hostname must be visible to the load driver instances where the tests are deployed.
Load Details Sites Count The number of sites to create on the server.  This is the target number; the test only creates sites up to this number even when rerun.

Users per Site The number of site members per site, including the site manager.  This is a target number and nothing is done if the site members already exist.
Files and Folders Maximum Active Loaders The maximum number of file-folder loading sessions to run concurrently.  Use this to limit the resources consumed on the Alfresco side.

Folder Depth The maximum folder depth to achieve within each site.  A document library with only files in has a Folder Depth of zero.

Subfolder Count The number of folders that a loader will create when increasing the document library depth.  A folder's subfolders are created using a random site user in a single CMIS session.

Files per Folder The number of files to add to every folder, including the document library itself.  This is a remote upload of real document using CMIS.  A folder's files are created using a random site user in a single CMIS session.

In order to conserve bandwidth and reduce server-side storage requirements, it is possible to perform file spoofing in Alfresco V5.1 (nightly builds as of April 2015).  If the number of files per folder is high, then each call to spoof the folder's files can take some time (30s for 1000 files per loader, for example).  It may be necessary to increase the HTTP connection timeout or reduce the number of files per folder to compensate.
The following properties control file spoofing:











Section
Property
Description
File Spoofing Spoof File Creation When false, the usual single file upload via CMIS will be used, giving a realistic CMIS upload simulation.  In order to have the server generate plain text document, enable spoofing by setting this to true.

Force Binary Stroage By default the Alfresco server will generate a consistent text document each time the content is requested by an API, SOLR or Share.  If this setting is true, then the generated text documents will be written through to the content store and stored in the usual manner.  Apart from stressing the disk IO, storage or backup mechanisms, there is no compelling reason to change this setting.

Files per Transaction For each folder that needs to be loaded, a single call is made to Alfresco.  The files and related metadata are generated on the server and committed in batches.




  • Click back up to the DATALOAD_V22 test.  You should be presented with 'No test runs found ...'.
  • Create a test run named TRIAL_01.
  • The list of test runs is displayed.  Click on the cog (properties editor) of the test run.
  • Notice that the properties you set in the test are inherited by the test run.  Make any run-specific tweaks of leave everything to inherit the values set earlier.
  • Click back up to the TRIAL_01 test run.
  • Click the run button.

The test should start after a few seconds and progress to completion.




Handling failures


If the test fails to progress from SCHEDULED to STARTED:


  1. mongo.test.host has not been set.
  2. an invalid property setting was provided (TODO: implement type- and range-based UI validation)

Startup log messages to start can be retrieved directly from the server using a GET request against the API (TODO: Add to server UI):

 http://bmserver-host:9080/alfresco-benchmark-server/api/v1/status/logs?count=5&skip=0&level=INFO&test=DATALOAD_V22

If the test starts but experiences a high number of failures, the failures can be accessed directly in the MongoDB results.  The video, Following up on Failures shows how this can be done.  It is especially useful when the server under load starts to produce errors or fails to respond correctly.  From the MongoDB console, it would look something like this:

 mongo <mongo-data-host>
use bm20-data
db.DATALOAD_V22.TRIAL_01.find({success:false}).pretty();

If the test is terminated early or fails early, it may be necessary to reset some of the loaders' lock data:

 mongo <mongo-data-host>
use bm20-data
db.mirrors.alfresco-host.filefolders.find({path:{$regex:'locked'}})
db.mirrors.alfresco-host.filefolders.remove({path:{$regex:'locked'}})



Extra Information


Attachments

    Outcomes