Schema Upgrade Scripts

Document created by resplin Employee on Jun 6, 2015
Version 1Show Document
  • View in full screen mode

Obsolete Pages{{Obsolete}}

The official documentation is at: http://docs.alfresco.com




Introduction


As of 1.4.0, upgrade scripts will be executed automatically by the server when starting up against an existing database, provided that an applicable script can be located.

It is possible to execute the appropriate scripts manually prior to starting the server.

This page describes the location of existing scripts and how to add scripts for different databases.
The reader should first familiarize him/herself with the configuration mechanisms available in Alfresco (New Installations).


Components


Property: <configRoot>/alfresco/repository.properties#db.schema.update
Controls whether the system bootstrap should create or upgrade the database schema automatically.

Bean: <configRoot>/alfresco/bootstrap-context.xml#schemaBootstrap
This bean is the first bean to execute during system bootstrap.  It detects whether the server is starting against an empty schema or an existing schema and takes the appropriate action.  The applyUpdateScriptPatches property contains a list of script patches that need to be applied or need to have been applied before the server bootstrap commences.

File: <configRoot>/alfresco/patch/patch-services-context.xml
Any number of SchemaUpgradeScriptPatch instances may be found here.  The scriptUrl property contains the location of a the script files that apply to the particular Hibernate dialect being used.  The way the script file is searched for is described later.

Folder: <alfresco.war>/WEB-INF/classes/alfresco/dbscripts/create
This folder contains schema post-creation scripts stored by version applicability and Hibernate Dialect or Dialect-derived fully qualified class names.

Folder: <alfresco.war>/WEB-INF/classes/alfresco/dbscripts/upgrade
This folder contains schema upgrade scripts stored by version applicability and Hibernate Dialect or Dialect-derived fully qualified class names.




Manually Upgrading


Sometimes, the permissions granted to the 'alfresco' database user are not sufficient to allow


  • database metadata retrieval, or
  • schema updates

In this case, we recommend setting up a test environment and performing the upgrade on there.  The names of the scripts are dumped out in the logs, and in V2.1, a collation of all executed statements is dumped to help with this process.  The script or scripts should be tested again, but can eventually be applied directly to the production data.  Turn off any schema introspection or manipulation using:


'

db.schema.update=false




Script Execution


The following properties control the generation and execution of SQL statements to modify the Alfresco schema and data (defaults shown):



db.schema.update=true
db.schema.update.lockRetryCount=24
db.schema.update.lockRetryWaitSeconds=5

  • db.schema.update=true:

When this is off, Alfresco will not generate or execute any SQL statements.  This can be used to completely avoid any metadata queries between upgrades.


  • db.schema.update.lockRetryCount and db.schema.update.lockRetryWaitSeconds:

If schema updates are on and some modifying SQL has been generated, then the system attempts to get a lock against the database by creating a lock table alf_bootstrap_lock.  If the table already exists, the system will wait and repeat the process of metadata examination and script generation.  If the lock table cannot be created after all the retry attempts have been exhausted, the system will post a message and exit:

A previous schema upgrade failed or was not completed.  Revert to the original database before attempting the upgrade again.

With the DB lock table in place, the SQL statements will be executed.  Failure of schema modification scripts is terminal.  There is no recovery for the current data.  The DB has to be restored, the source of the problem identified and the process repeated.  Once the SQL statements have all be run, the lock table will be removed and a message shown:

All executed statements written to file ...

Once all the SQL statements have been executed and the lock table has been removed, the system enters the data bootstrap or patch application phase.  These changes can be rolled back so no locks are required.




Script Loading


A schema script patch may be

   <bean id='patch.schemaUpdateScript-V1.4-1' class='org.alfresco.repo.admin.patch.impl.SchemaUpgradeScriptPatch' parent='basePatch'>
       <property name='id'><value>patch.schemaUpdateScript-V1.4-1</value></property>
       <property name='description'><value>patch.schemaUpgradeScript.description</value></property>
       <property name='fixesFromSchema'><value>0</value></property>
       <property name='fixesToSchema'><value>19</value></property>
       <property name='targetSchema'><value>20</value></property>
       <property name='scriptUrl'>
           <value>classpath:alfresco/dbscripts/upgrade/1.4/${db.script.dialect}/AlfrescoSchemaUpdate-1.4-1.sql</value>
       </property>
   </bean>

The administrator will normally override the Hibernate dialect to be used.  The schema bootstrap utility will substitute the fully qualified name of the dialect class into the ${db.script.dialect} placeholder.  In practice, some dialects are similar to each other, or the SQL required by the script will be common to several database vendors.  For this reason, the schema bootstrap attempts to find a relevant script by walking up the dialect class hierarchy, terminating after trying the org.hibernate.dialect.Dialect class.

An example post-creation script called sample.sql has been included to demonstrate the search algorithm.  The schemaBootstrap has the following:

     <property name='postCreateScriptUrls'>
        <list>
           <value>classpath:alfresco/dbscripts/create/1.4/${db.script.dialect}/sample.sql</value>
        </list>
     </property>

With the following structure on the classpath:

  WEB-INF/classes/alfresco/dbscripts/create/V1.4/
     org.hibernate.dialect.Dialect
        sample.sql
     org.hibernate.dialect.MySQLInnoDBDialect
        sample.sql

The search order for dialect org.hibernate.dialect.OracleDialect is:

  classpath:alfresco/dbscripts/create/1.4/org.hibernate.dialect.OracleDialect/sample.sql         <NOT FOUND>
  classpath:alfresco/dbscripts/create/1.4/org.hibernate.dialect.Oracle9Dialect/sample.sql        <NOT FOUND>
  classpath:alfresco/dbscripts/create/1.4/org.hibernate.dialect.Dialect/sample.sql               <FOUND>

The search order for dialect org.hibernate.dialect.MySQLInnoDBDialect is:

  classpath:alfresco/dbscripts/create/1.4/org.hibernate.dialect.MySQLInnoDBDialect/sample.sql    <FOUND>

The same logic applies when searching for upgrade scripts.




Writing and Testing Scripts


Script Encoding and Syntax


The following rules are true for all auto-executed SQL scripts:


  • All statements within the script are executed in a single transaction using the same database connection.  Some databases do not support transactional modification of tables.
  • Scripts must be UTF8 encoded.
  • Statements must be terminated by ';' or ';(optional)'.  The latter indicates that the statement may fail without causing the script to stop.
  • Lines starting with '--', '//' or '/*' are ignored.
  • Leading and trailing spaces are stripped from all statements.

Script Variables


It is possible to generate script variables and do variable replacement.  In the following example, the --ASSIGN: directive indicates that the following statement is a query and the value from the resultset column next_val must be assigned to avm_nodes_max.  This can then be used later for direct variable replacement.



--ASSIGN:avm_nodes_max_id=next_val
SELECT CASE WHEN MAX(id) IS NOT NULL THEN MAX(id)+1 ELSE 1 END AS next_val FROM avm_nodes;

CREATE TABLE t_avm_nodes (
   id BIGINT NOT NULL GENERATED BY DEFAULT AS IDENTITY (START WITH ${avm_nodes_max_id}),
   class_type VARCHAR(20) NOT NULL,
   vers BIGINT NOT NULL,

Some scripts would be exactly the same with the exception of differences between boolean columns using TRUE (PostgreSQL) or 1 (other DBs).  To circumvent this, the variable ${TRUE} can be used.  If the current dialect is PostgreSQL, then TRUE is used otherwise 1 (binary true) is used.



DELETE FROM alf_applied_patch WHERE id = 'patch.db-V3.2-Modify-AVM-MimeType';
INSERT INTO alf_applied_patch
  (id, description, fixes_from_schema, fixes_to_schema, applied_to_schema, target_schema, applied_on_date, applied_to_server, was_executed, succeeded, report)
  VALUES
  (
    'patch.db-V3.2-Modify-AVM-MimeType', 'Manually executed script upgrade V3.2 to modify AVM mimetype',
     0, 2016, -1, 2017, null, 'UNKOWN', ${TRUE}, ${TRUE}, 'Script completed'
   );

Script Locations


Once a script has been translated, it can be dropped into the Alfresco extensions location on the server or classpath in order to be picked up.  More details of the extension locations can be found at Repository_Configuration#Location_of_Extension_Configuration_Files.


Notes on Writing Scripts


  • Pay particular attention to the statements that insert entries into the alf_applied_patch table.  The patch mechanisms and checks use these entries to ensure that scripts aren't repeatedly executed.
  • Please contribute any missing scripts back to Alfresco with the following info:
    • Databases tested on (e.g. Oracle 10g and MySQL 5.0)
    • Schemas tested against (e.g. Tested against a 1.3.1 database that was originally upgraded from 1.2.1)
    • Version of Alfresco tested with (e.g. V1.4 Preview War)

MySQL Optimizations


Various DB scripts have optimizations that can be made, but are environment-specific.  Look in the scripts being executed for notes describing how the optimizations can be applied.  For instance, in the scripts to upgrade from 2.1 to 2.2SP2 (and later) several optimizations have been noted:



--
-- The INSERT ... SELECT statement can be broken up into
--    SELECT ... INTO OUTFILE ...
--    LOAD DATA INFILE ...
-- This can be an order of magnitude faster
-- So the following:
INSERT INTO t_alf_node
   (
      id, version, store_id, uuid, transaction_id, node_deleted, type_qname_id, acl_id,
      audit_creator, audit_created, audit_modifier, audit_modified
   )
   SELECT
      n.id, 1, s.id, n.uuid, nstat.transaction_id, false, q.qname_id, n.acl_id,
      null, null, null, null
   FROM
      alf_node n
      JOIN t_qnames q ON (q.qname = n.type_qname)
      JOIN alf_node_status nstat ON (nstat.node_id = n.id)
      JOIN t_alf_store s ON (s.protocol = nstat.protocol AND s.identifier = nstat.identifier)
;

-- Can be replaced with:
SELECT straight_join
       n.id, 1, s.id, n.uuid, nstat.transaction_id, false, q.qname_id, n.acl_id,
       null, null, null, null
    FROM
       alf_node n
       JOIN t_qnames q ON (q.qname = n.type_qname)
       JOIN alf_node_status nstat ON (nstat.node_id = n.id)
       JOIN t_alf_store s ON (s.protocol = nstat.protocol AND s.identifier = nstat.identifier)
    INTO OUTFILE '/tmp/t_alf_node.sql'
;
-- Query OK, 830222 rows affected (1 min 14.15 sec)
LOAD DATA INFILE '/tmp/t_alf_node.sql' INTO TABLE t_alf_node;
-- Query OK, 830222 rows affected, 65535 warnings (29 min 20.65 sec)




Porting Scripts


This section covers the steps required to port scripts and how to identify inconsistencies between different upgrade paths as well as different databases.  Alfresco engineers will do the first pass of upgrade scripts on MySQL; when porting scripts, use MySQL scripts as the basis.


Script Locations


The first thing to understand is how Alfresco looks up scripts using the dialect hierarchy.  Some scripts are compatible with most databases and might not need porting at all.  Some databases may force all scripts to be ported (e.g. PostgreSQL).  Read the section on Script Loading.


Analysis Tools


Alfresco Log


All SQL scripts executed (coded or auto-generated) are logged during startup:



...
...SchemaBootstrap] Schema managed by database dialect org.hibernate.dialect.MySQLInnoDBDialect.
...SchemaBootstrap] Executing database script <temp>\Alfresco\AlfrescoSchema-MySQLInnoDBDialect-Update-4589.sql /
(Generated).
...  INFO  [domain.schema.SchemaBootstrap] Executing database script <temp>\Alfresco\AlfrescoSchema-MySQLInnoDBDialect-Update-4590.sql /
(Copied from classpath:alfresco/dbscripts/create/2.2/org.hibernate.dialect.MySQLInnoDBDialect/AlfrescoPostCreate-2.2-MappedFKIndexes.sql).
...

The source of all statements is clearly shown, including the auto-generated script that forms the basis for new installations.  Upgrades between the same two revisions of a repository should execute the same set of scripts, although there may be differences between the particular databases.  Look at the path of the Copied from section to see whether a dialect-specific or generic script is being used.

For most cases, when the path uses the generic Dialect script, that same script will apply across all databases.  The common exception is for PostgreSQL, where upgrade scripts need special handling for the alf_applied_patch inserts at the end of the script.


Schema Dump Utility (3.1.1 and later)


In addition to seeing which scripts were executed, it is important to know what the current state of the database is.  In order to facilitate comparisons between database schemas, a schema dump utility has been written that outputs the database schema to a XML format.  The XML data is sorted and normalized - as far as possible - to allow meaningful comparisons between the files.  Using the standard diff command will allow a DBA to check for any serious differences between schemas.

The utility is run automatically whenever any SQL statements are executed on startup:



...SchemaBootstrap] Normalized schema (pre-bootstrap)  dumped to file <temp>\Alfresco\AlfrescoSchema-MySQLInnoDBDialect-21914-Startup.xml.
...SchemaBootstrap] Normalized schema (post-bootstrap) dumped to file <temp>\Alfresco\AlfrescoSchema-MySQLInnoDBDialect-21919.xml.

Finding Scripts to Port


The component that controls and executes scripts is part of the Alfresco bootstrap sequence.

<configRoot>/alfresco/bootstrap-context.xml:



    <bean id='schemaBootstrap' class='org.alfresco.repo.domain.schema.SchemaBootstrap' >
       ...
       <property name='updateSchema'>
          <value>${db.schema.update}</value>
       </property>
       <property name='stopAfterSchemaBootstrap'>
          <value>${db.schema.stopAfterSchemaBootstrap}</value>
       </property>
       ...
       <property name='preCreateScriptUrls'>
          <list>
          </list>
       </property>
       <property name='postCreateScriptUrls'>
          <list>
             <value>classpath:alfresco/dbscripts/create/2.2/${db.script.dialect}/AlfrescoPostCreate-2.2-MappedFKIndexes.sql</value>
             <value>classpath:alfresco/dbscripts/create/2.2/${db.script.dialect}/AlfrescoPostCreate-2.2-Extra.sql</value>
             <value>classpath:alfresco/dbscripts/create/2.2/${db.script.dialect}/post-create-indexes-04.sql</value>
             <value>classpath:alfresco/dbscripts/create/3.0/${db.script.dialect}/create-activities-extras.sql</value>
          </list>
       </property>
       <property name='validateUpdateScriptPatches'>
          <list>
          </list>
       </property>
       <property name='preUpdateScriptPatches'>
          <list>
             <ref bean='patch.schemaUpdateScript-V1.4-1' />
             <ref bean='patch.schemaUpdateScript-V1.4-2' />
             <ref bean='patch.db-V2.1-NotNullColumns' />
             <ref bean='patch.db-V2.2-ACL-From-2.1-A' />
             <ref bean='patch.db-V2.2-ACL' />
             <ref bean='patch.db-V2.2-0-CreateMissingTables' />
             <ref bean='patch.db-V2.2-Upgrade-From-2.1' />
             <ref bean='patch.db-V2.2-Upgrade-From-2.2SP1' />
             <ref bean='patch.db-V2.2-Person-2' />
          </list>
       </property>
       <property name='postUpdateScriptPatches'>
          <list>
             <ref bean='patch.db-V1.4-TxnCommitTimeIndex' />
             <ref bean='patch.db-V2.0-ContentUrls' />
             <ref bean='patch.db-V2.1-JBPMData' />
             <ref bean='patch.db-V2.1-VersionColumns2' />
             <ref bean='patch.db-V2.1-JBPMProcessKey' />
             <ref bean='patch.db-V2.1-RemoveWcmSubmittedAspect' />
             <ref bean='patch.db-V2.1-AuditPathIndex' />
             <ref bean='patch.db-V3.0-0-CreateActivitiesExtras' />
          </list>
       </property>
    </bean>

DB Creation Scripts


DB creation scripts are auto-generated, but also supplemented by additional scripts.  These scripts are listed in the preCreateScriptUrls and postCreateScriptUrls properties.  Each script must have an equivalent for the dialect that you are porting for; but remember to check if the scripts further up the dialect hierarchy don't already work with your database; as far as possible, standard SQL has been used and scripts broken up to allow maximum re-use.  Take a look around the <configRoot>/alfresco/dbscripts/create folder and see how the various scripts match up to the different dialects.


DB Upgrade Scripts


DB upgrade scripts are not listed in the same way as creation scripts.  Upgrade scripts have the additional requirement that they need to be detected as patches and verified to have been executed.  The preUpdateScriptPatches property lists patches that need to be applied before auto-generating upgrade scripts while the postUpdateScriptPatches need to be applied after the auto-generated upgrade scripts.  To see which scripts get executed by each patch, look at the patch definition.

<configRoot>/alfresco/patch/patch-services-context.xml:



    <bean id='patch.db-V2.2-Upgrade-From-2.2SP1' class='org.alfresco.repo.admin.patch.impl.SchemaUpgradeScriptPatch' parent='basePatch'>
        <property name='id'><value>patch.db-V2.2-Upgrade-From-2.2SP1</value></property>
        <property name='description'><value>patch.schemaUpgradeScript.description</value></property>
        <property name='fixesFromSchema'><value>0</value></property>
        <property name='fixesToSchema'><value>134</value></property>
        <property name='targetSchema'><value>135</value></property>
        <property name='scriptUrl'>
            <value>classpath:alfresco/dbscripts/upgrade/2.2/${db.script.dialect}/upgrade-from-2.2SP1.sql</value>
        </property>
        <property name='dependsOn' >
            <list>
                <ref bean='patch.db-V2.2-2-MoveQNames' />
            </list>
        </property>
        <property name='alternatives' >
            <list>
                <ref bean='patch.db-V2.2-Upgrade-From-2.1' />
            </list>
        </property>
    </bean>

While the upgrade script list is different from the creation script list, the way that the scripts are located is the same.  By convention, upgrade scripts are located in <configRoot>/alfresco/dbscripts/upgrade.  You can look around to see how the MySQL upgrade scripts are implemented and how ANSI SQL scripts are pushed up into the common dialect scripts.




Testing Your Scripts


  • Perform a clean bootstrap using MySQL.  Note the scripts executed and copy the schema dump file.
    • As a baseline, run unit tests against MySQL, at least 'ant test-repository' or all unit tests (eg. 'ant -f continuous.xml').
  • Perform a clean bootstrap using your database and dialect.
    • Ensure that the scripts execute without failure.
    • Ensure that each script in the MySQL list is matched by an equivalent (or same) script for your DB.
    • Copy the schema dump file and perform a diff with the MySQL schema dump.  There should not be any material differences.  Unfortunately, what counts as 'material' varies from database to database; an index name may be different but this only matters if the name is auto-generated, for example.
    • Run unit tests against your database and dialect, at least 'ant test-repository' or preferably all unit tests (eg. 'ant -f continuous.xml').
  • Perform an upgrade using your database to the same version as above.
    • Ensure successful execution of all scripts.
    • Compare the schema dump with the schema dump generated by the creation scripts.  There should be no material difference i.e. any variation should be irrelevant or should be the result of auto-naming.

It is rare that a missing script results in an incorrect generic script execution.  Generally, generic scripts that are found will be executed successfully and will do the correct thing - with the notable exception of PostgreSQL scripts.  This means that failure to port a script is usually met with a failure to find a particular script and the upgrade or bootstrap will fail.

Upgrading

Attachments

    Outcomes