Classification And Categories

Document created by resplin Employee on Jun 6, 2015Last modified by mitpatoliya on Oct 28, 2016
Version 2Show Document
  • View in full screen mode

Obsolete Pages{{Obsolete}}

The official documentation is at: http://docs.alfresco.com

 

 

Core ConceptsClassificationContent Modeling

 

Introduction

 

Alfresco repository is capable of handling multiple classifications, or classification hierarchies, although the UI Category Management screens only display one of the hierarchies, namely the one associated with cm:generalclassifiable aspect. 

A node can be classified in many ways.  Each classification is applied by adding an aspect to the node. Each classification aspect must be defined in a data dictionary. The aspect should be defined to have one property of type d:category. If this property supports multiple values then many categories from the classification can be applied to the node. If it is a single valued property then only a single category may be applied.

To assign a node to a particular category in a classification:

 

  • make sure the node has the appropriate aspect applied for the classification;
  • obtain the node ref values for the category or categories to be applied to the node; and
  • set these values for the appropriate property on the aspect.   

If a category is removed, the invalid category is filtered from the reported properties upon read. When the node is next updated the removal of the now deleted category will take place. This avoids reindexing all nodes in a category when a category is deleted.

By default, Alfresco uses the cm:generalclassifiable aspect defined in contentModel.xml for all classification.  When a new installation is started, categories from file categories.xml are bootstrapped and loaded into the categories hierarchy. 

Classifications, categories and categorised nodes can be found via query (see Search#Category_Queries) or using the CategoryService API.

 

Extending cm:generalclassifiable

 

This can be done by creating new root categories and sub-categories in the classification, either via the admin UI, WebServices or the CategoryService. The use of cm:generalclassifiable in the Alfresco UI is really treating root categories as classifications and putting all categorisation in one multi-valued property for expediency.

Using the category service:

// To create a root category:
NodeRef newRootCat = categoryService.createRootCategory(
      spacesStore,
      ContentModel.ASPECT_GEN_CLASSIFIABLE,
      'newRootCat');
// To create a category
NodeRef newCategory = categoryService.createCategory(newRootCat, 'newCategory');

 

Adding your own classification

 

This involves creating a new aspect in a dictionary model: see the Data Dictionary Guide.

In the example below, a new classification aspect is defined that supports a node being related to many categories in the classification.

 


   ...
   </types>

   <aspects>
   ...
      <aspect name='my:classificationAspect'>
         <title>My Classification</title>
         <parent>cm:classifiable</parent>
         <properties>
             <property name='my:categories'>
               <title>Categories</title>
               <type>d:category</type>
               <mandatory>false</mandatory>
               <multiple>true</multiple>
               <index enabled='true'>
                  <atomic>true</atomic>
                  <stored>true</stored>
                  <tokenised>true</tokenised>
               </index>
            </property>
         </properties>
      </aspect>
   </aspects>

 

There is no support for creating categories in this classification via the UI; as it only supports cm:generalclassifiable. Nor will advanced search support selecting categories from this classification. It is possible to extend the cm:category type, but again, management of such types will not be supported via the UI, web services and utility methods on the CategoryService.

By convention, an aspect used for categorisation defines one property. It could define more, but these must be limited to additional properties using the same classification. For example a 'primary' and 'secondary' category selected from the same classification.  This, and single or multi-valued category properties covers most use cases.

The name of the child association that links the new root category and the name of the aspect that contains the properties that link to the categories should match, otherwise Lucene member searches won't work as expected.

 

Adding Categories to a Classification

 

To add categories to the cm:generalclassifiable classification, there first needs to be a node of type cm:category with a child association QName of cm:generalclassifiable and child association type cm:categories beneath a node of type cm:category_root. This node is the top of the classification.

Nodes can be created beneath this node of type cm:category and child association type cm:subcategories. These nodes defined the root categories for the classification. Further nodes of type cm:category and child association type cm:subcategories can be added beneath these nodes to define the category hierarchy. Secondary links can be used to include categories from one classification in another - these category nodes appear in both classifications. The category property and its defining aspect determines which classification applies.

See the unit tests for an example of how to create this structure programatically.  For an example of how to bootstrap the categories, see categories.xml.

 

Classifying a node

 

On the node to classify, determine if it has the appropriate aspect applied; e.g. cm:generalClassifiable. If not, apply the aspect. Then, add to, or replace, any existing categories by setting the appropriate property to a category NodeRef or Collection of category NodeRefs

Using the node service:

NodeRef targetNode;
NodeRef categoryNode;
...
targetNode = ...
...
categoryNode = ...
...

// Replace any existing aspects
ArrayList<NodeRef> categories = new ArrayList<NodeRef>(1);
categories.add(categoryNode);
if(!nodeService.hasAspect(targetNode, ContentModel.ASPECT_GEN_CLASSIFIABLE)
{
    HashMap<QName, Serializable> props = new HashMap<QName, Serializable>();
    props.put(ContentModel.PROP_CATEGORIES, categories);
    nodeService.addAspect(targetNode, ContentModel.ASPECT_GEN_CLASSIFIABLE, props);
}
else
{
    nodeService.setProperty(targetNode, ContentModel.PROP_CATEGORIES, categories);
}

 

 

 

 

 

Finding Categories

 

Categories can be found by navigating the classification via the CategoryService, via the NodeService or by query.

The category service is currently more friendly for UI category navigation and select rather than finding a specific category directly. The node service allows more directed access as it supports lookup by the cm:name property, which categories use.

Queries are the most versatile approach, but have a few gotchas.

To find category nodes by name using a Lucene query

+TYPE:'cm:category' +@cm\:name:'United Kingdom'

 

This query could find more than one category and does not restrict the search scope to any particular classification. This further restriction can be done by adding a PATH constraint.

+TYPE:'cm:category' +@cm\:name:'United Kingdom' +PATH:'/cm:generalclassifiable/cm:myRegion//*'

 

Note cm:name will tokenise and can be used for wildcard matches. If you find the category directly by PATH there is no wild card support, the case must be an exact match, and the PATH must be escaped according to ISO 9075.

+PATH:'/cm:generalclassifiable/cm:myRegion/cm:Europe/cm:United_x0020_Kingdom'

 

The main surprises are related to getting more results than you expect from the queries. You may have to check the cm:name property is exactly what you want. The property cm:name is used for full text search and not as an identifier.

 

Implementation overview

 

Nodes of type cm:category_root have special behaviour when indexed. As they have the sys:aspect_root. Any node beneath will appear in the index as if it were relative to the root node. For example:

/sys:System/cm:categoryRoot/cm:generalclassifiable

In this case, if the second node is of type cm:category_root then there are two paths to this node:

/sys:System/cm:categoryRoot/cm:generalclassifiable
/cm:generalclassifiable

This is how the index supports path based queries against classifications and the categories they contain.

Categories can be defined anywhere in the repository beneath a node of type cm:category_root. The node below nodes of this type should be cm:category nodes with a child association QName equal to a classification. Then nodes beneath this node apply to the classification for queries. The queries used to inspect classifications and categorisation will merge together all bits of classifications that are accessible to the user. In theory this can be used to provide user specific extension of categories although has not been used in practice.

There could be more categories at:
/sys:System/sys:additionalCategories/cm:categoryRoot/cm:generalclassifiable

There are two paths to this node ...
/sys:System/sys:additionalCategories/cm:categoryRoot/cm:generalclassifiable
/cm:generalclassifiable

 

 

So a query for PATH:'/cm:generalclassifiable//*' will find nodes classified under /sys:System/cm:categoryRoot/cm:generalclassifiable or  /sys:System/sys:additionalCategories/cm:categoryRoot/cm:generalclassifiable.

By extension, I could have my own categories in

/app:company_home/app:user_homes/cm:andy/cm:categoryRoot/cm:generalclassifiable as this will also appear as  /cm:generalclassifiable in category searches.

By convention there is a weak link between the aspect name and using the aspect name to define the category hierarchy below a node of type cm:category_root.

Invalid catrgories are removed from node service properties at read time, but may still be in the database. Whenever the propreties of a node are updated, properties of type d:category are tidied up. Properties of type d:category expect a NodeRef value or a Collection of Noderefs. If a NodeRef is invalid, does not exist, or is not of a type derived from cm:category it will be, or appear to have been, removed.

Attachments

    Outcomes