The official documentation is at: http://docs.alfresco.com
Our intention is to define a SQL-based query language which for the moment we are calling CQL. SQL has the advantages that most developers of enterprise portal know and understand it and there a number of tools that support it.
In content management, there are a number of SQL-based languages. JSR-170 has defined a hybrid which uses XML-like namespace conventions and is easily translated into its XPath language. Documentum has the DQL language (which we had a lot of input into a long, long time ago)that pre-dated but anticipated some of the features of SQL3/SQL-99 -- such as object-oriented and full-text. Microsoft has a query syntax in it CAML (Collaborative Application Markup Language) which uses SQL-like constructs but essentially creates a SQL-like query tree mixed with presentation semantics encoded as XML.
What we would like to do is triangulate on these three systems to enable us to create applications for all three. We have three options to create a SQL-based language. In order to create something that is translateable into the above systems, we have three choices:
- Define a query-tree construct - This can then be easily translated into each language. This has the advantage of being able to support the functions required and can be expressed as XML for transmission. This is essentially the route that Microsoft took. Unfortunately, it is difficult to program its construction and there are no tools to support that construction.
- Define a new SQL-like language - This is the route that Documentum and JSR-170 have taken. This allows a simple, declarative construction and can be encoded as a simple string. However, this requires a special parser and does not take advantage of the various SQL tools available.
- Define a SQL schema and use pure SQL - The idea is to define a domain model that encompasses the functionality of each and essentially uses views and table functions to describe categories, hierarchical paths (as bridge tables) and inheritance.
The core of the query language, as with any SQL-based language, is around the SELECT statement. When a select statement is available many other things are possible. We are looking at the query language to provide the objects or nodes upon which the web services will operate. This means that we probably don't need to tackle the UPDATE and DELETE statements. The DDL statements of SQL would not make any sense.
The components or capabilities of a SELECT statement for content management, as opposed to pure SQL, would be the following:
- Target List - Instead of being simple columns as in pure SQL, we would need to return potentially complex content objects and operators on content. We would also need to provide the notion of an object/node ID, which may have any number of subtleties associated with it. Many systems provide multi-valued properties which may be returned in the target list. Functions and aggregations may play a role as well. Pseudo values such as 'score' in JCR and Documentum may be returned.
- FROM Clause - There are types domains that can be identified in a FROM clause could be types (Documentum, JCR) or path (Microsoft). JCR uses a pseudo property called 'jcr:path'. Documentum provides a path through a 'folder()' function in the predicate. We don't believe this should be a function of the FROM clause. Also, what about the notion of JCR Workspace which is the domain of the search. What about cross-repository search. This should probably be a qualifier on the domain as in typical SQL implementations.
- WHERE Clause
- BLOB/CLOB Support