gavincornwell

v1 REST API - Part 9 - Queries & Search

Blog Post created by gavincornwell Employee on Apr 11, 2017

In the last post we covered the Sites APIs, this time we're going to take a look at ways to find things in the repository. There are two main APIs to do this, /queries and /search.

 

As always there is a Postman collection to accompany this post, click the "Run in Postman" button below to import it into your client.

 

 

The first request uses one of the APIs we learnt about last time to create a public site named "queriesSearchSite", the second request retrieves the document library container id and stores it in a global variable. The third request can be used to upload documents to the site, to get the most out of this post upload a text file containing some lorem ipsum text, some image files and some Office documents.

 

The /queries endpoints are designed to be very simple to use and usable in "live search" scenarios i.e. they can executed upon each key press so clients can show results as the user types. The actual query used behind the scenes is hard-coded, if complex or custom queries are required the /search API should be used, which we'll look at shortly.

 

Let's first take a look at the endpoint to find nodes. The http://localhost:8080/alfresco/api/-default-/public/alfresco/versions/1/queries/nodes endpoint returns nodes (files and folders) that match a simple term provided via a query parameter. The type of nodes returned can be restricted via the nodeType query parameter, for example passing my:type as the value will only return nodes of that type and any of it's subtypes. The query will look in the name, title and description properties, in the content and in tags for a match. Take a look at the API Explorer for the other options available for this endpoint.

 

The 4th request in the Postman collection shows an example of looking for the term "lorem". The number of results you get will depend on the content in your repository, some of the sample site content contains the word "lorem" so you should get a few results! The response format (shown below) is also consistent with the /nodes API so if you've been following the series it should look familiar.

{
  "list": {
    "pagination": {
      "count": 7,
      "hasMoreItems": false,
      "totalItems": 7,
      "skipCount": 0,
      "maxItems": 100
    },
    "entries": [
      {
        "entry": {
          "createdAt": "2017-04-10T09:12:32.761+0000",
          "isFolder": false,
          "isFile": true,
          "createdByUser": {
            "id": "test",
            "displayName": "Test User"
          },
          "modifiedAt": "2017-04-10T09:12:32.761+0000",
          "modifiedByUser": {
            "id": "test",
            "displayName": "Test User"
          },
          "name": "test-lorem-ipsum.txt",
          "id": "3379e95a-fa24-418e-a1df-7d7ef9192516",
          "nodeType": "cm:content",
          "content": {
            "mimeType": "text/plain",
            "mimeTypeName": "Plain Text",
            "sizeInBytes": 3186,
            "encoding": "ISO-8859-1"
          },
          "parentId": "d32682f0-cfd9-43da-ab74-ba78fc59a01a"
        }
      },
      ...
    ]
  }
}

 

To find sites the http://localhost:8080/alfresco/api/-default-/public/alfresco/versions/1/queries/sites endpoint can be used. The 5th request in the Postman collection shows how to look for sites that have the term "queries" in the site id, title or description. Again, take a look at the API Explorer for other options, including how to order the results.

{
  "list": {
    "pagination": {
      "count": 1,
      "hasMoreItems": false,
      "totalItems": 1,
      "skipCount": 0,
      "maxItems": 100
    },
    "entries": [
      {
        "entry": {
          "role": "SiteManager",
          "visibility": "PUBLIC",
          "guid": "763588b4-9c6f-4b34-af41-c92a6102711f",
          "description": "Site created for queries and search blog post",
          "id": "queriesSearchSite",
          "preset": "site-dashboard",
          "title": "Queries and Search Site"
        }
      }
    ]
  }
}

 

Finally, to find people (users) the http://localhost:8080/alfresco/api/-default-/public/alfresco/versions/1/queries/people endpoint can be used. The 6th request in the Postman collection shows how to look for people that have "jackson" in their username (id), first name or last name. As my repository has the sample site loaded the sample user "Mike Jackson" is returned:

{
  "list": {
    "pagination": {
      "count": 1,
      "hasMoreItems": false,
      "totalItems": 1,
      "skipCount": 0,
      "maxItems": 100
    },
    "entries": [
      {
        "entry": {
          "lastName": "Jackson",
          "userStatus": "Working on a new web design for the corporate site",
          "jobTitle": "Web Site Manager",
          "statusUpdatedAt": "2011-02-15T20:13:09.649+0000",
          "mobile": "012211331100",
          "emailNotificationsEnabled": true,
          "description": "Mike is a demo user for the sample Alfresco Team site.",
          "telephone": "012211331100",
          "enabled": false,
          "firstName": "Mike",
          "skypeId": "mjackson",
          "avatarId": "3fbde500-298b-4e80-ae50-e65a5cbc2c4d",
          "location": "Threepwood, UK",
          "company": {
            "organization": "Green Energy",
            "address1": "100 Cavendish Street",
            "address2": "Threepwood",
            "address3": "UK",
            "postcode": "ALF1 SAM1"
          },
          "id": "mjackson",
          "email": "mjackson@example.com"
        }
      }
    ]
  }
}

 

As with the other queries endpoints the options are intentionally limited, see the API Explorer for details. We will also be looking at the people API in a lot more depth in the next instalment of this series so stay tuned!

 

As mentioned earlier, if the pre-canned queries do not provide what you need you have the option to use the rich and powerful /search API, at the cost of a little more complexity.

 

Due to the number of options and functionality available via the search API, it is a little different than most of the other APIs we've looked at so far in the series. Firstly, the API is defined under the "search" namespace so it's base URL is slightly different. Secondly, the /search endpoint does not accept any query parameters and is therefore completely controlled via the POST body as we'll see in the examples that follow.

 

We'll start by executing a simple search, the 7th request in the Postman collection shows how POSTing the following body to http://localhost:8080/alfresco/api/-default-/public/search/versions/1/search searches for the term "lorem":

{
  "query": {
    "query": "lorem"
  }
}

 

The results should look familiar though, for the most part they are the same as the results from /queries and from /nodes/{id}/children

{
  "list": {
    "pagination": {
      "count": 7,
      "hasMoreItems": false,
      "totalItems": 7,
      "skipCount": 0,
      "maxItems": 100
    },
    "entries": [
      {
        "entry": {
          "isFile": true,
          "createdByUser": {
            "id": "mjackson",
            "displayName": "Mike Jackson"
          },
          "modifiedAt": "2011-03-03T10:31:31.651+0000",
          "nodeType": "cm:content",
          "content": {
            "mimeType": "application/vnd.ms-powerpoint",
            "mimeTypeName": "Microsoft PowerPoint",
            "sizeInBytes": 2898432,
            "encoding": "UTF-8"
          },
          "parentId": "38745585-816a-403f-8005-0a55c0aec813",
          "createdAt": "2011-03-03T10:31:30.596+0000",
          "isFolder": false,
          "search": {
            "score": 1.6137421
          },
          "modifiedByUser": {
            "id": "mjackson",
            "displayName": "Mike Jackson"
          },
          "name": "Project Overview.ppt",
          "location": "nodes",
          "id": "99cb2789-f67e-41ff-bea9-505c138a6b23"
        }
      },
      ...
    ]
  }
}

 

There are a couple of differences though, the search API returns two additional properties, search and location. The search property (line 29) adds extra context for the individual result, in this case, the score.

 

Explaining the full details is beyond the scope of this post but it is possible to search across "live" nodes, deleted nodes and versioned nodes, the location property (line 37) shows from which area the result came from. By default only "live" nodes are included.

 

The example above used the default search language afts (Alfresco Full Text Search), however, cmis and lucene are also supported. The example body below shows how to execute a simple CMIS query (8th request in the Postman collection) to find all content with a name starting with "test.":

{
  "query": {
    "query": "select * from cmis:document WHERE cmis:name LIKE 'test.%'",
    "language": "cmis"
  }
}

 

For completeness, the example body below shows how to execute a simple Lucene query (9th request in the Postman collection) to find all the content modified in the last week:

{
  "query": {
    "query": "+@cm\\:modified:[NOW/DAY-7DAYS TO NOW/DAY+1DAY] +TYPE:\"cm:content\"",
    "language": "lucene"
  }
}

 

As with all the v1 REST APIs paging can also be controlled, it's just done via the body rather than a query parameter. The results can also be sorted. The example body below shows how to execute a search (10th request in the Postman collection) to find all content ordered by the name property, only show 25 results rather than the default of 100 and skip the first 10 results:

{
  "query": {
    "query": "+TYPE:\"cm:content\"",
    "language": "afts"
  },
  "paging": {
    "maxItems": "25",
    "skipCount": "10"
  },
  "sort": [{"type":"FIELD", "field":"cm:name", "ascending":"false"}]
}

 

Now we've covered the basics let's look at a couple of the more interesting features of the search API, faceting and term highlighting.

 

There are two types of facets; queries and fields. A query facet returns the count of results for the given query, you can provide multiple facet queries in one request. A field facet returns a number of "buckets" for a field, providing the count of results that fit into each bucket.

 

It's much easier to understand with an example, the body below shows a search request (11th request in the Postman collection) that will look for content nodes that have a name or title starting with "test". We also specify that we want to know how many of the results are small files, how many are plain text files, how many are images and how many are Office files. Additionally, we are also asking for the creator facet field to be included, which will indicate how many of the results were created by each user:

{
  "query": {
    "query": "(name:\"test*\" OR title:\"test*\") AND TYPE:\"cm:content\""
  },
  "facetQueries": [
    {"query": "content.size:[0 TO 10240]", "label": "Small Files"},
    {"query": "content.mimetype:'text/plain'", "label": "Plain Text"},
    {"query": "content.mimetype:'image/jpeg' OR content.mimetype:'image/png' OR content.mimetype:'image/gif'", "label": "Images"},
    {"query": "content.mimetype:'application/msword' OR content.mimetype:'application/vnd.ms-excel'", "label": "Office"}
  ],
  "facetFields": {"facets": [{"field": "creator"}]}
}

 

The response to this request is shown below:

{
  "list": {
    "pagination": {
      "count": 8,
      "hasMoreItems": false,
      "totalItems": 8,
      "skipCount": 0,
      "maxItems": 100
    },
    "context": {
      "facetQueries": [
        {
          "label": "Office",
          "count": 2
        },
        {
          "label": "Small Files",
          "count": 4
        },
        {
          "label": "Plain Text",
          "count": 1
        },
        {
          "label": "Images",
          "count": 3
        }
      ],
      "facetsFields": [
        {
          "label": "creator",
          "buckets": [
            {
              "label": "test",
              "count": 6,
              "display": "Test User"
            },
            {
              "label": "System",
              "count": 2,
              "display": "System"
            }
          ]
        }
      ]
    },
    "entries": [
      {
        "entry": {
          "isFile": true,
          "createdByUser": {
            "id": "test",
            "displayName": "Test User"
          },
          "modifiedAt": "2017-04-10T09:21:44.499+0000",
          "nodeType": "cm:content",
          "content": {
            "mimeType": "image/gif",
            "mimeTypeName": "GIF Image",
            "sizeInBytes": 3039,
            "encoding": "UTF-8"
          },
          "parentId": "d32682f0-cfd9-43da-ab74-ba78fc59a01a",
          "createdAt": "2017-04-10T09:20:41.665+0000",
          "isFolder": false,
          "search": {
            "score": 2.0050006
          },
          "modifiedByUser": {
            "id": "test",
            "displayName": "Test User"
          },
          "name": "test.gif",
          "location": "nodes",
          "id": "4ba71ad8-8812-4c1a-9d0b-30643dc39c51"
        }
      },
      ...
    ]
  }
}

 

As well as the expected list of nodes, the response also contains a facetQueries and a facetsFields object containing the counts we requested. The facetQueries object has an entry for each query supplied in the result whereas the facetsFields object contains an entry for each requested field which in turn contains the count for each bucket.

 

The last example we're going to look at in this post is term highlighting. The example body below shows a search request (12th request in the Postman collection) that will look for content nodes that have a name or title starting with "test", if the match occurs in either the cm:name or cm:title property the location of the match will be returned in the results. By default, the matched term is highlighted by surrounded by an em tag, to surround the match with something else the prefix and postfix properties can be used as shown in the example below:

{
  "query": {
    "query": "(name:\"test*\" OR title:\"test*\") AND TYPE:\"cm:content\""
  },
  "highlight": {
    "fields": [
      {
        "field": "cm:name",
        "prefix": "(",
        "postfix": ")"
      },
      {
        "field": "{http://www.alfresco.org/model/content/1.0}title"
      }
    ]
  }
}

 

As the highlighting is specific to each individual result the search object we saw earlier is used to return the result as shown below:

{
  "list": {
    "pagination": {
      "count": 8,
      "hasMoreItems": false,
      "totalItems": 8,
      "skipCount": 0,
      "maxItems": 100
    },
    "entries": [
      {
        "entry": {
          "isFile": true,
          "createdByUser": {
            "id": "System",
            "displayName": "System"
          },
          "modifiedAt": "2017-02-20T10:57:28.407+0000",
          "nodeType": "cm:content",
          "content": {
            "mimeType": "application/x-javascript",
            "mimeTypeName": "JavaScript",
            "sizeInBytes": 2271,
            "encoding": "UTF-8"
          },
          "parentId": "a4e9e481-89b5-43da-9389-21314dbb6046",
          "createdAt": "2017-02-20T10:57:28.407+0000",
          "isFolder": false,
          "search": {
            "score": 1.1892484,
            "highlight": [
              {
                "field": "cm:name",
                "snippets": [
                  "example (test) script.js.sample"
                ]
              },
              {
                "field": "{http://www.alfresco.org/model/content/1.0}title",
                "snippets": [
                  "Example <em>Test</em> Script"
                ]
              }
            ]
          },
          "modifiedByUser": {
            "id": "System",
            "displayName": "System"
          },
          "name": "example test script.js.sample",
          "location": "nodes",
          "id": "7e02b810-4bce-4ed6-aff0-3f2f88a5ff82"
        }
      },
      ...
    ]
  }
}

 

As we specified in the request, the match in the name property is surrounded by brackets (line 35) and the em tag surrounds the match in the title property (line 41).

 

We've only just scratched the surface of the capabilities of the search API in this post so I would highly recommend you take a look at the API Explorer and select "Search API" from the drop-down menu to get more details of what's possible.

 

If you're using Community via the installer as instructed in the previous post you will have been using SOLR 4. You may have heard that we also released support for SOLR 6 with 5.2. To learn more please read the SOLR 6 blog posts on this site or visit our documentation site.

 

Next time we're going to take a look at the people API.

Outcomes