Skip navigation
All Places > Alfresco Premier Services > Authors nmcminn-alfresco

It has now been a few days since DevCon 2018 wrapped up, and after digesting everything that happened I feel like it is a good time to write down a few thoughts.  In short, DevCon was AWESOME!  It's no secret that DevCon was always a favorite of our community, our partners and of course, the extended team at Alfresco.  This year's event felt like we never stopped doing it.  The attendees were engaged, the energy was high, and the sessions were informative and fun.  For that, we owe Kristen Gastaldo, Francesco Corti, Richard Esplin and their collaborators at the Order of the Bee a huge amount of gratitude.  Even though it was a reboot of a crowd favorite event, a lot has changed.

 

The Alfresco community has always been a driving force at DevCon, as it should be.  We are an open source company, after all, and when you are an open source company your community plays a pivotal role in your success.  This year that was on full display.  Not only were a huge number of our community superstars there in person, but even the talk selection was a joint effort between the community and the company.  The talks themselves were also a great mix of Alfrescians and our extended community, partners and customers.  If I recall correctly it was almost a neat split right down the middle.  Collaboration with the community was a running theme throughout the conference, and was the focus of one of the talks that wrapped up the first day.  Suffice to say, there are a TON of opportunities to get involved.  Whether you blog, write tutorials, create and contribute your own community extensions or work on an existing one, or submit pull requests to one of our numerous open source projects, it isn't hard to find a way to contribute.

 

As for the talks themselves, the content was of the quality we expect from our team and community.  Several people from the Alfresco Strategic Services team were among those selected to present, but unfortunately I didn't get to see all of them due to conflicts with my own speaking slots and other conference duties.  Jose Portillo spoke on Solr Sharding, Luis Cabaceira shared some great work he has been doing with Piergiorgio Lucidi on ManifoldCF, Richard McKnight covered some best practices for building Alfresco extensions, and Mohammed Gazal took his first turn on the stage to present some work on PDF templating.  Luckily the talks were all recorded, and will be posted in the coming weeks.  If you missed something the first time around, you'll have the whole event at your fingertips. 

 

One thing in particular that I liked about this year's DevCon was the mix between what is best practice today, what is coming from Alfresco, and a third category that you might call "the art of the possible".  Of course we had some great sessions on securing content with our governance tools, exporting content, building ACS extensions, tuning Solr and building resilient systems, but we also had a lot of forward looking sessions focused on AWS, ADF and other cutting edge stuff coming out of Alfresco.  I'd like to see this continue in the future.  It's important for us to share the best way to do things today as well as preparing ourselves and our community for the future.

 

With so many submissions not everybody was able to be included in the official conference agenda, but we are working to find ways to get that knowledge out into the world.  Perhaps as blog posts, or maybe some community webcasts?  Not sure yet but I'd love to hear how you would like us to continue the momentum that DevCon started.  I'm coming off this conference with a huge amount of energy, and can't wait to see what the rest of 2018 has in store! 

Alfresco's Customer Success team is always looking for ways to better serve our customers.  Sometimes this takes the form of improved processes in support, or implementing 360 degree views of customers for our CSMs, or improving the way we capture and manage best practices for our knowledgebase.  Sometimes this means taking entire teams and realigning them to better match what our customers are telling us that they need. 

 

Within Alfresco's Customer Success group, we have two teams that provide support and services that go beyond what comes with our primary support:  Alfresco Premier Services and Alfresco Consulting.  Not surprisingly, these two teams often work hand in hand.  Premier Services may work with a customer on an issue until it becomes clear that the business solution will require an extension, and then help the customer to engage Alfresco consulting to get it done.  Conversely, our consulting team provides solutions that affect the recommendations and troubleshooting steps that our support teams will take when working with a customer on subsequent questions.  When we look at the way these two teams work and what they do for our customers, it makes a lot of sense to pull them under the same roof.

 

Alfresco's Strategic Services organization was created for just that purpose, marrying the best of our elevated operational support offerings with our skilled and seasoned consulting team.  We have already started reimagining parts of our delivery and establishing new processes and procedures for internal knowledge sharing and capture.  In the coming weeks we will be focused on improving our ability to seamlessly move work across the team to make sure the right people have eyes on the right things at the right time.  It's an exciting time for services at Alfresco!

Troubleshooting application performance can be tricky, especially when the application in question has dependencies on other systems.  Alfresco's Content Services (ACS) platform is exactly that type of application.  ACS relies on a relational database, an index, a content store, and potentially several other supporting applications to provide a broad set of services related to content management.  What do you do when you have a call that is slow to respond, or a page that seems to take forever to load?  After several years of helping customers diagnose and fix these kinds of issues, my advice is to start at the bottom and work your way up (mostly).

 

When faced with a performance issue in Alfresco Content Services, the first step is to identify exactly what calls are responding slowly.  If you are using CMIS or the ACS REST API, this is simple enough, you'll know which call is running slowly and exactly how you are calling it.  It's your code making the call, after all.  If you are using an ADF application, Alfresco Share or a custom UI it can become a bit more involved.  Identifying the exact call is straightforward and you can approach this the same way you would approach it for any web application.  I usually use Chrome's built in dev tools for this purpose.  Take as an example the screenshot below, which shows some of the requests captured when loading the Share Document Library on a test site:

 

 

In this panel we can see the individual XHR requests that Share uses to populate the document library view.  This is the first place to look if we have a page loading slowly.  Is it the document list that is taking too long to load?  Is it the tag list?  Is it a custom component?  Once we know exactly what call is responding slowly, we can begin to get to the root of our performance issue.

 

When you start troubleshooting ACS performance, it pays to start at the bottom and work your way up.  This usually means starting at the JVM.  Take a look at your JVM stats with your profiler of choice.  Do you see excessive CPU utilization?  How often is garbage collection running?  Is the system constantly running at or close to your maximum memory allocation?  Is there enough system memory available to the operating system to support the amount that has been allocated to the JVM without swapping?  It is difficult to provide "one size fits all" guidance for JVM tuning as the requirements will vary based on the the type of workload Alfresco is handling. Luis Cabaceira has provided some excellent guidance on this subject in his blog.  I highly recommend his series of articles on performance, tuning and scale.  When troubleshooting ACS performance, start by ensuring you see healthy JVM behavior across all of your application tiers.  Avoid the temptation to just throw more memory at the problem, as this can sometimes make things worse.

 

Unpacking Search Performance

 

Assuming that the JVM behavior looks normal, the next step is to look at the other components on which ACS depends.  There are three main subsystems that ACS uses to read / write information:  The database, the index, and the content store.  Before we can start troubleshooting, we need to know which one(s) are being used in the use case that is experiencing a performance problem.  In order to do this, you will need to know a bit about how search is configured on your Alfresco installation.  Depending on the version you have installed, Alfresco Content Services (5.2+) / Alfresco One (4.x / 5.0.x / 5.1.x) supports multiple search subsystem options and configurations.  It could be Solr 1.4, Solr 4, Solr 6 or your queries could be going directly against the database.  If you are on on Alfresco 4.x, your system could also be configured to use the legacy Lucene index subsystem, but that is out of scope for this guide.  The easiest way to find out which index subsystem is in use is to look at the admin console.  Here's a screenshot from my test 5.2 installation that shows the options:

 

 

Now that we know for sure which search subsystem is configured, we need to know a little bit more about search configuration.  Alfresco Content Services supports something known as Transactional Metadata Queries.  This feature was added to supplement Solr for certain use cases.  The way Solr and ACS are integrated is "eventually consistent".  That is to say that content added to the repository is not indexed in-transaction.  Instead, Solr queries the repository for change sets, and then indexes those changes.  This makes the whole system more scalable and performant when compared with the older Lucene implementation, especially where large documents are concerned.  The drawback to this is that content is not immediately queryable when it is added.  Transactional Metadata Queries work around this by using the metadata in the database to perform certain types of queries, allowing for immediate results.  When troubleshooting performance, it is important to know exactly what type of query is executed, and whether it runs against the database or the index.  Transactional metadata queries can be independently turned on or off to various degrees for both Alfresco Full Text Search and CMIS.  To find out how your system is configured, we can again rely on the ACS admin console:

 

The full scope of Transactional Metadata Queries is too broad for this guide, but everything you need to know is in the Alfresco documentation on the topic.  Armed with knowledge of our search subsystem and Transactional Metadata Query configurations, we can get down to the business of troubleshooting our queries.  Given a particular CMIS or AFTS query, how do we know if it is being executed against the DB or the index?  If this is a component running a query you wrote, then you can look at the Transactional Metadata Query documentation to see if Alfresco would try to run it against the database.  If you are troubleshooting a query baked into the product, or you want to see for sure how your own query is being executed, turn on debug logging for class DbOrIndexSwitchingQueryLanguage.  This will tell you for sure exactly where the query in question is being run.

 

The Database

 

If you suspect that the cause may be a slow DB query, there are several ways to investigate.  Every DB platform that Alfresco supports for production use has tools to identify slow queries.  That's a good place to start, but sometimes it isn't possible to do because you, as a developer or ACS admin, don't have the right access to the DB to use those tools.  If that's the case you can contact your DBA or you can look at it from the application server side.  To get the app server view of your query performance you again have a few options.  You could use a JDBC proxy driver like Log4JDBC or JDBCSpy that can output query timing to the logs.  It seems that Log4JDBC has seen more recent development so that might be the better choice if you go the proxy driver route.  Another option is to attach a profiler. JProfiler and YourKit both support probing JDBC performance.  YourKit is what we use most often at Alfresco, and here's a small example of what it can show us about our database connections:

 

 

With this view it is straightforward to see what queries are taking the most time.  We can also profile DB connection open / close and several other database related bits that may be of interest.  The ACS schema is battle tested and performant at this point in the product lifecycle, but it is fairly common to see slow queries as a result of a database configuration problem, an overloaded shared database server, poor network connection to the database, out of date index statistics or a number of other causes.  If you see a slow query show up during your analysis, you should first check the database server configuration and tuning.  If you suspect a poorly optimized query (which is rare) contact Alfresco support.

 

One other common source of database related performance woes is the database connection pool.  Alfresco recommends setting the maximum database connection pool size on each cluster node to the number of concurrent application server worker threads + 75 to cover overhead from scheduled jobs, etc.  If you have Tomcat configured to allow for 200 worker threads (200 concurrent HTTP connections) then you'll need to set the database pool maximum size to 275.  Note that this may also require you to increase the limit on the database side as well.  If you have a lot of requests waiting on a connection from the pool that is not going to do good things for performance.

 

The Index

 

The other place where a query can run is against the ACS index.  As stated earlier, the index may be one of several types and versions, depending on exactly what version of Alfresco Content Services / Alfresco One you are using and how it is configured.  The good news is that you can get total query execution time the same way no matter which version of Solr your Alfresco installation is using.  To see the exact query that is being run, how long it takes to execute and how many results are being returned, just turn on debug logging for class SolrQueryHttpClient.  This will output debug information to the log that will tell you exactly what queries are being executed and how long each execution takes.  Note that this is the query time as returned in the result set, and should just be the Solr execution time without including the round trip time to / from the server.  This is an important distinction, especially where large result sets are concerned.  If the connection between ACS and the search service is slow then a query may complete very quickly but the results could take a while to arrive back at the application server.  In this case the index performance may be just fine, but the network performance is the bottleneck.

 

If the queries are running slowly, there are several things to check.  Good Solr performance depends heavily on good underlying disk I/O performance.  Alfresco has some specific recommendations for minimum disk performance.  A fast connection between the index and repository tiers is essential, so make sure that any load balancers or other network hardware that sit between the tiers are providing good performance.  Another thing to check is the Solr cache configuration.  Alfresco's search subsystem provides a number of caches that improve search performance at the cost of additional memory.  Make sure your system is sized appropriately using the guidance Alfresco provides on index memory requirements and cache sizes.  Alfresco's index services and Solr can show you detailed cache statistics that you can use to better understand critical performance factors like hit rates, evictions, warm up times, etc as shown in this screenshot from Alfresco 5.2 with Solr 4:

 

 

In the case of a large repository, it might also help to take a deeper look at how sharding is configured including the number of shards and hosts and whether or not the configuration is appropriate.  For example, if you are sharding by ACL and most of your documents have the same permissions, then it's possible the shards are a bit unbalanced and the majority of requests are hitting a single shard.  For this case, sharding by DBID (which ensures an even distribution) might be more appropriate and yield better performance.

 

It is also possible that a slow running query against the index might need some tuning itself.  The queries that Alfresco uses are well optimized, but if you are developing an extension and want to time your own queries I recommend looking at the Alfresco Javascript Console.  This is one of the best community developed extensions out there, and it can show you execution time for a chunk of Alfresco server-side Javascript.  If all that Javascript does is execute a query, you can get a good idea of your query performance and tweak / tune it accordingly.

 

The Content Store

 

Of all of the subsystems used for storing data in Alfresco, the content store is the one that has (typically) the least impact on overall system performance.  The content store is only used when reading / writing content streams.  This may be when a new document is uploaded, when a document is previewed, or when Solr requests a text version of a document for indexing.  Poor content store performance can show itself as long upload times under load, long preview times, or long delays when Solr is indexing content for the first time.  Troubleshooting this means looking at disk utilization or (if the content store resides on a remote filesystem) network utilization.

 

Into The Code

 

A full discussion of profiling running code would turn this from an article into a book, but any good systems person should know how to hook up a profiler or APM tool and look for long running calls.  Many Alfresco customers use things like Appdynamics or New Relic to do just that.  Splunk is also a common choice, as is the open source ELK stack.  All of these suites can provide a lot more than just what a profiler can do and can save your team a ton of time and money.  Alfresco's support team also finds JStack thread dumps useful.  If we see a lot of blocked threads that can help narrow down the source of a problem.  Regardless of the tools you choose, setting up good monitoring can help you find emerging performance problems before they become user problems.

 

Conclusion

 

This guide is nowhere near comprehensive, but it does cover off some of the most common causes for Alfresco performance issues we have seen in the support world.  In the future we'll do a deeper dive into the index, repository and index tier caching, content transformer selection and execution, permission checking, governance services and other advanced topics in performance and scalability.

About three years ago, Alfresco created a new branch of our support and services organization, the Premier Services team.  Since then, the Premier Services team has grown into a first class global support organization, handling many of Alfresco's largest and most complex strategic accounts.  Our group consists of some of Alfresco's most seasoned and senior support staff and has a presence in APAC, EMEA and the US serving customers worldwide.  Today we start a new chapter in our journey.

 

One of the benefits of working with large accounts is the breadth and depth of problems we get to help them solve.  Premier Services accounts tend to be those with extensive integrations, demanding uptime, reliability and performance requirements, complex business environments and product extensions.  We are launching our blog to share best practices, interesting insights, code / configuration examples and problem solving tips that arise from our work.  This blog will also serve as a platform for sharing new service offerings and updates to existing offerings, and to give our customers and community members some insights into the direction that our service offerings will take as they evolve.  For our inaugural blog post, we'd like to talk about three recent changes to the Alfresco Premier Services offerings which we think will help reduce confusion about what we deliver and give our customers some extra value.  

 

First, you may have noticed that our web site has been updated as a part of Alfresco's Digital Business Platform launch.  As a part of this update we have started the process of merging two of our premier services offerings.  Previously we offered an On-site Services Engineer (OSE) and a Remote Services Engineer (RSE).  Going forward we are combining these into a single Premier Services Engineer (PSE) offering.  We can still deliver it on-site or remote, and the pricing has not changed.  Where there were differences in the services, we've taken the more generous option and made it the default.  For example, OSE and RSE service used to come with a different number of Alfresco University Passports.  Going forward all PSE customers will get five passports included with their service.  Future passports issued to Premier Services accounts will also include a certification voucher.

 

The other changes in our service are additions intended to help with a pair of common customer requests.  It is common for customers to start to staff up at the beginning of an Alfresco project, and we are often asked to help evaluate potential hires.  In support of this request we have created a set of hiring profiles that identify the core skills that we find will enable a new hire to come up to speed quickly on the Alfresco platform.  These hiring profiles and assessments are available now to all Premier Services accounts.  A second major change is the addition of Alfresco Developer Support to Premier Services accounts on the Alfresco Digital Business Platform.  What this means is that if you are a Premier Services customer and are using Alfresco Content Services 5.2+ AND Alfresco Process Services powered by Activiti 1.6+, you will get Developer Support for two of your support contacts included with your service at no additional cost.  This is a huge addition to the Premier Services portfolio, and we're excited to be able to offer it in conjunction with our peers on the dev support team.

 

Stay tuned for more from the Premier Services team, our next blog posts will take an in-depth look at some challenging technical issues.