AnsweredAssumed Answered

Activiti SQL query performance

Question asked by shailendra1 on Jan 8, 2014
Latest reply on Jan 10, 2014 by shailendra1
During one of our performance tuning exercise, we observed that while querying for all variables using

Activity was using following query defined in the xml

${limitBefore}
select distinct RES.*,
VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_,
VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_,
VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_,
VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_
${limitBetween}

${orderBy}
${limitAfter}

The actual query fired was

select distinct RES.*,
VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_,
VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_,
VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_,
VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_
from ACT_RU_TASK RES
left outer join ACT_RU_VARIABLE VAR ON RES.PROC_INST_ID_ = VAR.EXECUTION_ID_ and VAR.TASK_ID_ is null;

Having a look at the query it seems that the distinct part is unnecessary as using distinct in left outer join with select including values also from table OTHER THAN Base table does not make sense in my opinion. The rows of resultset of this query without distinct would be distinct anyway.
Because of distinct, the database has to work extra. For e.g., using explain plan on this query for MySQL, it shows that this uses a temporary table to copy first all the results to temp area and then process distinct which is un necessary extra work. If the act_ru_variable table size is large, then this involves huge processing.

Removing the distinct returns the same number of rows only this time the processing done by database is not required.
Of course if the query contains only columns from base tables then using distinct makes sense.

Have I overlooked something here or not aware of the reason why distinct is being used ?

Outcomes