Xense Profiler

Version 1.3

Xense Profiler is a powerful performance analysis tool that enables you to analyse Documentum 5 DMCL (Documentum Client Library) trace files and locate performance data quickly. It is a Windows command line tool that accepts a level 10 DMCL trace as input and outputs a summary report in HTML, text or XML formats.

Xense Profiler supports the following functionality:

Accepts DMCL trace files of any size
Reports created in HTML (default), XML or text format
API Call summary report
Top 10 long-running API calls report
Top 20 long-running Queries report
Query Summary report
Top Queries and Query Summary contains the true duration of the query not just the duration of the intial query statement (see Top Queries below)
Durations are correctly attributed to API calls even in WDK multi-thread trace files (see Multi-thread Trace Files below)
Filter trace file to show only API calls that generate Remote Procedure Calls (see RPC Only Filter Option below)
Filter trace file by DMCL thread number (see Multi-thread Trace Files below)
Alternative 'Detail' mode allows raw DMCL trace to be filtered using RPC Only and thread filters
'Detail' mode also has an output-to-CSV option

Please read the Licensing information in the accompanying file Licensing.txt before installing and using this software.

What's New in version 1.3

Version 1.3 contains the following changes and bug fixes:

Top Queries report that lists the top 20 longest-running queries
Performance improvement for analysing Documentum 5.3 dmcl traces
Top Queries and Query Summary reports show rows processed for each query
API Call Summary now sorts APIs by total duration
Tested on Windows Vista
A number of cosmetic improvements to the report

What's New in version 1.2

Version 1.2 is maintenance release that fixes a bug that could cause the profiler to not report the full API in some cases.

Requirements and Installation

Xense Profiler is designed to run as a Windows command line application. It has been tested on Windows Vista, Windows XP, Windows 2003 and Windows 2000. It should run on other Windows 32-bit or 64-bit platforms - if you experience problems running Xense Profiler on other Windows platforms please let us know by mailing to profiler_support@xense.co.uk.

To install, unzip the files (e.g. to c:\Program Files\Xense\Profiler).

How to run Xense Profiler

To run Xense Profiler open a CMD prompt, CD to the directory where you installed the application. The command line format is as follows:

dmclprof tracefile outputfile [options]

The default operation is to process 'tracefile' and create an HTML report as given by 'outputfile'.

The options that can be specified are as follows:

Option	Description
-s[ummary] format	format can be HTML, XML or TEXT
-r[pconly]	filter out API calls that do not generate RPCs
-l[istthreads]	filter trace by DMCL thread
-d[etail] format	format can be RAW or CSV

Note. -summary and -detail options are mutually exclusive.

Some Examples

The following command processes the trace file view_draft_cabinet.log and creates an HTML summary report in view_draft_cabinet.htm:

dmclprof view_draft_cabinet.log view_draft_cabinet.htm

The report view_draft_cabinet.htm can be viewed by opening it in a web browser.

To create an XML report:

dmclprof view_draft_cabinet.log view_draft_cabinet.xml -s XML

To create a text report:

dmclprof view_draft_cabinet.log view_draft_cabinet.txt -s TEXT

To filter the HTML report for API calls that generate RPCs:

dmclprof view_draft_cabinet.log view_draft_cabinet.htm -r -s HTML

- or -

dmclprof view_draft_cabinet.log view_draft_cabinet.htm -r

To filter a trace file that contains trace data for 3 different threads (thread numbers 30, 32 and 67) and create a text report:

dmclprof view_draft_cabinet.log view_draft_cabinet.txt -l -s TEXT

When the 'listthreads' option is specified Xense Profiler displays a list of the threads contained in the trace file and prompts you to select one. Just the thread that is selected is then processed.

To transform a DMCL trace file into CSV format:

dmclprof view_draft_cabinet.log view_draft_cabinet.csv -d CSV

To filter a DMCL trace file for just the APIs that generate RPCs:

dmclprof view_draft_cabinet.log view_draft_cabinet_rpconly.log  -d RAW -r

Report Sections

The reports generated by Xense Profiler contain 4 sections:

Top 10 API Calls
API Call Summary
Top Queries
Query Summary

Top 10 API Calls

The Top 10 API Calls report lists the 10 API calls that took the most time to complete. For each API the time to complete the API and the full text of the API are reported.

Typically the longest running calls will be one of the query calls (query,query_cmd, execquery, readquery or cachequery) however in some circumstances authentication problems, network problems or database blocking problems can also cause other types of calls to appear in this report.

An example:

Duration secs	API Text
81.555	query_cmd,s1,T,F,,,,,select distinct a.r_object_id, a.object_name,upper(a.object_name) as objname, a.r_object_type, a.a_content_type,a.r_version_label, a.r_lock_owner, a.r_link_cnt, a.r_is_virtual_doc,'y' as selectable, '' as navigatable, 'y' as notnavigatable from dm_document (all) a where a.a_controlling_app = 'dm_dcm' and a.r_lock_owner = ' ' and a.r_object_id != '0900000180050012' and exists (select 1 from dcm_extended_sysobject b where b.parent_id = a.r_object_id and b.document_class in ('Class 1a','Class 1b', 'Class 1c','Class 1d','Class 1e','Class 2') ) and exists (select 1 from dcm_state_extension c where c.parent_id = a.r_policy_id and c.state_no = a.r_current_state and c.document_class = (select d.document_class from dcm_extended_sysobject d where d.parent_id = a.r_object_id) and c.state_type in ('dcm_review') ) and i_latest_flag = '1' order by 3 asc
0.622	next,s2,q0
0.376	execquery,s2,1,SELECT r_object_id,score,text,object_name, r_object_type, r_lock_owner,owner_name, r_link_cnt,r_is_virtual_doc,r_content_size, a_content_type,i_is_reference,r_assembled_from_id, r_has_frzn_assembly, a_compound_architecture,i_is_replica,r_policy_id,subject,r_modify_date FROM dm_sysobject WHERE (object_name LIKE '%XENSE-FORM-99998%' ESCAPE '\') AND (a_is_hidden = FALSE) ENABLE(FTDQL)
0.350	fetch,s2,57000001800099a0,,T,dm_type_cache_config
0.325	query_cmd,s2,F,F,,,,,select r_object_id, group_name from dm_group where any users_names = 'John Doe'
0.185	execquery,s2,0,select category_name from dm_dbo.category where is_active = 1 order by category_name
0.141	fetch,s2,5700000180009901,,T,dm_type_cache_config
0.141	fetch,s2,5700000180009909,,T,dm_type_cache_config
0.130	get,s2,tsop_form:46000001800049b2:Draft.a_status,is_hidden
0.126	execquery,s2,0,select org_name from dm_dbo.org where is_active = 1 order by org_name

Clearly the query listed at the top of this report dominates the elapsed time covered by the trace. Performance tuning effort should concentrate on tuning this query.

API Calls Summary

The API Call Summary reports each type of API call, the number of times the call was made and the total duration for the API call:

API Call	Duration secs	Frequency
close	0.001	150
connect	0.051	5
count	0.000	19
create	0.008	7
disconnect	0.031	5
fetch	3.433	298
get	0.254	32001
id	0.132	72
next	1.372	5215
query_cmd	4.585	180
values	0.006	1789

This type of report is particularly useful where there is no single large query dominating the elapsed time covered by the trace. This trace was taken from a system where clicking on a particular folder took 10 seconds to display the contents of the folder. The 10 second elapsed time is 'explained' by the large number of query_cmd, fetch and next statements executed. Tuning effort should concentrate on coding or configuration that reduces the use of these calls or at least reduces the number of times the calls result in RPCs.

Top Queries and Query Summary

The Top Queries report lists the 20 queries that took the most time to complete. For each query the following information is detailed:

time to completely process the query
number of rows processed
DQL query text

An example:

Top Queries

Duration secs	Rows	Query Text
39.225	12	SELECT r_object_id, title, rcod_id, expiry_date, r_version_label as published_label,object_name, r_object_type,r_lock_owner,owner_name,r_link_cnt,r_is_virtual_doc, r_content_size,a_content_type,i_is_reference from bus_doc WHERE bus_doc.main_id='55378989' AND bus_doc.rating_id IS NOT NULLINT AND bus_doc.rating_id NOT IN(5,52,1) AND bus_doc.main_date IS NOT NULLDATE AND (calc_date+ stock_life/2) <= DATE(today) + 90 AND bus_doc.stock_life IS NOT NULLINT AND bus_doc.r_current_state NOT IN (2,3) AND ANY bus_doc.r_version_label='PUBLISHED' AND (bus_doc.r_lock_owner='' OR bus_doc.r_lock_owner =' ')
38.116	12	SELECT r_object_id, title, rcod_id, calc_date as expiry_date, r_version_label as published_label,object_name,r_object_type, r_lock_owner,owner_name,r_link_cnt,r_is_virtual_doc, r_content_size,a_content_type,i_is_reference,s_title,hist_initial_date, main_date,stock_life,s_title,hist_initial_date,main_date,stock_life from bus_doc WHERE bus_doc.main_id='55378989' AND bus_doc.rating_id IS NOT NULLINT AND bus_doc.rating_id NOT IN(5,52,1) AND bus_doc.main_date IS NOT NULLDATE AND (calc_date+ stock_life/2) <= DATE(today) + 90 AND bus_doc.stock_life IS NOT NULLINT AND bus_doc.r_current_state NOT IN (2,3) AND ANY bus_doc.r_version_label='PUBLISHED' AND (bus_doc.r_lock_owner='' OR bus_doc.r_lock_owner =' ')
23.001	65	SELECT r_object_id, title, rcod_id, calc_date as expiry_date, r_version_label as published_label,object_name,r_object_type, r_lock_owner,owner_name,r_link_cnt,r_is_virtual_doc, r_content_size,a_content_type,i_is_reference,s_title,hist_initial_date,main_date,stock_life, s_title,hist_initial_date,main_date,stock_life from bus_doc WHERE bus_doc.main_id='55378989' AND bus_doc.rating_id IS NOT NULLINT AND bus_doc.rating_id NOT IN(5,52,1) AND bus_doc.main_date IS NOT NULLDATE AND (calc_date+ stock_life/2) <= DATE(today) + 90 AND bus_doc.stock_life IS NOT NULLINT AND bus_doc.r_current_state NOT IN (2,3) AND ANY bus_doc.r_version_label='PUBLISHED' AND (bus_doc.r_lock_owner='' OR bus_doc.r_lock_owner =' ')
15.225	68	SELECT r_object_id, title, rcod_id, calc_date as expiry_date, r_version_label as published_label,object_name,r_object_type, r_lock_owner,owner_name,r_link_cnt, r_is_virtual_doc,r_content_size,a_content_type,i_is_reference,s_title,hist_initial_date, main_date,stock_life,s_title,hist_initial_date, main_date,stock_life from bus_doc WHERE bus_doc.main_id='55378989' AND bus_doc.rating_id IS NOT NULLINT AND bus_doc.rating_id NOT IN(5,52,1) AND bus_doc.main_date IS NOT NULLDATE AND (calc_date+ stock_life/2) <= DATE(today) + 90 AND bus_doc.stock_life IS NOT NULLINT AND bus_doc.r_current_state NOT IN (2,3) AND ANY bus_doc.r_version_label='PUBLISHED' AND (bus_doc.r_lock_owner='' OR bus_doc.r_lock_owner =' ')
3.733	8	select stock_life, stock_life as stock_life_nm from dm_dbo.stock_life_v
3.150	4	select r_object_id,object_name,r_lock_owner,title, i_chronicle_id,r_assembled_from_id,i_is_reference,a_compound_architecture, r_is_virtual_doc,i_is_replica,r_link_cnt,r_object_type,a_category,r_has_frzn_assembly, a_content_type,r_page_cnt,i_vstamp , contain_id from dm_sysobject IN Document ID('0978a00b80336427') with any r_version_label = 'CURRENT'
1.722	2	select r_object_id from dmc_module where a_module_type='TBO' and object_name='rpco_kdo_doc_info' and folder('/System/Modules',descend)
0.727	2	select DISTINCT r_workflow_id from dmi_package where any r_component_chron_id='0978a00b801f63f2'
0.717	3	select r_object_id, parent_id, component_id, version_label, order_no, use_node_ver_label, follow_assembly, copy_child, a_contain_desc, a_contain_type from dmr_containment where parent_id = '0978a00b80336427' and order_no != 0 order by order_no
0.316	3	select a.r_object_id from rpco_kdo_doc_info a, bus_doc b where b.r_object_id = '0978a00b80336427' and b.rcod_id = a.rcod_id and b.main_id = a.kdo_id
0.312	4	select r_object_id,object_name,r_lock_owner,title,i_chronicle_id, r_assembled_from_id,i_is_reference,a_compound_architecture, r_is_virtual_doc,i_is_replica,r_link_cnt,r_object_type,a_category,r_has_frzn_assembly,a_content_type, r_page_cnt,i_vstamp ,contain_id from dm_sysobject IN Document ID('0978a00b80336429') with any r_version_label = 'CURRENT'
0.275	18	select r_object_id,r_version_label,i_position from dm_sysobject(all) where i_chronicle_id_i=ID('0978a00b8020898a') order by i_position desc
0.261	3	select a.r_object_id from rpco_kdo_doc_info a, bus_doc b where b.r_object_id = '0978a00b8033642b' and b.rcod_id = a.rcod_id and b.main_id = a.kdo_id
0.255	5	SELECT rcod_id from bus_doc(ALL) where r_object_id in ('0978a00b80226368','0978a00b80222b6f','0978a00b8021ef0f')
0.214	2	select r_object_id from dm_type where name = 'bus_doc' and any attr_name = 'rcod_id'
0.207	44	select r_object_id,r_version_label,i_position from dm_sysobject(all) where i_chronicle_id_i=ID('0978a00b80207dee') order by i_position desc
0.203	4	select r_object_id,object_name,r_lock_owner,title,i_chronicle_id, r_assembled_from_id,i_is_reference,a_compound_architecture, r_is_virtual_doc,i_is_replica,r_link_cnt,r_object_type,a_category,r_has_frzn_assembly,a_content_type, r_page_cnt,i_vstamp ,contain_id from dm_sysobject IN Document ID('0978a00b8033642b') with any r_version_label = 'CURRENT'
0.087	2	select r_object_id from dmc_module where a_module_type='TBO' and object_name='rpco_attachment' and folder('/System/Modules',descend)
0.016	2	select r_object_id from dmc_module where a_module_type='TBO' and object_name='rpco_file' and folder('/System/Modules',descend)
0.016	37	select r_object_id,r_version_label,i_position from dm_sysobject(all) where i_chronicle_id_i=ID('0978a00b801f63f2') order by i_position desc

The Top Query report includes the true cost of executing a DQL query. When a query is executed a query API call such as query_cmd, query, readquery, execquery or cachequery is issued to the Content Server and a collection identifier is returned. A 'next' API call is then issued on the collection each time the client application retrieves a row from the query collection.

The true cost of the query is not just the duration of the API call to execute the query, it should include the time to execute the 'next' calls on the collection returned by the query call. In many cases a 'problem' query does not show up in the Top 10 API Calls report because no single API call takes long to execute. However aggregating over all the API calls associated with the query shows the true cost of the query.

The Query Summary reports all the queries in the trace in the order that they were executed. This report is useful for viewing the overall flow of queries in the application being traced; it can help the performance analyst identify queries that are being executed unnecessarily.

RPC Only Filter Option

The Documentum Client Library (aka DMCL) communicates with the Content Server using Remote Procedure Calls (RPCs). However very many API calls don't generate RPC. A typical ratio is 1 in 10 API calls generate an RPC. Calls that don't generate RPCs usually complete very quickly and their presence in the DMCL trace is often unhelpful noise to the performance analyst. The RPC-Only filtering causes the Profiler engine to ignore API calls that don't generate RPCs.

Multi-thread Trace Files

Multi-threaded access to the DMCL is one of the most significant consequences of the move to WDK-based applications. The architecture of WDK applications running in a Java application server means that many DMCL sessions are operating concurrently in the same process. When DMCL tracing is turned on for a DMCL session all the DMCL sessions running in the App server will start logging to a single DMCL trace file. In fact, if you have multiple WDK applications installed into the same application server instance all sessions from all applications will log to the same trace file.

The different sessions are identified in the trace by the prefix to each trace line; e.g. in this example trace line the thread number is 324:

# [ 324 ] Thu Jan 15 22:31:25 2004 689000 ( 0.000 sec) ( 136 rpc) API> retrieve,c,dm_document

The most obvious problem with this is that the size of the DMCL trace can be massively increased. One of the traditional approaches to analysing DMCL traces is to convert the trace into a CSV file and then load the CSV file into Excel. Documentum provide some scripts to do this however the larger API traces now created can exceed the 65000 line limit for a single Excel sheet.

There is a more critical problem with the traditional approach to parsing DMCL traces. Consider the following DMCL trace fragment:

# [ 32 ] Thu Jul 6 13:22:24 2006 854499 ( 0.000 sec) ( 1116825 rpc) API> query_cmd,s4,T,F,,,,,select distinct a.r_object_id, a.object_name, ...
# [ 32 ] Thu Jul 6 13:22:24 2006 854865 ( 0.000 sec) ( 1116825 rpc) Server RPC: EXEC (0000000000000000) select distinct a.r_object_id, a.obj...
# [ 33 ] Thu Jul 6 13:22:35 2006 310248 ( 0.000 sec) ( 1116825 rpc) API> get,s2,1100000180003501,user_privileges
# [ 33 ] Thu Jul 6 13:22:35 2006 310663 ( 0.000 sec) ( 1116825 rpc) Res: '0'
# [ 33 ] Thu Jul 6 13:22:35 2006 313807 ( 0.000 sec) ( 1116825 rpc) API> get,s2,0c0000018000555f,_changed
         ... <58 seconds of thread 33 trace omitted here>
# [ 33 ] Thu Jul  6 13:23:33 2006 355773 ( 0.000 sec) ( 1118181 rpc) API> get,s2,0b00000180005561,object_name
# [ 33 ] Thu Jul  6 13:23:33 2006 355883 ( 0.000 sec) ( 1118181 rpc) Res: 'Draft'
# [ 32 ] Thu Jul 6 13:23:46 2006 409286 ( 81.555 sec) ( 1118182 rpc) Res: 'q0'

2 DMCL threads are being run concurrently, threads 32 and 33. Thread 32 runs a query using a query_cmd call that takes 81.555 seconds to complete. In between the time the query_cmd call is issued and the response is received thread 33 is making calls into the DMCL as well.

The standard parsing script provided by Documentum (parsetrace.awk) shows the following API consuming 81.555 seconds:

32	81.555	1118182	get,s2,0b00000180005561,object_name	q0

The problem is one of incorrect attribution. The duration obtained from the thread number 32 Res: trace line (81.555) is attributed to the immediately preceding API> trace line. In this case however this turns out to be an API call from a completely unrelated thread number 33.

This is a major problem for any DMCL trace involving multiple threads as the APIs you are most likely to be interested in are the long-running ones, the ones most likely to suffer from incorrect attribution.

Xense Profiler addressed all these problems:

Durations are attributed to the correct API call
There is no limit to the number of API calls that can be processed in a trace file
Filtering is available on the DMCL thread number.

Bugs, Feature Requests and Comments

Please let us know what you think of Xense Profiler and how it can be improved by mailing to profiler_development@xense.co.uk.

Known Issues

none