|Locating Documentum performance problems
|When tuning a Documentum system you are first concerned with locating which one of several components/tiers is the problem. When you have located the problem component you then need to understand why that component is taking the amount of time it is to complete its work. This article deals with the first of these issues and concentrates on locating problems in a typical WDK-based application. In fact the techniques involved are not limited to browser applications but apply to 'thick' clients such as Desktop client and to applications running directly on the Content Server.
There are a number of different ways to locate the problem component. If your Documentum system is physically tiered (Client, App Server, Content Server, Database are on different machines), you could try monitoring resource usage (e.g. CPU or disk activity using 'top' on Unix systems or Performance Monitor on Windows systems) before, during and after the long-running operation. If you notice that resource usage is particularly high on one of the boxes it is likely this is the component that is responsible for your slow performance.
Another option is to switch on DMCL tracing, level 10, on the application server. The procedure is outlined on the Documentum Developer site here so I won't repeat it in this article.
This type of trace is always useful if you know your problem lies on the Content Server or database, however it is not always appreciated that it also provides useful information allowing you to decide where the performance problem is located.
To understand why this is you need to understand the architecture of WDK-based applications (which is what Webtop, Web Publisher, DA and DAM are). An action performed in the browser, such as a button click or URL, becomes a request made to the application server running the WDK application. The application server will execute a JSP page or servlet to build the output page and in the course of doing so will make requests to the Content Server. These requests to the Content Server may in turn require database queries. So what we have is a single action in the browser causing a chain of requests down the application stack:
Diagram 1: Documentum Application Stack
Even if you are using WDK or DFC calls in your Application Server code, ultimately all calls to the Content Server will go through the Documentum Client Library (DMCL). The DMCL is code that runs on the Documentum client machine, in this case the application server, and makes Remote Procedure Calls (RPC) to the Content Server.
Turning on DMCL tracing on the Application Server causes the DMCL to output detail on each RPC to the Content Server. Included in the detail for each DMCL call is the time the call started and the call's duration. There is sufficient information here to allow us to 'profile' the problem action and split the duration of this action between time spent on the application server (and possibly the browser) and time spent on the Content Server and database. If we identify that the majority of the time taken to process the problem action is on the Content/Database server then the DMCL trace now allows us to drill down and investigate. If not then at least we need spend no more time messing around with database settings, indexes and updating statistics.
It should be pointed out that there are a number of drawbacks to using DMCL trace in a WDK application. First, a typical WDK application call can generate hundreds and even thousands of DMCL calls most of which are irrelevant for our purposes as they are satisfied internally in the DMCL client code and do not produce an RPC to the Content Server. The only way to deal with this situation is to parse the output to extract the necessary information. The Documentum developer site contains some unsupported trace utilities that allow you to turn the trace output into a format that can be loaded into Excel. My company (Xense) has a tool which provides additional filtering for DMCL trace files that splits out multiple sessions and ignores internal DMCL calls. It also analyses and reports on the contents of the trace file allowing you to quickly identify the cause of the performance problem - see Xense Profiler for more details.
Second, when you turn on DMCL tracing, tracing is enabled for all sessions running on that application server. You really need to do your tracing when the system (or at least the application server) is not being used by other people. That may mean performing this procedure during the evening or perhaps setting up a dedicated application server instance, possibly on the same machine but using a different port, to be used just for tracing.
Finally, there are some pitfalls in some of the conventional DMCL trace parsing utilities when they are used to parse trace data from multi-threaded applications. Usually this is applications that run in an application server such as WDK, Webtop, WebPublisher, et al.
The problem occurs when one DMCL thread is waiting for a response from the Content Server and a 2nd thread starts an API call. The trace data lines for the 2 threads get interleaved and are not properly handled by the simple trace parsing tools.
This paper outlined an approach to troubleshooting performance problems in a Documentum system:
1. Locate the problem component
2. Profile the problem component to assess why operations are taking so long.
I then discussed approaches to locating the problem component, including an unusual use of DMCL tracing.
Robin East has been involved with delivering systems since 1988 and has been analysing, developing and deploying Documentum systems since 1999. He set up Xense, an independent Documentum research and consulting company in 2000. He can be contacted at email@example.com.