Performance Monitoring

Main Contents »

Windows provides access to statistics related to system performance, such as CPU utilization, memory usage etc., through Windows performance counters. This chapter describes how these counters can be accessed from a Tcl application to monitor system performance.

1. Introduction

To measure is to know.

— Lord Kelvin

Performance counters allow the operating system and applications to provide interested parties with data regarding their utilization of various resources and other measures related to performance. Some examples are:

System resource utilization for CPU, memory etc.
Fault rates from adapters and other hardware
Number of pages served and response times for a web server
Query response times and disk latencies in a database server
Virtual machine statistics in a hypervisor

This information can be put to a variety of uses including capacity planning, performance analysis and troubleshooting.

Performance counters provide high level, “system state” type of information. A complementary facility, Event Tracing for Windows, which provides for more detailed, time-sequenced traces, is described in a separate chapter.

The twapi_pdh package, part of the TWAPI extension, is required for accessing Windows performance counters from Tcl.

% package require twapi_pdh
→ 4.4.1
% namespace path twapi

The calling process requires special privileges to access performance counters in real time. It must be running under accounts with local administrative privileges or belonging to the Performance Log Users group to read and write performance counters. Processes running under accounts belonging to the Performance Monitor Users group can read, but not write, counters.

2. Performance Objects and Counters

A performance object provides information related to performance within a system or application component. For example, the Memory object provides data related to use of memory on a system. A performance object may have multiple associated counters, identified by name, such as Available Bytes and Page Faults/sec in the case of the Memory object. Moreover, it may have multiple instances. The Memory object, by its global nature, has no instances. On the other hand, the Process object has multiple instances, one per running process in the system. The corresponding counters, such as % Processor Time, contain values for specific processes.

2.1. Enumerating counters and instances

The pdh_enumerate_objects enumerates all the performance objects registered on a system.

% print_sorted [pdh_enumerate_objects]
→ .NET CLR Data
  .NET CLR Exceptions
  .NET CLR Interop
  .NET CLR Jit
  .NET CLR Loading
...Additional lines omitted...

The counters within a performance object, say Process, can be enumerated with pdh_enumerate_object_counters.

% print_sorted [pdh_enumerate_object_counters Process]
→ % Privileged Time
  % Processor Time
  % User Time
  Creating Process ID
  Elapsed Time
...Additional lines omitted...

In the case of performance objects that multiple instances, the instance identifiers can be retrieved with pdh_enumerate_object_instances.

% print_sorted [pdh_enumerate_object_instances Process]
→ _Total
  AppleMobileDeviceService
  ApplicationFrameHost
  armsvc
  Calculator
...Additional lines omitted...

The Browse Counters dialog in the Windows performance monitor program perfmon can be used to discover what counters are available on a system. Alternatively to get a text dump of all the performance counters available on the system, run typeperf -q from the DOS command prompt.

3. Performance Counter Paths

A performance counter path uniquely identifies a particular counter or set of counters. The general syntax of a counter path is given by

\\ComputerName\PerfObject\(ParentInstance/ObjectInstance#InstanceIndex)\Counter

The use of / and \ are not interchangeable in counter paths. / can only be used to separate the parent instance from the object instance.

Not all components in the path are mandatory as noted in the table below.

Table 1. Table Performance counter path components
`ComputerName`	Name of the computer where the performance object resides (optional)
`PerfObject`	Name of the performance object that provides the counter, such as `Processor` or `SQL Server`
`ParentInstance`	Identifies the specific parent (e.g. process) containing the object (e.g. thread) of interest (optional)
`ObjectInstance`	Identifies the specific object containing the counter (optional)
`InstanceIndex`	Since ObjectInstance may not be unique (e.g. processes with the same name), InstanceIndex identifies a specific instance within that set (optional)
`Counter`	The name of the counter of interest

The following points about performance counter paths should be noted:

Performance object names, e.g. Process, and counter names, e.g. Handle Count, are localized and differ from system to system depending on localization settings. The pdh_counter_path will localize names by default assuming passed names are in English. Use the -localized true option if the passed names are already localized.
When a performance object has several instances, the _Total and * instance names can be used. The former refers to a single counter that sums values across all instances. The latter refers to multiple counters, one for each instance.
Instance indexes, which are used when instance names are duplicated, are 0-based so index 0 refers to the first instance. Moreover, absence of an instance index is equivalent to an index of 0.

The instance index associated with a specific object, such as an process, can change. For example, if processes identified by svchost, svchost#1 and svchost#2, exist at some point in time, and then the first process exits, svchost#1 and svchost#2 will be renamed to svchost, svchost#1 respectively. This makes it difficult to track counter values over time as instances come and go. Additional bookkeeping (such as process ids for processes) is required to correctly track the instances.

3.1. Constructing counter paths

In Tcl, the TWAPI pdh_counter_path command can be used to construct counter paths.

In the simplest case, a performance object has only a single instance so the instance does not need to be specified. The following constructs a path for the counter System Calls/sec belonging to the System performance object.

% pdh_counter_path System "System Calls/sec"
→ \System\System Calls/sec

When there are multiple instances, use the -instance option to select the specific instance. For example, to get the CPU utilization for a specific processor,

% pdh_counter_path Processor "% Processor Time" -instance 0
→ \Processor(0)\% Processor Time

When there are multiple instances, the instance identifier _Total will retrieve the counter value measured across all instances.

% pdh_counter_path Processor "% Processor Time" -instance _Total
→ \Processor(_Total)\% Processor Time

If instead, you are interested in the counter value for each individual instance, you can specify * as the instance identifier.

% pdh_counter_path Processor "% Processor Time" -instance *
→ \Processor(*)\% Processor Time

Note the difference between * and _Total. When counter values are retrieved, the former will return multiple counter values, one per instance; the latter will return a single counter value, measured across all instances.

When there are multiple instances with the same instance identifier, you need to qualify which specific instance is of interest by passing the instance index. For example, there are usually multiple system processes svchost.exe running all of which will have the same instance identifier svchost. To get the handle count for the first of these, no instance index need be specified in the counter path.

% pdh_counter_path Process "Handle Count" -instance svchost
→ \Process(svchost)\Handle Count

However, to get the handle count for the second instance, an instance index of 1 must be specified (instance indices start at 0).

% pdh_counter_path Process "Handle Count" -instance svchost -instanceindex 1
→ \Process(svchost#1)\Handle Count

Finally, when the counter pertains to an object contained within another object, the parent or containing object must also be specified. For example, threads are identified through sequential numbers starting with 0 within a process. The following call will construct a counter path to retrieve the thread state for thread instance 10 in the second process instance identified as svchost.

% pdh_counter_path Thread "Thread State" -parent svchost -instance 10 \
    -instanceindex 1
→ \Thread(svchost/10#1)\Thread State

For process and thread performance data, the TWAPI get_process_info command offers a more convenient interface which does not suffer from the disambiguation issues associated with counter instance identifiers.

4. Reading counters

Reading of performance counters involves the following steps.

Open a new performance query with pdh_query_open
Add the counters of interest to the query using pdh_add_counter
Periodically retrieve counter values through pdh_query_get
When done, close the query with the combination of pdh_remove_counter (optional) and pdh_query_close

The examples in this chapter use performance counters that are part of the operating system so as to ensure they are present on every system. Many applications, like SQL Server or IIS, provide their own performance counters. These can be tracked in an identical manner.

4.1. Reading simple counters

To read counter values, a performance query has to be opened with the pdh_query_open command.

% set hquery [pdh_query_open]
→ pdh1

A counter can be added to the query by passing its counter path to the pdh_add_counter command. When adding a counter, we associate it with a name (e.g. freebytes) which serves as a key in the dictionary of values returned by pdh_query_get. This is optional and defaults to the counter path if unspecified.

% pdh_add_counter $hquery [pdh_counter_path Memory "Available Bytes"] -name \
    freebytes
→ 2013954865088 HANDLE

pdh_add_counter returns a handle to the new counter. We ignore it here as it is automatically released when the query is closed. There are specific uses where the handle is needed but these are not covered here. See the TWAPI documentation for details.

The counter values in the query can be retrieved with pdh_query_get.

% pdh_query_get $hquery
→ freebytes 8061165568

pdh_query_get returns a dictionary containing one or more counter values.

Additional counters can be added at any time.

% pdh_add_counter $hquery [pdh_counter_path System "Processes" -instance _Total] \
    -name processcount
→ 2013954865520 HANDLE
% pdh_query_get $hquery
→ freebytes 8061165568 processcount 295

A subset of the counters can be retrieved by specifying them as additional arguments to the pdh_query_get call.

% pdh_query_get $hquery freebytes
→ freebytes 8061169664

Counters can also be removed at any time from the query.

% pdh_remove_counter $hquery freebytes
% pdh_query_get $hquery
→ processcount 295

When finally done, the query is closed with pdh_query_close to release all resources including the counter handles.

% pdh_query_close $hquery

4.2. Reading complex counters

The example in the previous section only dealt with simple counters which had the following characteristics:

they were simple integers
they represented instantaneous snapshots of some system state
they were scalar quantities

This section describes more complex counters.

Counter types

Counters may be of three numeric types: long, large and double which refer to 32-bit integers, 64-bit integers and floating point values respectively. These types are specified using the -format option to pdh_add_counter and indicate the format in which the counter is retrieved. Note that this is independent of the internal format of how an application maintains the counter so you can retrieve a floating point counter as an integer and vice versa.

Rate based counters

Some counters are not based on snapshots of the system state, but rather are measured over an interval. An example is CPU utilization. Since at least two time separated values are required for a rate based measurement, such counters have additional considerations:

The counters must be first initialized by calling pdh_query_refresh before the first call to read the counters via pdh_query_get.
For sensible measurements, there should be a sufficient interval between consecutive calls that retrieve counter values from the system. These calls include pdh_query_get and pdh_query_refresh. (Microsoft recommends at least one second between calls.)

Applications must be prepared to handle transient errors when reading rate base counters. For example, if the interval betwen consecutive counter reads is too short, the system cannot calculate the rate due to clock granularity limitations and will return an error. Similar effects can occur because of variable processor frequencies resulting from power saving operation.

Array counters

A counter path may correspond to multiple counter values when the counter path includes the * “all instances” instance identifier. This has to be specified with the -array option to pdh_add_counter without which retrieval of the counter will only return one value. With this option, a nested dictionary is returned instead with the keys of the inner dictionary mapping the instance names to their counter values.

All these aspects are illustrated in the following Tcl shell interactive session. Of course, a real program would probably read the counters at regular intervals via the Tcl event loop.

% set hquery [pdh_query_open]
→ pdh2
% pdh_add_counter $hquery [pdh_counter_path Processor "% Processor Time" -instance \
    *] -name cpu -array 1 -format double 
→ 2013954868976 HANDLE
% pdh_query_refresh $hquery 
% after 1000
% pdh_query_get $hquery 
→ cpu {0 5.909797592416199 1 1.2824105887645376 2 12.079646930618415 3 2.824872...
% after 1000
% pdh_query_get $hquery 
→ cpu {0 1.803317544054961 1 1.803317544054961 2 8.038027541257819 3 0.24464004...
% pdh_query_close $hquery

	Get CPU utilization array - one per processor, as a double
	Initialize start value for counter
	Read the counter calculated rate for first interval
	Read the calculated rate for the second interval

Note the counter value is returned as a nested dictionary keyed by the counter name cpu and the processor number.

4.3. Reading system counters

The twapi_pdh package includes a convenience command, pdh_system_performance_query that is a wrapper around pdh_query_open with several commonly used system counters predefined.

Without any arguments, the command returns a list of predefined counter names.

% pdh_system_performance_query
→ commit_limit committed_bytes committed_percent disk_bytes_rate disk_idle_perc...

Passed arguments result in a query that includes counters corresponding to those predefined names. This also internally initializes the counters so no pdh_query_refresh call is necessary.

% set hquery [pdh_system_performance_query committed_bytes processor_utilization]
→ pdh3
% after 1000
% pdh_query_get $hquery
→ committed_bytes 15162351616 processor_utilization 2.277331547463457
% pdh_query_close $hquery

pdh_system_performance_query can be used exactly as pdh_query_open so for example, additional counters can be added. An advantage of using pdh_system_performance_query is that in addition to saving the writer the trouble of finding the appropriate counter paths to add, it also hides some platform-specific differences.

5. Single Pager - Simple System Monitor

Our demo program, the Simple System Monitor, displays CPU, disk and memory usage data. It illustrates

use of the twapi_pdh package to retrieve performance information
Tk’s wm overrideredirect command to remove window borders and other decorations
binding commands to respond to mouse movements and button clicks
introspection using info level

As always we start with our preamble to load any required packages.

package require Tk
package require twapi

Define data structures

Next we define the array to hold our counters. As in all our one pagers, we define everything at a global level for ease of exposition. Note the array keys have been cunningly chosen to match the keys in the dictionary returned from pdh_query_get.

array set counters {
    processor_utilization 0.0
    disk_bytes_rate       0
    -physicalmemoryload   0
}

Updating the counters

We need a command to update these counter values at regular intervals. We will have it update the counters and then schedule itself to run again after a second.

proc refresh {pdhq} {
    array set ::counters [twapi::pdh_query_get $pdhq]
    array set ::counters [twapi::get_memory_info -physicalmemoryload] 
    reschedule 1000
}

For memory we use a different TWAPI command instead of program counters

We could have used the Tcl after command directly but for illustrative purposes we define the reschedule procedure which can be called from anywhere to rerun any calling procedure with the same arguments that it was originally called with. The key to this procedure is the info level command which returns the command and its associated arguments being invoked at any level of the call stack. So this will result in the caller being run again 'with the same arguments' after the requested delay.

proc reschedule {millisecs} {
    after $millisecs [info level -1]
}

Creating the user interface widgets

The user interface will be very simple. We define a label and a progress bar to display each counter value. Like many Tk widgets, a progress bar can be associated with a variable and updates its displayed state when the variable is modified. There is no need to explicitly update the progress bar when the corresponding counter changes value. The length of the progress bar will reflect the ratio between the variable’s value and the maximum value associated with the progress bar (100 by default).

ttk::label .l-cpu -text "CPU"
ttk::progressbar .p-cpu -variable counters(processor_utilization)
ttk::label .l-disk -text "Disk I/O"
ttk::progressbar .p-disk -variable counters(disk_bytes_rate) -maximum 1000000 
ttk::label .l-mem -text "Memory"
ttk::progressbar .p-mem -variable counters(-physicalmemoryload)

By default the maximum value of the bar is 100. For disk I/O, we set it to 1000000 since the disk rate is not a percentage.

We then lay the widgets out in table format using the Tk grid command.

grid .l-cpu  .p-cpu  -padx 2 -pady 2 -sticky w 
grid .l-disk .p-disk -padx 2 -pady 2 -sticky w
grid .l-mem  .p-mem  -padx 2 -pady 2 -sticky w

The -padx/-pady options create space between widgets and -sticky w forces the widgets to stick to the left (west) side of the table cell

Saving screen space

Since this is a simple desktop widget, we do not want unnecessary screen space taken up with the title bar, borders and other decorations. The Tk wm overrideredirect command allows us to do this.

wm overrideredirect . 1

Once the title bar is removed, there is no way to close the window or to move it around the screen.

So we provide a pop-up menu and bind it to the right mouse button. It will contain a single entry Exit which does the obvious.

menu .popup -tearoff 0
.popup add command -label Exit -command {destroy .}
bind . <ButtonPress-3> { tk_popup .popup %X %Y }

We also provide a means to move the window around by clicking anywhere in it and dragging. To provide a visual cue to the user, we will have the cursor change shape when the left mouse button is pressed or released.

bind . <ButtonPress-1> {
    . configure -cursor size 
    set ::mouse_xoff [expr %X - [winfo rootx .]] 
    set ::mouse_yoff [expr %Y - [winfo rooty .]]
}
bind . <ButtonRelease-1> {
    . configure -cursor "" 
}

bind . <B1-Motion> [list move_window %X %Y]

	Most widgets have an option to set the cursor shape. By binding it to the left button press, we only change shape when the button is pressed.
	We need to save the offsets of the mouse within the window to calculate the new position of the window when the mouse is dragged.
	Reset to default cursor when button is released
	move_window will be called when the mouse is moved with the first button pressed. %X and %Y provide the mouse coordinates.

The command to actually move the window is also simple. It calculates the new offsets and moves This command will be called in response to the mouse moved with Button 1 pressed.

proc move_window {mouse_x mouse_y} {
    set new_x [expr {$mouse_x - $::mouse_xoff}]
    set new_y [expr {$mouse_y - $::mouse_yoff}]
    wm geometry . "+$new_x+$new_y" 
}

The Tk wm geometry command is used to move the window to the new mouse position

All that’s left to do is to initialize the performance query and kick off the refresh loop.

set pdhq [twapi::pdh_system_performance_query processor_utilization \
    disk_bytes_rate]
after 1000 [list refresh $pdhq]

Here is the complete program listing:

Program Listing - Simple System Monitor

# perfmon.tcl
package require Tk
package require twapi

array set counters {
    processor_utilization 0.0
    disk_bytes_rate       0
    -physicalmemoryload   0
}

proc refresh {pdhq} {
    array set ::counters [twapi::pdh_query_get $pdhq]
    array set ::counters [twapi::get_memory_info -physicalmemoryload]
    reschedule 1000
}

proc reschedule {millisecs} {
    after $millisecs [info level -1]
}

ttk::label .l-cpu -text "CPU"
ttk::progressbar .p-cpu -variable counters(processor_utilization)
ttk::label .l-disk -text "Disk I/O"
ttk::progressbar .p-disk -variable counters(disk_bytes_rate) -maximum 1000000
ttk::label .l-mem -text "Memory"
ttk::progressbar .p-mem -variable counters(-physicalmemoryload)

grid .l-cpu  .p-cpu  -padx 2 -pady 2 -sticky w
grid .l-disk .p-disk -padx 2 -pady 2 -sticky w
grid .l-mem  .p-mem  -padx 2 -pady 2 -sticky w

wm overrideredirect . 1

menu .popup -tearoff 0
.popup add command -label Exit -command {destroy .}
bind . <ButtonPress-3> { tk_popup .popup %X %Y }

bind . <ButtonPress-1> {
    . configure -cursor size
    set ::mouse_xoff [expr %X - [winfo rootx .]]
    set ::mouse_yoff [expr %Y - [winfo rooty .]]
}
bind . <ButtonRelease-1> {
    . configure -cursor ""
}

bind . <B1-Motion> [list move_window %X %Y]

proc move_window {mouse_x mouse_y} {
    set new_x [expr {$mouse_x - $::mouse_xoff}]
    set new_y [expr {$mouse_y - $::mouse_yoff}]
    wm geometry . "+$new_x+$new_y"
}

set pdhq [twapi::pdh_system_performance_query processor_utilization \
    disk_bytes_rate]
after 1000 [list refresh $pdhq]

6. References

SDKPDH: Performance Counters Reference, Windows SDK documentation, http://msdn.microsoft.com/en-us/library/windows/desktop/aa373083(v=vs.85).aspx.