Multi-threaded RenderMan 
December, 2005


Performance Expectations

Performance Tuning

1 Introduction to Multi-threaded RenderMan

RenderMan now supports multi-threaded rendering. Within a single prman invocation, multiple threads will process the image. If there are multiple processing units available, this can result in a faster time to completion for a given set of input data. The primary advantage of multi-threaded rendering over multi-process rendering (which is the method employed by netrender and -p:n modes) is memory footprint. When multi-processing, each process will consume nearly the same amount of memory per frame as a single process would. With multi-threading, all threads will use nearly the same amount of memory as a single process.

By default, prman will determine the number of processing units available and will operate in multi-threaded mode. A single license will serve two processing units and prman will only consume one license in a default invocation (to prevent consumption of many licenses on machines with many processing units). The user can override the default behavior by specifying the number of processors that will be used with the -t:n option, where n is the number of processors the user wants the renderer to utilize.

The user can also change the default number of processing units with a setting in the rendermn.ini file:

/prman/nprocessors  1
This is useful if one wants to override the default behavior, which queries the system for the number of processors available. If the nprocessors setting is found, it will use that setting instead of querying the system. Note, this setting affects the default number of processors utilized when using the -p:n option as well.

2 Performance Expectations

As a rule of thumb, prman in multi-threaded mode should result in faster render times than prman in single threaded mode; however, this can be highly scene dependent. Scenes that incur very little shading cost (for example, shadowmaps) will not exhibit noticeable speedups when rendering multi-threaded. Fortunately the scenes that take the longest to complete, ones with expensive shading or visible-point shading, will be the scenes that benefit most from multi-threading.

When using prman in multi-threaded mode, one has to reconcile real-time (also know as elapsed-time) statistics with the user-time statistics. When multi-threading, user-time will be the total amount of processor time utilized by all processors on the system. This will normally be more than the real-time because multiple processing units will be active simultaneously.

In multi-threaded mode prman should use much less memory per scene than multi-processing mode. However, some of the rendering algorithms will utilize more memory in multi-threaded mode than single-threaded mode to maintain performance. One system that will utilize more memory is texturing (both 2D and 3D). The texturing system will create texture caches per processing unit that will consume slightly more memory than an invocation of prman that utilizes only a single processor. Likewise, ray tracing will create a geometry cache per processor and will consume slightly more memory in multi-threaded mode than in single-threaded mode.

If a scene employs shaders that use old-style RSL plugins, those should be ported to the new format. Old-style RSL plugins will cause the multi-threaded render to lock. This only allows the execution of one old-style RSL plugin to occur at a time, which can significantly impact the effectiveness of the multi-threaded renderer.

3 Performance Tuning

There are a couple of options that can be used to control the performance of multi-threaded prman. The first option, of course, is -t:n, where the user can specify n processors to be used. If the system has more than two processors (the default will be to utilize only two) a user could specify more processors (which will, in turn, use more licenses) and peformance should increase with the number of processors utilized. NOTE: specifying more processors than available on the system will most likely result in a slower render time.

Another option that can be used to improve the efficiency of multi-threaded prman is the bucket size. The can be controlled with the option:

Option "limits" "bucketsize" [32 32]
As the bucket size increases, the multi-threaded renderer will become more efficient. The downside of increasing the bucket size is that memory utilization will increase.

The ray tracer is tuned to be very effective when multi-threading. It will allocate by default a 60 Mbyte geometry cache per thread. This is controlled with the option:

Option "limits" "int geocachememory" [61440]
If a smaller cache per thread is required it can be specified with this option, but this will significantly impact the speed of ray tracing.

Pixar Animation Studios
(510) 752-3000 (voice)  (510) 752-3151 (fax)
Copyright © 1996- Pixar. All rights reserved.
RenderMan® is a registered trademark of Pixar.