Tuning Scenes for Combined ShadingApril, 2002 
The renderer now has the ability to shade several geometric primitives at the same time, provided that you follow a few simple rules when writing shaders. If you follow these rules, small objects such as leaves, grass blades, hairs, pebbles, etc. will shade up to 45 times faster (in the limiting case where the objects are far away, and are diced into a single micropolygon each). This can speed up the overall rendering times for complex scenes by up to a factor of 3. Even in moderate cases, where individual objects turn into 20 or 30 micropolygons each, your shaders can execute up to twice as fast.
So, how can you get this extra performance? The basic requirement is that all of the relevant objects must have the same shaders (including displacement, surface, atmosphere, and light shaders). Beyond that, the performance improvements should happen automatically  unless you have broken one of the rules listed below.
The rest of this document describes the rules you must follow. By far the most important of these is to get your parameter declarations right: if there is a quantity that has a different value for each object to be shaded, then it must be declared "varying" in the shader. There are also a number of other more subtle factors (such as attribute values) that can prevent you from getting this extra performance.
When deciding whether a shader parameter should be "uniform" or "varying", consider whether the value will be constant over the whole group of objects that should be shaded at once. Any parameter that has different values for different objects must be declared "varying" in the shader. For example, objects such as grass and leaves often have a "unique ID" parameter that is different for each leaf. In CheapGrass/newgrass.sl (in "tsveg") we have:
In this case "Season" is presumably constant over large swaths of grass, so it can be left alone. However, "GrassID" and "ColorNoise" typically have a different value for every grass blade. Their declarations need to be changed to "varying" in order for multigprim shading to have any effect:surface newgrass( ... uniform float Season = 0; uniform float GrassID = 0; uniform float ColorNoise = 0; )
There are a few points to note about this:surface newgrass( ... uniform float Season = 0; varying float GrassID = 0; varying float ColorNoise = 0; )
If you accidentally declare a parameter whose value differs between objects as "uniform" (such as "GrassID" in the example above), the renderer will be forced to process each object separately. This can slow down shading quite substantially.
Fortunately, the renderer itself can help you to locate the parameter declarations that are causing problems. If you look at the statistics file generated by the renderer, you will see something like this:
Grid merging statistics: (average size increase: 2.5x) 39019 grids were combined with an existing grid 26434 new grids were created, for the following reasons: 628  grid would be too large 6  different "object" coordinate systems 22598  different gprim values for a uniform parameter (see below) 649  different (but identical) shader instance lines 1576  different "shader" coordinate systems 82  different shader instance parameter values 624  different shaders (including lights) 271  first grid in bucket Detailed breakdown of rejections due to uniform parameters: (consider changing these parameters to "varying" in shader) 22573  "GPrimTag_0" in "TreeElm/leaf" 2  "GPrimTag_0" in "Neighborhood/NbdHouse/rafter" 1  "GPrimTag_0" in "Neighborhood/NbdHouse/Gutter" 12  "GPrimTag_0" in "Neighborhood/NbdHouse/Siding" 5  "GPrimTag_0" in "Neighborhood/NbdHouse/porticoTrim" 5  "GPrimTag_0" in "Neighborhood/NbdHouse/BaseConcrete"
The first line summarizes how many grids were merged together with another grid before shading, while the second line says how many grids could not be merged an existing grid and were therefore shaded. The remaining lines give a breakdown of the reasons why these grids could not be merged.
In particular, notice that by far the largest culprit in this example is the entry with 22598 grids (about 85% of the total), which are due to "different gprim values for a uniform parameter". This means that there were many grids that could have been combined, except that they had different values for a uniform shader parameter. If you look further down in the statistics, there a detailed listing of the actual shaders and parameters that caused problems. In this case, we see that almost all of the rejections are due to the parameter "GPrimTag_0" in the shader "TreeElm/leaf".
Examining the source code for this shader, we find:
surface leaf( ... uniform float BushID = 0; uniform float GPrimTag_0 = 0; uniform float NumVariants = 1; )
By changing the parameter "GPrimTag_0" to be "varying", most of the 22598 grids will be merged together (as we will see below).
Note that if several parameters to a shader are causing problems, the statistics will report only the first one. Once that parameter is fixed, it will then report the next one that causes problems, and so on.
Sometimes changing a parameter from "uniform" to "varying" may not be simple. For example, in the "TreeElm/leaf" shader above we have:
surface leaf( ... uniform float BushID = 0; uniform float GPrimTag_0 = 0; uniform float NumVariants = 1; ) { ... /* Paint variants. */ variant = format("%d", mod(BushID + GPrimTag_0, NumVariants)); spotvariant = format("%d", abs(mod(11+BushIDGPrimTag_0, NumVariants))); ... }
Here the "format" shadeop is being used to construct part of a texture file name. Since the arguments to "format" must be uniform (and in general, since the shading language only supports uniform strings), we cannot simply change "GPrimTag_0" to be "varying".
Returning to the underlying problem, recall that the reason that the renderer cannot shade all the leaves at once is that "GPrimTag_0" takes on a large number of different values (in this case, a unique value for every leaf). On the other hand, suppose that "GPrimTag_0" had only two different possible values (0 and 1). In that case the renderer would be able to group together leaves with the same tag value (e.g. all those with value 0), and shade them in large groups.
Applying this principle to the example above, note that the number of paint variants is generally quite small (in this example there were 8 variants). Thus for the purposes of the "format" statement, there might as well be only 8 values for "GPrimTag_0". If we could limit the number of different values in this way, the renderer would be able to sort the leaves into 8 groups and shade each group separately (note that this happens automatically).
The main problem with this idea is that the shader uses "GPrimTag_0" for other purposes as well (such as adjusting the leaf color), and in those situations we probably still want each leaf to have a distinct ID.
So, the easiest solution is to split "GPrimTag_0" into two different tags:
The resulting shader looks like this:
surface leaf( ... uniform float BushID = 0; varying float GPrimTag_0 = 0; /* original leaf ID */ uniform float GPrimTag_1 = 0; /* paint variant (8 values) */ uniform float NumVariants = 1; ) { ... /* Texture mapping space. */ if (float cellnoise(BushID + GPrimTag_0) < .5) x = t; else x = 1t; ... /* Paint variants. */ variant = format("%d", mod(BushID + GPrimTag_1, NumVariants)); spotvariant = format("%d", abs(mod(11+BushIDGPrimTag_1, NumVariants))); ... /* Adjust the color of the leaf surfaces and tint. */ Cleaf *= color( mix(1HueRange/2, 1+HueRange/2, float cellnoise(BushID, GPrimTag_0)), mix(1SatRange/2, 1+SatRange/2, float cellnoise(3+BushID, GPrimTag_0)), mix(1LumRange/2, 1+LumRange/2, float cellnoise(13+BushID, GPrimTag_0)) ); ... }
Notice that "GPrimTag_1" is used only for the paint variant lookup, while "GPrimTag_0" is used everywhere else. Of course, we must also modify the model or DSO that generates the RIB, in order to generate values for the "GPrimTag_1" parameter (which should equal "GPrimTag_0" mod 8). With these changes, we get the following rendering statistics:
Grid merging statistics: (average size increase: 17.4x) 69106 grids were combined with an existing grid 4212 new grids were created, for the following reasons: 1024  grid would be too large 6  different "object" coordinate systems 47  different gprim values for a uniform parameter (see below) 739  different (but identical) shader instance lines 1425  different "shader" coordinate systems 90  different shader instance parameter values 615  different shaders (including lights) 266  first grid in bucket Detailed breakdown of rejections due to uniform parameters: (consider changing these parameters to "varying" in shader) 13  "GPrimTag_0" in "Neighborhood/NbdHouse/rafter" 21  "GPrimTag_0" in "Neighborhood/NbdHouse/Siding" 1  "GPrimTag_0" in "Neighborhood/NbdHouse/Gutter" 5  "GPrimTag_0" in "Neighborhood/NbdHouse/porticoTrim" 2  "GPrimTag_0" in "Neighborhood/NbdHouse/Stucco" 5  "GPrimTag_0" in "Neighborhood/NbdHouse/BaseConcrete"
The number of grids shaded has gone down from 26434 to 4212, a factor of six improvement!
Note that this is not the only way to handle paint variants. For example, we could get by with just a single "varying" tag (the leaf ID) by handling more than one paint variant within the shader. This would involve looping over the 8 possible paint variants, and looking up the texture colors for the appropriate subset of points on each pass. Unlike the previous technique, this would require substantial modifications to the shader.
Another important requirement for multigprim shading is to compile shaders using "smooth derivatives" (which happens by default). Otherwise, "du" and "dv" will be assumed to be uniform variables by the shader compiler, and this will prevent the renderer from combining grids whose micropolygons have different sizes. (In the absence of smooth derivatives, grids can be combined only if they have the same geometric "du" and "dv" values.)
Thus the "ns" option of the shader compiler (which forces nonsmooth derivatives) should be avoided. Similarly, shaders should not use the global variables "__gdu" or "__gdv". Very old shaders (that were compiled before smooth derivatives existed) should be recompiled. The renderer will print a warning at runtime if such a shader is used:
S99002 Shader "OldCrap" uses geometric "du" or "dv". (PERFORMANCE WARNING)
Any grids that could not be merged together for this reason will be listed as "different geometric du/dv values" in the statistics output.
Sometimes the renderer will not be able to shade gprims at the same time because they have different object space coordinate systems. The easiest way to avoid this problem is to ensure that all the gprims that you want to shade together (leaves, blades of grass, shingles, etc) have the same coordinate system at the time they are declared. This implies that the model or DSO is responsible for transforming the object coordinates rather than the renderer.
Even when gprims have different object spaces, it is often possible for the renderer to shade them together. For this to be true, however, the shader must be written such that every transformation involving object space has a "varying" result. This includes transformations that use string parameters or variables as the coordinate system names (since the renderer assumes that such variables may contain the string "object"). For example, the following transformations are fine:
varying point objP = transform("object", P) varying normal Ns = ntransform("object", myMatrix, N); varying point ckPs = transform(ckcoords[ckIndex], P); uniform point Peye = transform("shader", E);
On the other hand, the following constructions cause problems:
/* Avoid these if possible */ uniform point Pobj = transform("object", E); uniform point Orig = point "object" (0,0,0); uniform point Q = transform(ckcoords[ckIndex], point (0,0,0)); uniform vector r1 = vtransform(from, to, vector "current" (1,0,0)); uniform matrix M = matrix "object" 1; float size = abs(determinant(1/(matrix refspace 1)*(matrix curspace 1)));
Essentially, the rule here is that the string "object" should be treated as though it were a varying parameter of the transformation operator. Thus even if all of the other arguments are uniform, a "varying" value must be allocated for the result:
/* These are all okay */ varying point Pobj = transform("object", E); varying point Orig = point "object" (0,0,0); varying point Q = transform(ckcoords[ckIndex], point (0,0,0)); varying vector r1 = vtransform(from, to, vector "current" (1,0,0)); varying matrix M = matrix "object" 1;
The renderer prints a warning when a shader violates these rules:
S99001 Shader "Ants/painted" requires uniform object space. (PERFORMANCE WARNING)
Similarly, the renderer cannot shade gprims that have different "shader" coordinate systems at the same time. This is true whether or not the shader actually refers to "shader" space.
If you need to have a different "shader" space for each gprim, consider using "object" space instead (and following the rules mentioned above).
In general, gprims that are bound to different shader instance lines in the RIB stream cannot be shaded at the same time. For example, the following gprims will be shaded separately:
Displacement "lumpy" "Km" [0.2] Patch "bilinear" "P" [1 1 0 1 1 0 1 1 0 1 1 0] Displacement "lumpy" "Km" [0.5] Patch "bilinear" "P" [1 1 2 1 1 2 1 1 2 1 1 2]
In this situation, consider binding any parameters that have a different value for each gprim to the gprims themselves instead:
Displacement "lumpy" Patch "bilinear" "P" [1 1 0 1 1 0 1 1 0 1 1 0] "Km" [0.2] Patch "bilinear" "P" [1 1 2 1 1 2 1 1 2 1 1 2] "Km" [0.5]
If a shader uses the "attribute" function, then the renderer can only combine gprims whose attribute values are the same. For example, suppose that the shader looks at the object name:
string objname; attribute("identifier:name", objname);
In this case, the renderer will only be able to combine gprims that have the same object name. (On the other hand, if the shader does not examine the object name, the renderer can and will combine gprims with different names.)
Note that if a string variable is used for the attribute name, the renderer can combine gprims only if all of their attribute values are the same (including the object name, displacement bound, shading rate, sidedness, etc). Therefore this practice should be avoided. For example:
string objname; attribute(attribname, objname); /* avoid this */
If a shader makes use of the "time" or "dtime" global variables, the renderer will combine gprims only if they have the same values for these variables. This is particularly important for multisegment motion blur, where the same gprim is shaded at several different times. If the shader for this gprim does not refer to "time" or "dtime", then the renderer will be able to shade all of the motion segments at once.
Here is a brief explanation of some of the other reasons that the renderer may not be able to combine two gprims:
Finally, be careful when writing any shader code that attempts to extract a "uniform" property from varying data associated with the grid. This has always been a dangerous thing to do, but it is now even more so.
For example, consider the following shader code:
/* * Assuming that Cs is in RGB space, convert to HSV space so * that we can modulate saturation and value without changing hue. */ red = comp(Cs, 0); grn = comp(Cs, 1); blu = comp(Cs, 2); /* set val to largest rgb component, x to smallest */ if (red >= grn && red >= blu) { /* red largest */ val = red; if (grn > blu) { x = blu; spoke = "Rb"; } else { x = grn; spoke = "Rg"; } } else if (grn >= red && grn >= blu) { /* green largest */ val = grn; if (red > blu) { x = blu; spoke = "Gb"; } else { x = red; spoke = "Gr"; } } else { /* blue largest */ val = blu; if (grn > red) { x = red; spoke = "Br"; } else { x = grn; spoke = "Bg"; } }
This code contains a subtle bug: if the surface color "Cs" is not constant, it may contain colors that belong in different parts of the color wheel. However, the variable "spoke" is uniform (since it is a string value). This has the effect of an "ifever": all of the "Cs" values will be mistakenly lumped into a single spoke.
Note that if this shader is applied to constantcolored objects, it works fine. However, with multigprim shading it is possible that several objects with different colors will be shaded at once. It is important to keep this in mind when writing shaders. In the example above, the problem can be fixed by storing the spoke value in a varying float (with a predefined constant for each spoke).
Here is another example:
uniform float side; if ((dPdu ^ dPdv) . Ng > 0) side = 0; else side = 1;
In this case, the shader is attempting to determine which "side" of the surface we are shading. The problem is that with multigprim shading, the renderer might well be shading both sides of the surface at once! The solution is to convert the variable "side" to be varying.
The point of these examples is that in general, it is dangerous to try to extract any uniform property from a set of varying data. This has always been true, but with multigprim shading it is even more important. Fortunately, the shading language makes this rather hard to do: it is necessary to use "ifever" or some equivalent construct.
Finally, we emphasize that "ifever" itself is not dangerous. Most often "ifever" is used to avoid expensive calculations when their results are not needed (e.g. if the result will be multiplied by zero). These uses of "ifever" are completely safe and are not affected by multigprim shading.
Pixar Animation Studios
