Beyond SAH — Building Optimal BVHs

Published in

Traverse Research

11 min readDec 25, 2023

At Traverse Research we are always looking to improve ray-tracing performance. A large factor for optimizing this is building the right and best acceleration structure, usually a BVH nowadays. In the past year, we looked into which way we can built the best-performing BVH in the form of a master’s thesis. In this blogpost we will summarize some of our findings, but first, let’s briefly recap the most common method. The entirety of the research can be found here, which includes the collected raw-data for what we discuss here.

In order to follow along, we expect you to be familiar with the concept of BVHs and along with that know what a top-level acceleration structure (TLAS) and bottom-level acceleration structure (BLAS) is. Having some understanding of how BVHs and similar acceleration structures are constructed helps, but is not strictly required.

BVHs and SAH — A recap

A common method for constructing a high-quality BVH is using the Surface Area Heuristic. One of the earliest source for using this is from MacDonald & Booth in “Heuristics for Ray Tracing Using Space Subdivision”. The idea is based on two main principles:

We want to minimize the probability of intersection for BHV nodes that are expensive to test (i.e. contain many primitives)
The probability of intersection of a node is proportional to its surface area, under certain conditions.

From this, a cost function can be derived that helps find a BVH with minimum cost. As such, we have a Surface Area Heuristic (SAH) to minimize, which looks something like this:

The cost of an interior node in a BVH. Ct is the cost of traversing an interior node, i.e. of two ray-aabb intersections, Ci the cost of performing a ray-primitive intersection test, SAl/r the respective surface areas of the left and right child nodes, SAp the surface area of the parent node and Tl/r the respective number of primitves in the left and right child nodes.

Most sane people do this in a greedy manner¹ where already established costs will not be re-evaluated as we go down the hierarchy. This would require evaluating all potential combinations of BVH nodes, which for any useful mesh is infeasible. The minimum cost BVH is in practice rather a local than global minimum in practice.

If you want to read more on how to bring this into practice, a detailed approach to implementation can be followed from the excellent blog series by Jacco Bikker.

[1] Our linked research also looks into the not-so-sane side: computing the global minimum cost and the implications compared to greedy computations.

Limitations of SAH

Using SAH to optimize the construction of BVHs is widely used as it can boost trace performance significantly for the cost it adds during in construction. However, the relation between the surface area and probability of intersection exploited in SAH assumes the following:

Ray origins and directions are uniformly distributed.
Ray origins lie outside the scene’s bounding box.
Rays do not terminate inside the scene.

The first assumption is a crude simplification of how we perform ray tracing. Our camera or primary rays follow a perspective distribution with a single origin, diffuse materials generally produce cosine-distributed rays and effects like ambient-occlusion also limits the distributions and directions. Only shadow rays will arguably not follow such a standard pattern and can be considered to be uniformly random to some extent. This generally better approximations than SAH may exist based on the distribution one particular type of ray follows.

The second assumption is generally true in practice for a BLAS, but with the exception of primary rays, ray origins will generally lie inside of a TLAS rather than outside of it, so again, using SAH won’t necessarily give us the optimal BVH here.

Then finally, rays not terminating in the scene is rarely the case in practice, as it is would imply all our rays are multi-hit rather than first-or-any hit rays. This affects the intersection probability for obstructed geometry, as rays starting outside the scene are much less likely to hit them or cannot hit those at all.

Alternatives

Clearly, there is a potential for building even better BVHs than with SAH for the cost function. Let’s briefly discuss some alternatives that were researched over the years:

Scene-interior Ray Origin Metric (Fabianowski et al., 2009)

Remember how SAH assumes that the ray origins lie outside the scene’s bounding box? Fabianowski et al. changed heuristic with the assumption that ray origins always lie inside the scene bounding box. This heuristic’s definition is however more complex and the resulting integral needs to be approximated for our discrete scenes. The two approximations they suggest both split the scene around the BVH node and approximate the solid angle (i.e. the visible surface area) from a set of vantage points in the divided space, as illustrated below.

Calculate the (approximate) probability of intersection between a BVH node and a ray that originates from within the scene. V is the volume, N the BVH node’s AABB, S the scene AABB, and R1–26 are the 26 boxes/regions induced by the extended faces of N, and a the visible surface area of N from the center of Ri, calculated from the solid angle.

Visualizing some of the vantage points from which we calculate the visible surface area, i.e. the solid angle, of aabb onto the point x1–26.

This is a relatively simple solution that can theoretically eliminate one of the incorrect assumptions made by SAH for the majority of rays cast. The used form of measurement in the results is not ideal as it is in FPS, but the estimation is that ray-tracing performance can be increased by up to 7%.

Preferred Ray Distributions (Havran & Bittner, 1999)

Instead of assuming that all our rays are uniformly distributed in origin and direction, the different types of rays can be broken down and a BVH can be fitted to its (estimated) distribution. Havran & Bittner did exactly this for various ray distributions, such as for primary rays or with spherical distribution².

[2] This spherical distribution can be seen as the basis for the above mentioned scene-interior ray origin metric; the latter will generally be superior for any practical use-cases.

Let’s take some observations in mind when we think of what casting rays looks like if we have a perspective-projection camera that shoots rays into the scene:

Objects closer to the camera have a higher probability of being intersected than those far away, as the camera rays diverge.
Any geometry outside the frustum around the camera rays will never be intersected and therefore has an intersection probability of zero.

These observations are used to adjust SAH by assuming the probability of intersection is proportional to the surface area of the geometry or BVH node once projected on the camera plane. To get this, simply project the vertices of the AABB of the BVH node onto the camera plane using the perspective projection matrix. Clip off any of the surface area that lies outside of the frustrum (or camera plane when projected), and you have the probability of intersection for that BVH node.

The perspective projection of the BVH node onto the camera plane. Image from Havran & Bittner, 2009.

Using this for as cost function reduces the authors’ intersection tests and traversal steps by 29% and 16% respectively, though their scenes are rather dated and do not really match modern day working sets.

Ray Distribution Heuristic (Havran & Bittner, 2009)

Monte-Carlo is the bread and butter for the average graphics programmer that does anything ray-or-path-tracing related, so not why use it for building the BVH? After all, we are looking to minimize the probability of intersection. Assuming we know which rays will be cast and given the geometry, the probability of intersection can be calculated for a candidate BVH node by using the rays that are to be cast to determine how often it will be intersected. This forms the Ray Distribution Heuristic (RDH) from Bittner & Havran. The probability of intersection is then the ratio of rays passing through the respective child node relative to the parent node, similar to the surface area ratio for SAH.

The rays used for construction don’t need to be the exact rays we use later on during ray-tracing, as that would also lead to enormous construction time for the BVH. Subsampling patterns (e.g. subsampling the primary rays by emulating a lower resolution) or subsampling the rays over time can both still yield speed-ups, assuming the rays that will be cast do not drastically change.

Applying this heuristic to kd-tree construction yields estimated speed-ups of up to 15% for tracing primary rays, but adding diffuse, shadow and other rays showed no gains. It’s not unlikely this comes from an “overfitting” problem, as the distribution becomes more uniform as more distributions are combined and SAH uses amore accurate estimate of the combined ray distribution.

Occlusion Surface Area Heuristic (Vinkler et al., 2012)

Aside from use-cases like shadow rays and translucency, rays are generally not traced further than their first intersection, which diminishes, as illustrated, the probability of intersection for geometry behind the first hit. Vinkler et al. therefore modified the SAH-based BVH build to include a factor of occlusion: Occlusion Surface Area Heuristic (OSAH). Leaving numerous additionally formed conditions out, the core cost function basically boils down to estimating the intersection probability through a weighted blend between the ratio of occluded triangles for the node and its SAH cost.

The probability of intersection for a node in with OSAH. Based on some weight (w), we combine SAH (the right part) with the ratio of number of visibile triangles within a node (Nv)

Their reported performance increases require covering SBVHs and such, but the tracing performance increases by around 28% for their tested scenes.

A consequence of OSAH, similar to RDH, is that it requires tracing rays during the construction of the BVH, as rays are traced to estimate how occluded geometry is. This means we want to have a BVH to construct one; our personal approach here is to just build a BVH once using SAH which is then used for occlusion queries to build OSAH-based BVHs.

The Best Heuristic

So we know some alternatives to SAH for BVH construction, but what’s actually the best? One problem is that while SAH is almost always included in performance comparisons, there is none between the listed improvements. Each research tests different scenes, viewpoints and metrics, so it’s hard to compare them directly from the published works.

This led us to our research, which compares the above discussed heuristics in a single framework in 8 different scenes. Since performance characteristics can differ between different ray types, this is measured separately per ray type for primary rays, diffuse bounces, shadow rays and ambient occlusion. Note that while it would be interesting to distinguish between TLAS and BLAS performance, this is not something we yet researched.

Let’s move on to some results from testing this with a CPU path-tracer for a simple stanford dragon scene in which we move the camera around:

The Stanford Dragon rendered in Breda, 871,306 tris. This rendering is not the output of CPU-based path-tracer.

Ray-tracing performance per type of ray. The Interior-Scene Ray Origin Metric uses both the paper’s described approximations, a high quality and fast one. Preferred Ray Sets Specifically implements the correction for perspective projection of the camera. Combined is the average per ray for all rays cast in a frame; most frames contain less diffuse bounces than primary rays, for example.

Evidently, the other heuristics are not much better. Only OSAH seems to steadily outperform SAH for all ray distributions here here, despite the occlusion being relatively low for this scene. RDH was meant to be the ideal case: we measure exactly what the probability of intersection is for a given set of rays and then we trace those rays, again, with the RDH-built BVH. The problem is that we have a scene with only little interesting going on. Only part of the scene is covered and it seems that RDH simply has some bad frames as the performance does seem pretty similar for the majority of it.

We can make some sense of these performance observations: the interior-scene ray origin, for example, is one we do not expect to excel in scenes where rays originate from outside the scene bounding box, which is the case here for primary rays. Still, evenfor the other ray types it’s slower than SAH.

The results change if we move to an indoor scene, such as the Cathedral Sibenik:

The Sibenik Cathedral rendered in Breda, 75,283 tris. This rendering is not the output of the CPU-based path-tracer.

‘sRay-tracing performance per type of ray. The Interior-Scene Ray Origin Metric uses both the paper’s described approximations, a high quality and fast one. Preferred Ray Sets Specifically implements the correction for perspective projection of the camera. Combined is the average per ray for all rays cast in a frame; most frames contain less diffuse bounces than primary rays, for example.

This scene has significantly more occlusion and is rendered in our fly-through (primarily) from the inside and as a result we get slightly better performance with the Interior-Scene Ray Origin Metric. Especially RDH works best here and SAH becomes a bit more middle of the pack.

The perspective adjustment with Peferred Ray Sets does not even seem viable for primary rays in contrast, which is interesting given the seeming soundness of its definition. This is discussed in more detail in the linked thesis, but the problem lies the impact of breaking the assumption of SAH that rays start outside the scene has a far greater impact than for SAH itself.

Note that the other tested scenes are discussed in detail in the linked thesis.

Conclusion

The take-away here is that the best heuristic is quite scene-dependent. We found that higher occlusion factors generally lead to better performance with OSAH, that SAH is generally a good default but rarely yields the best ray-tracing performance, and that there can be correlation factors that are not yet well defined that correlate the performance to a particular heuristic.

With OSAH often being a superior alternative in the tested scenes to SAH, why not use this by default? One not-yet-discussed aspect is the performance of constructing the BVH. RDH, OSAH and the Preferred Ray Sets method all depend on the camera viewpoint or even the exact set of rays that will be cast. This means that changing the viewpoint or set of rays requires a rebuild of the BVH. For a TLAS this can be acceptable, but this is undesirable for BLASes.

Even then a single build has the performance problem in that the operations are far more expensive than simply calculating the surface area, too expensive to build every frame. There is some potential of reducing the input set for RDH and OSAH, but while keeping in mind a minimum in optimization efforts, building with SAH is still much faster.

Is there no future in these alternative heuristics at all? We actually do think there is one. The Interior-Scene Ray Origin Metric, for example, can give a speed-up up to nine percent over SAH in some scenes and is rarely slower in ray-tracing, especially for rays other than primary rays, which we do not cast in Breda. Evaluating the heuristic is more expensive, but we think much can be optimized and the fast approximation of the heuristic is of comparable quality already. We highly recommend reading the paper by Fabianowski et al. as it is a relatively simple heuristic to understand and integrate in your own BVH builder.

Another possible way to benefit from these heuristics might even be to use separate BVHs for different ray types, as primary rays are usually faster for SAH than the Interior-Scene Ray Origin Metric. Other combinations may very well be worth it to deliver the, hypothetically, fastest tracer. Ideally we would be able to configure what heuristic or configure for which use-case a BVH is built when specifying this to the respective APIs, so hopefully some day this will be available.

There are many more directions to go with this. Which ray distributions do we use for construction with RDH and does it make a difference? Can we re-use the BVH from previous frames for cost evaluations that are view-or-ray dependent? Can we combine the heuristics in different ways such that they, as combination, counter some of the assumptions made in SAH? What about heuristics specific to a certain type of ray distribution, such as shadow rays? Several of these are also explored in the thesis, so if this caught your interest, we definitely suggest you give it a read! Ultimately we hope to dive into the others as well and explore new methods for boosting ray-tracing performance!