Binned kd-tree Construction with SAH on the GPU

Danilewski, Piotr

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Thesis

Binned kd-tree Construction with SAH on the GPU

MPS-Authors

/persons/resource/persons44282

Danilewski, Piotr
International Max Planck Research School, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Danilewski, P. (2009). Binned kd-tree Construction with SAH on the GPU. Master Thesis, Universität des Saarlandes, Saarbrücken.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0027-B6D9-D

Abstract

Our main goal is to create realistically looking animation in real-time. To that end, we are interested in fast ray tracing. Ray tracing recursively traces photon movement from the camera (backward) or light sources (forward). To find where the first intersection between a given ray and the objects in the scene is we use acceleration structures, for example kd-trees. Kd-trees are considered to perform best in the majority of cases, however due to their large construction times are often avoided for dynamic scenes. In this work we try to overcome this obstacle by building the kd-tree in parallel on many cores of a GPU. Our algorithm build the kd-tree in a top-down breath-first fashion, with many threads processing each node of the tree. For each node we test 31 uniformly distributed candidate split planes along each axis and use the Surface Area cost function to estimate the best one. In order to reach maximum performance, the kd-tree construction is divided into 4 stages. Each of them handles tree nodes of different primitive count, differs in how counting is resolved and how work is distributed on the GPU. Our current program constructs kd-trees faster than other GPU implementations, while maintaining competing quality compared to serial CPU programs. Tests have shown that execution time scales well in respect to power of the GPU and it will most likely continue doing so with future releases of the hardware.