Developer Guide: Path Tracing

Developer Guide: Path Tracing

Introduction

We codesigned Glowstick, our path tracing software, with Lightning, our path tracing hardware. This document describes how Glowstick and Lightning work, how we maximized performance, and how you can as well.

Lightning

Lightning Logo_Gold.jpg

Lightning offloads AS generation, traversal, and intersection testing from software applications.

Lighting Driver Interface

The Lighting Driver Interface is the low-level API that enables the creation of acceleration structures as well as intersection testing on the Lightning Hardware Engine.

The driver interface is very simple. It consists of three critical functions:

  • driver::initialize and driver::shutdown

  • driver::organize_model: This function is passed scene data organized in meshes, IDs and additional information to describe the scene primitives. The function will prompt the Lightning Engine to create the acceleration structures.

  • driver::traverse: This function receives a batch of rays and will ask the Lighting Engine to test them against the acceleration structures. The rays are modified in-place: upon returning, intersection information (hit, non-hit, world position of hit, etc..) are available in the ray payloads.

Note: The Driver itself is not available to developers, however the backend (a plug-and-play DLL that works seamlessly with both Bolt Rays and Glowstick API) will be available in precompiled form).

Bolt Rays

Maximizing Performance: A Shift to Batch-Oriented Ray Tracing

The Lightning Ray Tracing hardware is engineered for large throughput, delivering its highest performance when processing large, concurrent batches of rays. To unlock its full potential, we encourage a shift away from traditional, single-ray recursive techniques toward a modern, multithreaded, batch-oriented architecture.

By designing your application to continuously feed the hardware with large sets of rays while asynchronously processing the results of previous batches, you can achieve significant performance gains and build a more scalable rendering engine.

From Recursion to a Parallel, Queued Architecture

The core idea is to "unroll" or "flatten" the recursive process into an iterative one. Instead of function calls triggering deeper function calls, you will generate ray requests, add them to a queue, and process them in large, decoupled batches. This allows multiple threads to contribute to the work simultaneously without blocking.

We'll use a classic producer-consumer pattern with multiple queues to manage the flow of data between different stages of the rendering pipeline.

Key Components of the Architecture

  1. Ray Queues: Thread-safe queues that hold the data for rays awaiting processing. You will typically have at least two:

    • RaysQuery_Queue: Holds rays that need to be intersected with scene geometry.

    • Results_Queue: Holds intersection results that are ready for shading calculations.

  2. Ray Generation Threads (Producers):

    • These threads are responsible for creating the initial set of rays.

    • Typically, they generate primary camera rays for each pixel and push them into the RaysQuery_Queue.

  3. Hardware Dispatch Thread (Consumer/Producer):

    • This thread's main loop continuously pulls large batches of rays from the RaysQuery_Queue. The larger the batch, the better the hardware utilization.

    • It dispatches the batch to the driver/hardware for intersection testing.

    • Once the hardware completes the batch, the thread pushes the hit results (hit point, normal, material ID, etc.) into the Results_Queue.

  4. Shading Threads (Consumers/Producers):

    • These threads consume hit results from the Results_Queue.

    • They execute the shading logic (e.g. PBR) for a given intersection.

    • This is the crucial step for replacing recursion: If the shader needs to cast new rays (for reflections, refractions, or shadows), it does not make a recursive call. Instead, it generates the new ray data and pushes it back into the RaysQuery_Queue for the hardware to process in a future batch.

Handling Deferred Processing and State

A key challenge in a decoupled, asynchronous system is: "If a reflection ray is processed in a future batch, how do I know which pixel to add its color to?"

To simply identify the ray, Lightning’s ray payload includes a 64bits wide integer Pixel ID. The Pixel ID can be used to store a pointer to additional data, or as an index into multi-dimensional data structures to hold state for different parts of your simulation.

Lightning’s built-in ray payload:

struct RayData { Vec3f m_fDirection; Vec3f m_fOrigin; uint64_t m_iPixel_id; uint16_t m_iSample_id; Vec3f m_fWorld_pos; float m_fU; float m_fV; bool m_zHit, m_zSkip; Vec3f m_fIntersection_normal; uint32_t m_iMesh_id; bool m_zGet_ray_data_result; float m_fDepth; uint32_t m_iTri_ID; };

As with any other Ray Tracing system, the following fields are inputs:

  • m_iPixel_id

  • m_iSample_id

  • m_fOrigin

  • m_fDirection

The remaining fields are read-only, and it is good practice to fill them with zeros prior to a query.

Example:

  1. Ray Generation: A Ray Generation Thread creates a primary ray for pixel (x, y). It populates its payload: m_iPixel_id = y * width + x.

  2. Queue for Tracing: The thread pushes this RayData structure into the RaysQuery_Queue.

  3. Hardware Dispatch & Processing: The Hardware Dispatch Thread, at some point, dequeues a large batch of rays which includes our primary ray. It sends the entire batch to the hardware for intersection processing.

  4. Queue for Shading: Once the hardware returns the intersection results for the batch, the Dispatch Thread pairs each result with the original payload of the ray that produced it. The Ray Payload, which now contains the query results, is then pushed into the Results_Queue.

  5. Shading Begins: A Shading Thread dequeues our primary ray's result from the Results_Queue. It inspects the payload and performs what needs to be done at the hit point. Let's say that a reflection is desired at this point.

  6. Secondary Ray Generation (The Loop): Instead of making a recursive call, the Shading Thread generates a new reflection ray. It copies the context into the new ray's payload:

    • m_iPixel_id remains y * width + x.

    • m_fOrigin ← result ray’s m_fWorld_pos

    • m_fDirection ← new, reflected direction

    • This new ray is then pushed back into the RaysQuery_Queue, and the cycle begins again.

Bolt Rays manual

Overview

This library provides a multithreaded framework designed to interface with the Lightning Engine, optimized for batch ray processing. Alternatively, it offers a GPU back-end as well. Bolt Rays abstracts away the complexities of thread management, data queuing and dispatch.

The core design philosophy is to provide a "bring-your-own-logic" system. You define the behavior of ray generation and result processing by implementing a simple interface, and the library orchestrates the execution across a scalable pool of threads to maximize hardware utilization.

This manual will guide you through the architecture, core concepts, and API of the bolt::rays library.

Core Concepts

The library is built around a few key abstractions that work together to create a powerful data processing pipeline.

The Rays Class

This is the main entry point and control center for the entire system. It uses a builder pattern for configuration, allowing you to chain set_ calls to configure every aspect of the pipeline before starting it. You use this class to:

  • Define your custom thread logic.

  • Add scene geometry.

  • Select the execution mode and backend.

  • Run, stop, and manually step through the simulation.

ThreadConfig: The Blueprint for Your Logic

The ThreadConfig class is an abstract interface that represents a unit of work to be performed on a thread. The library requires you to define your application's logic by inheriting from specializations of this class. You must implement two key functions:

  • std::unique_ptr<ThreadConfig> clone() const: The library needs to create a unique instance of your logic for each thread it spawns. This function must return a deep copy of your configuration object.

  • bool execute(): This is the heart of your thread. The library calls this function in a loop. Your implementation contains the work the thread should do in one "chunk". Return true to keep the thread running or false to signal that it has completed its work and should terminate.

GenerateThreadConfig: The Ray Producers

This specialization of ThreadConfig is responsible for creating primary rays. Your implementation of this class will define the logic for generating the initial batches of rays that feed the processing pipeline (e.g., primary camera rays).

WorkerThreadConfig: The Ray Consumers

This specialization of ThreadConfig is responsible for processing the results of ray queries. The execute() method you implement will typically involve:

  1. Popping a completed ray's result from a queue.

  2. Performing shading or other calculations based on the result.

  3. If necessary, generating new secondary rays (e.g., for reflections, shadows) and adding them back into the system to be traced.

The Pipeline Data Flow

The library orchestrates the flow of data between your custom components:

  1. Generation: Your GenerateThreadConfig implementation produces a batch of RayData objects.

  2. Dispatch: The library takes these rays, sends them to the configured hardware or software backend for processing, and waits for the results.

  3. Consumption & Looping: The results are made available to your pool of WorkerThreadConfig instances. A worker thread pops a result, processes it, and can choose to inject new secondary rays back into the system for tracing. These new rays re-enter the pipeline at step 2, enabling recursive effects like multi-bounce reflections in a flattened, iterative manner.

Getting Started: A Step-by-Step Guide

Using the library involves three main steps: defining your logic, configuring the pipeline, and running it.

Step 1: Define Your Worker Logic

First, create a class that inherits from bolt::rays::WorkerThreadConfig. This class will define how you handle the results of a ray trace.

#include "bolt_thread_config.hpp" #include <iostream> // MyWorkerConfig defines the logic for processing a ray's intersection result. class MyWorkerConfig : public bolt::rays::WorkerThreadConfig { public: MyWorkerConfig() {} // Add any necessary state here // The library calls clone() to create an instance of this worker for each thread. std::unique_ptr<bolt::rays::ThreadConfig> clone() const override { return std::make_unique<MyWorkerConfig>(*this); } // execute() is the main work loop for a worker thread. bool execute() override { bolt::common::RayData ray; // 1. Try to pop a completed ray result from the input queue. while (pop_item(ray)) { // 2. Process the ray result. std::cout << "Worker processed ray with payload: " << ray.payload_id << std::endl; // 3. (Optional) Generate a secondary ray. if (should_reflect(ray)) { bolt::common::RayData reflection_ray = create_reflection_ray(ray); // Push the new ray back into the system to be traced. add_secondary(reflection_ray); } } // If pop_item returns false, the input queue is empty for now. // Returning true keeps the worker alive to check again later. // Returning false would terminate the worker thread. return true; } private: // Helper methods and member variables for worker state bool should_reflect(const bolt::common::RayData& ray) { /* ... */ return false; } bolt::common::RayData create_reflection_ray(const bolt::common::RayData& ray) { /* ... */ } };

Step 2: Define Your Ray Generation Logic

Next, create a class that inherits from bolt::rays::GenerateThreadConfig. This class produces the initial set of rays.

#include "bolt_thread_config.hpp" // MyGenerateConfig defines the logic for generating the initial rays. class MyGenerateConfig : public bolt::rays::GenerateThreadConfig { public: MyGenerateConfig(int rays_to_generate) : m_total_rays(rays_to_generate) {} std::unique_ptr<bolt::rays::ThreadConfig> clone() const override { return std::make_unique<MyGenerateConfig>(*this); } // The library calls this execute() to get a new batch of rays. bool execute(std::vector<bolt::common::RayData>& items) override { // Generate a batch of rays and add them to the 'items' vector. for (int i = 0; i < m_total_rays; ++i) { bolt::common::RayData new_ray; // ... populate ray origin, direction, and payload ... new_ray.m_iPixel_id = i; items.push_back(new_ray); } return true; // We generated rays, continue running. } private: int m_total_rays; };

Step 3: Configure and Run the Rays Pipeline

Finally, in your main function, instantiate and configure the bolt::rays::Rays object.

#include "Bolt_rays.hpp" #include "my_worker_config.hpp" #include "my_generate_config.hpp" #include <vector> void on_iteration_complete(int iteration_num) { std::cout << "---- Iteration " << iteration_num << " Finished ----" << std::endl; } int main() { // 1. Instantiate your custom configurations. MyGenerateConfig generator(10000); // Generate 10,000 rays in total. // NOTE: we can also add different types of workers in the pool std::vector<bolt::rays::WorkerThreadConfig*> worker_configs; for (int i = 0; i < 8; ++i) { // Create 8 worker threads worker_configs.push_back(new MyWorkerConfig); } // 2. Configure the main Rays object using the builder pattern. bolt::rays::Rays pipeline; pipeline.set_generate_configuration(&generator) .set_worker_configurations(worker_configs) .set_run_mode(bolt::rays::RUN_ITERATIONS) .set_num_iterations(5) .set_iteration_callback(on_iteration_complete) .set_backend("./libbolt_lighting_backend.so"); // 3. (Optional) Add geometry to the scene. int shape_id; pipeline.add_shape(vertices, normals, indices, shape_id); // 4. Run the pipeline. This call will block until the work is complete. bool success = pipeline.run(); if (success) { std::cout << "Pipeline completed successfully." << std::endl; } else { std::cout << "Pipeline finished with errors or was stopped." << std::endl; } return 0; }

API Reference

class bolt::rays::Rays

This is the main class for managing the ray tracing pipeline.

Configuration Methods

  • Rays& set_generate_configuration(GenerateThreadConfig* config) Sets the logic for the ray generation thread.

  • Rays& set_worker_configurations(std::vector<WorkerThreadConfig*> configs) Sets the worker logic. The library creates one thread for each pointer in the vector, cloning the configuration object for each.

  • Rays& set_run_mode(RUN_MODE mode) Sets the execution mode. See Section “Execution Modes” for details.

  • Rays& set_num_iterations(int iterations) Specifies the number of iterations to run when the mode is RUN_ITERATIONS.

  • Rays& set_backend(const char* backend) Selects the backend for ray tracing.

  • Rays& set_iteration_callback(std::function<void(int)> callback) Sets a callback function to be invoked after each iteration completes. A template overload for binding member functions is also available.

  • Rays& add_shape(...) Adds a triangle mesh to the scene. Returns the shape's id via an output parameter.

Execution Control

  • bool run() Starts the pipeline and blocks until completion. The behavior depends on the RUN_MODE. Returns true on successful completion. This step includes the creation of the acceleration structures for the geometries passed with add_shape.

  • void stop() Requests a graceful shutdown of the pipeline. This is intended to be called from a different thread than the one that called run().

  • void iterate() When in RUN_MANUAL mode, executes a single, complete iteration of the pipeline (generate -> dispatch -> process).

  • void wait_iteration_complete() Blocks the calling thread until the current iterate() call has finished processing.

class bolt::rays::ThreadConfig

The abstract base class for all thread logic.

  • virtual std::unique_ptr<ThreadConfig> clone() const = 0; Must be implemented to return a deep copy of the derived class.

  • virtual bool execute() = 0; The main logic loop for the thread.

class bolt::rays::WorkerThreadConfig

Inherits from ThreadConfig. Provides functions for interacting with the pipeline.

  • std::function<void(const bolt::common::RayData& ray)> add_secondary A function provided by the library to add a new secondary ray into the tracing queue.

  • std::function<bool(bolt::common::RayData& ray)> pop_item A function provided by the library to attempt to dequeue a completed ray result. Returns true and populates the ray parameter on success.

class bolt::rays::GenerateThreadConfig

Inherits from ThreadConfig. Provides an interface for generating ray batches.

  • virtual bool execute(std::vector<bolt::common::RayData>& items) = 0; The primary method you must implement. Fill the items vector with a new batch of rays to be traced. Return false when no more rays can be generated.

Execution Modes (RUN_MODE)

The library supports three distinct execution modes to suit different application needs.

  • RUN_ITERATIONS The pipeline runs for a fixed number of iterations, specified by set_num_iterations(). The run() call will block until all iterations are complete. This is ideal for offline rendering or benchmarks where the amount of work is known beforehand.

  • RUN_INFINITE The pipeline runs continuously until stop() is called from another thread. This is perfect for interactive applications, real-time renderers, or simulations that run indefinitely.

  • RUN_MANUAL The pipeline is controlled by the host application. The run() call initializes the system but does not start the main loop. You must manually drive the simulation by calling iterate() inside your own application loop. This provides the highest level of control, allowing you to synchronize the ray tracing work with other engine subsystems.

// Example of manual iteration loop pipeline.set_run_mode(bolt::rays::RUN_MANUAL); pipeline.run(); // Initializes threads, but doesn't start processing. while (my_app_is_running) { // Do other engine updates... physics_update(); input_update(); // Run one full iteration of the ray pipeline. pipeline.iterate(); // This is a synchronization point. It waits for all the work // dispatched in iterate() to be finished. pipeline.wait_iteration_complete(); // Now you can safely use the results (e.g., display a rendered image). display_frame(); } pipeline.stop();

Glowstick API

While the flexibility of the Bolt Rays system allows you to build a wide range of custom simulations, from visual rendering applications like path tracing to physics or even audio rendering, Glowstick API is offered as a complete, feature-rich path tracing solution built on this very framework.

Engine Overview

Glowstick API is a state-of-the-art, physically based path tracing engine designed for high-end production rendering. Built from the ground up to handle scenes of great complexity and scale, Glowstick API integrates a suite of modern rendering technologies to deliver performance, flexibility, and artistic control.

A Foundation for High-Throughput Parallelism

Glowstick API's core architecture is engineered from the ground up to align with the principles of high-throughput, batch-oriented hardware. Moving beyond traditional recursive path tracing, the engine flattens the entire rendering process into a decoupled, multithreaded pipeline as discussed in the previous sections.

In this model, shading execution does not recursively call the ray tracer. Instead, when a shader requires a secondary ray (for reflections, refractions, or shadows), it generates a ray request and submits it to a global queue. A dedicated dispatch system consumes these requests in large, concurrent batches, ensuring the underlying hardware is continuously fed with work to maximize utilization.

Core Architecture

Glowstick API is founded on a robust, unidirectional path tracing architecture. To handle scenes with thousands of lights, Glowstick API implements Reservoir-based Spatiotemporal Importance Resampling (ReSTIR), ensuring efficient and high-quality sampling of direct illumination from numerous area lights with minimal overhead.

Scene Description & Shading

  • Universal Scene Description (USD) Native Pipeline: Glowstick API leverages USD as its primary scene description format, enabling it to directly ingest and render complex scenes composed in standard asset pipelines. This ensures high-fidelity data interchange and compatibility with established production workflows without requiring proprietary scene translation.

  • Open Shading Language (OSL): The shading system is powered by OSL. Glowstick API compiles and executes OSL shaders at runtime, providing unparalleled flexibility for material and shader development. The engine fully implements the standard OSL closures, allowing for the accurate rendering of sophisticated, layered BSDFs for physically accurate materials.

Output and Compositing Workflow

Glowstick API provides a comprehensive output system designed for a modern compositing pipeline.

  • Arbitrary Output Variables (AOVs): Users can render any number of AOVs simultaneously, including passes such as Normals, Depth (Z), Position, Alpha, Shadow Catchers.

  • Light Path Expressions (LPEs): Glowstick API supports the LPE syntax defined by the OSL standard. This allows lighting artists and compositors to extract contributions from specific lights, surfaces, and scattering events into discrete AOVs, enabling adjustments in the final composite.

Performance and Optimization

  • Adaptive Sampling: To maximize efficiency, the engine uses a variance-driven adaptive sampling algorithm. It intelligently allocates more rendering samples to complex or noisy regions of the image (e.g., areas with soft shadows, motion blur, or depth of field) while reducing sample counts in cleaner areas, shortening render times.

  • Intel® Open Image Denoise Integration: Glowstick API incorporates the Intel Denoiser for fast, AI-accelerated denoising. This can be applied to the final beauty render providing clean images with significantly fewer samples.

Lighting and Camera

  • Advanced Lighting: The engine supports a wide range of light types, including mesh-based area lights and image-based lighting via HDRIs, ensuring realistic illumination.

  • Physical Camera Model: The camera system simulates real-world lens effects. Artists can control depth of field by adjusting the camera's focal point and aperture.

  • Interactive Refinement: Glowstick API supports region rendering, allowing artists to interactively render a specific crop of the frame and even distribute different regions across multiple machines.

Details

The PtEngine Class

The PtEngine class is the main interface for controlling and executing the entire path tracing rendering pipeline. It provides essential functions like initializing camera rays, running the rendering procedure, calculating viewports for parallel processing, and applying a denoising filter to the output images. These functions include initializeCameraRays(), run(), getViewports(), denoise() (and other omitted functions in the code snippet). This class manages and coordinates all aspects of rendering, from camera settings and viewport organization to denoising and output file handling, making it the central hub for path tracing in this application.

In addition to this class, two super-functions LoadUSDFile and RenderUSDFile are made available.

 

PtEngine::run(class OslContext* osl_context, const PtEngineOptions& options, std::string& output);

The PtEngine::run function initiates and manages an iterative rendering process for a path tracer, culminating in the production of high-quality images. It initializes member variables with values from the provided options, creating a WorkerThreadPool to handle rendering tasks concurrently. An HDRI image is used when necessary to set up the environment accurately, and the background color for the scene is set based on the provided options. Open Image Denoise (OIDN) is initialized for post-processing to smooth out noisy renderings. The ray generation threads are started based on the specified backend path, and the main rendering loop iterates until all required iterations or termination conditions are met. Rays are processed in parallel threads during each iteration, with the function waiting for each iteration to complete before proceeding. Progress is tracked and printed periodically, providing developers with insights into the rendering process's status. Upon completion of all required iterations or termination conditions, the function cleans up and waits for all threads to terminate. The final rendered image is saved in EXR high dynamic range format, supporting HDRI images when necessary, ensuring faithful representation of the scene's lighting and color information across a wide tonal range. This flexible and efficient rendering engine handles complex scenes with ray tracing and HDRI support efficiently.

void PtEngine::initializeCameraRays(const bolt::common::Camera& p_oCamera);

The PtEngine::initializeCameraRays function kicks off the primary ray generation process within a path tracer by setting up a dedicated thread to generate rays based on the provided camera configuration. This allows early initiation of ray generation, allowing other components of the rendering engine to progress simultaneously while the initial rays are being produced.
The function initializes its internal camera configuration with the input camera object p_oCamera, creates an instance of the CameraRaysThread class, sets the camera configuration for this thread, starts the thread, and triggers ray generation using a signal or event (m_generate). This setup ensures efficient resource allocation during the rendering workflow while maintaining smooth performance.
In essence, the function PtEngine::initializeCameraRays efficiently initiates and manages an independent ray generation process for a given camera configuration in the path tracer, enabling other parts of the engine to progress concurrently during the rendering process.

 

void PtEngine::denoise();

The PtEngine::denoise() function processes a sequence of color files, utilizing a denoiser device to denoise each file sequentially. It initializes necessary input and output buffers for the denoising process, creates a denoiser device using Open Image Denoise (OIDN), sets up the denoiser filter with appropriate parameters, executes the denoiser, measures the duration of both overall denoising process and execution time, logs these durations, saves the denoised output to disk, and finally cleans up temporary files used during the denoising process. The albedo and normal maps are read from disk, set as inputs for the denoiser, and the color map is also processed in a similar manner. The function ensures that all necessary input files exist before proceeding with the denoising process, logging an error message if any file is missing.

 

void PtEngine::getViewports(class OslContext* osl_context, const PtEngineOptions& options, std::string& output, std::vector<ViewportParams>& sortedViewports);

The PtEngine::getViewports() function calculates and organizes smaller rendering blocks, or viewports, for efficient parallel processing. A grid of viewport blocks is created, with each block defined by its top-left corner coordinates and width/height. Adjustments are made to the last block's dimensions if it doesn't fit exactly into the block size. This function is essential for distributing rendering tasks across multiple cores or processors to optimize performance.

 

The PtEngineOptions Structure

struct PtEngineOptions

PtEngineOptions is a structure designed for passing configuration parameters to the PtEngine class during rendering operations. It allows full control over the path tracing setup by encapsulating various settings, such as camera properties, output options, denoising configurations, and other render-related parameters. By providing an instance of this structure when invoking the functions within the PtEngine class, users can tailor the rendering pipeline to their specific requirements and achieve optimal results for their path tracing tasks.
Parameters:

  • int m_maxBounces: The maximum number of bounces allowed for light rays before they are terminated. - Typically ranges from 1 to 20, where higher numbers can produce more realistic lighting but at the cost of increased computation time.

  • int m_numThreads: The number of threads to use for rendering. - Default is set by std::thread::hardware_concurrency(), which typically uses all available hardware threads; values should be smaller or equal to std::thread::hardware_concurrency().

  • int m_iterations: The number of times the renderer will iterate over each pixel to achieve a stable image. - Typically ranges from 1 to thousands, with higher numbers resulting in more accurate but slower rendering.

  • std::string m_SceneFileName: The file path of the USD scene to be rendered. - A string representing the file path, e.g., "scene.usda".

  • std::string m_HdriFilename: The filename for a high dynamic range (HDR) environment map used as background lighting. - A string representing the file path, e.g., "env_map.hdr".

  • float m_HdriEV: Exposure value of the HDR image, indicating how bright or dark the scene should be rendered according to the HDR texture, e.g., 0.0f (neutral), -2.0f, 2.0f.

  • bool m_autoHdriEV: Whether to automatically adjust the exposure value of the HDR image based on its luminance.

  • Vec3f m_bgColor: The background color used if no HDR environment map is specified. A 3D vector, e.g., 1,1,1 (white), 0,0,0 (black).

  • float m_HdriRotation: The rotation angle of the HDR environment map around its vertical axis in degrees, e.g., 45.0f, -90.0f.

  • int m_HdriFlip: Whether to flip the HDR environment map horizontally.

  • bool m_debug: Enables debugging mode, which renders every iteration to a file. If disabled, only the final results are stored to files.

  • bool m_outputAOVs: Whether to enable the output of additional images called AOVs (Arbitrary Output Variables), including normal pass, depth pass, albedo pass and so forth.

  • bolt::common::Camera m_camera: The camera settings used for rendering, including position, orientation, and other parameters. - An instance of the bolt::common::Camera class.

  • backend m_oBackend*: Pointer to a backend plug-in (e.g., Embree or U50) used for ray tracing.

  • double m_adaptiveSamplingThreshold: The threshold value for adaptive sampling, where smaller values result in finer sampling around areas with high detail, e.g., 0.01, 0.001.

  • int m_adaptiveSamplingMinSamples: The minimum number of samples to use when adaptive sampling is enabled, e.g., if set to 10, the first 10 frames will be fully rendered without adaptive sampling, which will kick in at frame 11.

  • std::vector<uint32_t> m_devices: A vector of device IDs, specifying which Lighting devices should be used for ray tracing (The U50 backend must be selected for this to work).

  • bool m_alphaColor: If used, the scene will be rendered with HDRi (or whichever lighting setup), but the background will be black. An additional alpha mask will be provided. This combination allows basic compositing.

  • std::vectorstd::string m_lpeNames: Names of light path expressions used in the rendering process. - A vector of strings representing LPE channel names.

  • std::vectorstd::string m_lpePatterns: Patterns defining the light path expressions to be used for rendering. - A vector of strings representing patterns (rules) defining the LPEs.

  • std::string m_backendPath: The path to the backend plug-in.

  • float m_distance: The distance from the camera to the farthest point in the scene. This is used for the normalization of the depth buffer.

  • std::string m_outputPath: The path where rendered images and other outputs will be saved.

  • float m_fEpsilon: A small epsilon value used for numerical stability in the calculation of secondary ray origins.

  • int m_denoise: Whether to enable denoising during the rendering process. Possible values are 0 (disabled), or 1 (enabled).

  • std::atomic_bool m_stopFlag: A flag that can be used to stop the rendering process from within the engine. - An atomic boolean value, which is set to true to halt rendering.

 

The super functions

bool LoadUSDFile(class OslContext* p_oOslContext, std::string p_cInput, bolt::common::RenderOptions& p_oOpts, backend* p_oBackend, const float& p_fLightsPowerMultiplier, uint32_t p_iDevID, uint32_t p_iTriPerMesh, uint32_t p_iStats);

The LoadUSDFile function serves as a versatile tool for loading USD (Universal Scene Description) files and processing their content. It primarily focuses on generating lights and loading geometric meshes into a rendering context, with an array of key features to support this purpose:

  1. Lighting Support: This includes Sphere Lights, Cylinder Lights, Rectangular Lights (RectLights), Disk Lights, and Dome/HDR Environment Maps.

  2. Mesh Handling: The function processes different types of primitives like pxr::UsdGeomMesh, pxr::UsdGeomPointInstancer, and instances while applying transformations from the USD stage data to these geometries.

  3. Transformation Processing: Extracts transformation matrices using pxr::GfMatrix4d for each geometry, ensuring proper positioning and orientation.

  4. Texture Path Management: Adds parent directories of the loaded USD file to texture and OSL paths if they are not already included in environment variables or specified text files.

  5. Scene Initialization: Sets up camera positions based on model dimensions and initializes a backend (such as GPU) device while organizing the loaded models into an acceleration structure for efficient rendering.

  6. Rendering Context Integration: Loads scene data into bolt::common::RenderOptions, allowing configuration of various rendering parameters like lighting intensity, material properties, etc.

  7. Statistics and Debugging: Detailed logs are provided for debugging purposes, offering information about loaded primitives, lights, and mesh statistics using cinfo and cdebug.

  8. Conditional Execution: Allows the function to operate in a "load-only" mode, where it only loads the USD file without further processing or initialization steps.

This loader is designed with flexibility in mind, supporting various types of lighting and geometry from USD files while ensuring that all relevant resources (like textures) are properly handled.

 

void RenderUSDFile(class OslContext* p_oOslContext, std::string p_cOutput, bolt::common::RenderOptions& p_oOpts, PtEngineOptions& p_oEngineOpts, backend* p_oBackend);

The RenderUSDFile function renders a Universal Scene Description (USD) file by managing the rendering process across multiple viewports. This function takes various parameters like OSL context, output path, render options, engine-specific options, and a backend device for rendering.

It starts by initializing timing and logging, configuring the output path based on camera name, and setting up engine options using the backend device. It then generates viewports, iterates over each one, updates the camera settings, logs the current viewport’s coordinates, creates an engine instance for each viewport, initializes camera rays, runs the rendering process, handles denoising if enabled, checks for early termination, calculates total rendering time, and logs performance metrics.

In summary, the RenderUSDFile function is designed to provide a streamlined approach to rendering USD files by offering camera selection, viewport generation, rendering initialization, and denoising while ensuring detailed logging and performance metrics.

USD

Universal Scene Description (USD) is an open-source, highly interoperable file format developed by Pixar Animation Studios and now maintained by the Academy Software Foundation. It serves as a powerful tool for storing and managing 3D computer graphics data in a flexible and efficient manner.

At its core, USD provides a hierarchical data model that enables the organization of complex scenes into nested components, each with their own attributes, transformations, and dependencies on other elements within the scene. This hierarchical structure makes it easy to create, modify, and manage large and intricate 3D graphics in an intuitive and efficient manner.

One of USD's key features is its support for a wide range of assets, including geometry, materials, lights, and animation data. These assets can be easily integrated into scenes using the hierarchical structure, making it possible to create complex and visually rich graphics with ease. Furthermore, USD supports the use of standard industry-recognized asset formats such as FBX, OBJ, and Alembic, allowing for seamless integration of existing workflows and assets into USD pipelines.

USD also offers strong support for version control and collaboration, making it easy for teams to work on large and complex projects simultaneously. By using USD's built-in versioning system, developers can track changes made to the scene, revert to previous versions if necessary, and merge changes from multiple team members effortlessly.

In addition to these core features, USD also provides a powerful set of APIs and tools for extending its functionality beyond the basics. These APIs allow developers to build custom plug-ins, create new asset types, and even integrate USD into existing pipelines and workflows. This flexibility makes USD an invaluable tool for creating efficient and visually stunning graphics that cater to a wide range of applications within the field of computer graphics.

Glowstick API supports loading USD files. Here is a list of the most relevant features:

  • Lighting Support:

    • Glowstick API handles various types of lights such as Sphere Lights (pxr::UsdLuxSphereLight), Cylinder Lights (pxr::UsdLuxCylinderLight), Rectangular Lights (pxr::UsdLuxRectLight), Disk Lights (pxr::UsdLuxDiskLight, essentially flattened cylinder lights), and Dome/HDR Environment Maps (pxr::UsdLuxDomeLight for HDR environment maps).

    • It can load texture files associated with Dome/HDR Environment Maps.

  • Mesh Handling:

    • Glowstick API processes different types of primitives like pxr::UsdGeomMesh, pxr::UsdGeomPointInstancer, and instances, applying transformations from USD stage data to these geometries. Geometric normals and UVs are also supported.

  • Texture Path Management:

    • Glowstick API adds parent directories of the loaded USD file to texture and OSL paths if they are not already included in environment variables or specified text files (texture_paths.txt, osl_paths.txt). This ensures that all required resources, like textures, are found.

  • Scene Initialization:

    • Glowstick API sets up camera positions based on model dimensions for auto-generated scenes and initializes a backend device for ray tracing. The loaded models are organized into an acceleration structure for efficient rendering.

  • Materials:

    • The Glowstick API provides support for loading Universal Scene Description (USD) material interfaces and mapping them to Open Shading Language (OSL) shaders. The Glowstick API handles the mapping between the USD material interface and its corresponding OSL shader, streamlining the process for developers and ensuring consistency across different parts of the scene.

    • While displacement mapping isn't currently supported within this integration, support for normal maps is available. Normal maps can be used to simulate displacement effects, adding depth and realism to textured surfaces without requiring additional geometry or computationally intensive calculations. This makes it possible to create highly detailed and visually rich graphics while maintaining efficient rendering performance.

    • In addition to the support for normal maps, this integration also allows for the use of various other texture types, such as diffuse, specular, and opacity textures, giving developers extensive control over the appearance of their materials. This flexibility makes it possible to create highly customizable and visually rich graphics that cater to a wide range of applications within the field of computer graphics.

  • Statistics and Debugging:

    • Detailed logs are provided for debugging purposes. The level of log details can be controlled with a global verbosity setting.

  • Conditional Execution:

    • The module can operate in a "load-only" mode, only loading the USD file without further processing or initialization steps. This is a useful feature that enables exploring a USD scene without rendering it.

OSL

Open Shading Language (OSL) is an open-source, extensible, high-performance scripting language specifically designed for procedural shading and simulation tasks in computer graphics. Developed by Pixar Animation Studios, OSL serves as a powerful tool to create flexible, efficient, and scalable solutions for rendering complex scenes.

At its core, OSL provides a rich set of built-in functions and constructs that allow developers to build custom shaders, materials, and effects easily. These functions cover various aspects of computer graphics, including lighting models, texture processing, mathematical operations, and more. By utilizing these built-ins, developers can create complex and visually rich graphical elements with ease.

One of OSL's key features is its support for user-defined custom operators, which extend the functionality available to developers. These custom operators allow users to build their own functions that can be used alongside the built-in ones, further increasing the language's versatility and power. This flexibility makes it possible to create highly specialized shaders, simulations, or other graphical elements tailored to specific requirements.

In addition to its rich library of functions, OSL also provides a powerful set of data structures for managing complex data and dependencies within scripts. These structures include vectors, matrices, and higher-dimensional tensor types, as well as support for various types of arrays and multi-dimensional textures. Furthermore, OSL's strong typing system ensures that these structures are used correctly and efficiently during script execution.

OSL is not limited to shading alone; it also offers a wide range of built-in functions for simulation tasks such as fluid dynamics, particle systems, and more. This versatility makes OSL an invaluable tool for a diverse array of applications within the field of computer graphics. By incorporating OSL into our path tracing engine, we can harness its power to create visually stunning and realistic graphics that push the boundaries of what's possible in rendering.

glowstick_API takes advantage of this versatility seamlessly by implementing its standard closures and accessory functions, making it possible to harness the power of procedural shading and simulation for creating visually stunning and realistic graphics. Additionally, glowstick_API automatically resolves file paths for textures and shaders defined in the Universal Scene Description (USD) files using custom path overrides that can be configured via specific user-defined configuration files. This ensures that all required resources are properly loaded and accessible during rendering.

Usage

Setting Glowstick API up requires a few basic steps.

Step 1: Selecting the ray tracing backend

This step involves loading a dynamic-link library (DLL) that contains the raytracing backend functionality.

#include "logger.h" bolt::common::ConsoleLogger consoleLogger; void* l_oDLL = openDLLPath(const_cast<char *>(l_sBackendPath.c_str())); if(!l_oDLL) // const char* causes errors in dllopen { consoleLogger.error("[SERVER] Failed to open DLL: %s\n", const_cast<char *>(l_sBackendPath.c_str())); return EXIT_FAILURE; } else { consoleLogger.info("[SERVER] Opened DLL: %s\n", const_cast<char *>(l_sBackendPath.c_str())); } struct backend * l_oBackend = GetBackend();

After loading the DLL, call the function GetBackend() to obtain a pointer to the raytracing backend plug-in within the DLL.

Step 2: Initializing the OSL Context

The next step involves creating an instance of the OslContext class, which is a custom implementation of the OSL (Open Shading Language) context specifically designed for use with the Glowstick API.