Zeus

Zeus

Introduction

Zeus Logo_Black NP.png

Zeus is a next-generation graphics processor with a unique design that combines scalar (CPU) cores, vector (GPU) cores, and other specialized processors. In designing Zeus, we took a first principles approach, examining performance, efficiency, and scalability limitations of legacy GPU architectures. The result is Zeus: a GPU that outperforms legacy GPUs in key workloads, eliminates scaling issues, and drives down power consumption significantly.

Zeus 1c26-032 PCIe Card.png

Zeus brings a host of firsts to modern consumer-grade GPUs:

  • Native high-speed Ethernet (up to 800 Gbps interfaces)

  • Expandable memory using industry-standard SO-DIMMs and DIMMs

  • Ability to run Linux directly on the GPU

  • Out of Band management & telemetry using Redfish

The video below introduces Zeus:

Performance Improvements

Zeus is optimized for maximum performance in key workloads:

  • Content Creation: Film, TV, Broadcast/Transcoding, Advertising, Architecture Visualization (ArchViz), Game Development, XR

  • Content Consumption: Gaming, Video Transcoding

  • HPC (High Performance Computing): Compute workloads, Physics Simulations

Content Creation

3D artists and designers are dependent on high-quality renderings of their work. Path tracing is the best rendering method, providing photorealistic renderings.

Untitled presentation.png

For a visual introduction to path tracing, we recommend https://www.youtube.com/watch?v=frLwRLS_ZR0!

Historically, path tracing has been too slow to be used outside specific scenarios where the user can wait minutes for a low-resolution render, hours for a single 4K frame to render, or months for a film to render:

All this physical realism came with a high price tag: big budget productions in media and entertainment were spending vast resources on hardware, storage and computation as the bar of photorealism was raised ever higher. Budgets and artist time expanded for larger and larger projects. Rasterized pipelines started to fall by the wayside as only the increased realism and quality of ray traced projects became the norm. However, the rendering power of a standard artists workstation was not increasing at the exponential rate of the complexity of projects.

(from https://bolt.graphics/history-of-rendering-rasterization-ray-tracing-path-tracing/)

This performance limitation has forced developers and artists to create complex workarounds to try to get similar visual output without performing the computations required for the full path traced image:

In spite of these issues, artists continue to use ray tracing to solve complex lightning and realism issues. Without ray tracing, the artist has to spend a lot of time faking lighting effects like shadows and reflections. This has lead to the quick adoption of ray tracing in other industries that require a higher level of realism: product advertising, architecture, and of course gaming. As these (and other) industries adopt ray tracing, artists and consumers find it hard to go back to legacy rasterization.

(from https://bolt.graphics/ray-tracing-grows-across-industries-despite-hardware-issues/)

To solve this widespread industry problem, we are building Zeus to drive towards, and enable for every artist, designer, and engineer, Real Time Path Tracing!

10x Path Tracing

Zeus improves path tracing performance by up to 10x vs Nvidia RTX 5090. This is possible by redesigning the GPU for path tracing, instead of shoehorning it into a legacy rasterizer GPU. A render that takes 30 minutes on an Nvidia GPU will take less than 15 minutes on Zeus 1c26-032, our entry-level GPU that consumes only 120 watts.

Bolt Early Access Program M&E.png

Performance is measured by counting the time it takes for GPUs to render scenes of varying complexity. The Nvidia GPUs are benchmarked using Blender Cycles & Optix, while the Amd GPUs are using Cycles & HIP. Bolt Zeus is benchmarked using Glowstick.

The scenes rendered vary from ~100,000 triangles and a few textures to >4 million triangles and hundreds of textures. The tested scenes are always in OpenUSD format, with MaterialX materials.

Glowstick

In order to test and improve the effectiveness of our path tracing hardware, we needed to build a testbench that path traced renders. Initially, we used it internally to produce renders to compare against existing renderers like Arnold, Iray, and Cycles. We decided to productize this renderer, and named it Glowstick.

Glowstick is our in-development path tracer that is co-designed with Zeus hardware. It will be included at no cost with Zeus GPUs. We’ll talk more about DCC integrations and asset/material libraries below.

Bolt Early Access Program M&E (2).png

We chose to use open standards whenever possible. OpenUSD is becoming widely adopted by artists and developers across many studios and content creation tools as the industry moves away from proprietary and sometimes incompatible asset formats.

Bolt Early Access Program M&E (1).png

We are integrating Glowstick into many of your favorite content creation tools. The way Glowstick is integrated into those tools varies depending on what interfaces the tool provides.

Houdini, Maya, 3ds max, and Nuke provide support for Hydra delegates. Glowstick is already integrated inside USDView as a Hydra delegate, and we are working on porting that to the tools listed above.

SketchUp, Unreal Engine, Blender, Revit, and many other content creation tools use tool-specific plugin interfaces to integrate other renderers. Glowstick will be available as installable plugins for those tools.

Materials & Textures

Physically-based materials are key to path tracing. Materials define how light interacts with the scene: whether the object is a highly polished metal, an emissive light, or a translucent glass. Materials can be scanned or generated procedurally. Scanned materials are usually high resolution (4K, 8K, and higher).

Bolt Zeus Introduction.png

Scenes with hundreds of physically-based materials require lots of memory. Legacy GPUs have slowly increased the amount of soldered memory over time. CPUs have always provided the flexibility to install hundreds of GBs and even TBs of memory.

Many GPU renderers crash when trying to render a scene that requires more memory than is present on the GPU. It is then up to the artist or developer to try to work around this. The size of the materials can be reduced by also reducing the quality of the material. There are many cases where this is a difficult tradeoff to make.

Zeus solves this problem with legacy GPUs by enabling users to install up to 384 GB in a single PCIe card, 4x as much as the RTX 6000 Pro (96 GB), and 12x as much as the 5090 (32 GB).

Content Consumption

Real-Time Path Tracing

High quality real-time experiences like games, concerts, and metaverses are also dependent on real-time path tracing. Today, enabling path tracing features in a game reduces performance by >50%. In the case of Cyberpunk 2077, this means going from almost 60 fps without path tracing enabled, to under 20 fps (Nvidia 4090, 4K resolution, ultra settings).

The chart below shows the path tracing performance limitations of legacy GPUs. High quality path tracing relies on multiple rays per pixel bouncing around the scene multiple times.

Ray-Triangle Intersection Budget (4K 120 fps, per pixel).png

Performing a simple raycast (1 ray per pixel, 0 bounces) is feasible on even Intel B580. However, attempting to calculate soft shadows requires multiple rays per pixel, and at least 1 bounce. To get around this limitation, most real-time applications reduce the resolution of the ray tracing layers, and sometimes the frequency of the layers to half the frame rate or below.

Legacy GPUs just do not have the ray tracing performance (and certainly not the path tracing performance) to fully render scenes in real-time. Zeus 1c (120 watts) dramatically increases the ray-triangle intersection budget to almost 26 rays per pixel, which is enough for 4-5 rays per pixel and 3-4 bounces. Zeus 2c (250 watts) doubles throughput and enables real-time path tracing for the first time without requiring resolution decreases.

Simply measuring ray-triangle intersections does not represent all of the computations required to path trace a scene. Acceleration structure generation, traversal, and shading all take time. The time spent in those stages varies depending on the size and makeup of the scene being rendered, the number and complexity of the materials, and of course moving geometry (animations, deformations, etc.).

Zeus also offloads acceleration structure generation and traversal to further increase performance. For example, generating the acceleration structure for Terminator.usd, a scene with over 2 million triangles, takes around 0.6 milliseconds. Acceleration structure building and traversal on Zeus exerts minimal compute pressure on the scalar, vector, or matrix cores: it is fully offloaded.

For a performance guide on Lightning, the Path Tracing accelerator included in Zeus: Developer Guide: Path Tracing

Game Physics

Coming soon!

HPC

Today’s world is dependent on simulations to discover materials and drugs, design parts, simulate systems, and explore growing areas of scientific research. Simulations are used everywhere:

Bolt Zeus Introduction (1).png

Carmakers historically used expensive wind tunnels to measure the aerodynamic effectiveness of a vehicle under development. Using supercomputers instead eliminates the need for a manufactured sample vehicle, clay model, and wind tunnel. Over multiple design iterations this significantly reduces cost and the time spent inthe feedback loop.

A similar story unfolds in other industries, from Oil & Gas to Aerospace: manufacturing a sample takes time and money. The future of advanced manufacturing foregoes physical prototyping for digital design and testing. Many enterprises have written in-house solvers for various purposes, enabling them to work around hardware limitations.

(The Decline of) Accurate HPC

HPC has historically required FP32 and preferred FP64 accuracy. Computations that are dependent on previous computations exhibit increasing error. For example, when running a computation over many timesteps, error will compound and the final result is significantly inaccurate.

As Hyperion Research noted in their April 2025 HPC User Forum update, legacy GPU vendors are reducing the amount of FP64-capable cores in favor of lower precision like FP16, FP8, and MX6.

image-20250609-210328.png

Similarly, Timothy Morgan (The Next Platform, emphasis ours) states:

We are snobs about FP64 and we want those running the most intense simulations in the world for weather and climate modeling, or materials simulation, or turbulent airflow simulation, and dozens of other key HPC workloads that absolutely need FP64 floating point to be getting good value for the compute engines they are buying.

With the “Blackwell” B100 announced last year and ramping now, the peak vector FP64 performance was only 30 teraflops, which is a 10.5 percent decline from the peak FP64 performance of the Hopper GPU. And on the tensor cores, FP64 was rated the same 30 petaflops, which is a 55.2 percent decline for peak FP64 performance on the Hopper tensor cores. To be sure, the Blackwell B200 is rated at 40 teraflops on FP64 for both the vector and tensor units, and on the GB200 pairing with the “Grace” CG100 CPU, Nvidia is pushing it up to 45 teraflops peak FP64 for the vectors and the tensors. The important thing is tensor FP64 is not double that of vector FP64 with Blackwells, and in many cases, customers moving to Blackwell will pay more for less FP64 oomph than they did for Hopper GPUs.

Unlike our competitors, which dedicate a significant area of the chip for lower-precision math, we dedicate a significant area of the chip to higher-precision math.

(The Lack of) Consumer HPC

Legacy GPUs with sufficient FP64 processing power have been restricted to workstation or datacenter use. This tremendously restricts the market, as individuals, power users, and researchers need access to large amounts of compute to run useful simulations.

We’ve chosen to go down a different path: all Zeus GPUs, whether consumer or enterprise, have fully unlocked FP64 ALUs:

Zeus Architecture Diagrams_FP64 Performance.png

We enable all markets and users across many pricing tiers to have access to accurate computations!

Enhancing Key Workload Performance

Simply restoring the 2x FP32:FP64 compute ratio was not enough for us. We went further and co-designed Zeus hardware alongside key HPC software. HPC has many “key workloads”, including molecular dynamics, fluid dynamics, and electromagnetic simulation. We started with optimizing electromagnetic simulations.

Zeus performs electromagnetic simulations 300 times faster than Nvidia B200, without sacrificing accuracy or data output capabilities. The software package is called Apollo.

Bolt Zeus Introduction (2).png

In the FPGA, where we test our hardware and software, we already beat the performance of 64-core CPUs and H100 GPUs!

Expandable Memory

We’re giving the power back to the user by enabling them to install the amount of memory they need.

Zeus SKU

Soldered LPDDR5X

Maximum DDR5

Total Max Capacity

Zeus SKU

Soldered LPDDR5X

Maximum DDR5

Total Max Capacity

1c26-032

32 GB

128 GB

160 GB*

2c26-064

64 GB

256 GB

320 GB*

2c26-128

128 GB

256 GB

384 GB*

4c26-256

256 GB

2,048 GB

2,304 GB**

* Using 64 GB 5600 MT/s SO-DIMMs, like Crucial CT64G56C46S5. Users can choose lower density & higher bandwidth modules.
** Using 256 GB 4800 MT/s RDIMMs, like Samsung M321RBGA0B40-CWK. Users can choose lower density & higher bandwidth modules.

We selected LPDDR5X for soldered memory due to its low latency (<100 ns), low cost per bit (>3x lower than GDDR/HBM), and wide availability through multiple suppliers.

We selected DDR5 for expandable memory for the same reasons, including low latency, low cost per bit, and wide availability through many suppliers.

Zeus memory can be configured in various ways:

  • LPDDR5X only, when no DDR5 is installed.

  • Flat, where both LPDDR5X and DDR5 appear as separate address spaces.

  • Tiered, where DDR5 is only visible and LPDDR5X acts as a transparent cache.

Built-In Ethernet

We’ve integrated Ethernet directly into Zeus, eliminating the need for separate network cards (and DPUs). This reduces power consumption, latency, complexity, and cost. Zeus can talk over standard Ethernet to any other Ethernet device. This enables new workflows that previously were too complex and expensive to scale:

007bc613-9b0b-42ef-bacf-630cf8cbbf12.png
Bolt Zeus Announcement External (1).png
Bolt Zeus Announcement External (2).png

Zeus includes support for RDMA over Converged Ethernet v2 (RoCE v2), enabling data transfer between the memories of connected devices without CPU intervention.

Zeus Cluster

Historically, CPU and GPU cores were physically and logically far away from each other, requiring a host-device programming model. The application running on the CPU accesses the GPU device through a PCIe interface. The CPU and GPU have their own memories, and running a workload that performantly uses both the CPU and GPU requires careful programming.

Zeus Architecture Diagrams_Zeus Microarchitecture.png

Zeus eliminates the host-device programming model by enabling applications to access vector, matrix, and other accelerators through CPU instructions. This:

  • Reduces latency when, for example, both the scalar and vector code is tightly coupled and high latency to the vector limits performance.

  • Reduces complexity and latency when sharing data between blocks inside a cluster.

Zeus adopts and extends the RISC-V ISA to enable straightforward interoperability with the wider ecosystem.

As Zeus is a cloud-native GPU, each cluster can be isolated from the others by controlling the router.

High Performance CPU Core

Each Zeus cluster includes a high performance RVA23 RISC-V CPU that offers high single-thread performance:

  • SpecInt2006 ~25/GHz

  • Fmax ~3 GHz

  • 8-wide dispatch, 12 stage pipeline

  • 48-bit addressing

This CPU core runs Linux, eliminating the need for a separate PCIe-connected CPU host.

FP64 Vector Cores

Each Zeus cluster includes a wide vector processor comprised of 32x FP64 ALUs (conformant to RVV 1.0), configured as:

  • ELEN=64

  • DLEN=512

  • VLEN=512

In addition, each vector processor has a low-latency, high bandwidth interface to 2 MB of L1 cache. This results in each FP32 ALU having access to 64 KB of L1 cache, compared to ~5 KB on GB200.

Cache (KB) per FP32 core (1).png

Vectorizable workloads that are memory-bound on other architectures will perform better on Zeus.

For example, a simple peak floating point benchmark intentionally limits data movement to test the theoretical maximum FLOPs:

static const char glsl_fp64_p4_data[] = R"( #version 450 layout (constant_id = 0) const int loop = 1; layout (binding = 0) writeonly buffer c_blob { double c_blob_data[]; }; void main() { const uint gx = gl_GlobalInvocationID.x; const uint lx = gl_LocalInvocationID.x; dvec4 c = dvec4(gx); dvec4 a = c + dvec4(0,1,2,-3); dvec4 b = dvec4(lx) + dvec4(2,3,5,-7); for (int i = 0; i < loop; i++) { c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; c = a * c + b; } c_blob_data[gx] = (c[0] + c[1]) + (c[2] + c[3]); } )";

Reading/writing to more registers or L1 cache typically reduces performance significantly. On Zeus, performance degradation occurs after reading/writing to more registers and L1 cache.

In addition, the vector processor has sufficient bandwidth to memory to keep the ALUs fed (calculated using 5.6 Gbps DDR5 DIMMs and 9.5 Gbps LPDDR5X modules):

Memory Bandwidth (MB_s) per FP32 core.png

Installing faster 8.8 Gbps memory modules increases bandwidth by ~14%.

Accelerators

Each Zeus cluster includes various accelerators made available to applications using custom RISC-V extensions. This fulfills one of the original intents of the RISC-V ISA:

One use-case is developing highly specialized custom accelerators, designed to run kernels from important application domains. These might want to drop all but the base integer ISA and add in only the extensions that are required for the task in hand. The base ISAs have been designed to place minimal requirements on a hardware implementation, and has been encoded to use only a small fraction of a 32-bit instruction encoding space.

Accelerators included in Zeus typically fall into one of three categories:

  • Math functions perform math operations repeated in a software application. Higher CPU/vector overhead.

    • div, sin, cosin, tan, arctan, sqrt, sqrt

  • Kernel-level functions offload specific, high-value functions of a software application for a balance of flexibility and performance. Balanced CPU/vector overhead.

    • raster, fft

  • Application-level functions offload significant portions of a software application for maximum performance. Low CPU/vector overhead.

    • lightning, apollo, neptune

These accelerators are orders of magnitude faster than running scalar, vector, or matrix code. Each accelerator has its own buffers, and also uses the cluster-level shared cache when needed.

For documentation on the RISC-V extensions and how to use these accelerators, please contact us.

Video Encoding & Decoding

Coming Soon!

Matrix Cores

Coming Soon!

Router

Coming Soon!

Telemetry

Each Zeus cluster includes a localized telemetry engine which uses a separate interface to expose counters and data:

  • Power status

  • Fan status

  • Temperature sensor readings

  • Device configuration

  • Device status

  • Voltage integrity: detect voltage drop & noise

  • Timing margin: detect degradation

  • Engine usage

Zeus includes support for OpenBMC modules, which enables Redfish support into OpenStack, SuperMicro Cloud Composer, and various other tools from HPE and Dell.

This telemetry data enhances the value of the Zeus GPU by giving operators access to data that helps them monitor performance and predict degradation and failures before an application experiences downtime.

In addition to generating a heartbeat and reporting key telemetry, each cluster also reports link status and latency between clusters (inside and across chiplets), enabling the router and application generating the programmable chiplet routing table to find optimal paths between clusters and avoid dead links.

Zeus Chiplet

Each Zeus 1c26 chiplet is composed of 32x Zeus clusters connected to each other in a 2D mesh:

Each chiplet includes UCIe interfaces configured to run AXI Streaming mode. This enables clusters on another chiplet to logically appear as if they are on the same chiplet.

The table below lists key specifications of the entry-level Zeus 1c26 chiplet:

Zeus 1c26 Chiplet

Spec

Zeus 1c26 Chiplet

Spec

CPU Cores

32 @ 3 GHz

FP64 vector cores

1,024

Total on-chip caches

128 MB

UCIe interfaces

320 lanes @ 32 Gbps
1,280 GB/s total bandwidth
~10 ns latency

DDR5 interfaces

2x 80b @ 8.8 Gbps maximum
140.8 GB/s total bandwidth
<100 ns latency

LPDDR5X interfaces

8x 32b @ 9.6 Gbps maximum
307.2 GB/s total bandwidth
<100 ns latency

Video Encoders

2x 8K60 streams
AV1, H.264, H.265
In-pipeline compositing and masking

Video Decoders

2x 8K60 streams
AV1, H.264, H.265
In-pipeline compositing and masking

Display Interfaces

HDMI 2.1a
DisplayPort 2.1b

The Zeus 1c26 chiplet is connected to an I/O chiplet to convert UCIe to PCIe, providing:

Zeus 1c26 Chiplet

Spec

Zeus 1c26 Chiplet

Spec

PCIe Interfaces

2x PCIe Gen5 x16
Each interface is bifurcatable up to 4x4
Each interface can act as a root or endpoint port
Each interface supports CXL 3.0 devices

Ethernet Interfaces

400 GbE QSFP-DD (8x56G)
Supports 10 ft passive DACs

RJ-45 RedFish BMC

Zeus Chiplet Scaling

Zeus packages are configured with 1x, 2x, or 4x Zeus chiplets. The diagrams below show the package configurations:

 

 

Zeus Architecture Diagrams_Zeus 1c26-032.png
Zeus Architecture Diagrams_Zeus 2c26-064-128.png
Zeus Architecture Diagrams_Zeus 4c26-256.png

 

Bolt Zeus Introduction.png

Memory Latency

Latency to memory has a major impact on workload performance. The chart below shows a benchmark comparing latency across various test sizes, from 4 KiB to 3 GiB:

Memory Latency (ns, log).png

Due to the inclusion of larger on-chip caches, Zeus experiences lower latency up to 2 MB. When testing larger sizes (forcing access to DRAM), LPDDR5X enables lower latency than GDDR6X and GDDR6.

Zeus PCIe Card

We’re designing two PCIe cards (a single- and dual-slot version) for mass production in 2027, and taking a slightly different approach compared to our competitors:

  1. Unified branding and PCB designs: There are no AIB partners designing/producing different PCBs based off “reference designs” we create. Instead, we are simply designing the PCBs and working with various CMs to produce the cards.

    1. Users no longer need to worry about the quality of different AIB brands

    2. Users no longer see pricing spreads by different AIB brands/tiers

  2. Open PCB designs: We will open access to the PCB schematics and simulations to partners, who can then build all types of cooling mechanisms and accessories for Zeus PCIe cards.

    1. Partners can easily design all sorts of heatsinks, waterblocks, LN2, etc.

    2. Partners can easily design daughterboards that use Zeus' extra PCIe x16 edge connector, including NVMe, other accelerators, other I/O including expanded display interfaces.

Zeus PCIe cards will be available for purchase directly from Bolt and through distributors and retailers.

Zeus Server

The Zeus 4c26-256 package is quite large and relies on full-size DDR5 RDIMMs for expanded memory capacity. As a result, it does not fit in the PCIe or OAM form factor. We have opted to design and partner with leading CMs to produce motherboards with 4x Zeus 4c26-256 GPUs:

Bolt Early Access Program HPC.png

Each Zeus 4c26-256 is connected to the others using 2x 800 GbE ports. Each package is also directly connected to up to 8x PCIe Gen5 x5 NVMe devices. As these PCIe interfaces support CXL 3.0, memory expander devices can be used here.

The Zeus server is not configured with an external CPU host, reducing cost, power consumption, and complexity.

The table below lists key specifications of the 2U server with 4x Zeus 4c26-256 GPUs:

 

Bolt Zeus 2U Server

 

Bolt Zeus 2U Server

System Peak Power

~2,000 W

FP64 / FP32 / FP16 vector tflops

80 / 160 / 320

INT16 / INT8 matrix pflops

5 / 10

On-chip cache

2 GB

Memory

Up to 9,216 GB @ 5.8 TB/s
1 TB LPDDR5X + 32x DDR5 DIMMs

Path Tracing

1,228 gigarays

Video Encoding & Decoding

32x 8K60 streams AV1, H.264, H.265

I/O

8x 800 GbE (OSFP or QSFP-DD)
32x PCIe 5.0 x4

Zeus Rack

The Zeus 2U server is configured with 8x 800 GbE ports, enabling massive I/O. There are a few ways to connect servers in a rack to each other. We have found that connecting the servers directly to each other lowers costs, complexity, and power.

The Zeus rack design peaks at around 44 KW: this can be air cooled. We are working on a 1U variant that is liquid cooled, doubling performance, capacity, and power density to almost 90 KW.

Please reach out to us for the server configuration options (memory, NVMe, etc).

In this configuration, half of the 800 GbE ports are used to connect to the next server (in a 2D mesh, inside the rack and across neighboring racks). These can lower power, inexpensive passive DACs due to the short distance required ( 2ft DAC between servers in a rack, and 5-10 ft DAC between racks).

Optical cables are needed to connect each server to one or more switches. The remaining 4x 800 GbE ports provide flexibility in configuring a backend and frontend network in addition to the local 2D mesh.

The table below lists key specifications of the rack with 20x Zeus 2U servers installed:

 

Bolt Zeus Rack

 

Bolt Zeus Rack

Peak Power

~40,000 W compute
~4,000 W network

FP64 / FP32 / FP16 vector pflops

1.6 / 3.2 / 6.4

INT16 / INT8 matrix pflops

5 / 10

On-chip cache

40 GB

Memory

Up to 184 TB @ 116 TB/s

Path Tracing

24 terarays

Video Encoding & Decoding

640x 8K60 streams AV1, H.264, H.265

I/O

640x PCIe 5.0 x4

Software

Zeus adopts the RISC-V standard mainly to integrate inside an existing rapidly growing ecosystem. Emulators like QEMU are widely available and robust, enabling porting and testing work without physical hardware. Various developer boards and SBCs with RVA23 conformant cores are already in development.

Core Software

Core software includes the firmware, bootloader, OS, and drivers:

  • Bootloader: Coreboot with Linuxboot for enhanced security and portability across existing architectures (x86, Arm)

  • Firmware: Bolt PLDM-compliant firmware delivery through RedFish

Due to third-party IP licensing, Zeus firmware is not open source. To request access to the source for security analysis, please contact us. We can make the source available under NDA.

Operating Systems

We are working with the Canonical team to expand support for Ubuntu and Ubuntu packages on RISC-V.

Porting & Optimizing

The RISE project stewards the RISC-V software ecosystem: https://riseproject.dev/

Porting software to RISC-V involves a few key steps:

Zeus Architecture Diagrams_RISCV.png

We highly recommend following RISE’s https://riscv-optimization-guide.riseproject.dev/.

QEMU

Setting up QEMU with RISC-V is very straightforward and only takes a few minutes: https://risc-v-getting-started-guide.readthedocs.io/en/latest/linux-qemu.html

Help us (and the entire RISC-V ecosystem) out by setting up QEMU and trying to compile and run code that is interesting to you!

Compiler Toolchain

LLVM

For more information on LLVM & RISC-V: https://llvm.org/docs/RISCVUsage.html

gcc

For more information on gcc & RISC-V: https://github.com/riscv-collab/riscv-gnu-toolchain

Languages & Runtimes

C & C++

Use LLVM or gcc to compile your code to RISC-V.

Julia

RISC-V support is ongoing: https://github.com/JuliaLang/julia/issues/57569

Python

Python 3 installs on RISC-V, with no special changes required to the package! An updated list of available packages: https://gitlab.com/riseproject/python

Fortran

Fortran has been supported on RISC-V for 4+ years (through flang-new and flang): https://blog.llvm.org/posts/2025-03-11-flang-new/

GPU APIs

Cuda

Coming Soon!

OpenCL

Coming Soon!

Vulkan

Coming Soon!

DirectX

Coming Soon!

OpenGL

Coming Soon!

Telemetry & Debugging

Coming Soon!