Parallax

Universal GPU Acceleration for C++ Parallel Algorithms

Zero code changes. Any GPU. Pure Vulkan.

parallax_demo.cpp
// Use parallax::allocator for GPU-accessible memory
std::vector<float, parallax::allocator<float>> data(1'000'000, 1.0f);

// Just use std::execution::par - runs on GPU automatically!
std::for_each(std::execution::par,
              data.begin(), data.end(),
              [](float& x) { x *= 2.0f; });

// Memory coherence is automatic - no explicit sync needed!
// Works on AMD, NVIDIA, Intel - anything with Vulkan

Why Parallax?

🌍

Universal

Works on AMD, NVIDIA, Intel, Qualcomm, ARM Mali - any GPU with Vulkan 1.2+

Zero Overhead

Direct Vulkan compute. No translation layers, no runtime interpretation

🧠

Smart Memory

Unified memory with automatic host-device synchronization

🔓

Open Source

Apache 2.0 licensed. Community-driven development

Performance Results

Production benchmarks on NVIDIA GTX 980M (December 2025)

744M
Elements/sec (std::for_each)
1M element dataset
732M
Elements/sec (std::transform)
1M element dataset
100%
Test Pass Rate
47/47 conformance tests
0
Source Changes
Pure ISO C++20

Throughput by Dataset Size

Dataset Size std::for_each std::transform Efficiency
1K elements 0.18 M/s 0.38 M/s Low (overhead)
10K elements 36.77 M/s 36.63 M/s Good
100K elements 228 M/s 243 M/s Excellent
1M elements 744 M/s 732 M/s Excellent

Architecture

C++ Application (std::execution)
Parallax Runtime (C ABI)
Vulkan Compute Backend
GPU Driver (AMD, NVIDIA, Intel, etc.)

Ready to accelerate your C++ code?

Get Started Now