Parallax

Universal GPU Acceleration for C++ Parallel Algorithms

Zero code changes. Any GPU. Pure Vulkan.

parallax_demo.cpp

// Use parallax::allocator for GPU-accessible memory
std::vector<float, parallax::allocator<float>> data(1'000'000, 1.0f);

// Just use std::execution::par - runs on GPU automatically!
std::for_each(std::execution::par,
              data.begin(), data.end(),
              [](float& x) { x *= 2.0f; });

// Memory coherence is automatic - no explicit sync needed!
// Works on AMD, NVIDIA, Intel - anything with Vulkan

Why Parallax?

🌍

Universal

Works on AMD, NVIDIA, Intel, Qualcomm, ARM Mali - any GPU with Vulkan 1.2+

⚡

Zero Overhead

Direct Vulkan compute. No translation layers, no runtime interpretation

🧠

Smart Memory

Unified memory with automatic host-device synchronization

🔓

Open Source

Apache 2.0 licensed. Community-driven development

Performance Results

Production benchmarks on NVIDIA GTX 980M (December 2025)

744M

Elements/sec (std::for_each)

1M element dataset

732M

Elements/sec (std::transform)

1M element dataset

100%

Test Pass Rate

47/47 conformance tests

Source Changes

Pure ISO C++20

Throughput by Dataset Size

Dataset Size	std::for_each	std::transform	Efficiency
1K elements	0.18 M/s	0.38 M/s	Low (overhead)
10K elements	36.77 M/s	36.63 M/s	Good
100K elements	228 M/s	243 M/s	Excellent
1M elements	744 M/s	732 M/s	Excellent

View Full Benchmark Results