NeuReality
Semiconductor ManufacturingHaifa, Israel51-200 Employees
AI infrastructure has a hidden problem: the network and orchestration layer. As models scale to trillions of parameters and inference demand explodes, two bottlenecks emerge: how data moves between GPUs and how workloads are managed across them. The industry added more GPUs, scaled clusters, optimized models. But utilization still hovers around 50-70%. The compute is there, idle, burning watts. The bottleneck isn't the silicon. It's how data moves and how work gets distributed. Traditional networking was built for general-purpose workloads, not AI's east-west traffic and microsecond-sensitive synchronization. Traditional orchestration treats GPUs as generic compute, blind to the demands of prefill, decode, and model synchronization. Every GPU cycle wasted waiting is money and energy lost. We asked: What if the network wasn't just faster, but intelligent? What if orchestration understood AI workloads natively? NR-NEXUS is an inference operating system for large-scale inference. Hardware-agnostic, it unifies fragmented open-source frameworks into a single production platform, running across hyperscale clouds, GPU clusters, and emerging XPUs. NR2 AI-SuperNIC eliminates data-movement bottlenecks limiting GPU utilization. It executes the networking data path in hardware with no CPUs in the critical path, integrates in-network compute to offload communication operations, and supports open Ethernet-based networking. Together, they transform distributed GPU and XPU clusters into high-throughput token factories. The result: GPUs at near-100% utilization. Inference scales without adding racks. Energy consumption drops. This isn't incremental optimization. It's rethinking the data path and control plane so AI infrastructure matches AI ambition. For our customers: maximum performance from existing hardware. Lower cost, lower power, lower latency, higher throughput. NeuReality is headquartered in Tel Aviv with offices across North America and Europe.