rl-list.com
UPDATED 2026.06.07
// RL ENVIRONMENT VENDORS, 2026 INDEX

RL Environment Vendors
2026 Directory & Rankings

A map of the 38 companies building RL environments for frontier AI labs, profiled from public sources and what vendors share directly.

Every figure is cited, we track funding, team & research depth, customers, security (SOC 2), focus areas, and scale.
38
vendors tracked
4
segments
10
disclosed SOC 2
15
open source

RL environments are the simulated tasks and worlds used to train and evaluate AI agents with reinforcement learning, from the open-source Gymnasium API (the successor to OpenAI Gym) to bespoke coding worlds behind benchmarks like SWE-bench. This directory maps the companies building them commercially, along with the human-data and RLHF work that sits alongside. For the foundations, see Sutton & Barto's Reinforcement Learning: An Introduction. How we research and rank vendors is on the methodology page.

RL environment vendor directory

38 of 38 vendors
#1

Mechanize

Commercial
Mechanize is a small, elite San Francisco vendor (founded April 2025 by ex-Epoch AI researchers Matthew Barnett, Tamay Besiroglu, and Ege Erdil) that builds a small number of robust, high-fidelity RL environments and evals for frontier coding agents, selling to leading AI labs. Its stated long-term mission is the full automation of valuable economic work via simulated 'digital office' environments.
Raised
$9.1M
Headcount
11-50
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Nat Friedman, Daniel Gross, Patrick Collison
CodingComputer UsePrivate Codebases
site ↗
#2

AfterQuery

Commercial
AfterQuery is a San Francisco applied-research lab and data platform (YC W25) that supplies frontier AI labs with expert-generated human data (SFT, RL rubrics), agent/RL environments, and computer-use trajectories, drawn from a large network of verified practitioners. It publishes real-task benchmarks such as Terminal-Bench, VADER, FinanceQA, and IDE-Bench, positioning around capturing how domain experts (engineers, financial analysts, lawyers) reason.
Raised
$30.5M
Headcount
51-200
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: Altos Ventures (lead, Series A), The Raine Group, Y Combinator
CodingComputer UseEnterprise Workflows
site ↗
#3

Deeptune

Commercial
Deeptune is a New York-based startup building managed reinforcement-learning environments ('training gyms') for computer-use and code, where AI agents practice and are evaluated on realistic digital knowledge-work tasks (simulating tools like Slack and Salesforce). It sells these pre-built environments primarily to frontier AI labs and raised a $43M Series A led by a16z, announced March 2026.
Raised
$43M
Headcount
11-50
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Andreessen Horowitz (a16z, lead), 776, Abstract Ventures
CodingComputer UsePrivate Codebases
site ↗
#4

Bespoke Labs

Commercial
Bespoke Labs is an applied AI research lab (Mountain View, CA, founded 2024) focused on data curation and RL-environment curation for training and evaluating agents, known for open datasets and reproducible recipes (OpenThoughts) and open-source tools (Curator, Evalchemy). It pairs a public open-source/open-data presence with commercial custom data and RL-environment delivery.
Raised
$7.25M
Headcount
11-50
Founded
2024
SOC 2
unknown
Researchers
yes
Open src
yes
Pedigree: Co-founder/CEO Maheswaran (Mahesh) Sathiamoorthy, ex-Google DeepMind
Long-Horizon
site ↗
#5

Huzzle Labs

Commercial
Huzzle Labs is the AI division of London-based talent platform Huzzle (founded ~2020 by Ingmar Klein, Parham Rakhshanfar, and Amit Choudhary). It positions itself as a human-intelligence data foundry that builds RL environments (code, tool-use, computer-use, long-horizon enterprise workflows), expert trajectory data, and contextual evaluations for frontier AI labs and regulated European enterprises, leveraging Huzzle's vetted PhD/expert network. It bundles environments, human data, and evals in one stack.
Raised
$6M
Headcount
11-50
Founded
2020
SOC 2
Type II
Researchers
yes
Open src
partial
Backers: 10x Founders, Angel Invest, Emerge
CodingComputer UsePrivate Codebases
site ↗
#6

Fleet AI

Commercial
Fleet AI builds high-fidelity reinforcement-learning training environments ('gyms') that replicate enterprise software such as Salesforce and Excel, plus browser/desktop workflows, so frontier AI labs and large enterprises can train and evaluate computer-use agents. It ships a Python SDK, a platform API, and the open-source 'Harbor' agent-evaluation/RL-environment tooling, pairing simulated environments with human supervision.
Raised
$15M
Headcount
11-50
Founded
2024
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: Sequoia Capital, Menlo Ventures, SV Angel
Computer UseEnterprise Workflows
site ↗
#7

Datacurve

Commercial
Datacurve is a YC W24 commercial data vendor that supplies expert-curated frontier coding data, RLHF traces, and repository-wide reinforcement learning environments (with unit-test verifiers) to foundation model labs, sourced via its Shipd bounty platform of vetted software engineers. It also publishes DeepSWE, a long-horizon agentic coding benchmark.
Raised
$17.7M
Headcount
11-50
Founded
2024
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Chemistry (Mark Goldberg, lead Series A), Y Combinator, Balaji Srinivasan (seed)
CodingPrivate Codebases
site ↗
#8

Proximal

Commercial
Proximal is a San Francisco-based (with a Bangalore presence) research lab for coding data, building high-fidelity, long-horizon reinforcement learning environments grounded in real codebases to train and evaluate frontier coding agents. It emphasizes scalable, software-driven data engines over human contractors, and research into reward-hacking detection and 'fuzzy verifiers' that score code quality beyond functional correctness.
Raised
Headcount
11-50
Founded
2026
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: Scribble Ventures (lead), Angels from OpenAI, Anthropic, Thinking Machines, Google DeepMind, xAI, Meta Superintelligence, Cursor and Cognition (per founders' own statements; not independently verified)
CodingLong-HorizonPrivate Codebases
site ↗
#9

Gray Swan AI

Commercial
Gray Swan AI is a Pittsburgh-based AI security company spun out of Carnegie Mellon, offering adversarial red-teaming and runtime protection for AI models and agents via three products: Arena (a crowdsourced adversarial red-teaming network of 15,000+ researchers), Shade (automated red-teaming/pressure-testing), and Cygnal (runtime input/output guardrails). It positions itself as a security/evaluation partner to frontier labs and enterprises rather than a general RL-environment vendor.
Raised
$40M
Headcount
11-50
Founded
2023
SOC 2
Type II
Researchers
yes
Open src
no
Backers: Wing Venture Capital (co-lead), Madrona (co-lead), Obvious Ventures
#10

Veris AI

Commercial
Veris AI sells a high-fidelity simulation platform plus a production runtime that let enterprises train, evaluate, and govern AI agents against mocked enterprise tools before and during production, with support for reinforcement learning / fine-tuning pipelines. It positions itself as the enterprise 'environment layer' that agent builders lack.
Raised
$8.5M
Headcount
1-10
Founded
2025
SOC 2
claimed
Researchers
yes
Open src
no
Backers: Decibel Ventures (lead), Acrew Capital (lead), The House Fund
Enterprise Workflows
site ↗
#11

Chakra Labs

Commercial
Chakra Labs runs Dojo, an open/collaborative reinforcement-learning environment hub for computer-use agents, offering deterministic, frame-accurate clones of production software plus human computer-use trajectory datasets, with native support for the Harbor, Verifiers and Verl RL frameworks. It positions itself as bringing frontier-lab-grade CUA training infrastructure to the broader research community.
Raised
$10.1M
Headcount
11-50
Founded
2024
SOC 2
unknown
Researchers
yes
Open src
yes
Pedigree: Alexander Fung (co-founder), ex-Palantir, Snap/Snapchat, Fin; Compute
Computer Use
site ↗
#12

Andon Labs

Commercial
Andon Labs is a Y Combinator-backed (W24) startup, formerly Vectorview, building benchmarks and evaluations for AI agents' long-horizon coherence and safety (Vending-Bench, Butter-Bench, Blueprint-Bench) and operating real-world autonomous AI businesses. It is known for high-profile collaborations placing AI-run vending machines/stores in the offices of frontier labs Anthropic (Project Vend) and xAI (Grokbox).
Raised
$500K
Headcount
11-50
Founded
2023
SOC 2
unknown
Researchers
yes
Open src
?
Backers: Y Combinator (W24)
Computer UseLong-Horizon
site ↗
#13

Sepal AI

Commercial
Sepal AI is a YC-backed (S24) San Francisco data-research company that builds high-quality training data, expert-graded evaluation benchmarks, and reinforcement-learning environments for frontier LLMs, drawing on a network of 20k+ domain experts (PhDs, finance, medical, STEM). It was acquired by Mercor in February 2026.
Raised
$500K
Headcount
?
Founded
2024
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Y Combinator, Metaplanet Holdings, SID Venture Partners
Enterprise WorkflowsLong-HorizonMath
site ↗
#14

HUD

Commercial
HUD (YC W25, formerly hud.so) is a platform for building reinforcement-learning environments and evaluations for computer-use and browser agents. It lets teams wrap real software/code as agent-callable tools in isolated containers, define tasks and rewards, and run evals/RL at scale via an open-source SDK plus a cloud-hosted gateway. It maintains public benchmarks (OSWorld-Verified contributions, SheetBench-50) and positions frontier AI labs and agent-first startups as its target customers.
Raised
Headcount
11-50
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: Y Combinator (W25 batch), Exceptional Capital
Computer UseEnterprise Workflows
site ↗
#15

Vals AI

Commercial
Vals AI is an independent, third-party benchmarking and evaluation platform that scores LLMs and AI applications (copilots, RAG, agents) on rigorous, domain-specific tasks in regulated fields such as legal, finance, healthcare, tax and coding. It publishes public leaderboards (e.g., the Vals Index, Finance Agent benchmark, Vals Legal AI Report) and sells private evaluation infrastructure to labs and enterprise engineering teams.
Raised
Headcount
11-50
Founded
2023
SOC 2
claimed
Researchers
yes
Open src
yes
Pedigree: Co-founder/CEO Rayan Krishnan - ex-Stanford AI master's
#16

Halluminate

Commercial
Halluminate (YC S25, founded 2024, San Francisco) builds managed reinforcement-learning sandbox environments, simulated applications, and human/annotation data plus evaluation benchmarks (WebBench, BrowserBench, Westworld) to train and test computer-use and browser AI agents. Its 2026 site positioning has narrowed toward 'RL environments for financial services' (investment banking, private equity, consulting).
Raised
$160K
Headcount
1-10
Founded
2024
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: Y Combinator (S25), Orange Collective, Antigravity Capital
Computer UseEnterprise Workflows
site ↗
#17

Matrices

Commercial
Matrices builds reinforcement-learning training environments for frontier AI labs to train agents that use computers and browsers like humans, described as a 'gamified replica of the internet' where thousands of agents learn via RL. The company frames its mission as 'towards self-driving computers' and says it helps labs train computer-use agents (Operator-class systems). Note: this is the correct browser-native entity (matrices.ai / LinkedIn 'matricesapp'), distinct from the similarly named 'Matrice.ai' computer-vision company and 'Matrix AI Network' blockchain project.
Raised
$5M
Headcount
11-50
Founded
2023
SOC 2
unknown
Researchers
?
Open src
no
Backers: Index Ventures, AI Grant (Nat Friedman & Daniel Gross), Naval Ravikant
Computer Use
site ↗
#18

BenchFlow

Commercial
BenchFlow is an early-stage, YC-backed open-source 'environment lab' building evaluation infrastructure and a community Benchmark Hub for AI agents, with products including SkillsBench, ClawsBench (mock workplace environments) and a sandboxed agent runtime. It positions environments as 'the new data' for training and evaluating agents across domains like enterprise workflows, coding, computer use and browser tasks.
Raised
$1.0M
Headcount
1-10
Founded
2024
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: Y Combinator, Pear VC, Construct Capital
CodingComputer UseEnterprise Workflows
site ↗
#19

Collinear

Commercial
Collinear AI operates a 'Simulation Lab' (SimLab) that builds sandboxed, stateful RL environments simulating enterprise users, tools (Jira, ServiceNow, Shopify, EMR, airline/hotel systems) and multi-step workflows, producing training-ready trajectories, reward signals and evals for agentic models. It also offers synthetic post-training data and LLM-judge evaluation, positioning itself around 'environment-as-a-service' for enterprise long-horizon agents.
Raised
Headcount
11-50
Founded
2023
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Engineering Capital, Firestreak Ventures, 112 Capital (11.2 Capital)
CodingComputer UseEnterprise Workflows
site ↗
#20

Refresh

Commercial
Refresh (YC X25) builds simulation engines / RL environments with verifiable rewards for coding and computer use, partnering with frontier labs and enterprises to train AI software-engineering and computer-use 'coworker' capabilities across terminal and GUI.
Raised
Headcount
1-10
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Y Combinator
CodingComputer UsePrivate Codebases
site ↗
#21

Vmax

Commercial
Vmax is a San Francisco reinforcement-learning startup (founded 2025 by three RL/robotics PhDs from UCL and UPenn) that automates the conversion of proprietary data and evals into RL environments for LLM-based agents, targeting long-horizon and coding tasks. Its public research includes unix-ctf, a procedural generator of capture-the-flag tasks for Unix/shell competence.
Raised
Headcount
1-10
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Race Capital, South Park Commons
CodingLong-Horizon
site ↗
#22

Andromede

Commercial
Andromede is an early-stage RL data lab that programmatically generates RL environments, tasks, and verifiers from real-world data for post-training and evaluation of frontier agents, with an emphasis on long-horizon sequential reasoning tasks. As of mid-2026 it is in private beta, working with a small set of partners. It was co-founded by Guillaume Allegre (Founder & President) and Alexandre Sallinen (an EPFL-affiliated researcher who contributed to the Meditron medical-LLM project), and is backed by Unusual Ventures.
Raised
Headcount
1-10
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Unusual Ventures
Long-Horizon
site ↗
#23

Plato

Commercial
Plato (plato.so, Plato Technologies, Inc.) builds simulated worlds for training and evaluating browser and computer-use agents, recreating real websites/software (e.g. Amazon/Airbnb/Gmail-style replicas) as reinforcement-learning environments with structured APIs for interaction, state tracking and scoring. It also offers a 'Computer Use' capability driving a full Linux desktop, positioning at the intersection of browser interaction and enterprise workflow simulation.
Raised
Headcount
1-10
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
no
Pedigree: Pranav Putta (Co-founder/CTO), prior MultiOn, Georgia Institute of Te
Computer UseEnterprise Workflows
site ↗
#24

AIChamp

Commercial
AIChamp builds custom reinforcement-learning environments and 'Virtual Gym' simulations for training and evaluating tool-using AI agents on long-horizon, multi-step enterprise tasks, pairing engineered environments (agents operating in software like Slack, Notion, Linear) with domain experts who design and grade tasks (SFT/RLHF/process supervision). The company emphasizes deep industry authority and expert-sourced data, having pivoted from a remote-talent/hiring marketplace background.
Raised
$0
Headcount
1-10
Founded
?
SOC 2
unknown
Researchers
?
Open src
no
Pedigree: Self-claimed 'alumni of OpenAI and xAI team' (vendor site, unverified)
Enterprise WorkflowsLong-Horizon
site ↗
#25

Habitat Inc

Commercial
Habitat Inc is an early-stage commercial vendor (2-10 employees, New York HQ) building reinforcement-learning environments for white-collar / work automation, with stated focus on code and desktop-style (computer use) interaction tasks for training agentic AI models. It appears in third-party listings of RL-environment suppliers serving AI labs. No funding, customer, or certification information is publicly available.
Raised
Headcount
1-10
Founded
?
SOC 2
unknown
Researchers
?
Open src
no
Pedigree: Maxim Enis (co-founder), Williams College '24; prior Ramp association
CodingComputer UseEnterprise Workflows
site ↗
n/r

Scale AI

Incumbent
Scale AI is the data-labeling and AI-data incumbent that has extended into RL environments, offering simulated web apps, macOS/Windows-like desktop VMs, and MCP-tool environments (Slack, HubSpot, Linear) with expert-designed objectives, rubrics, and automated verifiers to train and evaluate agents on long-horizon professional workflows. Following Meta's ~$14.3B June 2025 investment (~49% non-voting stake) and founder Alexandr Wang's departure to Meta, several frontier-lab customers (OpenAI, Google, xAI) reportedly scaled back or paused engagement over conflict-of-interest concerns.
Raised
$1.6B
Headcount
200+
Founded
2016
SOC 2
Type II
Researchers
yes
Open src
no
Backers: Meta Platforms, Accel, Amazon
CodingComputer UseEnterprise Workflows
site ↗
n/r

Modal

Infrastructure
Modal (Modal Labs) is a New York-based, Python-native serverless cloud purpose-built for AI/ML workloads, providing on-demand GPU/CPU compute, fast-booting sandboxed containers, inference, fine-tuning, and code execution. It is execution infrastructure rather than an RL-environment vendor, but is used to run reinforcement-learning training and large fleets of parallel sandboxed environments for AI labs.
Raised
$466M
Headcount
51-200
Founded
2021
SOC 2
Type II
Researchers
no
Open src
no
Backers: General Catalyst (Series C co-lead), Redpoint Ventures (Series C co-lead; earlier Series A lead), Lux Capital (earlier round lead)
Coding
site ↗
n/r

Mercor

Incumbent
Mercor is a venture-backed expert-marketplace and AI-training-data company that organizes a network of ~30,000+ domain experts (doctors, lawyers, bankers, engineers) to produce RLHF data, evaluations, and reinforcement-learning environments for frontier AI labs and enterprises. Originally an AI-recruiting platform, it pivoted to human-data/RL services and expanded its RL-environment capability via the February 2026 acquisition of Sepal AI.
Raised
$492M
Headcount
51-200
Founded
2023
SOC 2
unknown
Researchers
yes
Open src
no
Backers: Felicis Ventures (led Series C and Series B), Benchmark, General Catalyst
CodingEnterprise WorkflowsLong-Horizon
site ↗
n/r

Surge AI

Incumbent
Surge AI is a bootstrapped, high-revenue human-data and RLHF labeling leader serving frontier AI labs, which has expanded into agentic RL environments via its EnterpriseBench suite (notably the CoreCraft enterprise customer-support simulation) and accompanying published benchmarks. As of June 2026 a reported ~$1B first external raise at a ~$25B valuation was in talks but not confirmed closed.
Raised
$1B
Headcount
51-200
Founded
2020
SOC 2
unknown
Researchers
yes
Open src
no
Pedigree: Founder/CEO Edwin Chen: ex-Google, ex-Facebook, ex-Twitter ML teams; M
Enterprise WorkflowsLong-Horizon
site ↗
n/r

Prime Intellect

Open source
Prime Intellect operates an open-source RL stack - the Environments Hub (2,500+ community RL environments), the Verifiers library and prime-rl training framework, plus hosted RL post-training (Lab), evals, inference and on-demand GPU compute. It positions itself as the open alternative to closed big-lab RL tooling and also trains open models (INTELLECT series).
Raised
$70.4M
Headcount
11-50
Founded
2023
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: Founders Fund (led $15M round), Menlo Ventures, Distributed Global (co-led seed)
CodingEnterprise WorkflowsLong-Horizon
site ↗
n/r

Daytona

Infrastructure
Daytona provides secure, elastic, programmatic sandboxes ('computers') that AI agents and developers can spin up in under ~90ms to run untrusted AI-generated code in isolated, stateful runtimes. It offers a managed-hosted service plus an open-source self-hostable stack, and is positioned as agent-native execution/runtime infrastructure for code execution, computer use, and RL/eval workloads.
Raised
$31M
Headcount
11-50
Founded
2023
SOC 2
Type I
Researchers
?
Open src
yes
Backers: FirstMark Capital (Series A lead; Matt Turck joined board), Pace Capital, Upfront Ventures (seed lead, Series A participant)
CodingComputer Use
site ↗
n/r

E2B

Infrastructure
E2B provides open-source, secure cloud sandboxes (built on Firecracker microVMs) for running AI-generated code and AI agents, offered as a hosted API with BYOC/on-prem/self-hosted options. It positions as execution infrastructure for enterprise AI agents and self-claims broad Fortune 100 adoption.
Raised
$32M
Headcount
11-50
Founded
2023
SOC 2
Type II
Researchers
no
Open src
yes
Backers: Insight Partners (Series A lead), Decibel (seed lead), Sunflower Capital
CodingComputer Use
site ↗
n/r

Runloop

Infrastructure
Runloop sells cloud-hosted, isolated micro-VM 'devboxes' plus blueprints, snapshots and benchmark/eval tooling that give AI coding agents a secure execution environment for development, evaluation, and reinforcement/supervised fine-tuning (RFT/SFT) loops. It is execution infrastructure for agent builders and model labs rather than an RL-data/environments vendor itself.
Raised
$7M
Headcount
11-50
Founded
2024
SOC 2
claimed
Researchers
no
Open src
yes
Backers: The General Partnership (lead), Blank Ventures, Exponent Founders Capital
Coding
site ↗
n/r

General Reasoning

Open source
General Reasoning is an AI research lab (operating research hub in London; legal entity General Reasoning, Inc. registered in the US) building open RL environments and infrastructure for training and evaluating agents over long horizons. Its OpenReward platform and Open Reward Standard (ORS) provide an open specification for connecting language models to community-built RL environments, with 330+ environments accessible through one API.
Raised
$10.9M
Headcount
11-50
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
yes
Pedigree: Ross Taylor (co-founder/CEO) - ex-Meta AI/FAIR, research lead on Galac
CodingLong-Horizon
site ↗
n/r

Cua

Infrastructure
Cua (trycua, YC X25) is open-source MIT-licensed infrastructure for computer-use agents, providing cloud and self-hosted sandboxes across macOS, Windows, Linux, and Android plus an SDK, a virtualization layer (Lume), and a benchmarking/RL-eval suite (Cua-Bench). It positions itself as the 'Docker for computer-use agents,' giving any agent a cloud desktop.
Raised
$500K
Headcount
1-10
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: Y Combinator (X25 batch)
Computer Use
site ↗
n/r

Good Start Labs

Open source
Good Start Labs is a 2025 Every spin-out that builds game-based environments to generate reinforcement-learning data and evaluate frontier models, using both custom games and partnerships with existing games where player behavior helps train and rank AI. It is known for AI Diplomacy / Diplomacy Arena (multi-agent long-horizon strategy) and LOL Arena (humor preference), and publishes openly on GitHub and Hugging Face.
Raised
$3.6M
Headcount
1-10
Founded
2025
SOC 2
unknown
Researchers
yes
Open src
yes
Backers: General Catalyst, Inovia Capital, Tirta Ventures
Long-Horizon
site ↗
n/r

Morph

Infrastructure
Morph (Morph Labs) provides snapshot-based VM compute for AI agents via its Infinibranch / Liquid Metal technology, which can snapshot, branch, and restore entire computational environments in roughly 100-250ms to enable massively parallel, reversible ('Git for compute') agent rollouts, evaluations, and reasoning-time branching. It markets the platform (Morph Cloud) as infrastructure for running and scaling agent/RL verification environments rather than as an RL-environment dataset vendor itself.
Raised
Headcount
1-10
Founded
2023
SOC 2
claimed
Researchers
yes
Open src
yes
Backers: Christian Szegedy (reported seed/angel investor; also Chief Scientist), amount undisclosed
Coding
site ↗
n/r

Turing

Incumbent
Turing is a large AGI-infrastructure and engineering-services company that supplies frontier AI labs with coding data, human expertise, and RL/evaluation data at scale, drawing on a global vetted developer and domain-expert network. Originally a remote-engineering talent marketplace, it now positions itself around 'AGI advancement' through code and reasoning data, including work over private/real codebases.
Raised
Headcount
200+
Founded
2018
SOC 2
unknown
Researchers
yes
Open src
no
HQ: Palo Alto, USA
CodingEnterprise WorkflowsPrivate Codebases
site ↗
#VendorSegmentRaisedHeadcountSOC 2HQConf.
#1Mechanize Commercial$9.1M 11-50 unknown San Francisco, USA
#2AfterQuery Commercial$30.5M 51-200 unknown San Francisco, USA
#3Deeptune Commercial$43M 11-50 unknown New York, NY, USA
#4Bespoke Labs Commercial$7.25M 11-50 unknown Mountain View, California, USA
#5Huzzle Labs Commercial$6M 11-50 Type II London, United Kingdom
#6Fleet AI Commercial$15M 11-50 unknown New York, NY, USA
#7Datacurve Commercial$17.7M 11-50 unknown San Francisco, USA
#8Proximal Commercial 11-50 unknown San Francisco, CA, USA
#9Gray Swan AI Commercial$40M 11-50 Type II Pittsburgh, Pennsylvania, USA
#10Veris AI Commercial$8.5M 1-10 claimed San Francisco, CA, USA
#11Chakra Labs Commercial$10.1M 11-50 unknown Brooklyn, New York, USA
#12Andon Labs Commercial$500K 11-50 unknown San Francisco, USA
#13Sepal AI Commercial$500K unknown San Francisco, USA
#14HUD Commercial 11-50 unknown San Francisco, USA
#15Vals AI Commercial 11-50 claimed San Francisco, USA
#16Halluminate Commercial$160K 1-10 unknown San Francisco, CA, USA
#17Matrices Commercial$5M 11-50 unknown San Francisco, California, USA
#18BenchFlow Commercial$1.0M 1-10 unknown New Castle, DE, USA (incorporation); Bay Area / San Francisco operating presence
#19Collinear Commercial 11-50 unknown Mountain View / Sunnyvale, California, USA
#20Refresh Commercial 1-10 unknown San Francisco, CA, USA
#21Vmax Commercial 1-10 unknown San Francisco, USA
#22Andromede Commercial 1-10 unknown Lausanne, Switzerland
#23Plato Commercial 1-10 unknown San Francisco, CA, USA
#24AIChamp Commercial$0 1-10 unknown San Francisco, USA (CEO-based; reported, not officially confirmed; Tracxn alternatively lists Bali, Indonesia)
#25Habitat Inc Commercial 1-10 unknown New York, NY, USA
n/rScale AI Incumbent$1.6B 200+ Type II San Francisco, California, USA
n/rModal Infrastructure$466M 51-200 Type II New York, NY, USA
n/rMercor Incumbent$492M 51-200 unknown San Francisco, CA, USA (181 Fremont)
n/rSurge AI Incumbent$1B 51-200 unknown San Francisco, California, USA
n/rPrime Intellect Open source$70.4M 11-50 unknown San Francisco, USA
n/rDaytona Infrastructure$31M 11-50 Type I New York, NY, United States
n/rE2B Infrastructure$32M 11-50 Type II San Francisco, USA
n/rRunloop Infrastructure$7M 11-50 claimed San Francisco, CA, USA
n/rGeneral Reasoning Open source$10.9M 11-50 unknown London, United Kingdom (Shoreditch), operating research hub; legal entity General Reasoning, Inc. registered in San Francisco/US per SEC Form D
n/rCua Infrastructure$500K 1-10 unknown San Francisco, CA, USA
n/rGood Start Labs Open source$3.6M 1-10 unknown Brooklyn, NY, USA
n/rMorph Infrastructure 1-10 claimed San Francisco, USA
n/rTuring Incumbent 200+ unknown Palo Alto, USA
No vendors match these filters.
The ranking · 2026

The Top 25 RL Environment Companies in 2026

The reinforcement-learning environment market went from a footnote to a procurement line item in under two years. This ranks the 25 dedicated, pure-play RL-environment vendors by the RL List score, a transparent, documented blend of scale and traction (funding, customers), security and research signals, and how much of each company’s record we could independently verify. It is a formula, not an opinion. Incumbents (Scale AI, Surge AI, Mercor), execution-infrastructure providers, and open-source projects are a different category and are listed separately below rather than ranked. Every figure links to its source.

Ranked by the RL List score · last updated 2026-06-07 · see how it’s computed in the methodology

1

Mechanize

Commercial

Mechanize is a small, elite San Francisco vendor (founded April 2025 by ex-Epoch AI researchers Matthew Barnett, Tamay Besiroglu, and Ege Erdil) that builds a small number of robust, high-fidelity RL environments and evals for frontier coding agents, selling to leading AI labs. Its stated long-term mission is the full automation of valuable economic work via simulated 'digital office' environments.

Backed by Nat Friedman, Daniel Gross, Patrick Collison, Dwarkesh Patel. Team pedigree: Founders Matthew Barnett, Tamay Besiroglu, and Ege Erdil are co-founders/alumni of Epoch AI (an AI research institute Besiroglu co-founded in 2022). Named customers include Anthropic (self-claimed, incl. frontier-lab ties).

$9.1M raisedFounded 2025San Francisco, USA11-50 staffSOC 2: unknown

Best fit: A frontier lab seeking a small set of deep, hard software-engineering RL environments and evals built by elite engineers rather than high-volume crowdsourced data.

2

AfterQuery

Commercial

AfterQuery is a San Francisco applied-research lab and data platform (YC W25) that supplies frontier AI labs with expert-generated human data (SFT, RL rubrics), agent/RL environments, and computer-use trajectories, drawn from a large network of verified practitioners. It publishes real-task benchmarks such as Terminal-Bench, VADER, FinanceQA, and IDE-Bench, positioning around capturing how domain experts (engineers, financial analysts, lawyers) reason.

Backed by Altos Ventures (lead, Series A), The Raine Group, Y Combinator, BoxGroup. Team pedigree: Founders: Spencer Mateega (ex-Meta, ex-Google, Morgan Stanley/Silver Lake; Wharton/Penn), Carlos Georgescu (ex-Citadel Securities, ex-Meta, ex-Google), Danny Tang; Founding team / network cites prior roles at Goldman Sachs, McKinsey, Jane Street, Palantir, NVIDIA, Google. Named customers include Frontier AI labs (unnamed; company claims 'every leading AI lab' is a customer; press separately names OpenAI, Anthropic, Google as labs it serves, but without third-party confirmation) (self-claimed, incl. frontier-lab ties).

$30.5M raisedFounded 2025San Francisco, USA51-200 staffSOC 2: unknown

Best fit: A frontier or enterprise AI team needing expert-authored RL environments, post-training data, and realistic real-task benchmarks across code, finance, and professional workflows.

3

Deeptune

Commercial

Deeptune is a New York-based startup building managed reinforcement-learning environments ('training gyms') for computer-use and code, where AI agents practice and are evaluated on realistic digital knowledge-work tasks (simulating tools like Slack and Salesforce). It sells these pre-built environments primarily to frontier AI labs and raised a $43M Series A led by a16z, announced March 2026.

Backed by Andreessen Horowitz (a16z, lead), 776, Abstract Ventures, Inspired Capital. Team pedigree: Team includes engineers/operators from Anthropic, Scale AI, Palantir, Hebbia, Glean, Retool, Modal (per company/press); CEO/co-founder Tim Lupo: ex-Hebbia founding engineer, USC CS & Business. Named customers include Leading/frontier AI labs (unnamed; company claims '100s of gyms' built for them and contributions to recent computer-use advances) (self-claimed, incl. frontier-lab ties).

$43M raisedFounded 2025New York, NY, USA11-50 staffSOC 2: unknown

Best fit: Frontier labs and model teams needing ready-made, managed RL environments for training/evaluating computer-use and coding agents.

4

Bespoke Labs

Commercial

Bespoke Labs is an applied AI research lab (Mountain View, CA, founded 2024) focused on data curation and RL-environment curation for training and evaluating agents, known for open datasets and reproducible recipes (OpenThoughts) and open-source tools (Curator, Evalchemy). It pairs a public open-source/open-data presence with commercial custom data and RL-environment delivery.

Team pedigree: Co-founder/CEO Maheswaran (Mahesh) Sathiamoorthy, ex-Google DeepMind; Co-founder/Chief Scientist Georgios (Alex) Dimakis, Professor, UC Berkeley (formerly UT Austin). Named customers include Fortune 500 enterprises (unnamed), Frontier labs / top labs (unnamed), Model builders using OpenThoughts datasets (190+ public HF models; unnamed) (self-claimed, incl. frontier-lab ties).

$7.25M raisedFounded 2024Mountain View, California, USA11-50 staffSOC 2: unknown

Best fit: Buyers needing reasoning-focused data curation, open reproducible datasets/recipes, and custom RL-environment/eval data delivery from a research-led team.

5

Huzzle Labs

Commercial

Huzzle Labs is the AI division of London-based talent platform Huzzle (founded ~2020 by Ingmar Klein, Parham Rakhshanfar, and Amit Choudhary). It positions itself as a human-intelligence data foundry that builds RL environments (code, tool-use, computer-use, long-horizon enterprise workflows), expert trajectory data, and contextual evaluations for frontier AI labs and regulated European enterprises, leveraging Huzzle's vetted PhD/expert network. It bundles environments, human data, and evals in one stack.

Backed by 10x Founders, Angel Invest, Emerge, Former CTO of Hugging Face (angel). Team pedigree: RL engineer, ex-Turing; Researchers from IIT Kharagpur and IIT Bombay (incl. PhD). Named customers include Apple, Lazard, Financial Times (self-claimed).

$6M raisedFounded 2020London, United Kingdom11-50 staffSOC 2: Type II

Best fit: Frontier labs and regulated enterprises needing custom RL environments plus expert human trajectory data and evals for code, computer-use, and long-horizon professional workflows.

6

Fleet AI

Commercial

Fleet AI builds high-fidelity reinforcement-learning training environments ('gyms') that replicate enterprise software such as Salesforce and Excel, plus browser/desktop workflows, so frontier AI labs and large enterprises can train and evaluate computer-use agents. It ships a Python SDK, a platform API, and the open-source 'Harbor' agent-evaluation/RL-environment tooling, pairing simulated environments with human supervision.

Backed by Sequoia Capital, Menlo Ventures, SV Angel, Bain Capital Ventures (reported prospective lead of an in-talks round; not confirmed closed). Team pedigree: Team self-describes prior experience at Anthropic, xAI, Meta Superintelligence, Essential AI, Contextual AI, Mercor, Docker, Citadel, Jane Street, and Cruise; Founder/CEO Nicolai (Nic) Ouporov: ex-founding engineer at Respell (acquired by Salesforce, Jan 2024); prior research at Stanford and Columbia per personal site.

$15M raisedFounded 2024New York, NY, USA11-50 staffSOC 2: unknown

Best fit: A frontier lab or large enterprise that needs bespoke, high-fidelity RL environments simulating real enterprise software (CRM, spreadsheets, browser/desktop) to train and evaluate computer-use agents.

7

Datacurve

Commercial

Datacurve is a YC W24 commercial data vendor that supplies expert-curated frontier coding data, RLHF traces, and repository-wide reinforcement learning environments (with unit-test verifiers) to foundation model labs, sourced via its Shipd bounty platform of vetted software engineers. It also publishes DeepSWE, a long-horizon agentic coding benchmark.

Backed by Chemistry (Mark Goldberg, lead Series A), Y Combinator, Balaji Srinivasan (seed), angel investors who are employees of DeepMind, Vercel, Anthropic and OpenAI (individuals, not the companies). Team pedigree: Serena Ge (co-founder/CEO): worked on LLM reasoning during a co-op at Cohere; University of Waterloo CS; Forbes 30 Under 30; Charley Lee (co-founder): University of Waterloo CS; AI research background.

$17.7M raisedFounded 2024San Francisco, USA11-50 staffSOC 2: unknown

Best fit: Frontier/foundation model labs needing expert-sourced coding SFT/RLHF data and code-execution RL environments with verifiable rewards (code execution, tight loops).

8

Proximal

Commercial

Proximal is a San Francisco-based (with a Bangalore presence) research lab for coding data, building high-fidelity, long-horizon reinforcement learning environments grounded in real codebases to train and evaluate frontier coding agents. It emphasizes scalable, software-driven data engines over human contractors, and research into reward-hacking detection and 'fuzzy verifiers' that score code quality beyond functional correctness.

Backed by Scribble Ventures (lead), Angels from OpenAI, Anthropic, Thinking Machines, Google DeepMind, xAI, Meta Superintelligence, Cursor and Cognition (per founders' own statements; not independently verified). Team pedigree: Justus Mattern (co-founder) - led RL research & data at Prime Intellect, core contributor to its RL training framework (Intellect-2); co-founded Revideo (YC S23); early engineer at Dynamo AI (confirmed via justusmattern.com); Calvin Chen (co-founder) - works on Proximal; part of a 'second-time exited' founding team (specifics of any prior company exit, ARR or sale amount NOT corroborated by his own site).

Founded 2026San Francisco, CA, USA11-50 staffSOC 2: unknown

Best fit: Frontier labs or AI startups needing long-horizon, real-codebase RL environments and quality-aware (fuzzy) verifiers to post-train coding agents.

9

Gray Swan AI

Commercial

Gray Swan AI is a Pittsburgh-based AI security company spun out of Carnegie Mellon, offering adversarial red-teaming and runtime protection for AI models and agents via three products: Arena (a crowdsourced adversarial red-teaming network of 15,000+ researchers), Shade (automated red-teaming/pressure-testing), and Cygnal (runtime input/output guardrails). It positions itself as a security/evaluation partner to frontier labs and enterprises rather than a general RL-environment vendor.

Backed by Wing Venture Capital (co-lead), Madrona (co-lead), Obvious Ventures, Snowflake Ventures. Team pedigree: Zico Kolter (Co-founder, Chief Scientist) - CMU professor, AI safety/robustness researcher, OpenAI board member; Matt Fredrikson (Co-founder, CEO) - CMU faculty, adversarial ML researcher. Named customers include Anthropic, OpenAI, Meta, Google DeepMind, xAI, Amazon, Snowflake, ByteDance, ElevenLabs, Intercom, Deloitte, UK AI Security Institute (AISI), Anaconda, OpenHands, AIUC (verified, incl. frontier-lab ties).

$40M raisedFounded 2023Pittsburgh, Pennsylvania, USA11-50 staffSOC 2: Type II

Best fit: Buyers needing adversarial evaluation, red-teaming arenas, and runtime guardrails for frontier or enterprise LLM/agent deployments.

10

Veris AI

Commercial

Veris AI sells a high-fidelity simulation platform plus a production runtime that let enterprises train, evaluate, and govern AI agents against mocked enterprise tools before and during production, with support for reinforcement learning / fine-tuning pipelines. It positions itself as the enterprise 'environment layer' that agent builders lack.

Backed by Decibel Ventures (lead), Acrew Capital (lead), The House Fund, Ian Livingstone. Team pedigree: CEO Mehdi Jamei: PhD EECS UC Berkeley; previously led agentic AI at System and Workmate; CTO Andi Partovi: PhD (brain-computer interfaces) University of Melbourne; ex-Solutions Architect at Google; ex-founder/CTO KeyLead Health. Named customers include Consumer fintech company (unnamed) - compliant chatbots, HR tech / executive-assistant agent company (unnamed), Manufacturer - supply chain agent (unnamed) (self-claimed).

$8.5M raisedFounded 2025San Francisco, CA, USA1-10 staffSOC 2: claimed

Best fit: Enterprise teams building agents for messy multi-step internal workflows who need safe simulated environments to evaluate, train (RL/RFT), and govern those agents before production.

11

Chakra Labs

Commercial

Chakra Labs runs Dojo, an open/collaborative reinforcement-learning environment hub for computer-use agents, offering deterministic, frame-accurate clones of production software plus human computer-use trajectory datasets, with native support for the Harbor, Verifiers and Verl RL frameworks. It positions itself as bringing frontier-lab-grade CUA training infrastructure to the broader research community.

Team pedigree: Alexander Fung (co-founder), ex-Palantir, Snap/Snapchat, Fin; Computer Science, University of Waterloo (per LinkedIn/search snippets); Nirmal Krishnan (co-founder), Computer Science & ML, Johns Hopkins; prior data/early-stage startup experience (per LinkedIn/search snippets).

$10.1M raisedFounded 2024Brooklyn, New York, USA11-50 staffSOC 2: unknown

Best fit: Teams training or evaluating computer-use / GUI agents that need ready-made, deterministic clones of production software environments plus human trajectory data.

12

Andon Labs

Commercial

Andon Labs is a Y Combinator-backed (W24) startup, formerly Vectorview, building benchmarks and evaluations for AI agents' long-horizon coherence and safety (Vending-Bench, Butter-Bench, Blueprint-Bench) and operating real-world autonomous AI businesses. It is known for high-profile collaborations placing AI-run vending machines/stores in the offices of frontier labs Anthropic (Project Vend) and xAI (Grokbox).

Backed by Y Combinator (W24). Team pedigree: Lukas Petersson (CEO, co-founder), previously co-founded Vectorview; Axel Backlund (CTO, co-founder). Named customers include Anthropic, xAI (verified, incl. frontier-lab ties).

$500K raisedFounded 2023San Francisco, USA11-50 staffSOC 2: unknown

Best fit: Buyers wanting long-horizon agent coherence/safety benchmarks and real-world autonomous-operation stress tests, with a frontier-lab-adjacent, irreverent eval style.

13

Sepal AI

Commercial

Sepal AI is a YC-backed (S24) San Francisco data-research company that builds high-quality training data, expert-graded evaluation benchmarks, and reinforcement-learning environments for frontier LLMs, drawing on a network of 20k+ domain experts (PhDs, finance, medical, STEM). It was acquired by Mercor in February 2026.

Backed by Y Combinator, Metaplanet Holdings, SID Venture Partners, Sterling Road. Team pedigree: Co-founders ex-Turing (built/scaled the LLM-trainer business; Robi Lin scaled trainers 50 to 800+; Kat Hu managed 500+ AI trainers); Co-founder Robi Lin formerly at Bain & Co.; co-founder Kat Hu former McKinsey consultant. Named customers include Top AI research labs (unnamed; frontier-lab ties referenced in Mercor acquisition rationale), Multiple Fortune 500 companies (unnamed), HUD (co-builder/collaboration partner on SheetBench-50, NOT a customer) (verified, incl. frontier-lab ties).

$500K raisedFounded 2024San Francisco, USASOC 2: unknown

Best fit: Buyers needing expert-validated evaluation environments and RL/training data for complex domains (notably finance/spreadsheet analyst workflows and advanced science), note the team is now part of Mercor following the Feb 2026 acquisition.

14

HUD

Commercial

HUD (YC W25, formerly hud.so) is a platform for building reinforcement-learning environments and evaluations for computer-use and browser agents. It lets teams wrap real software/code as agent-callable tools in isolated containers, define tasks and rewards, and run evals/RL at scale via an open-source SDK plus a cloud-hosted gateway. It maintains public benchmarks (OSWorld-Verified contributions, SheetBench-50) and positions frontier AI labs and agent-first startups as its target customers.

Backed by Y Combinator (W25 batch), Exceptional Capital. Team pedigree: Jay Ram (CEO) - consumer apps, ML/quant research; Lorenss Martinsons (CPO) - Cognitive Science, Yale. Named customers include DoorDash, UiPath, Sharpe, OpenAI, Anthropic (self-claimed, incl. frontier-lab ties).

Founded 2025San Francisco, USA11-50 staffSOC 2: unknown

Best fit: Teams that need to benchmark or RL-train computer-use/browser agents against real-software tasks with reproducible, containerized environments.

15

Vals AI

Commercial

Vals AI is an independent, third-party benchmarking and evaluation platform that scores LLMs and AI applications (copilots, RAG, agents) on rigorous, domain-specific tasks in regulated fields such as legal, finance, healthcare, tax and coding. It publishes public leaderboards (e.g., the Vals Index, Finance Agent benchmark, Vals Legal AI Report) and sells private evaluation infrastructure to labs and enterprise engineering teams.

Team pedigree: Co-founder/CEO Rayan Krishnan - ex-Stanford AI master's; Co-founder/CTO Langston Nashold - ex-Stanford AI master's. Named customers include Reed Smith (law firm; VLAIR benchmarking consortium partner, not a paying customer), Fisher Phillips (law firm; VLAIR benchmarking consortium partner, not a paying customer), McDermott Will & Emery (law firm; VLAIR benchmarking consortium partner, not a paying customer), Ogletree Deakins (law firm; VLAIR benchmarking consortium partner, not a paying customer), Harvey (legal AI vendor evaluated in VLAIR; not a stated customer), CoCounsel/Thomson Reuters (legal AI vendor evaluated in VLAIR; not a stated customer), Alexi (legal AI vendor evaluated in VLAIR; not a stated customer) (self-claimed).

Founded 2023San Francisco, USA11-50 staffSOC 2: claimed

Best fit: Buyers who need neutral, domain-specific (legal/finance/healthcare) benchmarking and ongoing evaluation of LLM applications on their own data and tasks.

16

Halluminate

Commercial

Halluminate (YC S25, founded 2024, San Francisco) builds managed reinforcement-learning sandbox environments, simulated applications, and human/annotation data plus evaluation benchmarks (WebBench, BrowserBench, Westworld) to train and test computer-use and browser AI agents. Its 2026 site positioning has narrowed toward 'RL environments for financial services' (investment banking, private equity, consulting).

Backed by Y Combinator (S25), Orange Collective, Antigravity Capital, Batch Ventures. Team pedigree: Jerry Wu (co-founder/CEO): ex-Capital One Labs (led product and research; launched an early AI agent in banking); Cornell CS & Economics; Wyatt Marshall (co-founder): Cornell Milstein Scholar; large-scale data engineering at two early-stage NYC startups. Named customers include Leading computer-use model labs (unnamed), The two largest browser agent companies (unnamed), Frontier labs e.g. OpenAI, Anthropic (per company copy, unnamed/unconfirmed) (self-claimed, incl. frontier-lab ties).

$160K raisedFounded 2024San Francisco, CA, USA1-10 staffSOC 2: unknown

Best fit: Foundation-model labs and browser/computer-use agent teams needing deterministic, managed RL sandboxes plus expert eval/annotation data, increasingly for finance workflows.

17

Matrices

Commercial

Matrices builds reinforcement-learning training environments for frontier AI labs to train agents that use computers and browsers like humans, described as a 'gamified replica of the internet' where thousands of agents learn via RL. The company frames its mission as 'towards self-driving computers' and says it helps labs train computer-use agents (Operator-class systems). Note: this is the correct browser-native entity (matrices.ai / LinkedIn 'matricesapp'), distinct from the similarly named 'Matrice.ai' computer-vision company and 'Matrix AI Network' blockchain project.

Backed by Index Ventures, AI Grant (Nat Friedman & Daniel Gross), Naval Ravikant. Team pedigree: Co-founder Leonardo Axel Setyanto (Co-Founder/CTO): UT Austin; prior startup engineering (Loku), no frontier-lab pedigree found; Co-founder John Qian: University of Illinois Urbana-Champaign. Named customers include Unnamed frontier AI labs (described as signing 7-figure contracts; agents like OpenAI 'Operator' referenced as the type they help train) (self-claimed).

$5M raisedFounded 2023San Francisco, California, USA11-50 staffSOC 2: unknown

Best fit: A frontier lab needing large-scale, realistic browser/computer-use RL environments to train and evaluate web-navigating agents.

18

BenchFlow

Commercial

BenchFlow is an early-stage, YC-backed open-source 'environment lab' building evaluation infrastructure and a community Benchmark Hub for AI agents, with products including SkillsBench, ClawsBench (mock workplace environments) and a sandboxed agent runtime. It positions environments as 'the new data' for training and evaluating agents across domains like enterprise workflows, coding, computer use and browser tasks.

Backed by Y Combinator, Pear VC, Construct Capital, FAST by GETTYLAB. Team pedigree: Xiangyi Li (founder/CEO), creator of SkillsBench; prior engineering roles per founder interview; Moritz Wallawitsch, early co-founder, reported departure ~Feb 2025.

$1.0M raisedFounded 2024New Castle, DE, USA (incorporation); Bay Area / San Francisco operating presence1-10 staffSOC 2: unknown

Best fit: Teams needing open-source, reproducible agent evaluation environments and a runtime to benchmark coding/computer-use/workplace agents at low setup cost.

19

Collinear

Commercial

Collinear AI operates a 'Simulation Lab' (SimLab) that builds sandboxed, stateful RL environments simulating enterprise users, tools (Jira, ServiceNow, Shopify, EMR, airline/hotel systems) and multi-step workflows, producing training-ready trajectories, reward signals and evals for agentic models. It also offers synthetic post-training data and LLM-judge evaluation, positioning itself around 'environment-as-a-service' for enterprise long-horizon agents.

Backed by Engineering Capital, Firestreak Ventures, 112 Capital (11.2 Capital). Team pedigree: Founder/CEO Nazneen Rajani: ex-Robustness Research Lead at Hugging Face, ex-Research Scientist at Salesforce, PhD University of Texas at Austin (MIT TR Innovators Under 35); Team described as researchers/engineers from Hugging Face, Salesforce, Google, Amazon, Stanford (per company About page). Named customers include Amazon, ServiceNow, Kore.ai, Matillion, MasterClass, Zoho, HUMAIN, Commonwealth Bank, LaHaus, ParseAI (self-claimed).

Founded 2023Mountain View / Sunnyvale, California, USA11-50 staffSOC 2: unknown

Best fit: Buyers training or evaluating enterprise agents that need realistic, stateful long-horizon simulated workflows (IT support, customer service, finance, HR) with verifiable rewards.

20

Refresh

Commercial

Refresh (YC X25) builds simulation engines / RL environments with verifiable rewards for coding and computer use, partnering with frontier labs and enterprises to train AI software-engineering and computer-use 'coworker' capabilities across terminal and GUI.

Backed by Y Combinator. Team pedigree: Christopher Settles (CEO), ex-Uber AI ML tech lead; CS degree from UIUC; Erik Quintanilla (CTO), ex-Capital One, ex-Amazon (computer vision / data scraping). Named customers include Frontier AI labs (unnamed) (self-claimed, incl. frontier-lab ties).

Founded 2025San Francisco, CA, USA1-10 staffSOC 2: unknown

Best fit: Frontier labs needing custom RL training environments and datasets for software-engineering and computer-use agent capabilities.

21

Vmax

Commercial

Vmax is a San Francisco reinforcement-learning startup (founded 2025 by three RL/robotics PhDs from UCL and UPenn) that automates the conversion of proprietary data and evals into RL environments for LLM-based agents, targeting long-horizon and coding tasks. Its public research includes unix-ctf, a procedural generator of capture-the-flag tasks for Unix/shell competence.

Backed by Race Capital, South Park Commons. Team pedigree: Matthew Sargent, RL PhD, University College London (2019-2024); co-founder; Augustine Mavor-Parker, RL PhD, University College London; CTO; previously Redwood Research (AI safety), Cold Spring Harbor Laboratory (NeuroAI), Illumina (AI for genomics). Named customers include Martian / ARES team (withmartian), partnership: jointly releasing ~1k JavaScript coding tasks in the Harbor format (Harbor = Terminal-Bench task format) (self-claimed).

Founded 2025San Francisco, USA1-10 staffSOC 2: unknown

Best fit: Teams needing custom, research-grade RL environments to train coding and long-horizon shell/terminal agents from proprietary data.

22

Andromede

Commercial

Andromede is an early-stage RL data lab that programmatically generates RL environments, tasks, and verifiers from real-world data for post-training and evaluation of frontier agents, with an emphasis on long-horizon sequential reasoning tasks. As of mid-2026 it is in private beta, working with a small set of partners. It was co-founded by Guillaume Allegre (Founder & President) and Alexandre Sallinen (an EPFL-affiliated researcher who contributed to the Meditron medical-LLM project), and is backed by Unusual Ventures.

Backed by Unusual Ventures. Team pedigree: Alexandre Sallinen (co-founder) - EPFL; contributor to the Meditron open-source medical LLM project; RL/LLM research background; Guillaume Allegre (co-founder & president) - ex-BCG X; MIT (Machine Learning & Operations Research), engineering/applied mathematics.

Founded 2025Lausanne, Switzerland1-10 staffSOC 2: unknown

Best fit: Buyers needing custom RL environments and verifiers derived from real-world data for post-training/evaluating long-horizon agentic models.

23

Plato

Commercial

Plato (plato.so, Plato Technologies, Inc.) builds simulated worlds for training and evaluating browser and computer-use agents, recreating real websites/software (e.g. Amazon/Airbnb/Gmail-style replicas) as reinforcement-learning environments with structured APIs for interaction, state tracking and scoring. It also offers a 'Computer Use' capability driving a full Linux desktop, positioning at the intersection of browser interaction and enterprise workflow simulation.

Team pedigree: Pranav Putta (Co-founder/CTO), prior MultiOn, Georgia Institute of Technology, Tonic.ai; Robert Farlow (Co-founder/CEO).

Founded 2025San Francisco, CA, USA1-10 staffSOC 2: unknown

Best fit: AI labs/teams needing high-fidelity replica web/enterprise environments to train and evaluate browser and computer-use agents via RL.

24

AIChamp

Commercial

AIChamp builds custom reinforcement-learning environments and 'Virtual Gym' simulations for training and evaluating tool-using AI agents on long-horizon, multi-step enterprise tasks, pairing engineered environments (agents operating in software like Slack, Notion, Linear) with domain experts who design and grade tasks (SFT/RLHF/process supervision). The company emphasizes deep industry authority and expert-sourced data, having pivoted from a remote-talent/hiring marketplace background.

Team pedigree: Self-claimed 'alumni of OpenAI and xAI team' (vendor site, unverified); CEO/founder Vol Goloshuk previously founded BrightestMinds lead-generation / sales-development agency (reported).

$0 raisedSan Francisco, USA (CEO-based; reported, not officially confirmed; Tracxn alternatively lists Bali, Indonesia)1-10 staffSOC 2: unknown

Best fit: Buyers needing expert-graded, long-horizon enterprise-workflow RL environments where agents operate inside real business tools (Slack, Notion, Linear).

25

Habitat Inc

Commercial

Habitat Inc is an early-stage commercial vendor (2-10 employees, New York HQ) building reinforcement-learning environments for white-collar / work automation, with stated focus on code and desktop-style (computer use) interaction tasks for training agentic AI models. It appears in third-party listings of RL-environment suppliers serving AI labs. No funding, customer, or certification information is publicly available.

Team pedigree: Maxim Enis (co-founder), Williams College '24; prior Ramp association per LinkedIn; co-author (with Mark Hopkins, Williams) of arXiv:2404.13813 'From LLM to NMT: Advancing Low-Resource Machine Translation with Claude' (2024, academic, predates company); Andrew Megalaa (co-founder), Williams College '24.

New York, NY, USA1-10 staffSOC 2: unknown

Best fit: Buyers needing RL environments that simulate enterprise/desktop and coding workflows to post-train computer-use and coding agents.

Also tracked, incumbents, infrastructure & open source

13 companies we research with the same rigor but don’t rank, because they’re a different category: data-labeling incumbents moving into environments, execution-infrastructure providers, and open-source projects.

Incumbent
Scale AI Scale AI is the data-labeling and AI-data incumbent that has extended into RL environments, offering simulated web apps, macOS/Windows-like desktop VMs, and MCP-tool environments (Slack, HubSpot, Linear) with expert-designed objectives, rubrics, and automated verifiers to train and evaluate agents on long-horizon professional workflows.
Infrastructure
Modal Modal (Modal Labs) is a New York-based, Python-native serverless cloud purpose-built for AI/ML workloads, providing on-demand GPU/CPU compute, fast-booting sandboxed containers, inference, fine-tuning, and code execution.
Incumbent
Mercor Mercor is a venture-backed expert-marketplace and AI-training-data company that organizes a network of ~30,000+ domain experts (doctors, lawyers, bankers, engineers) to produce RLHF data, evaluations, and reinforcement-learning environments for frontier AI labs and enterprises.
Incumbent
Surge AI Surge AI is a bootstrapped, high-revenue human-data and RLHF labeling leader serving frontier AI labs, which has expanded into agentic RL environments via its EnterpriseBench suite (notably the CoreCraft enterprise customer-support simulation) and accompanying published benchmarks.
Open source
Prime Intellect Prime Intellect operates an open-source RL stack - the Environments Hub (2,500+ community RL environments), the Verifiers library and prime-rl training framework, plus hosted RL post-training (Lab), evals, inference and on-demand GPU compute.
Infrastructure
Daytona Daytona provides secure, elastic, programmatic sandboxes ('computers') that AI agents and developers can spin up in under ~90ms to run untrusted AI-generated code in isolated, stateful runtimes.
Infrastructure
E2B E2B provides open-source, secure cloud sandboxes (built on Firecracker microVMs) for running AI-generated code and AI agents, offered as a hosted API with BYOC/on-prem/self-hosted options.
Infrastructure
Runloop Runloop sells cloud-hosted, isolated micro-VM 'devboxes' plus blueprints, snapshots and benchmark/eval tooling that give AI coding agents a secure execution environment for development, evaluation, and reinforcement/supervised fine-tuning (RFT/SFT) loops.
Open source
General Reasoning General Reasoning is an AI research lab (operating research hub in London; legal entity General Reasoning, Inc.
Infrastructure
Cua Cua (trycua, YC X25) is open-source MIT-licensed infrastructure for computer-use agents, providing cloud and self-hosted sandboxes across macOS, Windows, Linux, and Android plus an SDK, a virtualization layer (Lume), and a benchmarking/RL-eval suite (Cua-Bench).
Open source
Good Start Labs Good Start Labs is a 2025 Every spin-out that builds game-based environments to generate reinforcement-learning data and evaluate frontier models, using both custom games and partnerships with existing games where player behavior helps train and rank AI.
Infrastructure
Morph Morph (Morph Labs) provides snapshot-based VM compute for AI agents via its Infinibranch / Liquid Metal technology, which can snapshot, branch, and restore entire computational environments in roughly 100-250ms to enable massively parallel, reversible ('Git for compute') agent rollouts, evaluations, and reasoning-time branching.
Incumbent
Turing Turing is a large AGI-infrastructure and engineering-services company that supplies frontier AI labs with coding data, human expertise, and RL/evaluation data at scale, drawing on a global vetted developer and domain-expert network.