What is an RL environment?

An RL environment is a simulated task or world in which an AI agent takes actions, receives a reward signal, and improves through reinforcement learning. For frontier AI, these are usually high-fidelity replicas of real software (a web browser, a coding repository, or an enterprise app like Salesforce or Slack) paired with verifiers that automatically score whether the agent completed the task. Labs use them to train and evaluate agents on realistic, long-horizon work rather than single-turn questions.

What are the best RL environment companies in 2026?

There is no single best vendor: the right choice depends on your use case (coding, computer use, or enterprise workflows), budget, security and compliance needs, and deployment model. RL List ranks the dedicated commercial vendors by a transparent RL List score that combines scale and traction, security and research signals, and how much of each company's record could be independently verified. As of the latest update, the highest-ranked pure-play vendors are Mechanize, AfterQuery, Bespoke Labs, Huzzle Labs, Fleet AI, and Datacurve. Data-labeling incumbents such as Scale AI, Surge AI, and Mercor are tracked separately because they are a different category. See the methodology page for exactly how the ranking is computed.

Which companies build RL environments for coding agents?

Vendors that build RL environments and data specifically for coding and software-engineering agents include Mechanize, AfterQuery, Datacurve, Proximal, Huzzle Labs, and Vmax, typically using real code repositories with unit-test or execution-based verifiers. Execution-infrastructure providers such as Runloop, Modal, Daytona, E2B, and Morph supply the sandboxes those coding agents run in, rather than the environments themselves.

Which companies build RL environments for computer-use and browser agents?

Vendors focused on computer-use and browser agents, where an agent operates a real or simulated desktop, browser, or GUI, include Huzzle Labs, Fleet AI, Chakra Labs, HUD, Halluminate, Matrices, and Plato. Many recreate real websites and enterprise software (Salesforce, Slack, Excel) as reinforcement-learning environments with state tracking and automated scoring.

What is the difference between RL environments, evals, and human training data?

RL environments are the interactive tasks where an agent acts and is scored by a verifier. Evaluations (evals) and benchmarks measure how well a model or agent performs, often reusing those same environments. Human training data, including SFT and RLHF, is expert-generated demonstrations, trajectories, or preference labels used to train and align models. Many vendors offer more than one, and some bundle environments, human data, and evals in a single stack, which is why RL List tags each vendor by focus area rather than forcing one label.

How do I choose an RL environment vendor?

Start from your use case, then compare vendors on the public proxies RL List tracks: focus areas, scale and traction, research depth, security posture such as SOC 2, and how much of their record is independently verified versus self-claimed. The figures that actually decide a purchase (task and sample counts, unique environments, pass rates, difficulty splits, harness and data format, and pricing) are not on the public web and only surface in direct engagement. The practical next step is to request work samples from a shortlist of three to five vendors and let those decide.

RL Environment Vendors: 2026 Directory & Rankings

RL environment vendor directory

Mechanize

Commercial

Mechanize is a small, elite San Francisco vendor (founded April 2025 by ex-Epoch AI researchers Matthew Barnett, Tamay Besiroglu, and Ege Erdil) that builds a small number of robust, high-fidelity RL environments and evals for frontier coding agents, selling to leading AI labs. Its stated long-term mission is the full automation of valuable economic work via simulated 'digital office' environments.

Raised

$9.1M ↑ $9.1M

Headcount

11-50

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

Backers: Nat Friedman, Daniel Gross, Patrick Collison

Code

site ↗

AfterQuery

Commercial

AfterQuery is a San Francisco applied-research lab and data platform (YC W25) that supplies frontier AI labs with expert-generated human data (SFT, RL rubrics), agent/RL environments, and computer-use trajectories, drawn from a large network of verified practitioners. It publishes real-task benchmarks such as Terminal-Bench, VADER, FinanceQA, and IDE-Bench, positioning around capturing how domain experts (engineers, financial analysts, lawyers) reason.

Raised

$30.5M ↑ $30M

Headcount

51-200

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: Altos Ventures (lead, Series A), The Raine Group, Y Combinator

CodeComputer UseEnterprise

site ↗

Bespoke Labs

Commercial

Bespoke Labs is an applied AI research lab (Mountain View, CA, founded 2024) focused on data curation and RL-environment curation for training and evaluating agents, known for open datasets and reproducible recipes (OpenThoughts) and open-source tools (Curator, Evalchemy). It pairs a public open-source/open-data presence with commercial custom data and RL-environment delivery.

Raised

$40M ↑ $40M

Headcount

11-50

Founded

2024

SOC 2

unknown

Researchers

yes

Open src

yes

Pedigree: Co-founder/CEO Maheswaran (Mahesh) Sathiamoorthy, ex-Google DeepMind

Long Horizon

site ↗

Huzzle Labs

Commercial

Huzzle Labs is the AI division of London-based talent platform Huzzle (founded ~2020 by Ingmar Klein, Parham Rakhshanfar, and Amit Choudhary). It positions itself as a human-intelligence data foundry that builds RL environments (code, tool-use, computer-use, long-horizon enterprise workflows), expert trajectory data, and contextual evaluations for frontier AI labs and regulated European enterprises, leveraging Huzzle's vetted PhD/expert network. It bundles environments, human data, and evals in one stack.

Raised

$6M

Headcount

11-50

Founded

2020

SOC 2

Type II

Researchers

yes

Open src

partial

Backers: 10x Founders, Angel Invest, Emerge

CodeLong Horizon

site ↗

Fleet AI

Commercial

Fleet AI builds high-fidelity reinforcement-learning training environments ('gyms') that replicate enterprise software such as Salesforce and Excel, plus browser/desktop workflows, so frontier AI labs and large enterprises can train and evaluate computer-use agents. It ships a Python SDK, a platform API, and the open-source 'Harbor' agent-evaluation/RL-environment tooling, pairing simulated environments with human supervision.

Raised

$15M

Headcount

11-50

Founded

2024

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: Sequoia Capital, Menlo Ventures, SV Angel

Enterprise

site ↗

Datacurve

Commercial

Datacurve is a YC W24 commercial data vendor that supplies expert-curated frontier coding data, RLHF traces, and repository-wide reinforcement learning environments (with unit-test verifiers) to foundation model labs, sourced via its Shipd bounty platform of vetted software engineers. It also publishes DeepSWE, a long-horizon agentic coding benchmark.

Raised

$17.7M

Headcount

11-50

Founded

2024

SOC 2

unknown

Researchers

yes

Open src

Backers: Chemistry (Mark Goldberg, lead Series A), Y Combinator, Balaji Srinivasan (seed)

Code

site ↗

Proximal

Commercial

Proximal is a San Francisco-based (with a Bangalore presence) research lab for coding data, building high-fidelity, long-horizon reinforcement learning environments grounded in real codebases to train and evaluate frontier coding agents. It emphasizes scalable, software-driven data engines over human contractors, and research into reward-hacking detection and 'fuzzy verifiers' that score code quality beyond functional correctness.

Raised

–

Headcount

11-50

Founded

2026

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: Scribble Ventures (lead), Angels from OpenAI, Anthropic, Thinking Machines, Google DeepMind, xAI, Meta Superintelligence, Cursor and Cognition (per founders' own statements; not independently verified)

CodeLong Horizon

site ↗

Gray Swan AI

Commercial

Gray Swan AI is a Pittsburgh-based AI security company spun out of Carnegie Mellon, offering adversarial red-teaming and runtime protection for AI models and agents via three products: Arena (a crowdsourced adversarial red-teaming network of 15,000+ researchers), Shade (automated red-teaming/pressure-testing), and Cygnal (runtime input/output guardrails). It positions itself as a security/evaluation partner to frontier labs and enterprises rather than a general RL-environment vendor.

Raised

$40M ↑ $40M

Headcount

11-50

Founded

2023

SOC 2

Type II

Researchers

yes

Open src

Backers: Wing Venture Capital (co-lead), Madrona (co-lead), Obvious Ventures

–

site ↗

Veris AI

Commercial

Veris AI sells a high-fidelity simulation platform plus a production runtime that let enterprises train, evaluate, and govern AI agents against mocked enterprise tools before and during production, with support for reinforcement learning / fine-tuning pipelines. It positions itself as the enterprise 'environment layer' that agent builders lack.

Raised

$8.5M

Headcount

1-10

Founded

2025

SOC 2

claimed

Researchers

yes

Open src

Backers: Decibel Ventures (lead), Acrew Capital (lead), The House Fund

Enterprise

site ↗

#10

Chakra Labs

Commercial

Chakra Labs runs Dojo, an open/collaborative reinforcement-learning environment hub for computer-use agents, offering deterministic, frame-accurate clones of production software plus human computer-use trajectory datasets, with native support for the Harbor, Verifiers and Verl RL frameworks. It positions itself as bringing frontier-lab-grade CUA training infrastructure to the broader research community.

Raised

$10.1M ↑ $10.1M

Headcount

11-50

Founded

2024

SOC 2

unknown

Researchers

yes

Open src

yes

Pedigree: Alexander Fung (co-founder), ex-Palantir, Snap/Snapchat, Fin; Compute

Computer Use

site ↗

#11

Andon Labs

Commercial

Andon Labs is a Y Combinator-backed (W24) startup, formerly Vectorview, building benchmarks and evaluations for AI agents' long-horizon coherence and safety (Vending-Bench, Butter-Bench, Blueprint-Bench) and operating real-world autonomous AI businesses. It is known for high-profile collaborations placing AI-run vending machines/stores in the offices of frontier labs Anthropic (Project Vend) and xAI (Grokbox).

Raised

$500K

Headcount

11-50

Founded

2023

SOC 2

unknown

Researchers

yes

Open src

Backers: Y Combinator (W24)

Computer UseLong Horizon

site ↗

#12

HUD

Commercial

HUD (YC W25, formerly hud.so) is a platform for building reinforcement-learning environments and evaluations for computer-use and browser agents. It lets teams wrap real software/code as agent-callable tools in isolated containers, define tasks and rewards, and run evals/RL at scale via an open-source SDK plus a cloud-hosted gateway. It maintains public benchmarks (OSWorld-Verified contributions, SheetBench-50) and positions frontier AI labs and agent-first startups as its target customers.

Raised

–

Headcount

11-50

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: Y Combinator (W25 batch), Exceptional Capital

Computer UseEnterprise

site ↗

#13

Vals AI

Commercial

Vals AI is an independent, third-party benchmarking and evaluation platform that scores LLMs and AI applications (copilots, RAG, agents) on rigorous, domain-specific tasks in regulated fields such as legal, finance, healthcare, tax and coding. It publishes public leaderboards (e.g., the Vals Index, Finance Agent benchmark, Vals Legal AI Report) and sells private evaluation infrastructure to labs and enterprise engineering teams.

Raised

–

Headcount

11-50

Founded

2023

SOC 2

claimed

Researchers

yes

Open src

yes

Pedigree: Co-founder/CEO Rayan Krishnan - ex-Stanford AI master's

–

site ↗

#14

Halluminate

Commercial

Halluminate (YC S25, founded 2024, San Francisco) builds managed reinforcement-learning sandbox environments, simulated applications, and human/annotation data plus evaluation benchmarks (WebBench, BrowserBench, Westworld) to train and test computer-use and browser AI agents. Its 2026 site positioning has narrowed toward 'RL environments for financial services' (investment banking, private equity, consulting).

Raised

$160K

Headcount

1-10

Founded

2024

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: Y Combinator (S25), Orange Collective, Antigravity Capital

Computer UseEnterprise

site ↗

#15

Matrices

Commercial

Matrices builds reinforcement-learning training environments for frontier AI labs to train agents that use computers and browsers like humans, described as a 'gamified replica of the internet' where thousands of agents learn via RL. The company frames its mission as 'towards self-driving computers' and says it helps labs train computer-use agents (Operator-class systems). Note: this is the correct browser-native entity (matrices.ai / LinkedIn 'matricesapp'), distinct from the similarly named 'Matrice.ai' computer-vision company and 'Matrix AI Network' blockchain project.

Raised

$5M

Headcount

11-50

Founded

2023

SOC 2

unknown

Researchers

Open src

Backers: Index Ventures, AI Grant (Nat Friedman & Daniel Gross), Naval Ravikant

Computer Use

site ↗

#16

BenchFlow

Commercial

BenchFlow is an early-stage, YC-backed open-source 'environment lab' building evaluation infrastructure and a community Benchmark Hub for AI agents, with products including SkillsBench, ClawsBench (mock workplace environments) and a sandboxed agent runtime. It positions environments as 'the new data' for training and evaluating agents across domains like enterprise workflows, coding, computer use and browser tasks.

Raised

$1.0M

Headcount

1-10

Founded

2024

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: Y Combinator, Pear VC, Construct Capital

CodeComputer UseEnterprise

site ↗

#17

Collinear

Commercial

Collinear AI operates a 'Simulation Lab' (SimLab) that builds sandboxed, stateful RL environments simulating enterprise users, tools (Jira, ServiceNow, Shopify, EMR, airline/hotel systems) and multi-step workflows, producing training-ready trajectories, reward signals and evals for agentic models. It also offers synthetic post-training data and LLM-judge evaluation, positioning itself around 'environment-as-a-service' for enterprise long-horizon agents.

Raised

–

Headcount

11-50

Founded

2023

SOC 2

unknown

Researchers

yes

Open src

Backers: Engineering Capital, Firestreak Ventures, 112 Capital (11.2 Capital)

CodeComputer UseEnterprise

site ↗

#18

Refresh

Commercial

Refresh (YC X25) builds simulation engines / RL environments with verifiable rewards for coding and computer use, partnering with frontier labs and enterprises to train AI software-engineering and computer-use 'coworker' capabilities across terminal and GUI.

Raised

–

Headcount

1-10

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

Backers: Y Combinator

CodeComputer Use

site ↗

#19

Vmax

Commercial

Vmax is a San Francisco reinforcement-learning startup (founded 2025 by three RL/robotics PhDs from UCL and UPenn) that automates the conversion of proprietary data and evals into RL environments for LLM-based agents, targeting long-horizon and coding tasks. Its public research includes unix-ctf, a procedural generator of capture-the-flag tasks for Unix/shell competence.

Raised

–

Headcount

1-10

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

Backers: Race Capital, South Park Commons

CodeLong Horizon

site ↗

#20

Andromede

Commercial

Andromede is an early-stage RL data lab that programmatically generates RL environments, tasks, and verifiers from real-world data for post-training and evaluation of frontier agents, with an emphasis on long-horizon sequential reasoning tasks. As of mid-2026 it is in private beta, working with a small set of partners. It was co-founded by Guillaume Allegre (Founder & President) and Alexandre Sallinen (an EPFL-affiliated researcher who contributed to the Meditron medical-LLM project), and is backed by Unusual Ventures.

Raised

–

Headcount

1-10

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

Backers: Unusual Ventures

Long Horizon

site ↗

#21

Plato

Commercial

Plato (plato.so, Plato Technologies, Inc.) builds simulated worlds for training and evaluating browser and computer-use agents, recreating real websites/software (e.g. Amazon/Airbnb/Gmail-style replicas) as reinforcement-learning environments with structured APIs for interaction, state tracking and scoring. It also offers a 'Computer Use' capability driving a full Linux desktop, positioning at the intersection of browser interaction and enterprise workflow simulation.

Raised

–

Headcount

1-10

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

Pedigree: Pranav Putta (Co-founder/CTO), prior MultiOn, Georgia Institute of Te

Computer UseEnterprise

site ↗

#22

AIChamp

Commercial

AIChamp builds custom reinforcement-learning environments and 'Virtual Gym' simulations for training and evaluating tool-using AI agents on long-horizon, multi-step enterprise tasks, pairing engineered environments (agents operating in software like Slack, Notion, Linear) with domain experts who design and grade tasks (SFT/RLHF/process supervision). The company emphasizes deep industry authority and expert-sourced data, having pivoted from a remote-talent/hiring marketplace background.

Raised

Headcount

1-10

Founded

SOC 2

unknown

Researchers

Open src

Pedigree: Self-claimed 'alumni of OpenAI and xAI team' (vendor site, unverified)

EnterpriseLong Horizon

site ↗

#23

Habitat Inc

Commercial

Habitat Inc is an early-stage commercial vendor (2-10 employees, New York HQ) building reinforcement-learning environments for white-collar / work automation, with stated focus on code and desktop-style (computer use) interaction tasks for training agentic AI models. It appears in third-party listings of RL-environment suppliers serving AI labs. No funding, customer, or certification information is publicly available.

Raised

–

Headcount

1-10

Founded

SOC 2

unknown

Researchers

Open src

Pedigree: Maxim Enis (co-founder), Williams College '24; prior Ramp association

CodeComputer UseEnterprise

site ↗

n/r

Scale AI

Incumbent

Scale AI is the data-labeling and AI-data incumbent that has extended into RL environments, offering simulated web apps, macOS/Windows-like desktop VMs, and MCP-tool environments (Slack, HubSpot, Linear) with expert-designed objectives, rubrics, and automated verifiers to train and evaluate agents on long-horizon professional workflows. Following Meta's ~$14.3B June 2025 investment (~49% non-voting stake) and founder Alexandr Wang's departure to Meta, several frontier-lab customers (OpenAI, Google, xAI) reportedly scaled back or paused engagement over conflict-of-interest concerns.

Raised

$1.6B

Headcount

200+

Founded

2016

SOC 2

Type II

Researchers

yes

Open src

Backers: Meta Platforms, Accel, Amazon

CodeComputer UseEnterprise

site ↗

n/r

Modal

Infrastructure

Modal (Modal Labs) is a New York-based, Python-native serverless cloud purpose-built for AI/ML workloads, providing on-demand GPU/CPU compute, fast-booting sandboxed containers, inference, fine-tuning, and code execution. It is execution infrastructure rather than an RL-environment vendor, but is used to run reinforcement-learning training and large fleets of parallel sandboxed environments for AI labs.

Raised

$466M ↑ $355M

Headcount

51-200

Founded

2021

SOC 2

Type II

Researchers

Open src

Backers: General Catalyst (Series C co-lead), Redpoint Ventures (Series C co-lead; earlier Series A lead), Lux Capital (earlier round lead)

Code

site ↗

n/r

Mercor

Incumbent

Mercor is a venture-backed expert-marketplace and AI-training-data company that organizes a network of ~30,000+ domain experts (doctors, lawyers, bankers, engineers) to produce RLHF data, evaluations, and reinforcement-learning environments for frontier AI labs and enterprises. Originally an AI-recruiting platform, it pivoted to human-data/RL services and expanded its RL-environment capability through acquisitions, including the February 2026 acquisition of Sepal AI and the July 2026 acquisition of Deeptune, whose computer-use/enterprise environment platform and NYC team joined Mercor.

Raised

$492M

Headcount

51-200

Founded

2023

SOC 2

unknown

Researchers

yes

Open src

Backers: Felicis Ventures (led Series C and Series B), Benchmark, General Catalyst

CodeEnterpriseLong Horizon

site ↗

n/r

Surge AI

Incumbent

Surge AI is a bootstrapped, high-revenue human-data and RLHF labeling leader serving frontier AI labs, which has expanded into agentic RL environments via its EnterpriseBench suite (notably the CoreCraft enterprise customer-support simulation) and accompanying published benchmarks. As of June 2026 a reported ~$1B first external raise at a ~$25B valuation was in talks but not confirmed closed.

Raised

$1B

Headcount

51-200

Founded

2020

SOC 2

unknown

Researchers

yes

Open src

Pedigree: Founder/CEO Edwin Chen: ex-Google, ex-Facebook, ex-Twitter ML teams; M

EnterpriseLong Horizon

site ↗

n/r

Prime Intellect

Open source

Prime Intellect operates an open-source RL stack - the Environments Hub (2,500+ community RL environments), the Verifiers library and prime-rl training framework, plus hosted RL post-training (Lab), evals, inference and on-demand GPU compute. It positions itself as the open alternative to closed big-lab RL tooling and also trains open models (INTELLECT series).

Raised

$180M ↑ $130M

Headcount

11-50

Founded

2023

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: Founders Fund (led $15M round), Menlo Ventures, Distributed Global (co-led seed)

CodeEnterpriseLong Horizon

site ↗

n/r

Daytona

Infrastructure

Daytona provides secure, elastic, programmatic sandboxes ('computers') that AI agents and developers can spin up in under ~90ms to run untrusted AI-generated code in isolated, stateful runtimes. It offers a managed-hosted service plus an open-source self-hostable stack, and is positioned as agent-native execution/runtime infrastructure for code execution, computer use, and RL/eval workloads.

Raised

$31M ↑ $24M

Headcount

11-50

Founded

2023

SOC 2

Type I

Researchers

Open src

yes

Backers: FirstMark Capital (Series A lead; Matt Turck joined board), Pace Capital, Upfront Ventures (seed lead, Series A participant)

CodeComputer Use

site ↗

acq

Deeptune

Commercial

Deeptune was a New York-based startup building managed reinforcement-learning environments ('training gyms') for computer-use and code, where AI agents practice and are evaluated on realistic digital knowledge-work tasks (simulating tools like Slack and Salesforce). It sold these pre-built environments primarily to frontier AI labs and raised a $43M Series A led by a16z (March 2026). In July 2026 it was acquired by Mercor; the team joined Mercor and Deeptune's environment platform now sits under Mercor. It is therefore no longer ranked as an independent vendor.

Raised

$43M

Headcount

11-50

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

Backers: Andreessen Horowitz (a16z, lead), 776, Abstract Ventures

CodeComputer UseEnterprise

site ↗

n/r

E2B

Infrastructure

E2B provides open-source, secure cloud sandboxes (built on Firecracker microVMs) for running AI-generated code and AI agents, offered as a hosted API with BYOC/on-prem/self-hosted options. It positions as execution infrastructure for enterprise AI agents and self-claims broad Fortune 100 adoption.

Raised

$32M

Headcount

11-50

Founded

2023

SOC 2

Type II

Researchers

Open src

yes

Backers: Insight Partners (Series A lead), Decibel (seed lead), Sunflower Capital

CodeComputer Use

site ↗

n/r

Runloop

Infrastructure

Runloop sells cloud-hosted, isolated micro-VM 'devboxes' plus blueprints, snapshots and benchmark/eval tooling that give AI coding agents a secure execution environment for development, evaluation, and reinforcement/supervised fine-tuning (RFT/SFT) loops. It is execution infrastructure for agent builders and model labs rather than an RL-data/environments vendor itself.

Raised

$7M

Headcount

11-50

Founded

2024

SOC 2

claimed

Researchers

Open src

yes

Backers: The General Partnership (lead), Blank Ventures, Exponent Founders Capital

Code

site ↗

n/r

General Reasoning

Open source

General Reasoning is an AI research lab (operating research hub in London; legal entity General Reasoning, Inc. registered in the US) building open RL environments and infrastructure for training and evaluating agents over long horizons. Its OpenReward platform and Open Reward Standard (ORS) provide an open specification for connecting language models to community-built RL environments, with 330+ environments accessible through one API.

Raised

$10.9M

Headcount

11-50

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

yes

Pedigree: Ross Taylor (co-founder/CEO) - ex-Meta AI/FAIR, research lead on Galac

CodeLong Horizon

site ↗

n/r

Cua

Infrastructure

Cua (trycua, YC X25) is open-source MIT-licensed infrastructure for computer-use agents, providing cloud and self-hosted sandboxes across macOS, Windows, Linux, and Android plus an SDK, a virtualization layer (Lume), and a benchmarking/RL-eval suite (Cua-Bench). It positions itself as the 'Docker for computer-use agents,' giving any agent a cloud desktop.

Raised

$500K

Headcount

1-10

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: Y Combinator (X25 batch)

Computer Use

site ↗

acq

Sepal AI

Commercial

Sepal AI was a YC-backed (S24) San Francisco data-research company that built high-quality training data, expert-graded evaluation benchmarks, and reinforcement-learning environments for frontier LLMs, drawing on a network of 20k+ domain experts (PhDs, finance, medical, STEM). It was acquired by Mercor in February 2026; the team joined Mercor and its RL-environment and human-data work now sits under Mercor. It is therefore no longer ranked as an independent vendor.

Raised

$500K

Headcount

Founded

2024

SOC 2

unknown

Researchers

yes

Open src

Backers: Y Combinator, Metaplanet Holdings, SID Venture Partners

EnterpriseLong Horizon

site ↗

n/r

Good Start Labs

Open source

Good Start Labs is a 2025 Every spin-out that builds game-based environments to generate reinforcement-learning data and evaluate frontier models, using both custom games and partnerships with existing games where player behavior helps train and rank AI. It is known for AI Diplomacy / Diplomacy Arena (multi-agent long-horizon strategy) and LOL Arena (humor preference), and publishes openly on GitHub and Hugging Face.

Raised

$3.6M

Headcount

1-10

Founded

2025

SOC 2

unknown

Researchers

yes

Open src

yes

Backers: General Catalyst, Inovia Capital, Tirta Ventures

Long Horizon

site ↗

n/r

Morph

Infrastructure

Morph (Morph Labs) provides snapshot-based VM compute for AI agents via its Infinibranch / Liquid Metal technology, which can snapshot, branch, and restore entire computational environments in roughly 100-250ms to enable massively parallel, reversible ('Git for compute') agent rollouts, evaluations, and reasoning-time branching. It markets the platform (Morph Cloud) as infrastructure for running and scaling agent/RL verification environments rather than as an RL-environment dataset vendor itself.

Raised

–

Headcount

1-10

Founded

2023

SOC 2

claimed

Researchers

yes

Open src

yes

Backers: Christian Szegedy (reported seed/angel investor; also Chief Scientist), amount undisclosed

Code

site ↗

n/r

Turing

Incumbent

Turing is a large AGI-infrastructure and engineering-services company that supplies frontier AI labs with coding data, human expertise, and RL/evaluation data at scale, drawing on a global vetted developer and domain-expert network. Originally a remote-engineering talent marketplace, it now positions itself around 'AGI advancement' through code and reasoning data, including work over private/real codebases.

Raised

–

Headcount

200+

Founded

2018

SOC 2

unknown

Researchers

yes

Open src

HQ: Palo Alto, USA

CodeEnterprise

site ↗

#	Company	Code21Computer Use16Enterprise17Long Horizon13	Commercial25Open source3Incumbent4Infrastructure6	Under $10M12$10M–$50M10$50M–$200M1$200M+4Undisclosed11	1-101211-501951-2004200+2	Has SOC 210Undisclosed28	SF Bay Area28New York7Other US1United Kingdom1Europe1
#1	Mechanize	Code	Commercial	$9.1M ↑ $9.1M	11-50	unknown	San Francisco, USA
#2	AfterQuery	CodeComputer UseEnterprise	Commercial	$30.5M ↑ $30M	51-200	unknown	San Francisco, USA
#3	Bespoke Labs	Long Horizon	Commercial	$40M ↑ $40M	11-50	unknown	Mountain View, California, USA
#4	Huzzle Labs	CodeLong Horizon	Commercial	$6M	11-50	Type II	London, United Kingdom
#5	Fleet AI	Enterprise	Commercial	$15M	11-50	unknown	New York, NY, USA
#6	Datacurve	Code	Commercial	$17.7M	11-50	unknown	San Francisco, USA
#7	Proximal	CodeLong Horizon	Commercial	–	11-50	unknown	San Francisco, CA, USA
#8	Gray Swan AI	–	Commercial	$40M ↑ $40M	11-50	Type II	Pittsburgh, Pennsylvania, USA
#9	Veris AI	Enterprise	Commercial	$8.5M	1-10	claimed	San Francisco, CA, USA
#10	Chakra Labs	Computer Use	Commercial	$10.1M ↑ $10.1M	11-50	unknown	Brooklyn, New York, USA
#11	Andon Labs	Computer UseLong Horizon	Commercial	$500K	11-50	unknown	San Francisco, USA
#12	HUD	Computer UseEnterprise	Commercial	–	11-50	unknown	San Francisco, USA
#13	Vals AI	–	Commercial	–	11-50	claimed	San Francisco, USA
#14	Halluminate	Computer UseEnterprise	Commercial	$160K	1-10	unknown	San Francisco, CA, USA
#15	Matrices	Computer Use	Commercial	$5M	11-50	unknown	San Francisco, California, USA
#16	BenchFlow	CodeComputer UseEnterprise	Commercial	$1.0M	1-10	unknown	New Castle, DE, USA (incorporation); Bay Area / San Francisco operating presence
#17	Collinear	CodeComputer UseEnterprise	Commercial	–	11-50	unknown	Mountain View / Sunnyvale, California, USA
#18	Refresh	CodeComputer Use	Commercial	–	1-10	unknown	San Francisco, CA, USA
#19	Vmax	CodeLong Horizon	Commercial	–	1-10	unknown	San Francisco, USA
#20	Andromede	Long Horizon	Commercial	–	1-10	unknown	Lausanne, Switzerland
#21	Plato	Computer UseEnterprise	Commercial	–	1-10	unknown	San Francisco, CA, USA
#22	AIChamp	EnterpriseLong Horizon	Commercial	$0	1-10	unknown	San Francisco, USA (CEO-based; reported, not officially confirmed; Tracxn alternatively lists Bali, Indonesia)
#23	Habitat Inc	CodeComputer UseEnterprise	Commercial	–	1-10	unknown	New York, NY, USA
n/r	Scale AI	CodeComputer UseEnterprise	Incumbent	$1.6B	200+	Type II	San Francisco, California, USA
n/r	Modal	Code	Infrastructure	$466M ↑ $355M	51-200	Type II	New York, NY, USA
n/r	Mercor	CodeEnterpriseLong Horizon	Incumbent	$492M	51-200	unknown	San Francisco, CA, USA (181 Fremont)
n/r	Surge AI	EnterpriseLong Horizon	Incumbent	$1B	51-200	unknown	San Francisco, California, USA
n/r	Prime Intellect	CodeEnterpriseLong Horizon	Open source	$180M ↑ $130M	11-50	unknown	San Francisco, USA
n/r	Daytona	CodeComputer Use	Infrastructure	$31M ↑ $24M	11-50	Type I	New York, NY, United States
acq	Deeptune	CodeComputer UseEnterprise	Commercial	$43M	11-50	unknown	New York, NY, USA
n/r	E2B	CodeComputer Use	Infrastructure	$32M	11-50	Type II	San Francisco, USA
n/r	Runloop	Code	Infrastructure	$7M	11-50	claimed	San Francisco, CA, USA
n/r	General Reasoning	CodeLong Horizon	Open source	$10.9M	11-50	unknown	London, United Kingdom (Shoreditch), operating research hub; legal entity General Reasoning, Inc. registered in San Francisco/US per SEC Form D
n/r	Cua	Computer Use	Infrastructure	$500K	1-10	unknown	San Francisco, CA, USA
acq	Sepal AI	EnterpriseLong Horizon	Commercial	$500K	–	unknown	San Francisco, USA
n/r	Good Start Labs	Long Horizon	Open source	$3.6M	1-10	unknown	Brooklyn, NY, USA
n/r	Morph	Code	Infrastructure	–	1-10	claimed	San Francisco, USA
n/r	Turing	CodeEnterprise	Incumbent	–	200+	unknown	Palo Alto, USA

No vendors match these filters.

RL Environment Vendors 2026 Directory & Rankings

About this list

RL environment vendor directory

The Top 23 RL Environment Companies in 2026

Also tracked, incumbents, infrastructure & open source

RL environment vendors: frequently asked questions

What is an RL environment?

What are the best RL environment companies in 2026?

Which companies build RL environments for coding agents?

Which companies build RL environments for computer-use and browser agents?

What is the difference between RL environments, evals, and human training data?

How do I choose an RL environment vendor?

Other RL-environment lists

Export the dataset for your LLM