What the statute and regulation say
IRC Section 41(d)(1)(C), Treas. Reg. 1.41-4(a)(5) and (a)(6) define the test.
The activity must constitute elements of a process of experimentation. The regulation defines this as evaluating one or more alternatives to achieve a result where the capability, method, or design is uncertain at the outset.
Treas. Reg. 1.41-4(a)(6) imposes the substantially-all rule: at least 80 percent of the activities of a business component (measured by cost or activity) must be elements of a process of experimentation. If the component fails the 80 percent threshold, the entire component is excluded.
The process can be modeling, simulation, trial and error, or systematic evaluation. What it cannot be is a single one-shot 'try the obvious thing, ship it' implementation.
Qualifying experimentation methods
Five methods count. Most qualifying SaaS work uses two or three.
Evaluating alternatives. Two or more candidate approaches are considered before a choice is made, with documented criteria. An ADR with a comparison table is the cleanest evidence.
Hypothesis testing. A predicted behavior is tested against measured outcomes. Performance benchmarks, A/B-tested algorithmic changes (in the technical sense, not marketing A/B tests), and load-test campaigns all fit.
Modeling. A mathematical, statistical, or computational model is built and validated. Capacity planning models, queueing models, simulation harnesses.
Simulation. A simulated environment exercises the system across input ranges. Chaos-engineering experiments, fault-injection campaigns, replayable production trace tests.
Systematic iteration. Cycles of measure, refine, measure again. Visible in commit history as related PRs over a period of weeks or months on the same component.
The substantially-all 80 percent rule
This is the most common reason a binder downgrades a candidate component.
If a component has 100 commits and only 60 of them are experimentation - the rest are deployment scripts, cosmetic UI tweaks, or admin work - the component fails the substantially-all test and is excluded from QRE in full.
The fix is usually to re-scope the component to the subset of work that does meet the threshold. The remaining 40 commits stay in the repo; they just are not in the binder.
Per Treas. Reg. 1.41-4(a)(8), the substantially-all test is at the business-component level, not at the company or department level. Each component clears the bar independently.
What the binder counts as experimentation in your commit history
Not all commits count. The binder applies the categorization below before computing the percentage.
Counts: commits that introduce a new approach, replace an existing approach, add or modify a benchmark or test, revert a previous attempt, or refactor in service of a measured improvement.
Counts: commits with a test or benchmark file change in the same patch.
Does not count: deployment configuration, infrastructure provisioning, copy edits, dependency bumps with no behavior change, lint and formatting commits, vendor SDK upgrades without integration work.
Edge case: documentation commits count if the documentation captures uncertainty resolution (e.g., an ADR added after a spike). Pure marketing or external-docs writing does not count.
The other three parts of the four-part test
Every claimed business component has to satisfy all four parts. Each part has its own page:
Get documentation built to survive an exam
R&D Binder categorizes every commit, computes the experimentation percentage per component, and flags components that fall under the 80 percent threshold so they can be re-scoped before filing.