Methodology & Limitations | Robotaxi Safety Tracker

Data Sources

Where the raw numbers come from and what they include

NHTSA Standing General Order 2021-01

The primary source for incident data is NHTSA SGO 2021-01, a federal mandate requiring manufacturers and operators of SAE Level 2 ADAS and Level 3-5 ADS vehicles to report certain crashes. Reports are required when ADS was engaged within 30 seconds of a crash that involved any injury, fatality, vehicle tow-away, airbag deployment, or a vulnerable road user (pedestrian, cyclist, etc.).

What is included: All crashes meeting the above thresholds that involved Tesla's autonomous driving system in Austin, Texas. Each report contains the date, location (city/state), crash type, and a narrative description -- though Tesla's narratives are frequently redacted.

What is excluded: Near-misses, hard braking events, minor scrapes that do not meet the tow-away or injury threshold, and any incident where ADS was not engaged within the 30-second window. This means the true count of "safety-relevant events" is higher than the reported incident count.

Monthly Aggregation

Important: NHTSA SGO data provides incident dates at monthly resolution only (e.g., "JAN-2026" rather than "2026-01-15"). Since we cannot determine the exact day of each incident within a month, we aggregate all incidents by month and calculate a single monthly MPI estimate. Each data point in the trend chart represents one month's worth of incidents.

This monthly aggregation means that months with multiple incidents show the combined MPI for the entire cluster. The trend analysis and R² fit are computed over these 6 monthly data points, not the raw per-incident intervals.

Fleet Size Data

Daily fleet counts come from robotaxitracker.com, which independently tracks the number of Tesla robotaxi vehicles operating in each market. Fleet size is not reported continuously; instead, known data points are published at irregular intervals, and we linearly interpolate between them to estimate the fleet size on any given day.

Miles-per-Vehicle-per-Day Estimate

Tesla disclosed in its Q3 2025 earnings report that its Austin robotaxi fleet averaged approximately 115 miles per vehicle per day. This figure is used as a constant multiplier across all calculation periods. By comparison, private vehicles in the United States average roughly 30 miles per day; the higher robotaxi figure reflects commercial utilization patterns (vehicles operating 12-16 hours daily).

MPI Calculation

How miles per incident is derived for each inter-incident interval

The Formula

For each interval between consecutive NHTSA-reported incidents, miles per incident (MPI) is calculated as:

$$\text{MPI} = \text{fleet\_size} \times 115\;\text{mi/day} \times (\text{days\_between\_incidents} - \text{stoppage\_days})$$

Where $\text{fleet\_size}$ is the interpolated Austin fleet count on each day, $115\;\text{mi/day}$ is the per-vehicle daily mileage, and $\text{stoppage\_days}$ accounts for known periods when the fleet was offline (see Service Stoppages). More precisely, fleet miles are summed day by day across the interval, since fleet size changes over time.

Fleet Size Interpolation

Because fleet size data points are published at irregular intervals, daily fleet counts are estimated via linear interpolation between the nearest known data points. For example, if the fleet was 50 vehicles on January 1 and 70 vehicles on January 21, we estimate 55 vehicles on January 6, 60 on January 11, and so on. This produces a smooth daily series that is then multiplied by 115 mi/day to get daily fleet miles.

This method assumes fleet growth is roughly linear between data points. In reality, vehicles may be added in batches, so the interpolated curve may slightly over- or under-estimate fleet miles on any given day. Over multi-week intervals, these errors tend to average out.

What 115 Miles per Day Means

The 115 mi/day/vehicle figure comes from Tesla's Q3 2025 earnings disclosure. It represents an average across all vehicles in the Austin fleet, across all days. Individual vehicles may drive more or fewer miles depending on demand, time of day, routing, and maintenance schedules. If Tesla revises this figure in a future disclosure, the tracker's calculations will be updated accordingly.

Trend Model

How the exponential trend line and doubling time are derived

Exponential Fit

The trend line on the main chart fits the model:

$$\text{MPI} = a \cdot e^{b \cdot t}$$

where $a$ is the baseline MPI at time zero, $b$ is the growth rate, and $t$ is time in days. An exponential model is chosen because safety improvements in autonomous systems often compound: software updates improve all vehicles simultaneously, and more miles driven generate more training data.

Log-Linear Regression

The fit is computed by taking the natural logarithm of each MPI value and performing ordinary least-squares (OLS) linear regression on $\ln(\text{MPI})$ vs. $t$. This yields the slope $b$ and intercept $\ln(a)$. The regression is equivalent to maximizing likelihood under a log-normal error model.

R-squared

The R-squared value reported on the tracker is computed on the log scale: it measures how much of the variance in ln(MPI) is explained by the linear regression ln(MPI) = ln(a) + b·t. This is appropriate for exponential regression because:

It guarantees R² is always between 0 and 1
It measures relative (percentage) fit rather than absolute fit
The fit is computed on the same scale where the regression is performed

A low R² indicates high variance in MPI values across months, which is expected given the small sample size (6 monthly data points) and the stochastic nature of incident occurrence.

Doubling Time

Doubling time is derived directly from the growth rate $b$:

$$\text{Doubling Time} = \frac{\ln(2)}{b}$$

This tells you how many days it takes, on average, for MPI to double -- assuming the exponential trend continues. A 41-day doubling time, for example, means MPI is expected to double roughly every six weeks at the current trajectory.

Why Log Scale

Why the default chart uses logarithmic scaling and what it reveals

The Data Spans Orders of Magnitude

Tesla's MPI values range from roughly 10,000 miles (early incidents closely spaced) to over 1,000,000 miles (the most recent long streaks). On a linear-scale chart, the early data points would be compressed into a nearly invisible band at the bottom, while only the latest points would be visually distinguishable. A logarithmic y-axis gives equal visual weight to a doubling from 10K to 20K and a doubling from 500K to 1M, making the entire trajectory readable.

A Straight Line Means Exponential Growth

On a log-scale chart, a straight line corresponds to constant exponential growth. If the data points roughly follow a straight line on the log chart, it means MPI is growing at a consistent percentage rate over time. Deviations from that line -- a point far above or below the trend -- indicate periods of unusually good or poor performance relative to the exponential model.

Linear Scale Is Available

The main tracker page provides a toggle to switch between log and linear scale. The linear view is useful for seeing the absolute magnitude of recent improvements and understanding just how far MPI has risen in raw terms. However, it makes it difficult to assess early-period performance or judge whether the growth rate is accelerating, decelerating, or holding steady. We default to log scale because it is the more informative view for trend analysis.

Why FSD Improvement Is Naturally Exponential

Autonomous driving improvement follows an inherently exponential pattern due to how edge-case resolution compounds. As Tesla's VP of Autopilot Software Ashok Elluswamy has explained, once the system handles 99% of driving scenarios correctly, the remaining 1% of edge cases are what cause interventions. Fixing half of those remaining edge cases doesn't just improve performance by 0.5% — it doubles the miles between incidents, because incidents now occur half as often.

Put concretely: if the system intervenes once every 1,000 miles and engineers fix the edge cases responsible for 90% of remaining interventions, MPI jumps from 1,000 to 10,000 — a 10× improvement. Then fixing 90% of the next set of edge cases pushes MPI from 10,000 to 100,000. Each round of fixes addresses rarer and rarer scenarios, but each round also multiplies MPI by the same factor. This is why a log-scale chart is the natural way to visualize progress: equal vertical distances represent equal multiplicative improvements, making it easy to see whether the rate of edge-case resolution is consistent, accelerating, or slowing down.

Tesla's approach leverages its fleet data — collectively generating hundreds of years of driving daily — to identify and train on these rare edge cases using targeted data triggers. Combined with increasing model capacity (e.g., the 10× parameter increase in FSD v14) and the addition of reasoning capabilities, each software generation can resolve a larger fraction of remaining failure modes, sustaining the exponential improvement trajectory.

Human Driver Benchmarks

The two reference lines on the chart and why they differ

500,000 Miles: Police-Reported Crashes (NHTSA CRSS 2023)

The NHTSA Crash Report Sampling System (CRSS) 2023 estimates that human drivers in the United States are involved in a police-reported crash approximately every 500,000 miles. This includes all police-reported crashes regardless of severity, across all road types and conditions.

This benchmark is an overestimate of the true human crash rate for the purposes of AV comparison, because many minor crashes (parking-lot scrapes, low-speed fender-benders) are never reported to police but would trigger an NHTSA SGO report for an AV if airbags deployed, a tow truck was needed, or an injury occurred.

300,000 Miles: Insurance Claims (Swiss Re / Waymo 2023)

The Swiss Re / Waymo 2023 study used insurance claims data and found that human drivers file a bodily-injury or property-damage claim approximately every 300,000 miles. Because insurance claims capture incidents that may never reach a police report -- but are serious enough for a driver to file a claim -- this benchmark more closely mirrors the AV reporting threshold under NHTSA SGO 2021-01.

Why insurance claims are more comparable: The SGO requires AV companies to report any crash involving a tow-away, injury, or airbag deployment. These are the same types of events that typically generate insurance claims. A police report, by contrast, may not be filed for many of these events (e.g., a single-vehicle tow-away with no injuries in a parking lot). Therefore, the 300K insurance-claims benchmark is the fairer comparison for evaluating AV safety against human drivers.

Austin Only

Why this tracker uses only Austin data for its MPI and trend calculations

Austin Is the Only Unsupervised Level 4 Location

As of February 2026, Austin, Texas is the only market where Tesla operates robotaxis as a fully unsupervised SAE Level 4 autonomous driving service. Passengers ride without a safety driver in the vehicle. This makes Austin the only location where the safety data represents the true autonomous capability of Tesla's system.

Bay Area Operates with Safety Drivers

Tesla also operates robotaxis in the San Francisco Bay Area, but under California law these vehicles are required to have a safety driver present in the vehicle. The safety driver can intervene to prevent or mitigate crashes. Including Bay Area data in the MPI calculation would conflate supervised and unsupervised performance, making it impossible to assess the autonomous system's standalone safety record.

If and when Tesla begins unsupervised operations in additional cities, this tracker will incorporate data from those markets as well.

Bay Area incidents are tracked separately on the Cities page but do not enter the MPI trend calculation shown on the home page.

Safety Monitor Status

The current state of in-vehicle monitoring across Tesla's robotaxi fleet

Austin: Partial Monitor Removal

Some Tesla robotaxis in Austin already operate without an in-vehicle safety monitor. These vehicles are fully driverless -- there is no human in the car. However, the majority of the Austin fleet still carries a safety monitor who can observe but is not expected to intervene under normal conditions. Tesla has been gradually expanding the subset of rides that are fully unmonitored.

Bay Area: Safety Drivers Required

All Tesla robotaxis in the San Francisco Bay Area operate with a safety driver behind the wheel, as required by California's autonomous vehicle testing regulations. The safety driver can take over control at any time. This is a fundamentally different operating mode from Austin's unsupervised rides.

What This Tracker Estimates

One of the tracker's goals is to estimate when fleet-wide safety-monitor removal becomes viable based on quantitative safety signals. The key metric is whether MPI has reached and sustained a level that matches or exceeds human-driver benchmarks. Crossing the 500K police-reported threshold is a milestone; sustaining performance above the 300K insurance-claims benchmark with a large enough sample size would represent stronger evidence. These projections are based on the exponential trend model and should be treated as forward-looking estimates, not certainties.

Reporting Lag

How delays between incidents and public reporting affect data freshness

NHTSA Reporting Timelines

Under SGO 2021-01, manufacturers must submit an initial report within 1 day of learning about a reportable crash, followed by an updated report within 10 days that includes additional details such as the crash narrative, injury status, and vehicle damage assessment. Reports are then published on NHTSA's public portal, though the publication schedule introduces an additional, variable delay.

Tesla's Compliance History

Tesla has been cited by NHTSA for reporting delays in the past. Late reports mean that the public data may not reflect all incidents that have actually occurred. When an incident is eventually reported, it is back-dated to its actual occurrence date, which can retroactively change MPI values for earlier intervals.

Current Streak Is Provisional

The most recent MPI value -- the one based on the current incident-free streak -- should always be treated as provisional. It is possible that an incident has occurred but has not yet been reported or published. As new reports appear, the current streak may be shortened or split into multiple intervals. Historical MPI values (for intervals between two known incidents) are more stable, though even those can be revised if a previously unreported incident surfaces.

The tracker automatically recalculates all MPI values whenever new incident data is published. If a late report appears, the entire trend line is recomputed.

Known Limitations

What this analysis cannot tell you and where uncertainty is highest

Fleet Size Estimates May Be Inaccurate

Fleet counts from robotaxitracker.com are independently estimated and may not perfectly reflect the number of vehicles actively operating on any given day. Vehicles may be temporarily offline for maintenance, software updates, or repositioning. Linear interpolation between known data points smooths over batch additions or removals.

115 Miles per Day Is an Average

The per-vehicle mileage figure is a fleet-wide average from a single disclosure period (Q3 2025). Daily mileage likely varies by day of week, season, weather, and demand. If actual utilization was lower than 115 mi/day, our MPI estimates would be too high; if higher, too low. A +/-20% variation in this parameter shifts MPI values by the same percentage but does not meaningfully affect the trend direction or doubling time.

Fault Cannot Be Determined

Tesla's NHTSA SGO narratives are frequently redacted, making it impossible to determine whether the Tesla robotaxi was at fault in a given incident. An MPI metric counts all reported incidents equally, whether the robotaxi caused the crash, was rear-ended at a stop, or was hit by a red-light runner. This is consistent with how the human-driver benchmarks are calculated (they also include not-at-fault crashes), but it means MPI is not a measure of the system's driving skill in isolation.

Small Sample Size

As of early 2026, the tracker is based on approximately 10 reported incidents. This is a very small sample. A single unreported incident, a reporting error, or an unusual streak of good or bad luck can significantly shift the MPI values and the fitted trend. Confidence in the exponential model will increase substantially as the sample grows. Until then, all projections carry wide implicit uncertainty bands.

Exponential Fit May Not Continue Indefinitely

Exponential improvement in MPI cannot last forever. At some point, diminishing returns, edge cases, adversarial conditions, or fundamental sensor limitations will cause the improvement rate to plateau. The exponential model is a useful description of the current trajectory, not a guarantee of future performance. The tracker will switch to a more appropriate model (e.g., logistic, power-law) if and when the data warrants it.

Austin-Specific Conditions May Not Generalize

Austin, Texas has relatively mild weather (infrequent snow/ice), a modern road network, and a specific mix of driving conditions (highways, suburban streets, downtown). Safety performance in Austin may not be representative of performance in cities with more challenging conditions -- dense urban cores, heavy snow, aggressive traffic patterns, or poorly marked roads. As Tesla expands to additional cities, MPI may vary significantly by market.

Service Stoppages May Be Incomplete

The tracker accounts for known service stoppages (see below), but there may be additional periods when the fleet was partially or fully offline that have not been publicly reported. Unaccounted stoppages would cause MPI to be overestimated for the affected intervals, since the formula would assume vehicles were driving miles that they were not.

Service Stoppages

Known periods when the Tesla robotaxi service was offline and how they are handled

How Stoppages Affect Calculations

When the robotaxi service is taken offline -- whether due to weather, software issues, or regulatory action -- vehicles are not accumulating miles. If these days were included in the MPI calculation as normal driving days, MPI would be artificially inflated. To prevent this, known stoppage days are subtracted from the interval before computing fleet miles. No mileage is credited for days the fleet was offline.

Known Stoppages

Dates	Duration	Reason
January 25-26, 2026	2 days	Winter storm in Austin; service suspended for safety

This list is updated as new stoppages are identified. If you are aware of a stoppage not listed here, please contact us.