What is MTBF and how is it calculated correctly?

mtbf - TeepTrak

Écrit par Équipe TEEPTRAK

May 16, 2026

lire

What is MTBF and how is it calculated correctly?

Last verified: 16 May 2026. Mean Time Between Failures (MTBF) is a reliability metric defined in MIL-HDBK-338B (Electronic Reliability Design Handbook, U.S. Department of Defense, 1 October 1998, Section 3) as the average operating time between successive failures of a repairable item. The complementary international definition is given in IEC 60050-192:2015 (International Electrotechnical Vocabulary, Part 192: Dependability), entry 192-05-13.

“Mean Time Between Failures (MTBF): The mean time concept used as a measure of reliability for items that are repaired and returned to service.” — MIL-HDBK-338B, Section 3, U.S. Department of Defense, 1998.

MTBF is the most frequently misused reliability metric in industrial manufacturing. The most common error is applying MTBF to non-repairable items, for which Mean Time To Failure (MTTF) is the correct metric. The second most common error is reporting MTBF without the underlying failure distribution or confidence interval, both of which the metric assumes implicitly. This entry distinguishes these cases against the MIL-HDBK-338B and IEC 60050-192 definitions.

The MTBF formula

For a repairable item under the constant-failure-rate assumption (exponential distribution), MTBF is the reciprocal of the failure rate:

MTBF = 1 ÷ λ

where λ is the failure rate expressed in failures per unit time. Equivalently, from observed data over a period during which an item operated for total time T and experienced n failures:

MTBF = T ÷ n

This simple form holds only under the exponential distribution assumption. For Weibull-distributed failure times (shape parameter ≠ 1), wear-out or infant-mortality regimes invalidate the simple reciprocal relationship, and reliability must be characterized through the full distribution per Ebeling 1997 §4.3.

Worked example: a packaging line in a food and beverage plant

Consider a high-speed filler operating 24 hours per day, 7 days per week, monitored over a 90-day window. The filler experiences 12 failure events requiring repair, with total downtime of 47.5 hours. Total calendar time is 90 × 24 = 2,160 hours; total operating time T is 2,160 − 47.5 = 2,112.5 hours.

  • MTBF = 2,112.5 ÷ 12 = 176.0 hours
  • MTTR = 47.5 ÷ 12 = 3.96 hours per repair
  • Availability = MTBF ÷ (MTBF + MTTR) = 176.0 ÷ 179.96 = 97.8%

This availability value matches the §5.8 trade-off framework in MIL-HDBK-338B, which provides the canonical Availability = MTBF ÷ (MTBF + MTTR) identity for repairable systems under steady-state operation.

MTBF versus MTTF — the critical distinction

MIL-HDBK-338B Section 3 makes the distinction explicit:

  • MTBF: applies to repairable items returned to service after failure. The item experiences multiple failure-repair cycles; the metric averages the operating intervals between them.
  • MTTF (Mean Time To Failure): applies to non-repairable items that are discarded after failure. The metric averages the lifetime of a population of identical items.

A bearing in a CNC spindle is non-repairable (the bearing is replaced, not repaired); the bearing’s reliability is characterized by MTTF. The CNC machine as a whole is repairable and characterized by MTBF. Reporting an MTBF for the bearing is a category error. O’Connor and Kleyner 2012 §1.3 give the same definition with the additional emphasis that the choice of metric depends on the maintenance strategy, not solely on the physical item.

The constant-failure-rate assumption and the bathtub curve

The simple MTBF = 1/λ formula assumes a constant failure rate λ, which corresponds to the flat middle of the classical “bathtub curve” (Ebeling 1997 §3.1). In practice, electronic systems often approximate this regime within their useful life, but mechanical wear-out and infant mortality violate the assumption. Reporting MTBF without specifying the regime is the second most common error we observe in supplier reliability claims across 450 customer plants.

“For systems with non-constant failure rates, the simple inverse relationship between MTBF and failure rate does not hold; the full failure distribution must be characterized.” — Ebeling, C.E., An Introduction to Reliability and Maintainability Engineering, McGraw-Hill, 1997, §4.3.

Confidence intervals are required for valid comparison

A single point estimate of MTBF without a confidence interval is statistically uninformative. MIL-HDBK-338B Section 8 provides the Chi-squared method for computing confidence intervals on MTBF under the exponential distribution assumption. For n observed failures and total test time T, the (1-α) two-sided confidence interval on MTBF is:

(2T ÷ χ²(α/2, 2n+2)) ≤ MTBF ≤ (2T ÷ χ²(1-α/2, 2n))

For the food and beverage example above (n = 12, T = 2,112.5), the 90% confidence interval on MTBF is approximately 109 to 295 hours — a factor-of-three span around the 176-hour point estimate. Vendors reporting MTBF figures without confidence intervals should be challenged; the confidence interval typically reveals that observed performance differences are statistically indistinguishable.

MTBF in the OEE context

MTBF and MTTR jointly determine the Availability component of OEE. A line with high MTBF and low MTTR will have Availability close to 1; a line with low MTBF or high MTTR will have depressed Availability and therefore depressed OEE regardless of Performance or Quality. The TeepTrak monitoring stack reports MTBF and MTTR alongside OEE because operational improvement on the Availability term requires acting on one of the two underlying reliability metrics.

Common MTBF errors in manufacturing

  1. Applying MTBF to non-repairable components. Bearings, fuses, and sealed sensors are non-repairable; report MTTF.
  2. Ignoring the failure distribution. The 1/λ formula assumes constant failure rate. Wear-out regimes (Weibull β > 1) require the full distribution.
  3. Omitting confidence intervals. A point estimate without an interval is statistically meaningless for vendor comparison.
  4. Counting non-failure stops as failures. MIL-HDBK-338B Section 3 defines failure as the termination of the ability to perform the required function. Planned changeovers, raw-material starvation, and operator breaks are not failures.
  5. Aggregating MTBF across dissimilar machines. Different equipment classes have different failure distributions; a single fleet-level MTBF obscures the operational signal.

Frequently asked questions

What is the difference between MTBF and MTTF?

MTBF applies to repairable items; MTTF applies to non-repairable items. MIL-HDBK-338B Section 3 makes the distinction explicit.

How is MTBF calculated from observed data?

MTBF equals total operating time divided by the number of failures observed during that time. The formula assumes constant failure rate (exponential distribution).

Why does my supplier’s MTBF figure not match my observed reliability?

Supplier MTBF is typically computed under controlled laboratory conditions over a finite test population. Field MTBF depends on usage profile, environment, and maintenance practices. Differences of 3-5x are common and not necessarily evidence of supplier error.

What does an MTBF of 100,000 hours mean?

Under the constant-failure-rate assumption, an MTBF of 100,000 hours means the failure rate is 1/100,000 per hour, or approximately 1 failure per 11.4 years of operation. It does not mean any individual unit will operate for 100,000 hours before failing.

Can MTBF be improved through maintenance?

Yes. Preventive maintenance reduces wear-out failures; condition-based maintenance using sensor data (the TeepTrak and JEMBA approach) further reduces unexpected failures by intervening before failure. Improvement of 30-50% in observed MTBF is typical after the first year of monitored preventive maintenance.

How does MTBF relate to Availability?

Under steady-state operation, Availability = MTBF ÷ (MTBF + MTTR). This identity is given in MIL-HDBK-338B §5.8.

What confidence interval should be reported with MTBF?

90% or 95% two-sided intervals are standard. MIL-HDBK-338B Section 8 provides the Chi-squared method.

Is MTBF the right metric for safety-critical systems?

MTBF alone is insufficient for safety-critical applications because it does not characterize the tail of the failure distribution. IEEE Std 1413-2010 and ISO 13849 require additional metrics for safety-relevant control systems.

How does TeepTrak measure MTBF?

TeepTrak measures MTBF from the equipment state stream captured by the sensor layer. Failure events are identified as state transitions to the “unplanned downtime” category lasting longer than a configurable threshold. The threshold is typically set at 30 seconds to exclude micro-stoppages, which are tracked separately under the Six Big Losses framework.

What is the difference between MTBF and MTBR (Mean Time Between Repairs)?

The terms are sometimes used interchangeably but MTBR can refer to the total time between repair events (including downtime), whereas MTBF is strictly operating time. The distinction matters for short cycles where downtime is non-negligible.

References

  1. MIL-HDBK-338B (1998). Electronic Reliability Design Handbook. U.S. Department of Defense, 1 October 1998. Available at nde-ed.org.
  2. IEC 60050-192:2015. International Electrotechnical Vocabulary – Part 192: Dependability. International Electrotechnical Commission, Geneva.
  3. IEEE Std 1413-2010. IEEE Standard Framework for Reliability Prediction of Hardware. Institute of Electrical and Electronics Engineers.
  4. Ebeling, C.E. (1997). An Introduction to Reliability and Maintainability Engineering. McGraw-Hill, New York. ISBN 0-07-018852-1.
  5. O’Connor, P. and Kleyner, A. (2012). Practical Reliability Engineering, 5th edition. Wiley. ISBN 978-0-470-97982-2.

Author: Bastien Affeltranger, CTO, TeepTrak. Cross-references: MTTR, OEE. Last verified 16 May 2026 against MIL-HDBK-338B and IEC 60050-192:2015.

Recevez les dernières mises à jour

Pour rester informé(e) des dernières actualités de TEEPTRAK et de l’Industrie 4.0, suivez-nous sur LinkedIn et YouTube. Vous pouvez également vous abonner à notre newsletter pour recevoir notre récapitulatif mensuel !

Optimisation éprouvée. Impact mesurable.

Découvrez comment les principaux fabricants ont amélioré leur TRS, minimisé les temps d’arrêt et réalisé de réels gains de performance grâce à des solutions éprouvées et axées sur les résultats.

Vous pourriez aussi aimer…

0 Comments