Skip to main content
U.S. flag

An official website of the United States government

Here’s how you know

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

HTTPS

Secure .gov websites use HTTPS
A lock ( Lock A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Environmental Topics
  • Laws & Regulations
  • Report a Violation
  • About EPA
Risk Assessment
Contact Us

Missing data in ecology: Syntheses, clarifications, and considerations

On this page:

  • Overview
  • Downloads
In ecology and related sciences, missing data are common and occur in a variety of different contexts. When missing data are not handled properly, subsequent statistical estimates tend to be biased, inefficient, and lack proper confidence interval coverage. Missing data are often grouped into three categories: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). We review each category and compare their benefits and drawbacks. We review several approaches to handling missing data including complete case analysis, imputation, inverse probability weighting, and data augmentation. We clarify what types of variables should accompany imputation methods and how those variables are influenced by the analysis methods. Additionally, we discuss missing data that lack a formal basis for measurement and hence are fundamentally different from MCAR, MAR, and MNAR missing data. Throughout, we introduce concepts and numeric examples using both simulated data and data from the United States Environmental Protection Agency's 2016 National Wetland Condition Assessment. We conclude by providing five considerations for ecologists and other scientists handling missing data.

Impact/Purpose

Missing data are common in ecology and obfuscate ecological understanding when improperly handled, which happens often. This is likely due to the complexity of the subject, and the myriad techniques available to the practitioner for missing data. We provide a thorough synthesis of the missing data problem in ecology which we elucidate via several examples and connect these examples to an extensive literature review. We also develop novel insights specific to missing data in ecology and provide seven explicit missing data recommendations for ecologists to remember. Generally, missing data can be grouped into one of three categories: missing completely at random (MCAR); missing at random (MAR); or missing not at random (MNAR). We review each category and pay special attention to the MAR, which is quite flexible but often misunderstood. We compare the benefits and drawbacks of several modern missing data approaches, including complete case analysis, single imputation (deterministic and stochastic versions), multiple imputation, and data augmentation. Our manuscript demonstrates through examples and literature that multiple imputation and data augmentation perform best, but are more complex than complete case analysis and single imputation. We clarify the important distinction between imputation and prediction and argue that using predictive metrics to evaluate imputation methods is bad statistical practice and should be avoided. We clarify the nuances between two multiple imputation prediction approaches: predict-combine and combine-predict (Figure~\ref{fig:imp-predictions}). We introduce contingency filter variables, which clarify whether missing data have a basis for measurement, and show how contingency filter variables have myriad uses in ecology depending on context. Throughout, we motivate the missing data problem in ecology using wetland data from the United States Environmental Protection Agency's 2016 National Wetland Condition Assessment. 

Citation

Dumelle, M., R. Trangucci, A. Nahlik, Tony Olsen, K. Irvine, K. Blocksom, J. Ver Hoef, AND C. Fuentes. Missing data in ecology: Syntheses, clarifications, and considerations. Ecological Society of America, Ithaca, NY, 95(4):e70037, (2025). [DOI: 10.1002/ecm.70037]

Download(s)

DOI: Missing data in ecology: Syntheses, clarifications, and considerations
  • Risk Assessment Home
  • About Risk Assessment
  • Risk Recent Additions
  • Human Health Risk Assessment
  • Ecological Risk Assessment
  • Risk Advanced Search
    • Risk Publications
  • Risk Assessment Guidance
  • Risk Tools and Databases
  • Superfund Risk Assessment
  • Where you live
Contact Us to ask a question, provide feedback, or report a problem.
Last updated on February 17, 2026
United States Environmental Protection Agency

Discover.

  • Accessibility Statement
  • Budget & Performance
  • Contracting
  • EPA www Web Snapshots
  • Grants
  • No FEAR Act Data
  • Privacy
  • Privacy and Security Notice

Connect.

  • Data
  • Inspector General
  • Jobs
  • Newsroom
  • Open Government
  • Regulations.gov
  • Subscribe
  • USA.gov
  • White House

Ask.

  • Contact EPA
  • EPA Disclaimers
  • Hotlines
  • FOIA Requests
  • Frequent Questions

Follow.