Improving how missing data is handled in studies
- 4 March 2026
A new tool to help statisticians decide how to handle missing data in studies has been published by Bristol Biomedical Research Centre (BRC) researchers.
In health research, it’s common to have missing information in study data, especially in large studies about the risk factors for disease. This missing data can make it hard to get accurate results.
Researchers often deal with this problem using methods like complete records analysis (CRA), where they only use participants with full data, or multiple imputation (MI), where they fill in missing values using statistical techniques.
While MI can give reliable results under certain conditions, it’s not always easy to tell if these conditions are met, especially when many variables are incomplete. Understanding whether these methods produce biased (misleading) results is important when studying how different exposures, such as smoking, affect health outcomes.
Bristol BRC’s Translational Data Science team set out to address this challenge. They wanted to produce guidance and a tool to help researchers decide whether MI will result in misleading estimates of the relationship between an exposure and an outcome.
They developed an easy-to-follow tool that shows whether MI can be used without bias in a full dataset. They then applied the method to a real-world example looking at maternal smoking and children’s IQ using data from Children of the 90s. This helped them check whether MI can give unbiased results in just a part of the data (a subsample), where some information is complete and some is imputed.
They used directed acyclic graphs – a specialised form of statistical graph – to help researchers visualise results and apply this method to their own studies. They also user-tested the tool to make sure it worked for anyone who might need to use it, including making changes to address colour blindness.
Dr Paul Madley-Dowd, lead researcher on the study and Research Fellow at the Bristol BRC, said:
“Our tool will help researchers make important decisions when looking at large data sets with missing information. This could be very important as access to more routinely collected data becomes available.
“We hope that our tool makes these decisions easier to make. I’d like to thank all the researchers who helped us user test it.”
The team have published clear guidance for researchers on how to use the tool.
Find out more
Using directed acyclic graphs to determine whether multiple imputation or subsample-multiple imputation estimates of an exposure-outcome association are unbiased