What Is a Health Technology Impact Evaluation? Methods for Field Programs
A research-based look at health technology impact evaluation methods for field programs, from trials and mixed methods to implementation and cost analysis.

What Is a Health Technology Impact Evaluation? Methods for Field Programs
Health technology impact evaluation methods field programs use are really about one question: did the tool change anything that mattered once it left the pilot deck and met real people, real clinics, and real constraints? In global health, that question gets complicated fast. A screening app, decision-support tool, or contactless assessment workflow may look promising in a controlled test, yet the field story depends on uptake, supervision, referral completion, cost, and whether local teams can keep using it after outside support fades.
"Impact evaluation is the process of determining the extent to which observed changes in an outcome are attributable to an intervention." — J. Michel and Kim Schneider, Frontiers in Epidemiology (2025)
Health technology impact evaluation methods field programs use most often
The phrase sounds technical, but the logic is plain. Impact evaluation asks whether a health technology caused a meaningful change, for whom, and under what conditions. That is different from simple monitoring. Monitoring tells you how many screenings happened. Impact evaluation asks whether those screenings changed referral patterns, disease detection, worker behavior, or service use.
A useful starting point comes from Alain Labrique, Smisha Agarwal, and colleagues, who argued in a 2018 Global Health: Science and Practice paper that digital health should be evaluated against standards that include the surrounding health-system "ecosystem," not just the software itself. I think that is one of the most important corrections in this literature. Field programs do not fail or succeed in isolation. They succeed inside governance systems, staffing realities, device access constraints, and community trust.
The WHO's 2019 guidance on monitoring and evaluating digital health interventions makes the same point in a more operational way. It recommends a stage-based approach, moving from feasibility and usability work toward effectiveness and impact evaluation as programs mature. That matters because not every field deployment is ready for a large trial on day one.
| Evaluation method | What it measures well | What it can miss | Best fit in field programs |
|---|---|---|---|
| Randomized or cluster-randomized trials | Causal effect on defined outcomes | Why results varied across sites | Mature interventions with stable workflows |
| Stepped-wedge trials | Real-world rollout effects across sites over time | Complex implementation details | Programs scaling district by district |
| Mixed-method evaluations | Outcome change plus lived experience | Clean attribution can be weaker | Complex programs with behavior change elements |
| Realist evaluation | What worked, for whom, and why | Precise effect size estimates | Multi-site programs with strong context differences |
| Economic evaluation | Cost-effectiveness, affordability, resource use | User experience and trust | Funders and ministries making scale decisions |
| Implementation frameworks such as RE-AIM | Reach, adoption, fidelity, maintenance | Narrow clinical endpoints alone | Long-term operational learning |
That table gets at the main issue. No single method is enough for most field programs.
- Trials help when you need a defensible estimate of effect.
- Mixed methods help when the workflow is changing as it rolls out.
- Realist evaluation helps explain why one district responds differently from another.
- Economic analysis helps answer the uncomfortable scale question: is this worth funding beyond the pilot?
Why field programs often combine methods instead of choosing one
In practice, health technology impact evaluation is usually layered. A program may begin with usability interviews, add routine monitoring, then move into a cluster trial or quasi-experimental design once enough sites are active.
That pattern shows up in the broader digital-health literature. In their 2024 JMIR scoping review, Anneloek Rauwerdink, Pier Spinazze, Harm Gijsbers, Juul Molendijk, Sandra Zwolsman, Marlies P. Schijven, Niels H. Chavannes, and Marise J. Kasteleyn reviewed 824 studies and found that the largest share focused on the effectiveness or impact phase. To me, the interesting part is not just the number. It is the imbalance. Many teams still rush toward outcome claims before the implementation side is stable enough to interpret those claims well.
A good field example is SMARThealth India, led by David Peiris, Devarsetty Praveen, Stephen Jan, Anushka Patel, and colleagues. Agent-search surfaced the trial as a stepped-wedge cluster randomized study across 54 clusters and 62 villages in rural India. The study tested a smartphone-enabled, nurse-led cardiovascular prevention strategy and found a significant reduction in predicted 10-year cardiovascular risk. That is the sort of design field programs often need: rigorous enough to estimate impact, but flexible enough to fit staged rollout in real communities.
For related reading on this microsite, see After the Scan: How Referral Pathways Work in the Field and How Health Screening Changes Clinic Visit Patterns in Rural Areas.
Industry applications for impact evaluation in global health field work
District-scale screening deployments
When a district introduces a new screening workflow, the first evaluation question is usually reach. Who was screened, and who was missed? The second question is more important: did referrals turn into care?
This is where cluster trials and routine-service comparisons can work well. If rollout is phased, stepped-wedge designs let researchers compare sites before and after implementation without denying eventual access to later districts. That design is attractive in public health because it respects the politics of scale-up while still producing a counterfactual.
Community health worker programs
Community programs usually need more than effect estimates. They need to know whether workers adopted the tool, whether supervisors used the data, and whether the workflow survived weak connectivity or shared devices.
That is why implementation frameworks matter. Russell Glasgow's RE-AIM model looks at reach, effectiveness, adoption, implementation, and maintenance. Paired with Enola Proctor's implementation-outcomes work, it gives evaluators a way to ask harder field questions: Was the tool adopted? Was it used consistently? Did it last? Those are not side questions. In many deployments, they are the difference between a pilot result and a durable program.
Policy and donor decision-making
Funders do not just ask whether a technology worked once. They ask whether it is affordable, transferable, and worth backing at larger scale.
Economic evaluation helps here. Cost-effectiveness studies, budget impact analysis, and cost-utility work can show whether a technology produces enough value relative to staffing, device, training, and maintenance costs. I would argue this is the part teams avoid the longest, because it forces a program to confront its real operating model.
Current research and evidence
The strongest recent writing on impact evaluation is pushing in the same direction: more methodological discipline, but also more honesty about complexity.
Michel and Schneider's 2025 framework paper lays out seven core tenets of impact evaluation, including theory of change, stakeholder engagement, mixed-method indicators, baseline, midline, endline, and validation. That sequence is useful because it reminds teams that impact is not something you discover at the end by glancing at one dashboard. You design for it from the start.
Labrique and Agarwal's standards paper remains important because it widened the frame. They argued that digital health should be assessed across technical, behavioral, organizational, and systems dimensions. A tool may be usable and still fail to produce value if referral systems are weak. It may improve data quality and still fail on equity if only better-connected sites can keep it running.
Then there is the realist-evaluation tradition from Ray Pawson and Nick Tilley, which asks what works, for whom, in which circumstances, and how. That sounds almost obvious once you have spent time around field programs. Yet many evaluations still report average effects without explaining why one context outperformed another. In global health, those contextual differences are often the whole story.
A few evidence points stand out:
- WHO's first evidence-based digital health guideline, released in 2019, emphasized that much of the evidence base was still low to moderate quality.
- Rauwerdink and colleagues found 824 studies in their scoping review, but the field still shows uneven use of evaluation approaches across the full implementation cycle.
- SMARThealth India's stepped-wedge trial covered 54 clusters and 62 villages, showing how rigorous designs can still work in staged rural deployments.
- Michel and Schneider's 2025 framework argues that many evaluations remain too weak to guide action, especially when they skip theory, baseline design, or stakeholder validation.
The future of health technology impact evaluation
I suspect the future of evaluation in field programs will be less about picking the perfect method and more about matching the method to the stage of the program. Early deployments need feasibility and workflow learning. Mid-stage programs need implementation and adoption evidence. Mature programs need impact and economic analysis that ministries and grant-makers can actually use.
That also means less obsession with a single headline metric. A field technology may reduce risk scores, increase case finding, improve worker productivity, or shorten referral delays. Those are not interchangeable outcomes. Good evaluation makes clear which outcome mattered, why it mattered, and what it cost to achieve.
Newer screening and monitoring platforms, including solutions like Circadify, fit into this broader shift. The key question is not whether digital health sounds innovative. It is whether evaluators can show, with methods serious enough for the field, that the technology changed care pathways in a way local health systems can sustain.
Frequently Asked Questions
What is a health technology impact evaluation?
It is a structured effort to determine whether a health technology caused a meaningful change in outcomes such as detection, referral completion, service use, cost, or program reach.
Which methods are most common in field programs?
Common methods include cluster randomized trials, stepped-wedge designs, mixed-method evaluations, realist evaluation, and economic analysis. Many field programs use more than one.
Why are randomized trials not always enough?
Because a trial can estimate effect without fully explaining adoption, trust, supervision, or why one setting performed differently from another. Field programs usually need implementation evidence too.
What does realist evaluation add?
It helps explain what worked, for whom, in which circumstances, and why. That is especially useful in multi-site global health programs where context shapes results.
When should cost-effectiveness be part of evaluation?
Usually before major scale decisions. If a program cannot show its staffing, training, device, and maintenance costs clearly, funders and ministries have a hard time deciding whether broader rollout is realistic.
