What’s changed in the Toolkit’s method and why it helps you

Dr Ciara Keenan and Dr Laura Knight

We’re moving from summaries based on existing reviews to a living, transparent system that re-checks the evidence and shows what works, for whom, and at what cost. The previous approach did something incredibly important: it brought a complex evidence base into one place and made it accessible for busy practitioners. We’re keeping that spirit in everything we do, clear summaries, plain language, and fair and balanced signals about uncertainty.

In this blog post we highlight what is changing and why we believe this is clearer for practice, more nuanced to children and young people in the UK context and better at keeping up with an evolving youth justice landscape.

1) From snapshots in time to a living evidence base

The Toolkit does a really great job of pulling research together so busy people like you can see the headline takeaways quickly. That accessibility remains a core promise. What’s new is how we keep the evidence up to date and organised.
We now treat the Evidence & Gap Map (EGM) as the primary source feeding the Toolkit. It is being systematically updated using a published protocol so methods are transparent and repeatable.

Searches via EPPI-Reviewer plug directly into emerging research findings (via OpenAlex) and use priority screening in EPPI-Reviewer to bring the most relevant studies to the front. A team of humans still make the final calls; the machine just saves time. 
Because the EGM will be refreshed on a monthly schedule, it becomes a living resource.

Why this helps you: Fewer out-of-date strands, broader coverage of the youth justice landscape, and quicker incorporation of new, practice-relevant findings.

2) Fresh syntheses for each strand

For every Toolkit strand, we carry out our own systematic review and we meta-analyse any relevant primary studies ourselves. That gives a single, consistent estimate of impact using advanced methods agreed prior to analysis.

Evidence synthesis experts apply Robust Variance Estimation (RVE) where studies report multiple, related results, so the statistics don’t overstate certainty. Sensitivity checks and moderator analyses are used to probe what might change the size of effects (e.g., design quality, setting, participants).

Why this helps you: We no longer pull together numbers that were calculated in different ways by different teams. This gives you results that are calculated in the same way across the Toolkit. That creates a fairer comparison across Toolkit strands, a clearer picture of what works, for whom, and in which settings.

3) Measures that make sense for practice

We now use measures that are appropriate to the type of intervention and more intuitive to read.

For Person-based approaches (e.g., mentoring, pre-court diversion) we can clearly show how many fewer young people are likely to be involved in violence. We translate that into Absolute Risk Reduction (ARR), a plain “percentage-point” reduction that’s easier to understand than odds ratios alone. 

For Place-based approaches (e.g., Problem-Orientated Policing, CCTV, street lighting), we use the Relative Incidence Rate Ratio (RIRR), which is designed for count data such as incidents in an area, and then align it with the Toolkit impact categories so strands are comparable at a glance.

Why this helps you: Person-based strands tell you “how many fewer young people are likely to be involved in violence,” while place-based strands tell you “how much incidents are likely to fall by.” Both are now mapped onto the familiar Toolkit categories you already recognise, so you can compare strands at a glance.

4) We’re clearer about confidence

We will continue to pair the impact rating with an evidence security rating. As the security rating is now based on primary research rather than an existing review, in addition to knowing  (a) how many studies inform the impact rating, we are now able to capture (b) the design of the study, and (c) the quality of each study. Alongside a few other pieces of information, this now forms our evidence security rating. It’s shown with the familiar magnifying-glass icons on the front page and explained in the summary and technical report.

Why this helps you: You can spot, at a glance, whether a “high impact” label is built on lots of strong trials or on a smaller/less rigorous evidence base. That helps you decide whether to scale, test, or watch and wait.

5) We say more about who benefits

Wherever the data allow, we report whether effects differ for specific groups of children and young people (for example by gender, ethnicity, special educational needs or disabilities (SEND), or care experience) and how closely the evidence reflects UK practice and settings. We also flag how much evidence comes from the UK versus elsewhere and prioritise UK findings in narratives when quality allows.

Why this helps you: Decisions are more attuned to the young people you serve, not just an average effect drawn from very different contexts.

6) Costs you can plan around

Costs continue to be presented with accessible and transparent bands (Low/Medium/High) and a short narrative explaining what’s included. Where place-based work makes “per participant” costing tricky, we say so and explain typical resource drivers (e.g., police time, infrastructure). UK costs are prioritised; international costs are converted and flagged.

Why this helps you: Commissioners can weigh impact and affordability on the same page, with caveats clearly spelled out.

7) Three levels of detail, same plain language

Nothing changes about the layered design that practitioners told YEF they value:

Front page: short description + three key ratings (impact, evidence security, cost).
Summary page: what it is, is it effective?, who it works for, how secure?, how to implement, costs, UK projects, take-away messages.
Technical report: full methods and numbers ‘under the bonnet’.

Why this helps you: You can stay high-level or dive deep into the underlying research and data, without losing the thread between the three.

Conclusion

The Toolkit still does what practitioners value via clear and fair summaries you can confidently act on, but the way we build them is now stronger. By running our own systematic reviews and meta-analyses, using measures that fit both people-focused and place-based approaches, showing how confident we are in the findings, and being explicit about who benefits (and in which UK settings), we offer a more precise picture to inform decision-making. Costs are set out simply, with honest caveats, so you can weigh impact and affordability together.

We’re building on solid foundations, not replacing them. The result is a living, transparent resource that keeps pace with new studies and stays focused on improving outcomes for children and young people. If you’re commissioning, designing, or refining services, the updated Toolkit should make your next steps much clearer.