Consultation on the future approach to quality regulation
Published 18 September 2025
Annex G: Data annex
Introduction
- This annex provides further detail about the proposed indicators and how they might be presented. It also sets out initial proposals for the tests we could use to determine whether a provider has sufficient indicators for use in TEF assessments (expanding on Proposal 9).
Presentation of indicators
- The data measures that we are currently considering using as evidence for assessing each aspect in the first cycle of future TEF assessments are set out in Table G1.
- We expect to consult on measures for taught postgraduate and modular provision before using them in the second cycle of assessments. That would include any related changes to the presentation of the indicators.
Table G1: Proposed measures by aspect of assessment
Student experience |
Student outcomes |
|
|
- We intend to retain the current approach to benchmarking the TEF indicators. Benchmarking takes account of the characteristics of a provider’s students and courses. It enables the assessment to consider the quality of the student experience and outcomes for a provider’s particular mix of students and courses. We envisage applying the existing methodology to the additional post-study measures we are proposing. As part of the second stage consultation in 2026, we intend to consult on the details of these new measures, including how they would be benchmarked, and on any adjustments to benchmarking for any other indicators.
- We intend to retain the current approach of presenting ‘overall’ TEF indicators that group all undergraduate levels of study together, and aggregate the data over a four year period. These overall indicators would be presented separately for each mode of study (full-time, part-time, and apprenticeships if applicable).
- The overall indicators would also be ‘split’ to show different levels of study, subjects, student characteristics, years and other groupings. We intend to review existing split indicators to identify which are most relevant and whether there might be scope to remove any to reduce complexity in the dataset. We would consult on any proposed changes to split indicators as part of the second stage consultation in 2026.
- We are also proposing the data dashboards would present the indicators for two views of each provider’s student population:
- taught only (students that the provider teaches)
- subcontracted out and validated only (students taught by others through partnership arrangements).
- As described under Proposal 3: Provision in scope, we consider that presenting indicators separately for taught and partnership provision would increase the visibility of any differences in quality between these two groups. We are also considering introducing additional splits for provision delivered through partnerships, which would break down the data by each named partnership.
- For student outcomes, our intention is to explore how we can integrate or simplify the data dashboards to support an integrated assessment of B3 as part of the assessment of student outcomes in the TEF. Within the approach described above we aim to develop a means of identifying for users where student outcomes do not meet the minimum numerical thresholds, which are set for each level of study.[22]
Sufficiency of indicator data for use in TEF assessments
- Proposal 9 sets out how, where there is insufficient data, we might vary our approach to assessing student experience, or not rate student outcomes. We propose that we should develop rules or tests, to be applied consistently across all providers, to determine whether the data is sufficient for use in the assessment of each aspect.
- We are proposing to develop sufficiency tests based on:
- Coverage – whether an indicator covers a substantial percentage of the provider’s students.
- Statistical confidence – whether there is sufficient statistical confidence in an individual indicator.
We would then test, for the aspect as a whole, whether there are enough indicators that meet both the coverage and statistical confidence tests to inform an assessment of that aspect.
- We have carried out analysis to identify the potential effect of our proposals to vary the assessment approach, where there is insufficient data. For this analysis we have applied initial assumptions about what might be appropriate thresholds for each of these tests, as set out below. We invite comments on how we should define whether the data is sufficient, and the tests we are proposing to carry out under Proposal 9.
- In our analysis, we have so far only considered indicators for the ‘Taught only’ view, which includes all students taught by the provider. We could consider in future also including students for which the provider has responsibility under partnership arrangements.
- In the analysis we considered the indicators in whichever modes of study the provider has a substantial proportion of its students. Rather than just looking at indicators in the ‘majority mode’ for a provider, we looked at indicators in any modes that represented at least 35 per cent of the provider’s students.[23]
This means, for example:
- If 70 per cent of a provider’s undergraduate students were full-time, 20 per cent part-time and 10 per cent on apprenticeships, we looked at the full-time indicators for this provider.
- If 45 per cent of a provider’s undergraduate students were full-time, 40 per cent part-time and 15 per cent on apprenticeships, we looked at both the full-time and part-time indicators for this provider.
- We took this approach rather than looking at the ‘majority mode’ because not all providers have a single ‘majority’ mode, or the balance between two modes can be almost even, and this could result in a near majority of students being excluded from consideration. We consider it more useful to test all indicators that represent a substantial proportion of students.
Sufficient coverage
- The indicators we propose to use in the TEF do not cover all undergraduate students, and the coverage can vary for different providers. When considering whether the data is sufficient to inform an assessment, we need to consider how to account for circumstances where the coverage of a provider’s indicator is limited. For example, the NSS does not include students on courses that are one year or less in length, so NSS indicators will be less representative of the student population at providers that offer large numbers of these courses.
- In our analysis, we considered an NSS indicator to have sufficient coverage if at least half of the provider’s students in the relevant mode are on courses more than one year in length. We considered this to be the simplest test to give us a likely indication of whether the majority of the students in that mode are likely to be invited to complete the NSS.
- We did not apply coverage tests to the student outcomes indicators. The way in which the continuation and completion measures are defined means that we could be confident they represent the majority of the provider’s students. However, we would envisage applying a coverage test to the progression measure and any additional post-study measures we develop where this would be appropriate. We would consult on this as part of our detailed proposals prior to implementation.
Sufficient statistical confidence
- We currently use four indicative categories to describe the strength of statistical evidence we use in judgements about student outcomes and in TEF assessments:
- Around 99 per cent statistical confidence would provide compelling statistical evidence on which to make regulatory judgements.
- Around 95 per cent or higher statistical confidence would provide very strong statistical evidence on which to make regulatory judgements.
- Around 90 per cent or higher statistical confidence would provide strong statistical evidence on which to make regulatory judgements.
- Around 80 per cent or higher statistical confidence would provide probable statistical evidence on which to make regulatory judgements.[24]
- Our view is that, because the TEF assessments consider multiple indicators to inform judgements rather than relying on a single indicator, we should regard all indicators that provide strong, very strong or compelling statistical evidence as materially contributing to the assessment. (This is consistent with our current approach to B3 assessments, where we would begin to consider whether a provider is failing to meet minimum requirements when we have around 90 per cent or higher statistical confidence that an indicator falls below the relevant numerical threshold.)
- When interpreting the level of quality suggested by an indicator, the TEF assessors would consider whether an indicator is: materially below benchmark; broadly in line with benchmark; or materially above benchmark (in one of three ‘zones of performance’). We consider that an indicator can meaningfully contribute to the assessment if it spans two zones of performance, with at least strong statistical confidence.
- For example, if the ‘Proportion of statistical uncertainty distribution’ for an indicator is 70 per cent broadly in line with benchmark and 30 per cent materially above benchmark, there is compelling statistical evidence indicating at least high quality. This would be considered alongside contextual information or evidence in the submissions that might confirm an assessment of high quality, or might shift the judgement to outstanding quality. This would then be considered alongside a range of other indicators and evidence, to inform a broad overall judgement of the aspect as a whole.
- In our analysis, we therefore consider an indicator to provide sufficient statistical confidence to contribute to an assessment if we have around 90 per cent or higher statistical confidence that it is either within one zone of performance, or spans no more than two zones of performance.
Sufficient number of measures
- As shown in Table G1, we are proposing to use multiple indicators when assessing each aspect. We have initially considered how many measures with sufficient coverage and sufficient statistical confidence would be needed, as a minimum, for an assessment of student outcomes or to contribute to the assessment of student experience (without needing the alternative means of gathering student views). Our initial view, which we used for the purpose of our analysis, is:
- We would have sufficient data to rate the student outcomes aspect for a provider if we have indicators with sufficient coverage and statistical confidence for the continuation measure and at least one other measure. In practice, there are unlikely to be many cases where we have sufficient confidence in another measure and not in the continuation measure.
- We would have sufficient NSS indicators to inform the assessment of student experience (without the alternative means of gathering student views) if we have at least three NSS-based indicators with sufficient coverage and statistical confidence.
Likely effect of these proposals
- We set out the likely effect of our proposals, based on the tests that we applied in our analysis, below. Adjusting the thresholds for the tests or the indicators apply would impact on these estimates.
Student outcomes
- Based on initial assumptions and analysis, there are currently:
- 349 providers with sufficient data for student outcomes to be assessed
- 27 providers without sufficient outcomes data to be assessed.
- In addition to this, there are currently 23 relatively new or recently registered providers with undergraduate students that do not yet have any student outcomes data. While these providers will have accumulated outcomes data by the time we carry out the first cycle of assessments under the new scheme, it is not possible to know now how many of them would have sufficient data by 2027-28 to 2029-30. We also expect new providers to continue to register in the intervening period, and on an ongoing basis, so at any point we can anticipate there being some providers without student outcomes data.
- Our current estimate, based on the sufficiency criteria we have used, is that we could expect between 10 and 15 per cent of registered providers not to have sufficient data for a student outcomes assessment. It is worth noting that our current analysis is based on the continuation measure plus either or both of completion and progression. Once data indicators are available for the proposed additional measures, this could increase the number of providers where data meets the criteria for a student outcomes assessment.
Student experience
- Based on initial assumptions and analysis, there are currently:
- 260 providers with sufficient NSS indicator data
- 102 providers without sufficient NSS indicator data.
- In addition to this, there are currently 46 providers with undergraduate provision without NSS data. These include newer providers that don’t yet have students who have finished their courses, those with very small cohorts where data has been suppressed and those where response rate publication thresholds haven’t been met, as well as any providers where none of their higher education courses fall within scope of the NSS. As with student outcomes, it is possible that the number of providers without data will fluctuate in future.
- Based on the current numbers, we would need to use the alternative means of gathering student views to inform assessment of the student experience for around a third of providers. However, this initial analysis used the three years of available response data for the current NSS questionnaire. By the time we carry out assessments under the new scheme, we will have four years of data, and we expect confidence in the overall indicators to increase as a result. Our initial modelling suggests this would increase the number of providers with sufficient NSS indicator data to 315, meaning that we would need to use the alternative means of gathering student views to inform assessment of the student experience for around a quarter of providers.
Examples of applying proposed tests for data sufficiency
- Below we set out some examples of how the proposed sufficiency rules would apply for a number of illustrative scenarios. These use the same thresholds that we used in our analysis and are intended to support understanding of our proposals.
Examples: Student experience
Provider A
Coverage
78 per cent of Provider A’s undergraduate provision is full-time, so we would look at the coverage of its full-time NSS indicators. It has 5,000 full-time students.
80 per cent of Provider A’s full-time undergraduate students are on courses within scope of the NSS, so the NSS-based indicators would have sufficient coverage to inform an assessment and we would move on to apply the rules for statistical confidence.
Statistical confidence
Table G2 shows the proportion of the statistical uncertainty distribution that falls within each zone of performance for each full-time NSS indicator for Provider A.
Table G2: Statistical uncertainty distribution for Provider A’s NSS indicators
Proportion of statistical uncertainty distribution | Sufficient statistical confidence to contribute to an assessment? | |||
Materially below benchmark (%) |
Broadly in line with benchmark (%) |
Materially above benchmark (%) |
||
Teaching on my course |
0 |
5 |
95 |
Yes (around 90% or higher statistical confidence in one zone of performance) |
Learning opportunities |
0 |
35 |
65 |
Yes (around 90% or higher statistical confidence spanning two zones of performance) |
Assessment and feedback |
20 |
50 |
30 |
No (we do not have 90% or higher statistical confidence spanning two zones of performance) |
Academic support |
0 |
40 |
60 |
Yes (around 90% or higher statistical confidence spanning two zones of performance) |
Learning resources |
0 |
35 |
65 |
Yes (around 90% or higher statistical confidence spanning two zones of performance) |
Student voice |
17 |
37 |
46 |
No (we do not have 90% or higher statistical confidence spanning two zones of performance) |
Provider A has four NSS indicators with sufficient statistical confidence to contribute to an assessment and two NSS indicators where we do not have this level of confidence.
Conclusion
The tests would mean that Provider A has sufficient NSS indicators with sufficient coverage and statistical confidence to inform the assessment of student experience (without the alternative means of gathering student views).
Provider B
Coverage
85 per cent of Provider B’s undergraduate provision is part-time, so we would look at the coverage of its part-time NSS indicators. It has 250 part-time students.
35 per cent of Provider B’s part-time undergraduate students are on courses within scope of the NSS, so the rules would mean that the NSS-based indicators have insufficient coverage to inform an assessment (without alternative student views).
Statistical confidence
Because Provider B’s indicators have insufficient coverage to inform an assessment, we would not consider statistical confidence.
Conclusion
The tests would mean that Provider B has insufficient NSS indicators with sufficient coverage and statistical confidence to inform the assessment of student experience on their own. We would therefore gather student views through alternative means.
Examples: Student outcomes
Provider A
Coverage
78 per cent of Provider A’s undergraduate provision is full-time, so we would look at the coverage of its full-time student outcomes indicators. It has 5,000 full-time students.
Over 90 per cent of the provider’s full-time undergraduate students are within the population used for the Continuation and Completion indicators.
75 per cent are within the target population for the Graduate Outcomes survey, used for the progression indicator, and the new ‘Use of skills’ indicator.
80 per cent are within the sample for the new salary indicator.
All of these indicators have sufficient coverage to inform an assessment and we would move on to consider statistical confidence for each of them.
Statistical confidence
Table G3 shows the proportion of the statistical uncertainty distribution that falls within each zone of performance for each full-time student outcomes indicator for Provider A.
Table G3: Statistical uncertainty distribution for Provider A’s student outcomes indicators
Proportion of statistical uncertainty distribution | Sufficient statistical confidence to contribute to an assessment? | |||
Materially below benchmark (%) |
Broadly in line with benchmark (%) |
Materially above benchmark (%) |
||
Continuation |
0 |
3 |
97 |
Yes (around 90% or higher statistical confidence in one zone of performance) |
Completion |
2 |
6 |
92 |
Yes (around 90% or higher statistical confidence in one zone of performance) |
Progression |
20 |
25 |
55 |
No (we do not have 90% or higher statistical confidence spanning two zones of performance) |
Use of skills |
5 |
55 |
40 |
Yes (around 90% or higher statistical confidence spanning two zones of performance) |
Salary |
10 |
45 |
45 |
Yes (around 90% or higher statistical confidence spanning two zones of performance) |
Provider A has four student outcomes indicators with sufficient statistical confidence to contribute to an assessment, one of which is for the continuation measure.
Conclusion
The rules would mean that we have sufficient data to assess and rate the student outcomes aspect for Provider A as we have indicators with sufficient coverage and statistical confidence for the continuation measure and at least one other measure.
Provider B
Coverage
85 per cent of Provider B’s undergraduate provision is part-time, so we would look at the coverage of its part-time student outcomes indicators. It has 250 part-time students.
Over 90 per cent of Provider B’s part-time undergraduate students are within the population used for the Continuation and Completion indicators.
65 per cent are within the sample for the Graduate Outcomes survey, used for the Progression indicator and the new Use of skills indicator
70 per cent are within the sample for the new salary indicator
All of these indicators have sufficient coverage to inform an assessment and we would move on to consider statistical confidence for each of them.
Statistical confidence
Table G4 shows the proportion of the statistical uncertainty distribution that falls within each zone of performance for each part-time student outcomes indicator for Provider B.
Table G4: Statistical uncertainty distribution for Provider B’s student outcomes indicators
Proportion of statistical uncertainty distribution | Sufficient statistical confidence to contribute to an assessment? | |||
Materially below benchmark (%) |
Broadly in line with benchmark (%) |
Materially above benchmark (%) |
||
Continuation |
8 |
55 |
37 |
Yes (around 90% or higher statistical confidence spanning two zones of performance) |
Completion |
45 |
35 |
20 |
No (we do not have 90% or higher statistical confidence spanning two zones of performance) |
Progression |
30 |
45 |
20 |
No (we do not have 90% or higher statistical confidence spanning two zones of performance) |
Use of skills |
35 |
40 |
25 |
No (we do not have 90% or higher statistical confidence spanning two zones of performance) |
Salary |
50 |
30 |
20 |
No (we do not have 90% or higher statistical confidence spanning two zones of performance) |
Provider B has one student outcomes indicator with sufficient statistical confidence to contribute to an assessment, for the continuation measure.
Conclusion
The rules would mean we have insufficient data to assess or rate the student outcomes aspect for Provider B. While all indicators have sufficient coverage, because of the small size of the student population we only have sufficient statistical confidence in the indicator for one measure.
Notes
[22] The numerical thresholds that currently apply are available at OfS, Setting numerical thresholds for condition B3.
[23] Based on the most recent ‘Size and shape of provision’ data, available at OfS, Size and shape of provision data dashboard.
[24] More information about what creates statistical uncertainty in the data indicators and our view of this can be found in Description and definition of student outcome and experience measures - Office for Students.
Describe your experience of using this website