Data Questions & Answers
Here you can find the answers to some of the questions researchers most frequently ask us
You can find answers to some common questions below (click the arrow to expand the information). If you have a question that isn’t answered here, please get in touch via our Contact Us page. We also have more information for patients and the public here.
How do I apply to the Gut Reaction Hub to access the data?
Gut Reaction is the programme of work bringing together integrated data on patients with IBD. The lead organisation for Gut Reaction data is Cambridge University Hospitals (CUH), who are also the data controllers for all datasets held by NIHR BioResource. You can find more information on our Accessing Data page. An application form to access the data can be found on the NIHR BioResource website.
The IBD Registry holds additional datasets not currently linkable to Gut Reaction that are available through a separate application process. You can find more information on the Registry’s data application page, together with an application form to access these datasets.
Are the processes for commercial and academic applications the same?
Under our current data access procedure, all applications for NIHR BioResource data from commercial organisations require review by the national Steering Committee. Academic study requests may be reviewed by the Data Access Committee or the Steering Committee, depending on the level of complexity.
Further details of this process can be found here.
Any datasets applied for through the IBD Registry will follow the requirements of the Registry.
Which organisation would a commercial contract be with, and therefore be responsible for providing data access and meeting contractual obligations such as quality checks and timelines?
All contracts would be between Cambridge University Hospitals and the institution which employs the applicant(s).
Who can request access to the data?
Any academic and commercial researchers can request access to the data. International collaborations will be considered.
Can I extract / download the data to analyse it within my own systems?
Extensive engagement with our PAC has told us that they are most happy with their de-identified data being accessed from within our secure Trustworthy Research Environment (TRE).
A variety of data analysis tools are available within the TRE. If you have bespoke tools which you require, we can work with you to include these within the TRE. Any data generated by the research team as part of a participant recall study and the results of any study may be held securely by the researchers in question within their own systems.
Requests to analyse de-identified data in secure systems not owned by Gut Reaction will only be considered under very specific circumstances and will require approval by our PAC and data management team prior to a data access request.
Are proof-of-concept studies possible?
Yes, you may apply to run a proof-of-concept study.
What regions of the UK have you collected data from?
Gut Reaction has data from consented patients across England, Scotland and Wales. The IBD Registry also has datasets from Northern Ireland.
Will the datasets being provided by NHS Trusts and NHS Digital be available for all causes of hospital admission, or only those relating to IBD?
Hospital Episode Statistics (HES) from NHS Digital are available for all causes of hospital admission. However studies requiring access to these data will require specific permission from NHS Digital.
Will mortality data be collected?
Patient activity data in HES can be used to identify if a patient died in hospital. Deaths recorded in HES may be analysed by the main diagnosis for which the patient was being treated, however these data alone cannot be used to determine the underlying cause of death. Data concerning mortality following discharge are not (currently) available, although we are investigating with NHS Digital whether it will be possible to use the Personal Demographics Service (PDS) for this purpose – but only where BioResource participants are being contacted about a potential recall study.
Will the data be curated?
Please see our ‘available data sets’ for details on available datasets and their curation.
Can I submit a single data access request which would allow repeated access to the data for a range of purposes?
No. An application must be submitted for each proposed study.
Each application must have a clear aim and specify the datasets that you would like to access. The data must only be used for the purpose specified in the study application. Applications requesting access to all the data for a study without specific aims will be declined. We continue to work closely with our Patient Advisory Committee who have provided extensive feedback who are clear that these are imperative to continued patient support for the project.
Is it possible to obtain details of a cohort size, using top level inclusion & exclusion criteria, to gauge pre-study feasibility prior to application?
A cohort discovery tool will soon be available for this purpose. Cohort discovery can currently be undertaken with our team prior to a data access application. If significant costs are incurred in undertaking work before a project starts, these may need to be recovered in the contract for the project.
With regards to intellectual property ownership, who is classed as owner of the data following a study?
The data controller for all existing datasets is Cambridge University Hospitals. Results of a study will belong to the organisation undertaking the research. Where it is anticipated there could be downstream commercialisation of Intellectual Property, or significant cost savings realised by the applying organisation, this may potentially be reflected in the commercial arrangements to ensure fair value for the data is returned to the NHS.
Is sharing of commercial revenues agreed pre-study on an ad hoc basis or is there a fixed pricing structure & duration where this cost to industry is applied?
Any revenue share will be agreed once the input of the data into a commercial product is known. Factors to be considered will be:
- Intellectual Property (IP): who is contributing the Background IP, and who is/are creating the Foreground IP
- How the data contributes to the final product
- Granularity of data requested; increased granularity or specificity of a data set (depth of phenotyping, number of data points) increases its value