A dataset of radiotherapy planning CT scans was generated after informed consent from patients undergoing radiotherapy treatment planning at Tata Medical Center. The planning CT scans include the structure sets drawn by Radiation Oncologists and used for radiotherapy treatment planning for the patients. Planning CT scans were acquired in a single CT scanner. In addition, for some patients, the radiation plan and dose data were also made available. The dataset also has clinical data on the patients which is linked to the images. 

Background and Summary

Radiation therapy in the form of external beam radiotherapy is a part of the cancer treatment of nearly 60% of the cancer patients treated worldwide. The delivery of radiation therapy requires acquisition of a planning CT scan with the patient positioned in the treatment position, followed by segmentation where trained oncologists delineate the target volume and the organs at risk. As a result, this form of imaging data is already manually segmented and highly suited for radiomic feature extraction. After the segmentation has been performed, patients will undergo treatment planning where a individualized treatment plan would be created for the patient. The dose information is linked to the structure set such that a complete three dimensional evaluation of the dose received by the irradiated volume is feasible from the dataset. This opens up the possibility of interrogating dosiomics - a method were the spatial dose heterogenity is interrogated for it's impact on treatment related toxicity and cancer control. 

This dataset comprises of the following:

  1. Planning CT scans 
  2. Structure sets drawn on the planning CT scans for treatment planning. 

In addition clinical, pathological, staging and treatment data have been provided for the patients. 


After proper informed consent, patients radiotherapy planning images were exported from the Varian Treatment Planning system along with the structure set and the treatment plan. For a proportion of patients who have undergone treatment in Tomotherapy, treatment plans were not available as they have not been exported into the Varian treatment planning system. Clinical data was obtained from the hospital medical records and entered into a REDCap database. The data elements captured were aligned with the CHAVI database structure. Both clinical and radiological data were de-identified using the CHAVI DDiS software before uploading into CHAVI. 

Data Records

Imaging Data: Imaging data comprises of the following:

  1. Planning CT scans acquired on a GE Lightspeed 16 slice CT scanner used for radiotherapy treatment planning. Variable slice thickness used for the patients but in general most images will have been acquired with a slice interval of 2.5 mm. Field of view kept at maximum 50 cm for the radiation planning CT scans. IV contrast would be used for patients who required it clinically. Oral contrast would generally not be used. Images exported would be ones which would be used for treatment planning. 
  2. Structure sets: Each RT planning image would have a structure set which comprises of manually segmented structures corresponding to the target volume and organs at risk used for treatment planning. For most patients, these structure set would be drawn by one radiation oncologist and reviewed for accuracy by a senior radiation oncologist before approval. Only approved structure sets have been exported. 
  3. Treatment Plans and Doses: RT Structure sets and Plans are also made available in a DICOM format from the treatment planning system. 

Data of 80 patients have been uploaded with the following distribution of the anatomical sites:

Anatomical SiteNumber
Hematological Malignancies5
Central Nervous System10
Lower Gastrointestinal Tract17
Male Genital System19
Upper Gastrointestinal Tract6
Head Neck16

Clinical Data: Clinical data is available for all patients. The clinical dataset comprises of the following data elements

  1. Age
  2. Gender
  3. Date of registration at the center (de-identified)
  4. Date of pathological diagnosis (de-identified)
  5. Pathology
  6. Diagnosis
  7. Stage
  8. Treatment Intent
  9. Radiotherapy
  10. Chemotherapy
  11. Targeted Therapy
  12. Hormone Therapy (Endocrine therapy)
  13. Overall Survival

Technical Validation

The DICOM dataset comprises of RT structure sets which have been approved for clinical treatment by experienced Radiation Oncologist. All clinical data was verified manually from the source records in the HMS. 

Usage Notes

The dataset is available for use under the usual CHAVI data usage policy without additional restrictions. 

Code Availability

Not Applicable.


  1. Kundu S, Chakraborty S, Mukhopadhyay J, Das S, Chatterjee S, Achari RB, et al. Design and Development of a Medical Image Databank for Assisting Studies in Radiomics. J Digit Imaging [Internet]. 2022 Feb 15; Available from: http://dx.doi.org/10.1007/s10278-021-00576-6