This page has the list of publications related to CHAVI

Kundu S, Chakraborty S, Chatterjee S, Das S, Achari RB, Mukhopadhyay J, et al. De-Identification of Radiomics Data Retaining Longitudinal Temporal Information. J Med Syst. 2020 Apr 2;44(5):99.

We propose a de-identification system which runs in a standalone mode. The system takes care of the de-identification of radiation oncology patient's clinical and annotated imaging data including RTSTRUCT, RTPLAN, and RTDOSE. The clinical data consists of diagnosis, stages, outcome, and treatment information of the patient. The imaging data could be the diagnostic, therapy planning, and verification images. Archival of the longitudinal radiation oncology verification images like cone beam CT scans along with the initial imaging and clinical data are preserved in the process. During the de-identification, the system keeps the reference of original data identity in encrypted form. These could be useful for the re-identification if necessary.

Kundu S, Chakraborty S, Mukhopadhyay J, Das S, Chatterjee S, Basu Achari R, et al. Research Goal-Driven Data Model and Harmonization for De-Identifying Patient Data in Radiomics. J Digit Imaging [Internet]. 2021 Jul 9; Available from: https://doi.org/10.1007/s10278-021-00476-9

There are various efforts in de-identifying patient’s radiation oncology data for their uses in the advancement of research in medicine. Though the task of de-identification needs to be defined in the context of research goals and objectives, existing systems lack the flexibility of modeling data and normalization of names of attributes for accomplishing them. In this work, we describe a de-identification process of radiation and clinical oncology data, which is guided by a data model and a schema of dynamically capturing domain ontology and normalization of terminologies, defined in tune with the research goals in this area. The radiological images are obtained in DICOM format. It consists of diagnostic, radiation therapy (RT) treatment planning, RT verification, and RT response images. During the DICOM de-identification, a few crucial pieces of information are taken about the dataset. The proposed model is generic in organizing information modeling in sync with the de-identification of a patient’s clinical information. The treatment and clinical data are provided in the comma-separated values (CSV) format, which follows a predefined data structure. The de-identified data is harmonized throughout the entire process. We have presented four specific case studies on four different types of cancers, namely glioblastoma multiforme, head–neck, breast, and lung. We also present experimental validation on a few patients’ data in these four areas. A few aspects are taken care of during de-identification, such as preservation of longitudinal date changes (LDC), incremental de-identification, referential data integrity between the clinical and image data, de-identified data harmonization, and transformation of the data to an underlined database schema.

Kundu S, Chakraborty S, Mukhopadhyay J, Das S, Chatterjee S, Achari RB, et al. Design and Development of a Medical Image Databank for Assisting Studies in Radiomics. J Digit Imaging [Internet]. 2022 Feb 15; Available from: http://dx.doi.org/10.1007/s10278-021-00576-6

 CompreHensive Digital ArchiVe of Cancer Imaging - Radiation Oncology (CHAVI-RO) is a multi-tier WEB-based medical image databank. It supports archiving de-identified radiological and clinical datasets in a relational database. A semantic relational database model is designed to accommodate imaging and treatment data of cancer patients. It aims to provide key datasets to investigate and model the use of radiological imaging data in response to radiation. This domain of research area addresses the modeling and analysis of complete treatment data of oncology patient. A DICOM viewer is integrated for reviewing the uploaded de-identified DICOM dataset. In a prototype system we carried out a pilot study with cancer data of four diseased sites, namely breast, head and neck, brain, and lung cancers. The representative dataset is used to estimate the data size of the patient. A role-based access control module is integrated with the image databank to restrict the user access limit. We also perform different types of load tests to analyze and quantify the performance of the CHAVI databank.