Demonstration 2: How to use deep learning in segmentation and patient-spesific dose estimation in CT¶


Short demonstration with example DICOM CT data how to apply developed deep learning models in CT dosimetry
Demonstration idea taken from Huettenbrink et al. Personalized Prediction of Patient Radiation Exposure for Therapy of Urolithiasis: An Application and Comparison of Six Machine Learning Algorithms J Pers Med. 2023 Apr 7;13(4):643. doi: 10.3390/jpm13040643
This notebook is modified from Github , which is based on following article: Tzanis, E., Damilakis, J. A machine learning-based pipeline for multi-organ/tissue patient-specific radiation dosimetry in CT. Eur Radiol (2024). https://doi.org/10.1007/s00330-024-11002-0
Outline¶
Data preparation
- Get dicom CT data from thorax or abdomen region
- Read in dicom data
- Extract image stack from dicom object
- HU conversion for the image stack
Organ segmentation with deep learning
Dose prediction
- Pre-processing
- Deep learning model
- Dose estimation
Conclusion
17.11.2024 Satu Inkinen
#Import all the necessary libraries needed:
import pydicom
import torch
from model import RegressionModel
from sklearn.preprocessing import StandardScaler
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from numpy import load
import nibabel as nib
import os
import math
import warnings
warnings.filterwarnings("ignore")
#Auxliary functions:
def read_dicom(path):
'''This function reads dicoms and sorts them by Image position patient attribute '''
slices = [pydicom.dcmread(path + '/' + s, force=True) for s in os.listdir(path)]
slices.sort(key = lambda x: float(x.ImagePositionPatient[2]))
return slices
def dicom_to_array(slices):
'''This function extracts pixel data from the pydicom objects '''
image = np.stack([s.pixel_array for s in slices])
image = image.astype(np.int16)
return image
def convert_to_hu(dicom_image_serie, dicom_image_array):
'''This function converts pixel images to HU units based on Dicom metadata RescaleIntercept and RescaleSlope '''
intercept = dicom_image_serie[0].RescaleIntercept
slope = dicom_image_serie[0].RescaleSlope
hu_image = dicom_image_array * slope + intercept
return hu_image
def inference_array(organ,wed_array, mA, path, slice_location, path_output):
'''This function generates inference data sturcture from organ mask and dicom data and saves it the path_output'''
mask = np.rot90(nib.load(path + organ + '.nii.gz').get_fdata())
mask = np.transpose(mask, (2, 0, 1))
mask[mask != 1] = np.nan
#retrieving the organ's HU values
masked_organ = hu_image*mask
organ_training_array = np.empty((len(dicom_image_serie), 5), dtype=np.float32)
idx = []
for j in range(len(dicom_image_serie)):
if np.all(np.isnan(masked_organ[j])):
idx.append(j)
else:
organ_training_array[j][0] = np.nanmean(masked_organ[j])
organ_training_array[j][1] = np.nanstd(masked_organ[j])
organ_training_array[j][2] = float(mA[j])
organ_training_array[j][3] = wed_array[j]
organ_training_array[j][4] = float(slice_location[j])
clean_organ_training_array = np.delete(organ_training_array, idx, axis=0)
#save the inference array
os.makedirs(path_output + organ, exist_ok=True)
file_name = f'{path_output}/{organ}/{organ}_inference_array.npy'
np.save(file_name, clean_organ_training_array)
Data preparation¶
What we need to do 🤔?
- Clone/download the Github repository to obtain dose DL models
- Get dicom CT data from thorax or abdomen region
- Read in dicom data
- Extract image stack from dicom object
- HU conversion for the image stack
- Extract exposure parameters from dicom metadata
- Extract Slice location from dicom metadata
Clone/download DL models for dose prediction¶
- Download from Github page
- use git
Get dicom CT data from thorax or abdomen region¶
Example data from 3D Slicer tutorial .
Example data Dicom metadata illustration:

Example data image illustration:

we already observe challenges: obese patient --> not all organs are fully visible in the scan area i.e. outside field of view
Read in dicom data¶
def read_dicom(path):
'''This function reads dicoms and sorts them by Image position patient attribute '''
slices = [pydicom.dcmread(path + '/' + s, force=True) for s in os.listdir(path)]
slices.sort(key = lambda x: float(x.ImagePositionPatient[2]))
return slices
Extract image stack from dicom object¶
def dicom_to_array(slices):
'''This function extracts pixel data from the pydicom objects '''
image = np.stack([s.pixel_array for s in slices])
image = image.astype(np.int16)
return image
HU conversion for the image stack¶
def convert_to_hu(dicom_image_serie, dicom_image_array):
'''This function converts pixel images to HU units based on Dicom metadata RescaleIntercept and RescaleSlope '''
intercept = dicom_image_serie[0].RescaleIntercept
slope = dicom_image_serie[0].RescaleSlope
hu_image = dicom_image_array * slope + intercept
return hu_image
#Load data
#Path to data
path = r"C:\Users\HUS68766320\Documents\GitHub\multi-structure-CT-dosimetry-main\DATA\SlicerDICOMTutorialData\dataset1_TorsoCT"
#Read dicom data
dicom_image_serie = read_dicom(path)
#convert the dicom image to a numpy array
dicom_image_array = dicom_to_array(dicom_image_serie)
#converting pixel values to Hounsfield units
hu_image = convert_to_hu(dicom_image_serie, dicom_image_array)
#extracting mA values
mA = [s.XRayTubeCurrent for s in dicom_image_serie]
mean_mA = np.mean(mA)
#extracting slice location:
slice_location = [s.ImagePositionPatient[2] for s in dicom_image_serie]
Multi organ segmentation with deep learning (Total segmentator)¶
What we need to do 🤔?
- Clone/download Total segmentator to selected location
- Use python based API
- Download model weights
Total segmentator - deep learning pipeline for robust multi-organ segmentation¶

Total segmentation which is the default task contains 117 organs (main classes)
🔥 Has a python API for easy use! 🔥
https://github.com/MIC-DKFZ/nnUNet
What is nn-Unet?
"nnU-Net is a semantic segmentation method that automatically adapts to a given dataset. It will analyze the provided training cases and automatically configure a matching U-Net-based segmentation pipeline. No expertise required on your end! You can simply train the models and use them for your application."

https://www.nature.com/articles/s41592-020-01008-z
- Automatic Configuration: nnU-Net automatically configures itself based on the dataset and task at hand. This includes adjusting the network architecture, training schedules, and data augmentation strategies without manual intervention
- Benchmarking and Validation: The paper highlights the need for rigorous validation and benchmarking of segmentation methods. It points out that many recent claims of superior performance by novel architectures do not hold up under thorough testing
- Use of CNN-based Models: Despite the rise of new architectures like Transformers, the study finds that CNN-based U-Net models, including variants like ResNet and ConvNeXt, still provide state-of-the-art performance when properly configured and scaled to modern hardware
- Scalability: nnU-Net is designed to scale with modern hardware resources, ensuring that it can handle large datasets and complex tasks efficiently. Overall, nnU-Net demonstrates that a well-configured U-Net architecture can achieve excellent results in medical image segmentation, challenging the notion that newer architectures are always superior
What is Unet model
An encoder-decoder network that has skip connections (gray arrows), first developed for semantic segmentation

Pytorch code implementation of Unet:
""" Full assembly of the parts to form the complete network """
from .unet_parts import *
class UNet(nn.Module):
def __init__(self, n_channels, n_classes, bilinear=False):
super(UNet, self).__init__()
self.n_channels = n_channels
self.n_classes = n_classes
self.bilinear = bilinear
self.inc = (DoubleConv(n_channels, 64))
self.down1 = (Down(64, 128))
self.down2 = (Down(128, 256))
self.down3 = (Down(256, 512))
factor = 2 if bilinear else 1
self.down4 = (Down(512, 1024 // factor))
self.up1 = (Up(1024, 512 // factor, bilinear))
self.up2 = (Up(512, 256 // factor, bilinear))
self.up3 = (Up(256, 128 // factor, bilinear))
self.up4 = (Up(128, 64, bilinear))
self.outc = (OutConv(64, n_classes))
def forward(self, x):
x1 = self.inc(x)
x2 = self.down1(x1)
x3 = self.down2(x2)
x4 = self.down3(x3)
x5 = self.down4(x4)
x = self.up1(x5, x4)
x = self.up2(x, x3)
x = self.up3(x, x2)
x = self.up4(x, x1)
logits = self.outc(x)
return logits
""" Parts of the U-Net model """
import torch
import torch.nn as nn
import torch.nn.functional as F
class DoubleConv(nn.Module):
"""(convolution => [BN] => ReLU) * 2"""
def __init__(self, in_channels, out_channels, mid_channels=None):
super().__init__()
if not mid_channels:
mid_channels = out_channels
self.double_conv = nn.Sequential(
nn.Conv2d(in_channels, mid_channels, kernel_size=3, padding=1, bias=False),
nn.BatchNorm2d(mid_channels),
nn.ReLU(inplace=True),
nn.Conv2d(mid_channels, out_channels, kernel_size=3, padding=1, bias=False),
nn.BatchNorm2d(out_channels),
nn.ReLU(inplace=True)
)
def forward(self, x):
return self.double_conv(x)
class Down(nn.Module):
"""Downscaling with maxpool then double conv"""
def __init__(self, in_channels, out_channels):
super().__init__()
self.maxpool_conv = nn.Sequential(
nn.MaxPool2d(2),
DoubleConv(in_channels, out_channels)
)
def forward(self, x):
return self.maxpool_conv(x)
class Up(nn.Module):
"""Upscaling then double conv"""
def __init__(self, in_channels, out_channels, bilinear=True):
super().__init__()
# if bilinear, use the normal convolutions to reduce the number of channels
if bilinear:
self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
self.conv = DoubleConv(in_channels, out_channels, in_channels // 2)
else:
self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2)
self.conv = DoubleConv(in_channels, out_channels)
def forward(self, x1, x2):
x1 = self.up(x1)
# input is CHW
diffY = x2.size()[2] - x1.size()[2]
diffX = x2.size()[3] - x1.size()[3]
x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,
diffY // 2, diffY - diffY // 2])
# if you have padding issues, see
# https://github.com/HaiyongJiang/U-Net-Pytorch-Unstructured-Buggy/commit/0e854509c2cea854e247a9c615f175f76fbb2e3a
# https://github.com/xiaopeng-liao/Pytorch-UNet/commit/8ebac70e633bac59fc22bb5195e513d5832fb3bd
x = torch.cat([x2, x1], dim=1)
return self.conv(x)
class OutConv(nn.Module):
def __init__(self, in_channels, out_channels):
super(OutConv, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)
def forward(self, x):
return self.conv(x)
Why of Skip Connections? In summary, skip connections are essential in U-Net models for maintaining high-resolution spatial information, improving gradient flow, and combining different levels of feature information. Without them, segmentation performance would likely degrade significantly.
Python API for totalsegmentator:
from totalsegmentator.python_api import totalsegmentator
output_path ="inference/organ_masks_demo2/'"
totalsegmentator(path, output_path, task ="body")
totalsegmentator(path, output_path, task ="lung_vessels")
%%time
#create organ masks using total segmentator
from totalsegmentator.python_api import totalsegmentator
output_path ="inference/organ_masks_demo2/'"
totalsegmentator(path, output_path, task ="body")
totalsegmentator(path, output_path, task ="lung_vessels")
If you use this tool please cite: https://pubs.rsna.org/doi/10.1148/ryai.230024 Converting dicom to nifti... found image with shape (512, 512, 291) Resampling... Resampled in 6.37s Predicting...
100%|██████████████████████████████████████████████████████████████████████████████████| 24/24 [00:16<00:00, 1.41it/s]
Predicted in 52.75s
100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.37it/s] 100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.11it/s]
Resampling... Saving segmentations...
0%| | 0/2 [00:00<?, ?it/s]
Saved in 8.56s Creating body.nii.gz Creating skin.nii.gz If you use this tool please cite: https://pubs.rsna.org/doi/10.1148/ryai.230024 Generating rough segmentation for cropping... Converting dicom to nifti... found image with shape (512, 512, 291) Resampling... Resampled in 2.73s Predicting...
100%|████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 5.89it/s]
Predicted in 17.58s Resampling... Converting dicom to nifti... found image with shape (512, 512, 291) cropping from (512, 512, 291) to (512, 508, 151) Predicting...
100%|██████████████████████████████████████████████████████████████████████████████████| 80/80 [00:06<00:00, 13.32it/s]
Predicted in 70.65s Saving segmentations...
0%| | 0/2 [00:00<?, ?it/s]
Saved in 10.89s CPU times: total: 1min 8s Wall time: 4min 23s
<nibabel.nifti1.Nifti1Image at 0x23605d04550>
Visualization of the segmentations¶

Segmented masks:
- esophagus
- aorta
- lung_lower_lobe_left
- lung_lower_lobe_right
- lung_upper_lobe_left
- lung_upper_lobe_right
- lung_middle_lobe_right
- skin
- lung_vessels
- lung_trachea_bronchia
- heart
Water equivalent diameter (WED) feature:¶
Is computed for each slice $$ WED = 2\sqrt{ \Bigg\lbrack \frac{1}{1000}\overline{HU}+1 \Bigg\rbrack\frac{A}{\pi}} $$ where, $A =$ body area $\overline{HU} = $ mean Houdsfield unit.
#WED calculation:
#voxel x and y dimensions
x_dim = float(dicom_image_serie[0].PixelSpacing[0])
y_dim = float(dicom_image_serie[0].PixelSpacing[1])
#Get body mask from totalsegmentated data:
body_mask = np.rot90(nib.load('inference/organ_masks_demo2/\'/body.nii.gz').get_fdata())
#transpose the array to match the dimensions of the hu_image
body_mask = np.transpose(body_mask, (2, 0, 1))
body_mask[body_mask != 1] = np.nan
wed_array = []
for j in range(0, len(dicom_image_serie)):
pix_number = np.count_nonzero(body_mask[j] == 1)
area = pix_number*x_dim*y_dim
croped = hu_image[j]*body_mask[j]
mean_HU = np.nanmean(croped)
wed = 2*(math.sqrt(((mean_HU/1000)+1)*(area/math.pi)))
wed_array.append(wed)
Inference array¶
Now all features can be computed in array form and savedfor the organ dose estimation.

def inference_array(organ,wed_array, mA, path, slice_location, path_output):
'''This function generates inference data sturcture from organ mask and dicom data and saves it the path_output'''
mask = np.rot90(nib.load(path + organ + '.nii.gz').get_fdata())
mask = np.transpose(mask, (2, 0, 1))
mask[mask != 1] = np.nan
#retrieving the organ's HU values
masked_organ = hu_image*mask
organ_training_array = np.empty((len(dicom_image_serie), 5), dtype=np.float32)
idx = []
for j in range(len(dicom_image_serie)):
if np.all(np.isnan(masked_organ[j])):
idx.append(j)
else:
organ_training_array[j][0] = np.nanmean(masked_organ[j])
organ_training_array[j][1] = np.nanstd(masked_organ[j])
organ_training_array[j][2] = float(mA[j])
organ_training_array[j][3] = wed_array[j]
organ_training_array[j][4] = float(slice_location[j])
clean_organ_training_array = np.delete(organ_training_array, idx, axis=0)
#save the inference array
os.makedirs(path_output + organ, exist_ok=True)
file_name = f'{path_output}/{organ}/{organ}_inference_array.npy'
np.save(file_name, clean_organ_training_array)
%%time
#inference array creation:
organs_list = ['esophagus', 'aorta', 'lung_lower_lobe_left', 'lung_lower_lobe_right',
'lung_upper_lobe_left', 'lung_upper_lobe_right', 'lung_middle_lobe_right',
'skin', 'lung_vessels', 'lung_trachea_bronchia']
path_output = 'inference/inference_arrays/'
path_input = 'inference/organ_masks_demo/'
for organ in organs_list:
print(organ)
inference_array(organ, wed_array, mA, path_input, slice_location, path_output)
esophagus aorta lung_lower_lobe_left lung_lower_lobe_right lung_upper_lobe_left lung_upper_lobe_right lung_middle_lobe_right skin lung_vessels lung_trachea_bronchia CPU times: total: 7.77 s Wall time: 13.1 s
Dose prediction¶

Preprocessing:¶
Standard scaler:
- Normalization of Data: Standard scaling transforms your data to have a mean of 0 and a standard deviation of 1. This normalization helps in ensuring that each feature contributes equally to the model’s performance.
- Improved Convergence: Algorithms like gradient descent converge faster when the data is scaled. This is because the optimization process can navigate the cost function landscape more efficiently when features are on a similar scale.
- Reduced Bias: Without scaling, features with larger ranges can dominate the learning process, leading to biased models. Standard scaling helps in mitigating this issue by ensuring all features are treated equally.

The yellow parameters are estimated from the training data
#scaler for the organ
data = load('training_arrays/chest_merged_training_arrays/'+ organ + '_training_array.npy')
X = data[:,:-1]
# fit scaler
scaler = StandardScaler()
X = scaler.fit_transform(X)
X = [ 62.365078 35.9114 269. 273.69235 -126.5 ]
X_scaled = [ 9.68281e-01 -4.54136e-01 -3.96682e-01 4.26966e-01 -4.41768e-01] data is scaled to zero mean and standard deviation of 1.
Deep learning model for organ dose estimation:¶
import torch.nn as nn
class RegressionModel(nn.Module):
def __init__(self, input_size, hidden_sizes, output_size, num_layers, dropout):
super(RegressionModel, self).__init__()
layers = []
layers.append(nn.Linear(input_size, hidden_sizes[0]))
layers.append(nn.ReLU())
for i in range(1, num_layers - 1):
layers.append(nn.Linear(hidden_sizes[i - 1], hidden_sizes[i]))
layers.append(nn.ReLU())
layers.append(nn.Dropout(dropout))
layers.append(nn.Linear(hidden_sizes[num_layers - 2], output_size))
self.model = nn.Sequential(*layers)
def forward(self, x):
return self.model(x)
Each organ has its own pre-trained dose prediction model available:

Evaluation/inference: organ dose esimation using pretrained models¶
#Model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
hidden_sizes = [512, 256, 128, 64]
model = RegressionModel(input_size=5, hidden_sizes=hidden_sizes, output_size=1, num_layers=5, dropout=0.1)
print('Predicted dose for the organs of interest' )
print('-----------------------------------------')
def evaluate(organ):
#model loading for the organ
model.load_state_dict(torch.load('chest_models/'+ organ + '_model.pth'))
model.to(device)
#scaler for the organ
data = load('training_arrays/chest_merged_training_arrays/'+ organ + '_training_array.npy')
X = data[:,:-1]
# Find rows containing NaN values in X
nan_rows_X = np.isnan(X).any(axis=1)
# Drop rows with NaN values from X
X = X[~nan_rows_X]
# fit scaler
scaler = StandardScaler()
X = scaler.fit_transform(X)
X_test = load(f'inference/inference_arrays/{organ}/{organ}_inference_array.npy')
# Find rows containing NaN values in X
nan_rows_X_test = np.isnan(X_test).any(axis=1)
# Drop rows with NaN values from X and y
X_test = X_test[~nan_rows_X_test]
#scaling
X_test = scaler.transform(X_test)
#data to tensor
X_test = torch.tensor(X_test, dtype=torch.float32).to(device)
#evaluate
with torch.no_grad():
model.eval()
predictions = model(X_test).cpu().numpy()
print(f'{organ} : {np.mean(predictions):.1f} mGy')
return np.mean(predictions)
organs_list = ['esophagus', 'aorta', 'lung_lower_lobe_left', 'lung_lower_lobe_right',
'lung_upper_lobe_left', 'lung_upper_lobe_right', 'lung_middle_lobe_right',
'skin', 'lung_vessels', 'lung_trachea_bronchia']
organ_dose_est_list = []
for organ in organs_list:
organ_dose_estimate = evaluate(organ)
organ_dose_est_list.append(organ_dose_estimate)
%%time
#Dose prediction:
#Model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
hidden_sizes = [512, 256, 128, 64]
model = RegressionModel(input_size=5, hidden_sizes=hidden_sizes, output_size=1, num_layers=5, dropout=0.1)
print('Predicted dose for the organs of interest' )
print('-----------------------------------------')
def evaluate(organ):
#model loading for the organ
model.load_state_dict(torch.load('chest_models/'+ organ + '_model.pth'))
model.to(device)
#scaler for the organ
data = load('training_arrays/chest_merged_training_arrays/'+ organ + '_training_array.npy')
X = data[:,:-1]
# Find rows containing NaN values in X
nan_rows_X = np.isnan(X).any(axis=1)
# Drop rows with NaN values from X
X = X[~nan_rows_X]
# fit scaler
scaler = StandardScaler()
X = scaler.fit_transform(X)
X_test = load(f'inference/inference_arrays/{organ}/{organ}_inference_array.npy')
# Find rows containing NaN values in X
nan_rows_X_test = np.isnan(X_test).any(axis=1)
# Drop rows with NaN values from X and y
X_test = X_test[~nan_rows_X_test]
#scaling
X_test = scaler.transform(X_test)
#data to tensor
X_test = torch.tensor(X_test, dtype=torch.float32).to(device)
#evaluate
with torch.no_grad():
model.eval()
predictions = model(X_test).cpu().numpy()
print(f'{organ} : {np.mean(predictions):.1f} mGy')
return np.mean(predictions)
organ_dose_est_list = []
for organ in organs_list:
organ_dose_estimate = evaluate(organ)
organ_dose_est_list.append(organ_dose_estimate)
Predicted dose for the organs of interest ----------------------------------------- esophagus : 11.8 mGy aorta : 15.0 mGy lung_lower_lobe_left : 14.2 mGy lung_lower_lobe_right : 12.9 mGy lung_upper_lobe_left : 9.6 mGy lung_upper_lobe_right : 4.3 mGy lung_middle_lobe_right : 12.6 mGy skin : 8.3 mGy lung_vessels : 13.7 mGy lung_trachea_bronchia : 0.8 mGy CPU times: total: 297 ms Wall time: 1.05 s
%matplotlib inline
fig, ax = plt.subplots()
#bar_labels = ['red', 'blue', '_red', 'orange']
#bar_colors = ['tab:red', 'tab:blue', 'tab:red', 'tab:orange']
ax.bar(organs_list, organ_dose_est_list)#, label=bar_labels, color=bar_colors)
plt.setp(plt.gca().get_xticklabels(), rotation=90, ha='center')
ax.set_ylabel('Dose mGy')
ax.set_title('Estimate organ doses')
plt.show()
Conclusion¶
Utilizing developed deep learning models from platforms like GitHub simplifies learning compared to reverse engineering from scientific paper
Limitations:
- DICOM stack from an obese patient had parts of the body outside the field of view in the reconstruction which could not be segmented correctly
- i.e. model estimation is limited to organs within the field of view
Future Studies:
- Should incorporate modeling organs outside the field of view in dose estimation
Advantages:
- Availability of code provides a valuable example of using open-source codes in segmentation tasks
- Eliminates the need to reinvent the wheel