Notebook to derive reference segmentations from segmentations of multiple experts. This notebook is based on SimpleITK.

This notebook is optmizied to be executed on Google Colab.

  • Press the the play butten to execute the cells. It will show up between [ ] on the left side of the code cells.
  • Run the cells consecutively. Skip cells that do not apply for your case.
  • Use Firefox or Google Chrome if you want to upload and download files

For more information on ground truth estimation methods see Biancardi, Alberto M., Artit C. Jirapatnakul, and Anthony P. Reeves. "A comparison of ground truth estimation methods." International journal of computer assisted radiology and surgery 5.3 (2010): 295-305.

#@markdown Please run this cell to get started.
try:
    from google.colab import files, drive
except ImportError:
    pass
try:
    import deepflash2
except ImportError:
    !pip install -q deepflash2==0.0.14
try:
    import SimpleITK
    assert SimpleITK.Version_MajorVersion()==1
except:
    !pip install -q SimpleITK==1.2.4
import zipfile
import imageio
import SimpleITK as sitk
from fastai.vision.all import *
from deepflash2.data import _read_msk
from deepflash2.utils import unzip

def staple(segmentations, foregroundValue = 1, threshold = 0.5):
    'STAPLE: Simultaneous Truth and Performance Level Estimation with simple ITK'
    segmentations = [sitk.GetImageFromArray(x) for x in segmentations]
    STAPLE_probabilities = sitk.STAPLE(segmentations)
    STAPLE = STAPLE_probabilities > threshold
    return sitk.GetArrayViewFromImage(STAPLE)

def mvoting(segmentations, labelForUndecidedPixels = 0):
    'Majority Voting from  simple ITK Label Voting'
    segmentations = [sitk.GetImageFromArray(x) for x in segmentations]
    mv_segmentation = sitk.LabelVoting(segmentations, labelForUndecidedPixels)
    return sitk.GetArrayViewFromImage(mv_segmentation)

Provide Reference Segmentations from different experts

  • One folder per expert
  • Identical names for segmentations

Examplary structure:

  • [folder] expert1
    • [file] mask1.png
    • [file] mask2.png
  • [folder] expert1
    • [file] mask1.png
    • [file] mask2.png

Option A: Upload via Google Drive (Colab only)

  • The folder in your drive must contain all segmentations and correct folder structure.
  • See here how to organize your files in Google Drive.
  • See this stackoverflow post for browsing files with the file browser
try:
    drive.mount('/content/drive')
    path = "/content/drive/My Drive/expert_segmentations" #@param {type:"string"}
    path = Path(path)
    #@markdown Example: "/content/drive/My Drive/expert_segmentations"
    print('Path contains the following files and folders: \n', L(os.listdir(path)))
except:
    print("Warning: Connecting to Google Drive only works on Google Colab.")
    pass

Option B: Upload via zip file (Colab only)

  • The zip file must contain all segmentations and correct folder structure.
  • See here how to zip files on Windows or Mac.
path = Path('expert_segmentations')
try:
    u_dict = files.upload()
    for key in u_dict.keys():
        unzip(path, key)
    print('Path contains the following files and folders: \n', L(os.listdir(path)))
except:
    print("Warning: File upload only works on Google Colab.")
    pass

Option C: Provide path (Local installation)

If you're working on your local machine or server, provide a path to the correct folder.

path = "expert_segmentations" #@param {type:"string"}
path = Path(path)
print('Path contains the following files and folders: \n', L(os.listdir(path)))
#@markdown Example: "expert_segmentations"

Option D: Try with sample data (Testing only)

If you don't have any data available yet, try our sample data

path = Path('expert_segmentations')
url = "https://github.com/matjesg/bioimage_analysis/raw/master/train_data/lab-wue1/labels/"
experts = ['expert_'+str(e) for e in range(1,6)]
for e in  experts:   
    (path/e).mkdir(exist_ok=True, parents=True)
    urllib.request.urlretrieve(f'{url}/{e}/0001_cFOS.png', path/e/'mask_1.png');

Load data

masks = get_image_files(path)
experts = set([m.parent.name for m in masks])
print(f'You have uploaded {len(masks)} files from the following experts: {experts}')

Ground Truth Estimation

Recommended: Simultaneous truth and performance level estimation (STAPLE)

The STAPLE algorithm considers a collection of segmentations and computes a probabilistic estimate of the true segmentation and a measure of the performance level represented by each segmentation.

Source: Warfield, Simon K., Kelly H. Zou, and William M. Wells. "Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation." IEEE transactions on medical imaging 23.7 (2004): 903-921

path_staple = path/'staple'
path_staple.mkdir(exist_ok=True)
unique_masks = set([m.name for m in masks])
for msk_name in progress_bar(unique_masks):
    print('Processing', msk_name)
    segmentations = [_read_msk(m) for m in masks if m.name==msk_name]
    staple_segmentation = staple(segmentations)
    out_mask = staple_segmentation*255 if staple_segmentation.max()==1 else staple_segmentation
    imageio.imsave(path_staple/msk_name, out_mask)

If connected to Google Drive, the ground truth estimations are automatically added to your drive. You can also download the files here:

zipObj = zipfile.ZipFile('staple_export.zip', 'w')
for f in get_image_files(path_staple):
    zipObj.write(f)
zipObj.close()
try:
    files.download('staple_export.zip')
except:
    print("Warning: File download only works on Google Colab.")
    pass

Alternative: Majority Voting

Use majority voting to obtain the reference segmentation. Note that this filter does not resolve ties. In case of ties it will assign labelForUndecidedPixels to the result.

labelForUndecidedPixels = 0 #@param {type:"integer"}
path_mv = path/'mv'
path_mv.mkdir(exist_ok=True)
unique_masks = set([m.name for m in masks])
for msk_name in progress_bar(unique_masks):
    print('Processing', msk_name)
    segmentations = [_read_msk(m) for m in masks if m.name==msk_name]
    mv_segmentation = mvoting(segmentations, labelForUndecidedPixels)
    imageio.imsave(path_mv/msk_name, mv_segmentation*255 if mv_segmentation.max()==1 else mv_segmentation)

If connected to Google Drive, the ground truth estimations are automatically added to your drive. You can also download the files here:

zipObj = zipfile.ZipFile('mv_export.zip', 'w')
for f in get_image_files(path_mv):
      zipObj.write(f)
zipObj.close()
try:
    files.download('mv_export.zip')
except:
    print("Warning: File download only works on Google Colab.")
    pass