Technical Terms
Preparation of training Data: The following terms are often used, when describing the preparation work that needs to be done to generate training data that can be used to train the neural network.
Image segmentation
The image segmentation describes the process when a person manually marks an image. This is also called annotation. For this, instance labels are used and regions of interest can be identified.
Instance labels
The labels are manually set in the segmentation phase. They are used by experts to highlight the objects of interest and separate them from background noise. Often two colours are used. One for the objects which the later model should find, and one for the background or structures that should be ignored in the training process.
ROI - Regions of Interest
These are the regions on the image that contain the features that are most important for the training of the model. These will be manually highlighted by experts (segmented) before the training is started.
(Segmentation) mask
The segmentation mask describes an image fully annotated by a person, oftentimes an expert in the corresponding field.
Classes
The classes in deepFlash2 denote how many different categories of labels were used by the experts during the segmentation of the training data.
Training of the neural network
Loss function
The loss function determines how our neural network optimizes its precision when learning from sample data. It defines how to determine the weights of the individual neurons.
Hyperparameter
Hyperparameters are values that we can set before the actual learning process is performed. With these parameters we can influence the duration and accuracy of the learning process.
Parameters are:
The learning rate
Number of epochs
batch size
Learning rate
With the learning rate we can control the progress of the neural network when looking for an optimum.
When adjusting the learning rate we can choose how sensitive the neurons in the neural network will react to information found in the training data in each iteration.
When the learning rate is too high, a lot of information is cut out and this will lead to poor performance.
When the learning rate is too small, the neural network will take very long to find a result.
LR Finder
The learning rate finder is an automated approach to find an optimal learning rate for your use case. It approximates a value that can be a good compromise between resulting performance of the model on the one hand and training duration on the other.
Number of epochs
With the number of epochs we can set how often the training data is shown to the neural network.
Batch size
With the batch size we can set how many samples from our training data are fed through the neural network in one iteration.
Ensemble Training
Ensemble training combines multiple models to optimize the result. The result is represented as a weighted average.
TTA test time augmentation
TTA randomly modifies the images from the sample data after they were fed through the neural network and repeats the training. The goal is to enable the neural network to extract more general features from the images in the sample data. Typical modifications are alterations in zoom, orientation of the images and flips.
STAPLE Algorithm
The STAPLE algorithm is used to validate the segmentation of an image. Either from an expert or a model used for a neural network. STAPLE stands for simultaneous truth and performance level estimation and works by comparing multiple segmentation jobs done by different experts or models for the same image. Further the algorithm tries to address deviations in segmentation skills or biases between experts. This allows the better segmentation to have more influence on the overall result.
Majority Voting Algorithm
The algorithm looks for a majority in the segmented data and uses it as the single truth to evaluate the segmentation performance of the neural network. The majority in this case describes similarities that can be found in most expert segmentations. To find the majorities, the algorithm will not weigh in any qualitative aspects. Therefore this can lead to biased results, when the expert’s segmentation skills do vary.
Model:
The model represents the final settings of the neural network when the training process is finished. These settings include the weights of every single neuron in the neural network and can be exported as a file to be used for future object recognition tasks. To achieve good results with a pre-trained model, the images should contain similar objects and structures to those used to train the model.