Important Glosary

Baseline (BL): Stacked Denoising Autoencoders (SDA) are multiple layer networks where each one is trained as a denoising autoencoder. SDA training comprises of two stages: an unsupervised pre-training stage followed by a supervised fine-tuning stage. During pre-training, the network is generated by stacking multiple dA one on top of each other thus learning unsupervised features of optimal weights and biases. Then, a logistic regression layer is added on top and the whole network and fine-tuned in a supervised way, thus learning supervised features.

Transfer learning (TL): We first train the source network with the source data and labels and then copy its hidden layers to the target network. In case source label not equal to target label, then we add a classifier layer randomly initialized. The network is trained towards the target task. If the performance of the newly trained target network exceeds the performance of the baseline approach we have positive transference; otherwise we have negative transference.

Transferred layers: We select a particular layer or set of layers of the whole baseline network to transfer. For example we may select to transfer first layer features of the baseline approach to the target network, the rest of the target network layer features are randomly initialized.

Retraining layers: Once the features are transferred to the target network, we add a logistic regression layer for the target task. We have a choice to fine-tune this entire network as a multi-layer perceptron using back-propagation or lock a layer, meaning the transferred feature from source network do not change during the error propagation for the target task. Thus giving a choice of whether or not to fine-tune the certain layers of the target network. This opens up several possible approaches to solve a problem, where the layers are optionally locked or unlocked. This causes fragile co-adaptation of neurons between layers leading to optimization difficulties. The choice of whether or not to fine-tune the first layer of the target network depends on the size of the target dataset and number of parameters.

Transfer learning unsupervised (TLu): We transfer the unsupervised features of the SDA model from the source to the target network. Once the features are transferred to the target network, we add a logistic regression layer for the target task. Then we fine-tune the entire classifier like a regular multi-layer perceptron with back-propagation choosing to lock or unlock certain layers to solve the target task.

Transfer learning supervised (TLs): The trained weights of the BL approach are used. For example we transfer features from source to target network. Then we back-propagate choosing to lock or unlock certain layers to solve the target task.

Source reuse mode: for TLu is "PT" and for TLs is "PT+FT".

Reusable Deep Neural Networks:

Applications to Biomedical Data

Important Glosary