Multi-task Learning


Multi-task learning is the sub-branch of machine learning. In which, many learning tasks are resolved together using commonalities and changes through tasks. When related to training the models alone, this may outcome in enhanced learning effectiveness and estimate correctness for the task-specific models.

Multitask Learning is a method of inductive transfer. It enhances generalization by using the domain information kept in check in the training signals of related tasks. In this article, we will learn about multi-task learning in machine learning in detail.


Multi-task learning is a machine learning method in which we go to learn many tasks at the same time, enhancing various loss functions at once. We permit a single model to learn to complete all of the tasks at once somewhat than training independent models for each task. The model uses all of the obtainable data through the different tasks to learn widespread pictures of the data that are useful in many contexts in this process.

Multi-task learning has realized well-known use across multiple domains. For example, natural language processing, computer vision, and recommendation systems. It is similarly usually leveraged in the industry because of its ability to well leverage big amounts of data in order to solve linked tasks.

Multi-task Learning

The above figure shows a very common form of multi-task learning. In which,

  • Not the same supervised tasks (predicting y(i) given x) share the same input x.
  • In addition to some intermediate-level representation h(shared) captures a common pool of factors.
  • The model can usually be divided into two kinds of parts and related parameters:

Task-specific parameters

They only advantage from the instances of their task to achieve good generalization. These are the upper layers of the neural network as shown in the figure.

Generic parameters

These are shared across all the tasks. They benefit from the pooled data of all the tasks. These are the lower layers of the neural network as may be understood in the figure.

  • Multi-task learning can be cast in some methods in deep learning frameworks.
  • This figure shows the usual situation where the tasks share a common input.
  • However, include changed target random variables.
  • The lower layers of a deep network may be shared through such tasks.
  • While task-specific parameters can be well-read on top of those yielding a shared representation h (shared).
  • The fundamental theory is that there is a common pool of factors that explain the differences in the input x.
  • While every task is related to a subset of these factors.
  • It is moreover expected that top-level hidden units h (1) and h (2) are particular to each task separately predicting y (1) and y (2).
  • While some intermediate-level representation is shared through all tasks.
  • It makes sense for some of the top-level factors to be allied with none of the output tasks (h (3).
  • These are the factors that explain some of the input variations then are not relevant for predicting y (1) or y (2).

Multi-task Learning

  • Above learning curves viewing how the negative log-likelihood loss changes over time, those specified as a number of training iterations over the dataset, or epochs.
  • We train, in this instance, a maxout network on MNIST.
  • Detect that the training objective declines reliably over time.
  • Though, the validation set average loss finally instigates to re-increase which forms an asymmetric U-shaped curve.
  • Better overview and generalization error bounds may be attained due to the shared parameters, for which statistical strength can be greatly improved (in quantity with the increased number of examples for the shared parameters.
  • This will occur only if some expectations about the statistical relationship between the different tasks are valid. It means that there is rather shared across some of the tasks.

Use of multi-task learning

  • This is significant to go through situations in which multi-task learning is, and is not, suitable.
  • Usually, multi-task learning should be used when the tasks have some level of correlation.
  • In another way, multi-task learning enhances presentation when there are original values or information shared between tasks.
  • For instance, two tasks relating to categorizing images of animals are possible to be correlated because both tasks will include learning to distinguish hair patterns and colors.
  • This use case will be the best for multi-task learning since learning these features of images is valuable for both tasks.
  • Instead, occasionally training on several tasks results in negative transfer between the tasks.
  • In which the multi-task model does worse than the equal single-task models.
  • This usually occurs when the different tasks are unrelated to each other.
  • Also, when the information learned in one task denies in another task.


  • Muti-task learning is being more often used.
  • It is still prevalent for neural-network-based Multi-task learning.
  • However, current developments on learning what to share are auspicious.
  • We need to learn more about tasks to gain a better understanding of the generalization capabilities of Muti-task learning with regard to deep neural networks.