Abstract
Developing meta-learning algorithms that are un-biased toward a subset of
training tasks often requires hand-designed criteria to weight tasks,
potentially resulting in sub-optimal solutions. In this paper, we introduce a
new principled and fully-automated task-weighting algorithm for meta-learning
methods. By considering the weights of tasks within the same mini-batch as an
action, and the meta-parameter of interest as the system state, we cast the
task-weighting meta-learning problem to a trajectory optimisation and employ
the iterative linear quadratic regulator to determine the optimal action or
weights of tasks. We theoretically show that the proposed algorithm converges
to an $\epsilon_{0}$-stationary point, and empirically demonstrate that the
proposed approach out-performs common hand-engineering weighting methods in two
few-shot learning benchmarks.