Research
2018

Vectorbased navigation using gridlike representations in artificial agents.
, , , .
Nature.
▼ Show abstract Link bibDeep neural networks have achieved impressive successes in fields ranging from object recognition to complex games such as Go. Navigation, however, remains a substantial challenge for artificial agents, with deep neural networks trained by reinforcement learning failing to rival the proficiency of mammalian spatial behaviour, which is underpinned by grid cells in the entorhinal cortex. Grid cells are thought to provide a multiscale periodic representation that functions as a metric for coding space and is critical for integrating selfmotion (path integration) and planning direct trajectories to goals (vectorbased navigation). Here we set out to leverage the computational functions of grid cells to develop a deep reinforcement learning agent with mammallike navigational abilities. We first trained a recurrent network to perform path integration, leading to the emergence of representations resembling grid cells, as well as other entorhinal cell types. We then showed that this representation provided an effective basis for an agent to locate goals in challenging, unfamiliar, and changeable environmentsoptimizing the primary objective of navigation through deep reinforcement learning. The performance of agents endowed with gridlike representations surpassed that of an expert human and comparison agents, with the metric quantities necessary for vectorbased navigation derived from gridlike units within the network. Furthermore, gridlike representations enabled agents to conduct shortcut behaviours reminiscent of those performed by mammals. Our findings show that emergent gridlike representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for vectorbased navigation, demonstrating that the latter can be combined with pathbased strategies to support navigation in challenging environments.
2017

Neural Episodic Control.
, , , , , , , .
Proceedings of the 34rd International Conference on Machine Learning (ICML 2017).
▼ Show abstract PDF bibDeep reinforcement learning methods attain superhuman performance in a wide range of environments. Such methods are grossly inefficient, often taking orders of magnitudes more data than humans to achieve reasonable performance. We propose Neural Episodic Control: a deep reinforcement learning agent that is able to rapidly assimilate new experiences and act upon them. Our agent uses a semitabular representation of the value function: a buffer of past experience containing slowly changing state representations and rapidly updated estimates of the value function. We show across a wide range of environments that our agent learns significantly faster than other stateoftheart, general purpose deep reinforcement learning agents.

Comparison of Maximum Likelihood and GANbased training of Real NVPs.
, , , , .
arXiv preprint.
▼ Show abstract PDF bibWe train a generator by maximum likelihood and we also train the same generator architecture by Wasserstein GAN. We then compare the generated samples, exact logprobability densities and approximate Wasserstein distances. We show that an independent critic trained to approximate Wasserstein distance between the validation set and the generator distribution helps detect overfitting. Finally, we use ideas from the oneshot learning literature to develop a novel fast learning critic.
2016

Early Visual Concept Learning with Unsupervised Deep Learning.
, , , , , , , .
arXiv preprint.
▼ Show abstract PDF bibState of the art deep reinforcement learning algorithms take many millions of interactions to attain humanlevel performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. In the brain, such rapid learning is thought to depend on the hippocampus and its capacity for episodic memory. Here we investigate whether a simple model of hippocampal episodic control can learn to solve difficult sequential decisionmaking tasks. We demonstrate that it not only attains a highly rewarding strategy significantly faster than stateoftheart deep reinforcement learning algorithms, but also achieves a higher overall reward on some of the more challenging domains.

ModelFree Episodic Control.
, , , , , , , , .
arXiv preprint.
▼ Show abstract PDF bibState of the art deep reinforcement learning algorithms take many millions of interactions to attain humanlevel performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. In the brain, such rapid learning is thought to depend on the hippocampus and its capacity for episodic memory. Here we investigate whether a simple model of hippocampal episodic control can learn to solve difficult sequential decisionmaking tasks. We demonstrate that it not only attains a highly rewarding strategy significantly faster than stateoftheart deep reinforcement learning algorithms, but also achieves a higher overall reward on some of the more challenging domains.

Associative long shortterm memory.
, , , , .
Proceedings of the 33rd International Conference on Machine Learning (ICML 2016).
▼ Show abstract PDF bibWe investigate a new method to augment recurrent neural networks with extra memory without increasing the number of network parameters. The system has an associative memory based on complexvalued vectors and is closely related to Holographic Reduced Representations and Long ShortTerm Memory networks. Holographic Reduced Representations have limited capacity: as they store more information, each retrieval be comes noisier due to interference. Our system in contrast creates redundant copies of stored information, which enables retrieval with reduced noise. Experiments demonstrate faster learning on multiple memorization tasks.

Neural autoregressive distribution estimation.
, , , , .
Journal of Machine Learning Research.
▼ Show abstract PDF bibWe present Neural Autoregressive Distribution Estimation (NADE) models, which are neural network architectures applied to the problem of unsupervised distribution and density estimation. They leverage the probability product rule and a weight sharing scheme inspired from restricted Boltzmann machines, to yield an estimator that is both tractable and has good generalization performance. We discuss how they achieve competitive performance in modeling both binary and realvalued observations. We also present how deep NADE models can be trained to be agnostic to the ordering of input dimensions used by the autoregressive product rule decomposition. Finally, we also show how to exploit the topological structure of pixels in images using a deep convolutional architecture for NADE.
2015

Connectionist multivariate densityestimation and its application to speech synthesis.
.
The University of Edinburgh, School of Informatics, PhD dissertation (2015).
▼ Show abstract PDF bibAutoregressive models factorize a multivariate joint probability distribution into a product of onedimensional conditional distributions. The variables are assigned an ordering, and the conditional distribution of each variable modelled using all variables preceding it in that ordering as predictors.
Calculating normalized probabilities and sampling has polynomial computational complexity under autoregressive models. Moreover, binary autoregressive models based on neural networks obtain statistical performances similar to that of some intractable models, like restricted Boltzmann machines, on several datasets.
The use of autoregressive probability density estimators based on neural networks to model realvalued data, while proposed before, has never been properly investigated and reported. In this thesis we extend the formulation of neural autoregressive distribution estimators (NADE) to realvalued data; a model we call the realvalued neural autoregressive density estimator (RNADE). Its statistical performance on several datasets, including visual and auditory data, is reported and compared to that of other models. RNADE obtained higher test likelihoods than other tractable models, while retaining all the attractive computational properties of autoregressive models.
However, autoregressive models are limited by the ordering of the variables inherent to their formulation. Marginalization and imputation tasks can only be solved analytically if the missing variables are at the end of the ordering. We present a new training technique that obtains a set of parameters that can be used for any ordering of the variables. By choosing a model with a convenient ordering of the dimensions at test time, it is possible to solve any marginalization and imputation tasks analytically.
The same training procedure also makes it practical to train NADEs and RNADEs with several hidden layers. The resulting deep and tractable models display higher test likelihoods than the equivalent onehiddenlayer models for all the datasets tested.
Ensembles of NADEs or RNADEs can be created inexpensively by combining models that share their parameters but differ in the ordering of the variables. These ensembles of autoregressive models obtain stateoftheart statistical performances for several datasets.
Finally, we demonstrate the application of RNADE to speech synthesis, and confirm that capturing the phoneconditional dependencies of acoustic features improves the quality of synthetic speech. Our model generates synthetic speech that was judged by naive listeners as being of higher quality than that generated by mixture density networks, which are considered a stateoftheart synthesis technique.

Modelling AcousticFeature Dependencies with Artificial NeuralNetworks: TrajectoryRNADE.
, , , , .
IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP2015.
▼ Show abstract Generated samples PDF bibGiven a transcription, sampling from a good model of acoustic feature trajectories should result in plausible realizations of an utterance. However, samples from current probabilistic speech synthesis systems result in low quality synthetic speech. Henter et al. have demonstrated the need to capture the dependencies between acoustic features conditioned on the phonetic labels in order to obtain high quality synthetic speech. These dependencies are often ignored in neural network based acoustic models. We tackle this deficiency by introducing a probabilistic neural network model of acoustic trajectories, trajectory RNADE, able to capture these dependencies.
2014

A Deep and Tractable Density Estimator.
, , .
Proceedings of the 31th International Conference on Machine Learning, (ICML 2014).
▼ Show abstract Code PDF bibThe Neural Autoregressive Distribution Estimator (NADE) and its realvalued version RNADE are competitive density models of multidimensional data across a variety of domains. These models use a fixed, arbitrary ordering of the data dimensions. One can easily condition on variables at the beginning of the ordering, and marginalize out variables at the end of the ordering, however other inference tasks require approximate inference. In this work we introduce an efficient procedure to simultaneously train a NADE model for each possible ordering of the variables, by sharing parameters across all these models. We can thus use the most convenient model for each inference task at hand, and ensembles of such models with different orderings are immediately available. Moreover, unlike the original NADE, our training procedure scales to deep models. Empirically, ensembles of Deep NADE models obtain state of the art density estimation performance.
2013

RNADE: The realvalued neural autoregressive densityestimator.
, , .
Advances in Neural Information Processing Systems 26 (NIPS 2013)
▼ Show abstract Supplement Code PDF bibWe introduce RNADE, a new model for joint density estimation of realvalued vectors. Our model calculates the density of a datapoint as the product of onedimensional conditionals modeled using mixture density networks with shared parameters. RNADE learns a distributed representation of the data, while having a tractable expression for the calculation of densities. A tractable likelihood allows direct comparison with other methods and training by standard gradientbased optimizers. We compare the performance of RNADE on several datasets of heterogeneous and perceptual data, finding it outperforms mixture models in all but one case.
2012

Deep Architectures for Articulatory Inversion.
, , , .
Proc. Interspeech, Portland, Oregon, USA
▼ Show abstract PDF bibWe implement two deep architectures for the acousticarticulatory inversion mapping problem: a deep neural network and a deep trajectory mixture density network. We find that in both cases, deep architectures produce more accurate predictions than shallow architectures and that this is due to the higher expressive capability of a deep model and not a consequence of adding more adjustable parameters. We also find that a deep trajectory mixture density network is able to obtain better inversion accuracies than smoothing the results of a deep neural network. Our best model obtained an average root mean square error of 0.885 mm on the MNGU0 test dataset.
2011

A deep belief network for the acousticarticulatory inversion mapping problem.
.
The University of Edinburgh, School of Informatics, MSc dissertation
▼ Show abstract PDF bibIn this work, we implement a deep belief network for the acousticarticulatory inversion mapping problem.
We find that adding up to 3 hiddenlayers improves inversion accuracy. We also show this is due to the higher expressive capability of a deep model and not a consequence of adding more adjustable parameters. Besides, we show unsupervised pretraining of the system improves its performance in all cases, even for a 1 hiddenlayer model. Our implementation obtained an average root mean square error of 0.95 mm on the MNGU0 test dataset, beating all previously published results.