Publications & Preprints

  • Continuous-Time Meta-Learning with Forward Mode Differentiation. T. Deleu, D. Kanaa, L. Feng, G.Kerg, Y. Bengio, G. Lajoie, P.-L. Bacon. Submitted to the Tenth International Conference on Learning Representations (ICLR), 2022. openreview

  • Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization. S. Jastrzebski, D. Arpit, O. Astrand, G.Kerg, H. Wang, C. Xiong, R. Socher, K. Cho and K. Geras. Accepted at the Thirty-eighth International Conference on Machine Learning (IMCL), 2021. arxiv

  • Network-level computational advantages of single-neuron adaptation. V. Geadah, G. Lajoie, G.Kerg, S. Horoi and G. Wolf. Accepted at Computational and Systems Neuroscience (COSYNE), 2021.

  • Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization. S. Jastrzebski, D. Arpit, O. Astrand, G.Kerg, H. Wang, C. Xiong, R. Socher, K. Cho and K. Geras. Accepted at the Proceedings of the NeurIPS 2020 Workshop on Optimization in Machine Learning (OPT), 2020. paper

  • Untangling trade-offs between recurrence and self-attention in artificial neural networks. G.Kerg*, B. Kanuparthi*, A. Goyal, K. Goyette, Y. Bengio and G. Lajoie. Accepted at the Advances in Neural Information Processing Systems (NeurIPS), 2020. arxiv

  • Guarantees for stable signal and gradient propagation in self-attentive recurrent networks. G.Kerg, B. Kanuparthi, A. Goyal, K. Goyette, Y. Bengio and G. Lajoie. Accepted at DeepMath 2020 (Conference on the Mathematical Theory of Deep Neural Networks), 2020.

  • Learning Long-term Dependencies Using Cognitive Inductive Biases in Self-attention RNNs. G.Kerg*, B. Kanuparthi*, A. Goyal, K. Goyette, Y. Bengio and G. Lajoie. Accepted at the ICML 2020 Inductive biases, invariances and generalization in RL workshop, 2020. Also accepted at the Montreal AI Symposium (MAIS), 2020. paper

  • Advantages of biologically-inspired adaptive neural activation in RNNs during learning. V. Geadah, G.Kerg, S. Horoi, G. Wolf and G. Lajoie. arxiv

  • Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics. G.Kerg*, K. Goyette*, M.P. Touzel, G. Gidel, E. Vorontsov, Y. Bengio and G. Lajoie. Accepted at the Advances in Neural Information Processing Systems (NeurIPS), 2019. arxiv

  • h-detach: Modifying the LSTM gradient towards better optimization. B. Kanuparthi*, D. Arpit*, G.Kerg, R. Ke, I. Mitliagkas and Y. Bengio. Accepted at the Seventh International Conference on Learning Representations (ICLR), 2019. openreview arxiv

  • Safe Screening for Support Vector Machines. J. Zimmert, C. Schröder de Witt, G.Kerg, and M. Kloft. Accepted at the Proceedings of the NIPS 2015 Workshop on Optimization in Machine Learning (OPT), 2015. paper

  • On Neretin's group of tree spheromorphisms. Master thesis for MSc in pure mathematics, Université Libre de Bruxelles, 2013. thesis

  • Expansion in groups. Essay for Part III of the Math Tripos (equivalent to Master thesis), University of Cambridge, 2012. thesis