Academic Awards 2025 booklet

55 Dropout Neural Network Training Viewed from a Percolation Perspective Artificial intelligence (AI) technologies are becoming increasingly prevalent across many vital industries such as healthcare, energy, and agriculture. Underpinning many of these technologies are Neural Networks (NNs). Developing solid mathematical theory behind NNs is imperative for many of the challenges faced in safety, explainability, and performance of AI. A major challenge of NNs is to ensure they make high quality predictions on unseen data, termed generalisability. I studied a training method used to improve generalisability, known as dropout . Despite its abundant use in practice, the full analytical picture of how and why dropout works well is not yet complete. I took a different perspective on understanding dropout using the lens of Percolation Theory – a field of mathematics originally used to describe the flow of liquid over a porous medium, such as water over coffee grounds. My main contributions were threefold, • I established the existence of percolative behaviour in dropout, connecting these two seemingly different problems. • I showed that this behaviour causes the performance of dropout to critically breakdown, for a large class of NNs. • Finally, I began to develop new mathematical tools in order to generalise my work to an even larger class of NNs. In doing so, these ideas begin to form a framework to analyse a general set of training algorithms. Input Layer Biases Output Layer Hidden Layers Input Layer Biases Output Layer Hidden Layers (a) (b) (a) p = 0.53 (b) p = 0.47 Figure 1: Caption: Schematic diagrams of a neural network. On the left, the network is ‘ fully connected’ where each neuron is connected to every neuron in the next layer. On the right, the network has had dropout applied meaning half of the connections have been randomly removed. Figure 2: Caption: Two samples of bond percolation on a 30x30 lattice. Each lattice has been formed by deleting edges at random with probability 1- p . The largest ‘connected component’ of the grids are in red. The red part fills the left grid meaning liquid could flow from one side to another, but on the right it could not.