Detailansicht

Numerical methods for the injectivity of ReLU-layers

Hannah Eckert

Art der Arbeit

Masterarbeit

Universität

Universität Wien

Fakultät

Fakultät für Informatik

Studiumsbezeichnung bzw. Universitätlehrgang (ULG)

Masterstudium Data Science

Betreuer*in

Peter Balazs

Volltext herunterladen
Volltext in Browser öffnen

DOI

10.25365/thesis.77512

URN

urn:nbn:at:at-ubw:1-23383.01867.851911-2

Link zu u:search

(Print-Exemplar eventuell in Bibliothek verfügbar)

Abstracts

Abstract

(Deutsch)

Injektivität ist eine erstrebenswerte Eigenschaft für neuronale Netzwerkschichten, da sie Invertierbarkeit sicherstellt, was das neuronale Netz interpretierbarer macht, weil sie es ermöglicht, den Entscheidungsprozess nachzuvollziehen. ReLU-Schichten sind weit verbreitete Bausteine neuronaler Netzwerke und verwenden die die ReLU-Funktion als Aktivierungsfunktion. Bis heute ist die praktische Verifikation der Injektivität von ReLU-Schichten nicht vollständig gelöst. In einer kürzlich erschienenen Arbeit haben meine Betreuereine frametheoretische Bedingung zur Sicherstellung der Injektivität von ReLU-Schichten auf offenen oder konvexen Mengen eingeführt. Die Bedingung ist eine obere Schranke für den Bias-Vektor, sodass die zugehörige ReLU-Schicht injektiv ist. Allerdings ist die exakte Berechnung der vorgeschlagenen oberen Schranke in hohen Dimensionen rechnerisch nicht durchführbar. Das erste Ziel dieser Arbeit ist es, einen probabilistischen Algorithmus bereitzustellen, der die obere Schranke approximiert. Die Idee basiert auf einem Monte-Carlo-Sampling-Ansatz und der Sicherstellung der punktweisen Injektivität für jede gezogene Stichprobe. Als nächster Schritt zur Verbesserung des Algorithmus werden wir datengesteuertes Sampling anstelle von uniformen Sampling durchführen. Wir schlagen zwei Ansätze zur Implementierung vor: 1) Erweiterung der Trainingsdaten durch Hinzufügen von Gaußschem Rauschen und Verwendung der erweiterten Trainingsdaten als Zufallsstichproben für das Monte-Carlo-Sampling und 2) Verwendung von Kerndichteschätzung. Das ultimative Ziel ist, dass die geschätzte obere Schranke eine höhere Wahrscheinlichkeit hat, Injektivität für Stichproben sicherzustellen, die nahe (in der euklidischen Distanz) an Stichproben in den Trainingsdaten liegen. Das zweite Ziel dieser Arbeit ist es, den eingeführten Algorithmus zur approximativen numerischen Verifikation der Injektivität von ReLU-Schichten zu verwenden um das Injektivitätsverhalten einer ReLU-Schicht während des Trainings eines neuronalen Netzwerks in verschiedenen Szenarien untersuchen. Im letzten Schritt erzwingen wir die Injektivität während des gesamten Trainingsprozesses und bewerten anschließend die Auswirkungen auf Leistung und Robustheit.

Abstract

(Englisch)

Injectivity is a desirable property for neural network layers as it ensures invertibility. Invertibility is advantageous because it opens possibilities for explainability, by allowing to trace back the decision-making process. This capability can be crucial for gaining an understanding of the entire neural network. A common type of neural network layer is the ReLU-layer. ReLU-layers are neural network layers where the activation function used is the Rectified Linear Unit (ReLU). The injectivity of ReLU-layers has been investigated theoretically but the practical verification of this property has not been fully solved so far. In a recent paper a frame-theoretic condition to ensure injectivity of ReLU-layers on some open or convex set was introduced. The proposed condition is formulated in terms of an upper bound, such that any ReLU-layer with a bias smaller than the bound is injective. Since, the exact calculation of the upper bound is computationally not feasible in high dimensions, the first goal of this work is to provide a probabilistic algorithm that approximates the upper bound. The algorithm is based on a Monte Carlo sampling approach and ensures "point-wise" invertibility for each uniformly drawn sample on the closed ball). As a further step to make the algorithm more efficient, we perform data-driven sampling instead of sampling uniformly. We propose two approaches to implement this: 1) Augmenting the train data by adding Gaussian noise and using it as random samples for the Monte Carlo sampling and 2) Using kernel density estimation. The ultimate goal of the data-driven sampling is that the estimated upper bound has a higher probability of ensuring injectivity for samples that lie close (in the Euclidean distance) to the samples in the train data. The second goal of this work is to use the derived algorithm to study the injectivity behavior of a ReLU-layer during the training of a deep neural network in different settings. As a last step, we enforce injectivity throughout the training process and assess the effects on performance and stability, evaluating the costs associated with making ReLU-layers invertible.

Autor*innen

Hannah Eckert

Haupttitel (Englisch)

Numerical methods for the injectivity of ReLU-layers

Publikationsjahr

2024

Umfangsangabe

viii, 50 Seiten : Illustrationen

Sprache

Englisch

Beurteiler*in

Peter Balazs

Klassifikation

54 Informatik > 54.72 Künstliche Intelligenz

AC Nummer

AC17408159

Utheses ID

74170

Studienkennzahl

UA | 066 | 645 | |

Detailansicht

Abstracts

Schlagwörter