On the Expressivity of Random Features in CNNs - TF 2.3 (Community) #9174
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Only BatchNorm
This repository is the unofficial implementation of the following [Paper].
Description/Abstract
Batch normalization (BatchNorm) has become an indispensable tool for training
deep neural networks, yet it is still poorly understood. Although previous work
has typically focused on studying its normalization component, BatchNorm also
adds two per-feature trainable parameters—a coefficient and a bias—whose role
and expressive power remain unclear. To study this question, we investigate the
performance achieved when training only these parameters and freezing all others
at their random initializations. We find that doing so leads to surprisingly high
performance. For example, sufficiently deep ResNets reach 82% (CIFAR-10) and
32% (ImageNet, top-5) accuracy in this configuration, far higher than when training
an equivalent number of randomly chosen parameters elsewhere in the network.
BatchNorm achieves this performance in part by naturally learning to disable
around a third of the random features. Not only do these results highlight the
under-appreciated role of the affine parameters in BatchNorm, but—in a broader
sense—they characterize the expressive power of neural networks constructed
simply by shifting and rescaling random features.
Key Features
model.fit
tf.keras.layers
tf.data
andtfds
absl-py
from abseil.ioRequirements
To install requirements:
Results
Image Classification (Only BatchNorm weights)
Dataset
CIFAR10
dataset - 10 classes with 50,000 images in the train set and 10,000 images in the test set.Training
Please run this command line for training.
This trains the OnlyBN model for the ResNet-14 architecture. Replace
num_blocks
with the appropriate value for 'N' from the results table above to train the respective ResNet architecture.Evaluation
Please run this command line for evaluation.
References
Citation
If you want to cite this repository in your research paper, please use the following information.
Authors or Maintainers
This project is licensed under the terms of the Apache License 2.0.