Understanding Batch-Normalization in Deep Neural Networks

dc.contributor.authorSrujan, Pendyala Sai
dc.date.accessioned2025-07-22T06:21:18Z
dc.date.available2025-07-22T06:21:18Z
dc.date.issued2025-06
dc.descriptionDissertation under the supervision of Dr. Sasanka Roy and Dr. Shubhra Sankar Rayen_US
dc.description.abstractBatch Normalization (BN) is a commonly used technique in various deep learning architectures for tasks such as image classification and object detection. It stabilizes and accelerates training by normalizing the activations of intermediate layers using mean and variance of the batch, allowing the use of higher learning rates and often improving generalization through implicit regularization. During inference, BN uses running estimates of batch statistics accumulated during training. However, if individual batches are not representative of the overall data distribution, these accumulated statistics may not accurately approximate the population statistics. This discrepancy can lead to a phenomenon known as **estimation shift**, which impairs the model’s generalization performance. In this project, we study the behavior of estimation shift in deep learning models using BN and explore techniques to mitigate its effects. Specifically, we introduce **dynamicity** in the momentum parameter of BN layer (DMBN) while computing exponential moving averages and evaluate its impact under various architectural configurations. We use MNIST, FashionMNIST, and CIFAR-10/100 datasets to train and test both simple Deep Neural Networks (DNNs) as well as deeper Convolutional Neural Networks (CNNs) such as ResNet-50. Our experiments are conducted in two phases: first, by varying the static momentum parameter across different values, and second, by introducing layer-wise dynamic momentum where each layer is assigned the momentum (or equivalently, β) that minimizes estimation shift. The performance of the proposed method, DMBN, is evaluated using various performance metrics such as sensitivity, specificity, accuracy, and F-score. The DMBN is compared with existing BN-BFN method and is observed to be performing better in most of cases. For example, for fashionMNIST data, the accuracy values achieved by DMBN and BN-BFN are 0.889 and 0.853, respectively.en_US
dc.identifier.citation41p.en_US
dc.identifier.urihttp://hdl.handle.net/10263/7588
dc.language.isoenen_US
dc.publisherIndian Statistical Institute, Kolkataen_US
dc.relation.ispartofseriesMTech(CS) Dissertation;23-35
dc.subjectBatch Normalization (BN)en_US
dc.subjectConvolutional Neural Networks (CNNs)en_US
dc.subjectDMBNen_US
dc.subjectDeep Neural Networksen_US
dc.titleUnderstanding Batch-Normalization in Deep Neural Networksen_US
dc.typeOtheren_US

Files

Original bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
CS2335_Dissertation_Plag.pdf
Size:
853.81 KB
Format:
Adobe Portable Document Format
Description:
Plagiarism_report
No Thumbnail Available
Name:
CS2335_Dissertation_signed.pdf
Size:
1.99 MB
Format:
Adobe Portable Document Format
Description:
Dissertations - M Tech (CS)

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: