Understanding Batch-Normalization in Deep Neural Networks

Srujan, Pendyala Sai

Understanding Batch-Normalization in Deep Neural Networks

dc.contributor.author	Srujan, Pendyala Sai
dc.date.accessioned	2025-07-22T06:21:18Z
dc.date.available	2025-07-22T06:21:18Z
dc.date.issued	2025-06
dc.description	Dissertation under the supervision of Dr. Sasanka Roy and Dr. Shubhra Sankar Ray	en_US
dc.description.abstract	Batch Normalization (BN) is a commonly used technique in various deep learning architectures for tasks such as image classification and object detection. It stabilizes and accelerates training by normalizing the activations of intermediate layers using mean and variance of the batch, allowing the use of higher learning rates and often improving generalization through implicit regularization. During inference, BN uses running estimates of batch statistics accumulated during training. However, if individual batches are not representative of the overall data distribution, these accumulated statistics may not accurately approximate the population statistics. This discrepancy can lead to a phenomenon known as estimation shift, which impairs the model’s generalization performance. In this project, we study the behavior of estimation shift in deep learning models using BN and explore techniques to mitigate its effects. Specifically, we introduce dynamicity in the momentum parameter of BN layer (DMBN) while computing exponential moving averages and evaluate its impact under various architectural configurations. We use MNIST, FashionMNIST, and CIFAR-10/100 datasets to train and test both simple Deep Neural Networks (DNNs) as well as deeper Convolutional Neural Networks (CNNs) such as ResNet-50. Our experiments are conducted in two phases: first, by varying the static momentum parameter across different values, and second, by introducing layer-wise dynamic momentum where each layer is assigned the momentum (or equivalently, β) that minimizes estimation shift. The performance of the proposed method, DMBN, is evaluated using various performance metrics such as sensitivity, specificity, accuracy, and F-score. The DMBN is compared with existing BN-BFN method and is observed to be performing better in most of cases. For example, for fashionMNIST data, the accuracy values achieved by DMBN and BN-BFN are 0.889 and 0.853, respectively.	en_US
dc.identifier.citation	41p.	en_US
dc.identifier.uri	http://hdl.handle.net/10263/7588
dc.language.iso	en	en_US
dc.publisher	Indian Statistical Institute, Kolkata	en_US
dc.relation.ispartofseries	MTech(CS) Dissertation;23-35
dc.subject	Batch Normalization (BN)	en_US
dc.subject	Convolutional Neural Networks (CNNs)	en_US
dc.subject	DMBN	en_US
dc.subject	Deep Neural Networks	en_US
dc.title	Understanding Batch-Normalization in Deep Neural Networks	en_US
dc.type	Other	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: CS2335_Dissertation_Plag.pdf
Size:: 853.81 KB
Format:: Adobe Portable Document Format
Description:: Plagiarism_report

Download

Name:: CS2335_Dissertation_signed.pdf
Size:: 1.99 MB
Format:: Adobe Portable Document Format
Description:: Dissertations - M Tech (CS)

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Dissertations - M Tech (CS)