Go Top
Kunal's Picture
Kunal Banerjee
Staff Data Scientist
Walmart Global Tech

Address: Walmart Labs
Touchstone Building, Outer Ring Road
Kaverappa Layout, Kadabeesanahalli
Bengaluru, Karnataka, India - 560103
Email: kunal [dot] banerjee1 [at] walmart [dot] com           [CV]



About

I have recently joined Walmart Global Tech (earlier known as Walmart Labs) as Staff Data Scientist.

Earlier I was a Research Scientist at Parallel Computing Lab, Intel Labs, India, where my primary focus was on kernel optimization of deep learning workloads on Intel architectures (IA). For example, my code for convolution using Winograd, RNN, LSTM and GRU are available in open source libraries: LIBXSMM and Intel MKL-DNN. These libraries have been adopted in several software products including TensorFlow, Caffe, MS CNTK, Apache MXNet, Chainer, OpenVINO among others for enhanced performance on IA.
I am also interested in low-precision deep neural networks. Specifically, together with my colleagues in Intel Labs, we have developed and implemented Ternary Residual Networks which uses 8-bits for activations and 2-bits for weights (with residual edges, if required) for neural networks. I have also helped showcase the efficacy of BFLOAT16 datatype on IA.
These works have been accepted in venues such as, SuperComputing, IPDPS, ICLR, CLUSTER, and have been recognized with awards such as, ISC Best Research Poster Award (AI & ML track), Intel's Gordy Award (Intel Labs' highest award) and Divisional Recognition Award.
I have also contributed to Intel's accelerator for deep learning training as part of Intel Artificial Intelligence Products Group.

Prior to joining Intel, I received my PhD from the Department of Computer Science and Engineering, IIT Kharagpur. My research areas encompassed program analysis, formal methods and verification. I was a recipient of Senior Research Fellowship from the Department of Science and Technology, India, and TCS Research Fellowship from Tata Consultancy Services for supporting my doctoral studies. My dissertation work won Best PhD Thesis Award at VLSI Design, Best PhD Forum Paper at ISVLSI and Techno Inventor Award (PhD) from India Electronics & Semiconductor Association (IESA).

Research Interests


Education


Professional Experience


Awards/Achievements


Selected Publications        

Journals

  1. Optimizing Deep Learning RNN Topologies on Intel Architecture.
    Kunal Banerjee, Evangelos Georganas, Dhiraj D. Kalamkar, Barukh Ziv, Eden Segal, Cristina Anderson, Alexander Heinecke.
    Supercomputing Frontiers and Innovations, vol. 6, no. 3, 2019, pp: 64-85, invited paper.
  2. A Counter-Example Generation Procedure for Path based Equivalence Checkers.
    Ramanuj Chouksey, Chandan Karfa, Kunal Banerjee, Pankaj Kalita, Purandar Bhaduri.
    IET Software, vol. 13, no. 4, 2019, pp: 280-285.
  3. Deriving Bisimulation Relations from Path Extension Based Equivalence Checkers.
    Kunal Banerjee, Dipankar Sarkar, Chittaranjan Mandal.
    IEEE Transactions on Software Engineering (TSE), vol. 43, no. 10, 2017, pp: 946-953.
  4. Deriving bisimulation relations from path based equivalence checkers.
    Kunal Banerjee, Dipankar Sarkar, Chittaranjan Mandal.
    Formal Aspects of Computing (FAOC), vol. 29, no. 2, 2017, pp: 365-379.
  5. A Path Construction Algorithm for Translation Validation using PRES+ Models.
    Soumyadip Bandyopadhyay, Dipankar Sarkar, Chittaranjan Mandal, Kunal Banerjee, Krishnam Raju Duddu.
    Parallel Processing Letters (PPL), vol. 26, no. 2, 2016, pp: 1-25.
  6. Extending the FSMD Framework for Validating Code Motions of Array-Handling Programs.
    Kunal Banerjee, Dipankar Sarkar, Chittaranjan Mandal.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 33, no. 12, 2014, pp: 2015-2019.
  7. Verification of Code Motion Techniques using Value Propagation.
    Kunal Banerjee, Chandan Karfa, Dipankar Sarkar, Chittaranjan Mandal.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 33, no. 8, 2014, pp: 1180-1193.
  8. Verification of Loop and Arithmetic Transformations of Array-Intensive Behaviours.
    Chandan Karfa, Kunal Banerjee, Dipankar Sarkar, Chittaranjan Mandal.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 32, no. 11, 2013, pp: 1787-1800.

Conferences / Workshops

  1. From Pixels To Words: A Scalable Journey Of Text Information From Product Images To Retail Catalog.
    Pranay Dugar, Rajesh Shreedhar Bhat, Asit Sharad Tarsode, Uddipto Dutta, Kunal Banerjee, Anirban Chatterjee, Vijay Srinivas Agneeswaran.
    International Conference on Information and Knowledge Management (CIKM), November 2021, (accepted).
  2. Exploring Alternatives to Softmax Function.
    Kunal Banerjee, Vishak Prasad C, Rishi Raj Gupta, Karthik Vyas, Anushree H, Biswajit Mishra.
    Deep Learning Theory and Applications (DeLTA), July 2021, pp: 81-86. [arXiv]
    Nominated for "Best Poster Award"
  3. Designing a Bot for Efficient Distribution of Service Requests.
    Arkadip Basu, Kunal Banerjee.
    Bots in Software Engineering (BotSE), Jun 2021, pp: 16-20. [arXiv]
  4. Harnessing Deep Learning via a Single Building Block.
    Evangelos Georganas, Kunal Banerjee, Dhiraj Kalamkar, Sasikanth Avancha, Anand Venkat, Michael Anderson, Greg Henry, Hans Pabst, Alexander Heinecke.
    International Parallel & Distributed Processing Symposium (IPDPS), May 2020, pp: 222-233. [arXiv]
    (Preliminary version accepted as research poster in SuperComputing 2019.)
  5. Reliability Evaluation of Compressed Deep Learning Models.
    Brunno F. Goldstein, Sudarshan Srinivasan, Dipankar Das, Kunal Banerjee, Leandro Santiago, Victor C. Ferreira, Alexandre S. Nery, Sandip Kundu, Felipe M. G. Franca.
    Latin American Symposium on Circuits and Systems (LASCAS), February 2020, pp: 1-5.
  6. Training Google Neural Machine Translation on an Intel CPU Cluster.
    Dhiraj Kalamkar, Kunal Banerjee, Sudarshan Srinivasan, Srinivas Sridharan, Evangelos Georganas, Mikhail E. Smorkalov, Cong Xu, Alexander Heinecke.
    International Conference on Cluster Computing (CLUSTER), September 2019, pp: 1-10.
  7. Anatomy Of High-Performance Deep Learning Convolutions On SIMD Architectures.
    Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, Alexander Heinecke.
    International Conference for High Performance Computing, Networking, Storage, and Analysis (SuperComputing), November 2018, pp: 66:1-66:12. [arXiv]
  8. Poster: Automatic Detection of Inverse Operations while Avoiding Loop Unrolling.
    Kunal Banerjee, Ramanuj Chouksey, Chandan Karfa, Pankaj Kumar Kalita.
    International Conference on Software Engineering (ICSE), May 2018, pp: 175-176.
  9. Mixed Precision Training of Convolutional Neural Networks using Integer Operations.
    Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, Alexander Heinecke, Pradeep Dubey, Jesus Corbal, Nikita Shustrov, Roma Dubtsov, Evarist Fomenko, Vadim Pirogov.
    International Conference on Learning Representations (ICLR), April 2018, pp: 1-11. [arXiv]
  10. An Equivalence Checking Framework for Array-Intensive Programs.
    Kunal Banerjee, Chittaranjan Mandal, Dipankar Sarkar.
    Automated Technology for Verification and Analysis (ATVA), October 2017, pp: 84-90.
  11. An End-to-end Formal Verifier for Parallel Programs.
    Soumyadip Bandyopadhyay, Santonu Sarkar, Kunal Banerjee.
    International Conference on Software Technologies (ICSOFT), July 2017, pp: 388-393.
  12. Translation Validation of Loop and Arithmetic Transformations in the Presence of Recurrences.
    Kunal Banerjee, Chittaranjan Mandal, Dipankar Sarkar.
    Languages, Compilers, Tools, and Theory for Embedded Systems (LCTES), June 2016, pp: 31-40.
  13. Data-Race Detection: The Missing Piece for an End-to-End Semantic Equivalence Checker for Parallelizing Transformations of Array-Intensive Programs.
    Kunal Banerjee, Soumyadip Banerjee, Santonu Sarkar.
    International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY@PLDI), June 2016, pp: 1-8.
  14. A Translation Validation Framework for Symbolic Value Propagation Based Equivalence Checking of FSMDAs.
    Kunal Banerjee, Chittaranjan Mandal, Dipankar Sarkar.
    Source Code Analysis and Manipulation (SCAM), September 2015, pp: 247-252.
  15. A Path-Based Equivalence Checking Method for Petri net based Models of Programs.
    Soumyadip Bandyopadhyay, Dipankar Sarkar, Kunal Banerjee, Chittaranjan Mandal.
    International Conference on Software Engineering and Applications (ICSOFT-EA), July 2015, pp: 319-329.
  16. Translation Validation of Transformations of Embedded System Specifications using Equivalence Checking.
    Kunal Banerjee, Chittaranjan Mandal, Dipankar Sarkar.
    IEEE Computer Society Annual Symposium on VLSI (ISVLSI), July 2015, pp: 183-186.
    Received "Best PhD Forum Paper Award"
  17. Circuits and Synthesis Mechanism for Hardware Design to Counter Power Analysis Attacks.
    Partha De, Kunal Banerjee, Chittaranjan Mandal, Debdeep Mukhopadhyay.
    Euromicro Conference on Digital System Design (DSD), August 2014, pp: 520-527.
  18. Extending the Scope of Translation Validation by Augmenting Path Based Equivalence Checkers with SMT Solvers.
    Kunal Banerjee, Chittaranjan Mandal, Dipankar Sarkar.
    International Symposium on VLSI Design and Test (VDAT), July 2014, pp: 1-6.
  19. Experimentation with SMT Solvers and Theorem Provers for Verification of Loop and Arithmetic Transformations.
    Chandan Karfa, Kunal Banerjee, Dipankar Sarkar, Chittaranjan Mandal.
    IBM Collaborative Academia Research Exchange (I-CARE), October 2013, pp: 3:1-3:4.
    Received "Best Paper Award"
  20. Designing DPA Resistant Circuits Using BDD Architecture and Bottom Pre-charge Logic.
    Partha De, Kunal Banerjee, Chittaranjan Mandal, Debdeep Mukhopadhyay.
    Euromicro Conference on Digital System Design (DSD), September 2013, pp: 641-644.
  21. A Value Propagation Based Equivalence Checking Method for Verification of Code Motion Techniques.
    Kunal Banerjee, Chandan Karfa, Dipankar Sarkar, Chittaranjan Mandal.
    International Symposium on Electronic System Design (ISED), December 2012, pp: 67-71.
  22. Equivalence Checking of Array-Intensive Programs.
    Chandan Karfa, Kunal Banerjee, Dipankar Sarkar, Chittaranjan Mandal.
    IEEE Computer Society Annual Symposium on VLSI (ISVLSI), July 2011, pp: 156-161.

Others

  1. K-TanH: Hardware Efficient Activations For Deep Learning.
    Abhisek Kundu, Alexander Heinecke, Dhiraj Kalamkar, Sudarshan Srinivasan, Eric C. Qin, Naveen K. Mellempudi, Dipankar Das, Kunal Banerjee, Bharat Kaul, Pradeep Dubey.
    Preprint on arXiv, September 2019, arXiv:1909.07729.
  2. A Study of BFLOAT16 for Deep Learning Training.
    Dhiraj Kalamkar, Dheevatsa Mudigere, Naveen Mellempudi, Dipankar Das, Kunal Banerjee, Sasikanth Avancha, Dharma Teja Vooturi, Nataraj Jammalamadaka, Jianyu Huang, Hector Yuen, Jiyan Yang, Jongsoo Park, Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Abhisek Kundu, Misha Smelyanskiy, Bharat Kaul, Pradeep Dubey.
    Preprint on arXiv, May 2019, arXiv:1905.12322.
  3. A Quick Introduction to Functional Verification of Array-Intensive Programs.
    Kunal Banerjee, Chandan Karfa.
    Preprint on arXiv, May 2019, arXiv:1905.09137.
  4. Optimizing Deep Learning LSTM Topologies on Intel Xeon Architecture.
    Kunal Banerjee, Evangelos Georganas, Dhiraj Kalamkar, Alexander Heinecke.
    ISC High Performance, June 2019, Research Poster.
    Received "Best Research Poster Award" in "Artificial Intelligence and Machine Learning" track
  5. Understanding the Performance of Small Convolution Operations for CNN on Intel Architecture.
    Alexander Heinecke, Evangelos Georganas, Kunal Banerjee, Dhiraj Kalamkar, Narayanan Sundaram, Anand Venkat, Greg Henry, Hans Pabst.
    International Conference for High Performance Computing, Networking, Storage and Analysis (SuperComputing), November 2017, Research Poster.
  6. Ternary Residual Networks.
    Abhisek Kundu, Kunal Banerjee, Naveen Mellempudi, Dheevatsa Mudigere, Dipankar Das, Bharat Kaul, Pradeep Dubey.
    Preprint on arXiv, July 2017, arXiv:1707.04679.
    (Accepted as extended abstract in SysML 2018. Presented at Intel AI DevCon 2018.)
  7. An Equivalence Checking Mechanism for Handling Recurrences in Array-Intensive Programs.
    Kunal Banerjee.
    Principles of Programming Languages (POPL): Student Research Competition, January 2015, pp: 1-2.

Activities


Last updated: Oct 23, 2021.