Indian Institute of Information Technology, Allahabad
Computer Vision and Biometrics Lab (CVBL)
Visual Recognition
Odd Semester 2021 - 2022
Course Information
Objective of the course: The field of visual recognition has become part of our lives with applications in self-driving cars, satellite monitoring, surveillance, video analytics particularly in scene understanding, crowd behaviour analysis, action recognition etc. It has eased human lives by acquiring, processing, analyzing and understanding digital images and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information. The visual recognition encapsulates image classification, localization and detection. The course on visual recognition will help students understand new tools, techniques and methods which are influencing the visual recognition field.
Outcome of the course: At the end of this course, the students will be able apply the concepts to solve some real problems in recognition. The students will be able to use computational visual recognition for problems ranging from extracting features, classifying images, to detecting and outlining objects and activities in an image or video using machine learning and deep learning concepts. The student will be also being able to invent new methods in visual recognition for various applications.
- Class meets
- Monday: 04.00 - 06.00 pm, Friday: 10.00 - 12.00 pm and 04.00 - 06.00 pm; Remote
Schedule - Lectures
Date | Topic | Optional Reading |
L01: July 30: 04.00 PM - 05.00 PM |
Introduction Lecture
Slide, Recorded Lecture |
|
L02: July 30: 05.00 PM - 06.00 PM |
Local Features: What, Why and How
Slide, Recorded Lecture |
|
L03: August 06: 10.00 AM - 11.00 AM |
Corner Detection
Slide, Recorded Lecture |
|
L04: August 06: 11.00 AM - 12.00 PM |
Harris Detector and Invariance Property
Slide, Recorded Lecture |
|
L05: August 09: 04.00 PM - 05.00 PM |
Blob Detection: Harris-Laplacian (LoG), SIFT (DoG), Affine Invariant Detection
Slide, Recorded Lecture |
|
L06: August 09: 05.00 PM - 06.00 PM |
Feature Description: SIFT and SURF
Slide, Recorded Lecture |
|
L07: August 13: 10.00 AM - 11.00 AM |
Feature Description: LBP and HOG
Slide, Recorded Lecture |
|
L08: August 27: 10.00 AM - 11.00 AM |
Image Categorization and Bag of Visual Words
Slide, Recorded Lecture |
|
L09-11: August 27: 11.00 AM - 12.00 PM & 4.00 PM - 6.00 PM |
Classifiers for Image Categorization: KNN, Linear Classifier, SVM, Softmax
Slide, Recorded Lecture 1 Recorded Lecture 2 |
|
L12-13: August 30: 04.00 PM - 06.00 PM |
Neural Networks
Slide, Recorded Lecture |
|
L14-15: September 03: 10.00 AM - 12.00 PM |
Convolutional Neural Networks (CNNs)
Slide, Recorded Lecture |
|
L16-17: September 06: 04.00 PM - 06.00 PM |
Training Aspects of CNN: Activation Functions, Data Split, Data Preprocessing and Weight Initialization
Slide, Recorded Lecture |
|
L18-19: September 10: 04.00 PM - 06.00 PM |
Training Aspects of CNN: Optimization, Learning Rate, Regularization, Dropout, Batch Normalization, Data Augmentation and Transfer Learning
Slide, Recorded Lecture |
|
L20-21: September 24: 04.00 PM - 06.00 PM |
CNN Architectures - Plain Models: LeNet, AlexNet, VGG, NiN
Slide, Recorded Lecture1, Recorded Lecture2 |
|
L22-23: October 01: 04.00 PM - 06.00 PM |
CNN Architectures - DAG Models: GoogleNet, ResNet, DenseNet, etc.
Slide, Recorded Lecture1, Recorded Lecture2 |
|
L24-25: October 08: 10.00 AM - 12.00 PM |
CNN Architectures for Object Detection - R-CNN, Fast R-CNN, Faster R-CNN, YOLO, etc.
Slide, Recorded Lecture |
|
L26: October 23: 10.00 AM - 11.00 AM |
Special Lecture on Person Recognition A Biometric Approach by Dr. Satish Kumar Singh
Lecture Slide |
|
L27: October 23: 11.00 PM - 12.00 PM |
Special Lecture on Multimodal Biometrics A Reliable Way by Dr. Satish Kumar Singh
Lecture Slide |
|
L28: October 23: 03.00 PM - 04.00 PM |
Special Lecture on DL Architectures for Recognition by Dr. Satish Kumar Singh
Lecture Slide, Recorded Video |
|
L29: October 24: 10.00 AM - 11.00 AM |
Special Lecture on Hand Shape Coding Multimodal Biometric by Dr. Satish Kumar Singh
Lecture Slide, Recorded Video |
|
L30: October 24: 10.00 AM - 11.00 AM |
Special Lecture on Face Recognition under Surveillance by Dr. Satish Kumar Singh
Lecture Slide |
|
L31: October 26: 06.00 PM - 07.00 PM |
Special Lecture on Biometric Security by Prof. Pritee Khanna (IIITDM Jabalpur)
Recorded Video |
|
L32: October 26: 07.00 PM - 08.00 PM |
Special Lecture on DeepFakes by Dr. Kiran Raja (NTNU Norway)
Recorded Video |
|
L33: October 26: 08.00 PM - 09.00 PM |
Special Lecture on Face Anti-spoofing by Dr. Shiv Ram Dubey
Lecture Slide, Recorded Video |
|
L34: October 27: 08.00 PM - 09.00 PM |
Special Lecture on Facial Micro-expression Recognition by Dr. Shiv Ram Dubey
Lecture Slide, Recorded Video |
Schedule - Tutorials and Labs
Date | Topic | Optional Reading |
TL01-02: July 30: 10.00 AM - 12.00 PM |
Introduction to Python
Recorded Video |
|
TL03-04: August 02: 04.00 PM - 06.00 PM |
Introduction to Python
Recorded Video |
|
TL05-06: August 07: 10.00 AM - 12.00 PM |
Introduction to Python
Recorded Video |
|
TL07: August 13: 11.00 AM - 12.00 PM |
Project Discussions
|
|
TL08-09: August 13: 04.00 PM - 06.00 PM |
Project Discussions
|
|
TL10-11: September 03: 04.00 PM - 06.00 PM |
Project Work
|
|
TL12-13: September 10: 10.00 AM - 12.00 PM |
CRP Assessment 1
|
|
TL14-15: October 04: 04.00 PM - 06.00 PM |
Project Discussions
|
|
TL16-17: October 08: 04.00 PM - 06.00 PM |
Project Discussions
|
|
TL18-19: October 18: 04.00 PM - 06.00 PM |
CRP Assessment 2
|
|
Grading
- C1 (30%): 10% Written + 20% Practice
- C2 (30%): 10% Written + 20% Practice
- C3 (40%): 20% Written + 20% Practice
Prerequisites
- Computer Programming
- Data Structures and Algorithms
- Machine Learning
- Image and Video Processing
- Ability to deal with abstract mathematical concepts
Books
- Computer Vision: Algorithms and Applications, Richard Szeliski, Springer
- Deep Learning, Ian Goodfellow, Aaron Courville, and Yoshua Bengio, MIT Press
Related Classes / Online Resources
Disclaimer
The content (text, image, and graphics) used in this slide are adopted from many sources for Academic purposes. Broadly, the sources have been given due credit appropriately. However, there is a chance of missing out some original primary sources. The authors of this material do not claim any copyright of such material.