Computer Vision

CSE P576 // Autumn 2021

SqueezeDet applied to a frame from the KITTI dataset

Meeting Information

Lectures: Wednesdays 6:30-9:20pm

Location and Attendance Options:

Attendance via livestream or watching posted lecture recordings [link] is possible, however these are the less-preferred modalities.

Office Hours: see Canvas

Instructors

Vitaly Ablavsky

TAs: Kalyani Marathe, Svetoslav Kolev

Course Description

A masters course in computer vision, emphasizing fundamentals of geometry and image formation as well as deep learning and image understanding.

Grading, Late Policy, and Collaboration Policy: see Canvas

Projects

  1. Feature Extraction and Matching: Build an image feature matcher, starting with simple convolution operations.
  2. Panoramic Stitching: Implement a panorama stitcher using features, RANSAC and rotation estimation.
  3. Image Classification: Classify CIFAR10 images using a CNN and analyze the model's performance.
  4. Pose Estimation: Estimate the 3D pose of a rigid object within a single image. [assignment PDF and starter code]

Books

Resources

Course Overview

Date Lecture Description Notes and Resources
9/29 Introduction [CVA2] Ch.1
Image Formation Geometric and Photometric Image Formation, Pinhole Camera, Lenses, Sensors, Colour, Gamma, DCT, Image Coding [CVA2] Ch.2
10/6 Filtering and Pyramids Linear + Non-Linear Filtering, Correlation, Convolution, Gaussian + Laplacian Pyramids, Sampling and Aliasing [CVA2] Ch. 3.2, 3.5
Features and Matching Detection, Correspondence, Edges, Corners, Regions, Patch Matching, SIFT, Shape Context, Learning Features [CVA2] Ch. 7
Project 1 start
10/13 Planar Geometry 2D Transforms: Euclidean, Similarity, Affine, Projective, Camera Models: Perspective, Projective, Linear, Viewing planes, Lines and Camera Rotation [CVA2] Ch. 3.6
RANSAC Least Squares 2-view Alignment, Outliers, Robust Line Fitting, RANSAC, Minimal Subsets [CVA2] Ch. 8.1, 8.2
10/20 Epipolar Geometry Epipolar Lines, Plane Constraint, Fundamental/Essential Matrix, 8 point algorithm, Triangulation, 2-view SFM Project 2 start
[CVA2] Ch. 11.3
Multiview Alignment and SFM Multiview Alignment, Residuals, Error Function, Structure from Motion, Bundle Adjustment, Pose Estimation, Triangulation [CVA2] Ch. 8.3, 8.4, 11.4
[Panorama stitching by Brown & Lowe]
[ORB-SLAM by Mur-Artal et al.]
10/25 Project 1 due
10/27 Stereo Stereo matching, local + global, multiview stereo, plane sweep, volumetric, depth map merging, photometric stereo [CVA2] Ch. 12
Depth + Flow Depth imaging + fusion, signed distance functions, non-rigid matching, optical flow, Lucas Kanade algorithm [CVA2] Ch. 13.[1,2,3,5], Ch. 9.1;
PlaneSweep ipynb, LucasKanade ipynb.
Notebooks by Steven Lovegrove, Richard Newcombe
11/3 Linear Classification Visual classification intro, object recognition, instance, category, classification vs detection, linear classification, 2-class, N-class, linear and softmax regression [CVA2] Ch. 6.1, 6.2;
[ESL] Ch. 2.3
Project 3 start
Visual Classification 2 Fundamentals and Pre-Deep Learning Classification, Bayesian classifiers, Gaussian distributions, PCA, LDA, Decision Forests, Visual words, SVMs [DL] Ch. 5
11/8 Project 2 due
11/10 Neural Networks Feature extraction, end to end learning, multiple linear layers, activation functions, biological neurons, space warping, universal approximation, convex optimization [CVA2] Ch. 5.3, 5.4.0, 5.4.1;
[DL] Ch. 6;
[Slides for Week 7 by Justin Johnson]
Backpropagation Chain rule, computational gradients, forward/reverse mode autodiff, upstream/local gradients, flat backprop, modular design, scalar/vector/tensor backprop, matrix multiplication example
Convolutional Networks Convolutional layers, activation maps, dimension mappings, receptive fields, strides, pooling, LeNet5 example
11/17 Advanced CNNs CNN building blocks, dropout, batch norm, factorized convolutions, residual connections, popular architectures: AlexNet, VGG, GoogLeNet, Resnet, MobileNet, SE-Net Project 4 start: assignment PDF and starter code
Object Detection Motivation + applications, sliding windows, anchor based detection, single-stage and two-stage architectures, evaluation metrics, IoU, precision-recall, mAP, practical tips [CVA2] Ch. 6.3
[Slides for Week 8 by Jonathan Huang]
11/22 Project 3 due
11/24 NO CLASS
12/1 Tracking, Part 1 Motivation, probabilistic formulation, linear dynamical systems, multiple-hypothesis tracking (MHT), Bayesian filtering with CONDENSATION and RJ MCMC [Course on Tracking at Linköping University]
[Course on SLAM and Tracking at University of Freiburg]
Tracking, Part 2 Case studies: tracking as online learning, correlation filters, tracking with Siamese networks, graph-theoretic formulations
12/8 Vision and Language "Visual Tracking and Retrieval by Natural Language Descriptions"
Guest lecture by Qi (Fred) Feng, Boston University
[CVA2] Ch. 6.6
[Vision & Language]
[Real-time Tracking with NL]
[Siamese Natural Language Tracker]
Deep Learning in 3D Single-view, 2-view, multi-view depth, deep learning with points, meshes, voxels, SDFs, neural scene representation and rendering
Project 4 due [Buried in Syllabus, Prize Remains Unfound]