Large-Scale Optimization

Veranstaltung	Wochentag	Termin	Ort
042531 (Übung: 042532)	Montag (Übung: Mittwoch)	10.15 - 12.00	Otto-Hahn-Str. 12 - R 1.055 (Mon), R 1.056 (Wed) Campus Nord

Lecture Information

Module description: pdf
Language: English
Lecturer: Dr. Sangkyun Lee
Attendance to practices (Übung) is REQUIRED.
Office Hour (lecturer): Thursday 13:00-14:00, LS8, OH 12, Raum 4.023 (or by appointments)

Content:

This course focuses on optimization techniques to find solutions of large-scale problems that typically appear in statistical learning / data analysis tasks with big data. Topics would include widely adopted methods in modern research literature such as projected gradient methods, accelerated first order algorithms, conjugate gradient methods, quasi-Newton methods, block coordinate descent, proximal point methods, stochastic sub-gradient algorithms, alternating direction method of multipliers, and semi-definite programming. Efficiency of these methods in terms of convergence rate and overall time complexity will be discussed, so that one can see how to choose a suitable method for his/her own research problems. Separable (approximate) reformulations of optimization problems will also be discussed that are suitable for parallel computation compared to their original formulation. Discussions in the course will be problem- oriented, so that methods will be presented in the context of specific problem instances whenever possible. Homework assignments will be given to provide background knowledge for the course or to check understanding of techniques.

Aims:

The aim of this course is to provide students with understanding of modern optimization techniques suitable for large-scale/big-data problems, so that students see how they can choose, apply, and/or modify appropriate methods for their own research problems.

Organization:

Lectures: we focus on methods to understand the structure of their target problem, algorithms, and convergence properties in theory.
Practices: we will see real-world examples where optimization methods (discussed in lectures) are used.

Lectures:

This lecture will be based on recent publications that will be assigned as reading homeworks. No textbook is required. Another lecture of mine will be helpful if you need background knowledge:

Numerical Optimization SS14

Some of my lecture materials are from:

EE236C (UCLA) by L. Vendenberghe, link
10-725 (CMU) by G. Gordon R. Tibshirani, link
EE364b (Stanford) by S. Boyd, link

The following is a tentative schedule, and may subject to change.

Lecture 1 (06.10.14): Introduction to large-scale optimization
Lecture 2 (08.10): Background for convex optimization
Lecture 3 (13.10): Stochastic subgradient descent (1: gradient descent)
Lecture 4 (15.10): Stochastic subgradient descent (2: subgradient method)
Lecture 5 (20.10): Stochastic subgradient descent (3: stochastic subgradient method)
Lecture 6 (22.10): Stochastic subgradient descent (4: stochastic subgradient method)

> Reading: Nemirovski et al., Robust Stochastic Approximation Approach to Stochastic Programming, SIOPT 2009 Sections 1~2.2, 4~5 (inclusive)
Lecture 7-8 ({27,29}.10): Stochastic subgradient descent for SVMs

> Reading: Shalev-Shwartz, Singer, and Srebro, Pegasos: Primal Estimated sub-GrAdient SOlver for SVM, ICML 2007
Lecture 9 (3.11): Generalized gradient descent
Lecture 10 (5.11): SpaRSA, Compressed Sensing
Lecture 11 (10.11): Random projection, projected gradient descent, matrix completion

> Reading: SparRSA: sparse Reconstruction by Separable Approximation

> Reading (optional): Astronomical image representation by the curvelet transform (Section 5.3 5.4)

> Reading (optional): ISTA part (Sect 3) of the FISTA paper below

> Reading (optional): High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality, D. Donoho, 2000
Lecture 12 (12.11): Accelerated first-order methods

> Reading: FISTA: A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
Lecture 13 (17.11): Matrix factorizations
Lecture 14 (19.11): Optimization systems: CVXPY, examples
Lecture 15 (24.11): Estimation of extreme singular values
Lecture 16 (26.11): Block coordinate descent (BCGD)
Lecture 17 (1.12): Block coordinate descent (Accelerated, Alternating optimization)

> Reading: On the Convergence of Block Coordinate Descent Type Methods, SIOPT, 2013

Lecture 18 (3.12): Alternating Direction Method of Multipliers (ADMM) 1 (Dual ascent, augmented Lagrangian, ADMM)
Lecture 19 (8.12): Alternating Direction Method of Multipliers (ADMM) 2 (Consensus and sharing forms)
Lecture 20 (10.12): Alternating Direction Method of Multipliers (ADMM) 3 (Learning with distributed examples or features)

> Reading: Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, FnT ML, 2010 Chapters: 2 and 3 (ADMM), 7 (distributed optimization), and 8 (ML examples: optional)

-- after Christmas --
Lecture 21 (7.1): Linear Program 1: Simplex Algorithm
Lecture 22 (12.1): Linear Program 2: Interior-Point Method
pdf (14.1) Presentation of mini-project ideas
Lecture 23 (19.1): Duality 1: LP and Lagrangian duality
Lecture 24 (21.1): Duality 2: Duality gap and strong duality
Lecture 25 (26.1): Duality 3: Conjugation and dual problems
Lecture 26 (28.1): Duality 4: Dual of Lasso and Fused Lasso
Lecture 27 (2.2): Conic Program 1
Lecture 28 (4.2): Conic Program 2
Intro to Julia

Extra material:

Optimality conditions for nonsmooth convex optimization [link]
Google[x], Project Wing [youtube]
Class projets from Stanford EE364b 2014 [link]
L1-L2 Optimization in Signal and Image Processing (This provides a nice overview of some applications/methods we discuss. PDF is accessible within TU Dortmund)
CVXPY: Python-based modeling language for convex optimization problems link

Final Exam: Mini-Project

> Proposal (due by 05.02): a successful proposal is required for the qualification of the final exam. Check the requirements discussed on 14.01 pdf
> Presentation (19-20.03): oral presentation of a mini-project (20min: 1-person group or 30min: 2-people group)
> Final report (due by 25.03): up to 7 pages (extra pages allowed only for large tables and plots)
Final grade will be based on the presentation and the final report.
Latex template for proposal and the final report: link

Topics:

There is no restriction on the choice of topics, but it should be motivating for the student and fit the time limit of the lecture. You can focus on either method or application, but not restricted to.

Method focus: e.g. modification of a method for better speed for a certain problem. Need to implement the method. Adaptation from an existing code is allowed. Benchmarking several algorithms for one problem setting is possible.
Application focus: e.g. apply an optimization method for your research problem. Don't need to implement the method by yourself, if you can find an off-the-shelf software.

Datasets:

These data sets can be useful for your project.

Collections of "standard" data sets:

> UCI Machine Learning Repository: a common source of data sets link

> LIBSVM datasets: link
Computer vision and image datasets:

> A collection at UT link

> Face recognition databases link
Movie rating data sets:

> NF dataset (right clink -> download link as a file. Password required to unzip. About 500 MB): link

> MovieLens dataset (smaller than NF) link
Bioinformatics:

> Gene expression omnibus (GEO): link

> The cancer genome atlas (TCGA): link

Grading Criteria:

Motivation: why is this an interesting problem for you and/or for the audience?
Understanding: the method(s) of choice should be explained in a compact but precise manner
Creativity: any good reason behind: the choice of a method / methods, the choice of a problem / problems, etc.
Evaluation: how well a hypothesis (e.g. an algorithm will get faster by a modification) is validated by theoretic analysis or experiments?
Presentation: how good things are organized and delivered in the presentation?
Repeatability: does it provide a script and executables to generate (a part of) experiments and results?

Lecture Schedule:

The schedule may change, so please check often.

Hauptnavigation

General

Research

Teaching

Staff