|
![]() |
|
![]() |
|
![]() |
|
.
.
.| Date | Content | Exercises | Lecture notes | ||
| 19.2.2008 | Basic geometry - definitions and notations | PDF, PS | Chapter 1 | ||
| 26.2.2008 | Semidefinite programming and approximations | PDF, PS, solution | Chapter 2 | ||
| 4.3.2008 | Measure concentratinon and introduction to the low distortion embeddings | PDF, PS | Chapter 3 Chapter 4 (1) |
||
| 11.3.2008 | Low distortion embeddings II | PDF, PS | Chapter 4 (2) | ||
| 18.3.2008 | Low distortion embeddings III | PDF, PS | Chapter 4 (3) | ||
| 1.4.2008 | Smallest enclosing balls | PDF, PS | Chapter 5 | ||
| 8.4.2008 | Cuboids | PDF, PS | Chapter 6 | ||
| 15.4.2008 | Directional width | PDF, PS | Chapter 7 | ||
| 22.4.2008 | Directional width | PDF, PS | |||
| 29.4.2008 | Support vector machines - introduction | PDF, PS | Chapter 8 | ||
| 6.5.2008 | Support vector machines - kernel trick | PDF, PS | |||
| 13.5.2008 | VC dimension | PDF, PS | Chapter 9 | ||
The course is concerned with approximate geometric methods for the analysis of large data sets represented by point clouds.
Data is being collected in order to draw conclusions from it, i.e. to discover relations, make extrapolations into the future, etc. More often than not, data comes as a set or sequence of points describing objects, with each coordinate representing some quantification of some feature. On a computer such data is just a sequence of 0's and 1's; the need to analyze and "understand" it calls for means to support the process. One way is to visualize the data. For example, a data set representing a number of people by their respective heights and weights can be drawn as a point set in the plane, and this drawing may reveal some correlation that could be approximated by a linear function.
For a wide range of applications (brain research, robotics, statistics, bioinformatics, character and speech recognition, computer graphics etc.) this approach is too simplistic, for various reasons. First of all, the size of the data may be huge (in the millions and billions, and sometimes so huge that we cannot even store it). And secondly, objects may have many features, giving rise to sets of dimension in the hundreds - and we know that simple visualization methods tend to fail starting in dimension 4. (Many features may in fact be redundant, but it is part of the endeavor to find out which ones.)
Many of the arising problems appear to be too hard to be solved exactly in an efficient fashion. The course discusses several approximate methods for the analysis of large high-dimensional data sets that have been developed over the last years in response to the issues indicated above. While we have applications in mind for the questions we address, we emphasize theoretical aspects in the solutions.
Methods we cover are random sampling, grid structures, core-sets, well separated pair decomposition, low distortion low-dimensional embeddings. Applications we address are shape and dimension analysis, nearest neighbor search, clustering etc.
Examples for specific questions arising in these applications are the following: for some point in d-space, what is its closest neighbor in the point cloud? What is the closest pair in the point cloud? What is the "best" grouping of the points in the cloud into k groups? Which subset of the point set of size k provides the "best" description of the point cloud? What is the dimensionality of the point cloud and what does dimensionality mean here? Can the point cloud be embedded into a lower-dimensional space while preserving many of its characteristics?
In every lecture we provide you with an exercise sheet. You should solve it in written form and return the solutions at the beginning of the subsequent lecture. Solving the exercises in teams is not allowed.Your solutions will be graded. If you reach at least 80% of all possible points you will get the grade 6.0, 40% of all possible points are necessary to pass (with the grade 4.0).
There will be an oral exam of 15 minutes at the end of the course. Your final grade consists to 50% of the grade for the exam and to 50% of the grade for the exercises. Passing grades from both parts are required.
For PhD students there are special rules concerning credit points (KE) for the course. If your supervisor agrees, you can simply sit in on the course and obtain 2 KE. For solving the exercises successfully additional 2 KE can be obtained. Passing the exam is worth the last 1 KE.
After having completed the course and the Seminar on Approximate Methods in Geometry in the fall semester (see below), it is possible to do a semester, master or diploma thesis in the area of discrete or computational geometry. Students are also welcome at our graduate seminar.
Outlook: During 2008, the following geometry related courses will be offered:
SpringFall
- Topological Methods in Combinatorics and Geometry (J. Matoušek, U. Wagner)
- Seminar zur Algorithmischen Geometrie (B. Gärtner, M. Hoffmann, E. Welzl)
- Algorithmische Geometrie (B. Gärtner, M. Hoffmann)
- Seminar on Approximate Methods in Geometry (B. Gärtner, U. Wagner, E. Welzl) this seminar builds on the Approximate Methods in Geometry course
B. Gärtner, J. Blömer, Approximation Algorithms , Lecture Notes.
For Semidefinite programming see Chapter 7.
References from the Lecture Notes
- M. Badoiu and K.L. Clarkson. Optimal core-sets for balls, submitted.
- G. Barequet and S. Har-Peled. Efficiently approximating the minimum-volume bounding box of a point set in three dimensions, J. of Algorithms (JoA), volume 38 (1), pages 91-109, January 2001.
- K. Fischer and B. Gärtner. The smallest enclosing ball of balls: combinatorial structure and algorithms, Int. J. Comput. Geometry Appl. 14(4-5): 341-378 (2004)
- A. Goel, P. Indyk and K.R. Varadarajan. Reductions among high dimensional proximity problems, SODA 2001: 769-778
- B. Schölkopf and A. Smola. Learning with Kernels, MIT Press, Cambridge Massachusetts, 2002.
- E.Welzl. Smallest Enclosing Disks (Balls and Ellipsoids), In H.Maurer, editor, New Results and New Trends in Computer Science, volume 555 of Lecture Notes in Computer Science, pages 359-370, 1991.