About the EventImages and other non-text feature-rich data are predominant in today's
exponentially growing digital universe.How to organize such data at
large scale for efficient content-based search is an important problem
which remains open after decades of research.One major challenge is that
the feature data are usually of high dimensionality and are
intrinsically hard to search due to the curse of dimensionality.In this
talk, I will present an exciting progress we recently made at Princeton,
namely an efficient method to construct a data structure called a
k-nearest neighbor graph, which can be used to substantially improve
online search.I will also briefly talk about our work on compact data
representation for similarity search and on large-scale near-duplicate
image detection.
|