Classes | Defines | Typedefs | Enumerations
ikmeans.h File Reference

Integer K-Means clustering. More...

#include "generic.h"
#include "random.h"
Include dependency graph for ikmeans.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  _VlIKMFilt
 IKM quantizer. More...

Defines

#define VL_IKMACC_MAX   0x7fffffffUL

Typedefs

typedef vl_int32 vl_ikmacc_t
typedef struct _VlIKMFilt VlIKMFilt
 IKM quantizer.

Enumerations

enum  VlIKMAlgorithms { VL_IKM_LLOYD, VL_IKM_ELKAN }
 IKM algorithms. More...

Functions

Create and destroy
VL_EXPORT VlIKMFiltvl_ikm_new (int method)
 Create a new IKM quantizer.
VL_EXPORT void vl_ikm_delete (VlIKMFilt *f)
 Delete IKM quantizer.
Process data
VL_EXPORT void vl_ikm_init (VlIKMFilt *f, vl_ikmacc_t const *centers, vl_size M, vl_size K)
VL_EXPORT void vl_ikm_init_rand (VlIKMFilt *f, vl_size M, vl_size K)
VL_EXPORT void vl_ikm_init_rand_data (VlIKMFilt *f, vl_uint8 const *data, vl_size M, vl_size N, vl_size K)
VL_EXPORT int vl_ikm_train (VlIKMFilt *f, vl_uint8 const *data, vl_size N)
 Train clusters.
VL_EXPORT void vl_ikm_push (VlIKMFilt *f, vl_uint32 *asgn, vl_uint8 const *data, vl_size N)
 Project data to clusters.
VL_EXPORT vl_uint vl_ikm_push_one (vl_ikmacc_t const *centers, vl_uint8 const *data, vl_size M, vl_size K)
 Project one datum to clusters.
Retrieve data and parameters
VL_EXPORT vl_size vl_ikm_get_ndims (VlIKMFilt const *f)
 Get data dimensionality.
VL_EXPORT vl_size vl_ikm_get_K (VlIKMFilt const *f)
 Get the number of centers K.
VL_EXPORT int vl_ikm_get_verbosity (VlIKMFilt const *f)
 Get verbosity level.
VL_EXPORT vl_size vl_ikm_get_max_niters (VlIKMFilt const *f)
 Get maximum number of iterations.
VL_EXPORT vl_ikmacc_t const * vl_ikm_get_centers (VlIKMFilt const *f)
 Get maximum number of iterations.
Set parameters
VL_EXPORT void vl_ikm_set_verbosity (VlIKMFilt *f, int verb)
 Set verbosity level.
VL_EXPORT void vl_ikm_set_max_niters (VlIKMFilt *f, vl_size max_niters)
 Set maximum number of iterations.

Detailed Description

Integer K-Means clustering.

Integer K-means (IKM) is an implementation of K-means clustering (or Vector Quantization, VQ) for integer data. This is particularly useful for clustering large collections of visual descriptors.

Use the function vl_ikm_new() to create a IKM quantizer. Initialize the IKM quantizer with K clusters by vl_ikm_init() or similar function. Use vl_ikm_train() to train the quantizer. Use vl_ikm_push() or vl_ikm_push_one() to quantize new data.

Given data $x_1,\dots,x_N\in R^d$ and a number of clusters $K$, the goal is to find assignments $a_i\in\{1,\dots,K\},$ and centers $c_1,\dots,c_K\in R^d$ so that the expected distortion

\[ E(\{a_{i}, c_j\}) = \frac{1}{N} \sum_{i=1}^N d(x_i, c_{a_i}) \]

is minimized. Here $d(x_i, c_{a_i})$ is the distortion, i.e. the cost we pay for representing $ x_i $ by $ c_{a_i} $. IKM uses the squared distortion $d(x,y)=\|x-y\|^2_2$.

Algorithms

Initialization

Most K-means algorithms are iterative and needs an initialization in the form of an initial choice of the centers $c_1,\dots,c_K$. We include the following options:

Lloyd

The Lloyd (also known as Lloyd-Max and LBG) algorithm iteratively:

This algorithm is not particularly efficient because all data points need to be compared to all centers, for a complexity $O(dNKT)$, where T is the total number of iterations.

Elkan

The Elkan algorithm is an optimized variant of Lloyd. By making use of the triangle inequality, many comparisons of data points and centers are avoided, especially at later iterations. Usually 4-5 times less comparisons than Lloyd are preformed, providing a dramatic speedup in the execution time.

Author:
Brian Fulkerson
Andrea Vedaldi

Definition in file ikmeans.h.


Define Documentation

#define VL_IKMACC_MAX   0x7fffffffUL

Definition at line 27 of file ikmeans.h.


Typedef Documentation

IKM accumulator data type

Definition at line 26 of file ikmeans.h.

typedef struct _VlIKMFilt VlIKMFilt

IKM quantizer.

------------------------------------------------------------------


Enumeration Type Documentation

IKM algorithms.

------------------------------------------------------------------

Enumerator:
VL_IKM_LLOYD 

Lloyd algorithm

VL_IKM_ELKAN 

Elkan algorithm

Definition at line 35 of file ikmeans.h.


Function Documentation

VL_EXPORT void vl_ikm_delete ( VlIKMFilt f)

Delete IKM quantizer.

Parameters:
fIKM quantizer.

Definition at line 117 of file ikmeans.c.

VL_EXPORT vl_ikmacc_t const* vl_ikm_get_centers ( VlIKMFilt const *  f)

Get maximum number of iterations.

Parameters:
fIKM filter.
Returns:
maximum number of iterations.

Definition at line 268 of file ikmeans.c.

VL_EXPORT vl_size vl_ikm_get_K ( VlIKMFilt const *  f)

Get the number of centers K.

Parameters:
fIKM filter.
Returns:
number of centers K.

Definition at line 235 of file ikmeans.c.

VL_EXPORT vl_size vl_ikm_get_max_niters ( VlIKMFilt const *  f)

Get maximum number of iterations.

Parameters:
fIKM filter.
Returns:
maximum number of iterations.

Definition at line 257 of file ikmeans.c.

VL_EXPORT vl_size vl_ikm_get_ndims ( VlIKMFilt const *  f)

Get data dimensionality.

Parameters:
fIKM filter.
Returns:
data dimensionality.

Definition at line 223 of file ikmeans.c.

VL_EXPORT int vl_ikm_get_verbosity ( VlIKMFilt const *  f)

Get verbosity level.

Parameters:
fIKM filter.
Returns:
verbosity level.

Definition at line 246 of file ikmeans.c.

VL_EXPORT void vl_ikm_init ( VlIKMFilt f,
vl_ikmacc_t const *  centers,
vl_size  M,
vl_size  K 
)
VL_EXPORT void vl_ikm_init_rand ( VlIKMFilt f,
vl_size  M,
vl_size  K 
)
VL_EXPORT void vl_ikm_init_rand_data ( VlIKMFilt f,
vl_uint8 const *  data,
vl_size  M,
vl_size  N,
vl_size  K 
)
VL_EXPORT VlIKMFilt* vl_ikm_new ( int  method)

Create a new IKM quantizer.

Parameters:
methodClustering algorithm.
Returns:
new IKM quantizer.

The function allocates initializes a new IKM quantizer to operate based algorithm method.

method has values in the enumerations VlIKMAlgorithms.

Definition at line 104 of file ikmeans.c.

VL_EXPORT void vl_ikm_push ( VlIKMFilt f,
vl_uint32 asgn,
vl_uint8 const *  data,
vl_size  N 
)

Project data to clusters.

Parameters:
fIKM quantizer.
asgnAssignments (out).
datadata.
Nnumber of data (N >= 1).

The function projects the data data on the integer K-means clusters specified by the IKM quantizer f. Notice that the quantizer must be initialized.

Definition at line 164 of file ikmeans.c.

VL_EXPORT vl_uint vl_ikm_push_one ( vl_ikmacc_t const *  centers,
vl_uint8 const *  data,
vl_size  M,
vl_size  K 
)

Project one datum to clusters.

Parameters:
centerscenters.
datadatum to project.
Knumber of centers.
Mdimensionality of the datum.
Returns:
the cluster index.

The function projects the specified datum data on the clusters specified by the centers centers.

Definition at line 185 of file ikmeans.c.

VL_EXPORT void vl_ikm_set_max_niters ( VlIKMFilt f,
vl_size  max_niters 
)

Set maximum number of iterations.

Parameters:
fIKM filter.
max_nitersmaximum number of iterations.

Definition at line 290 of file ikmeans.c.

VL_EXPORT void vl_ikm_set_verbosity ( VlIKMFilt f,
int  verb 
)

Set verbosity level.

Parameters:
fIKM filter.
verbverbosity level.

Definition at line 279 of file ikmeans.c.

VL_EXPORT int vl_ikm_train ( VlIKMFilt f,
vl_uint8 const *  data,
vl_size  N 
)

Train clusters.

Parameters:
fIKM quantizer.
datadata.
Nnumber of data (N >= 1).
Returns:
-1 if an overflow may have occurred.

Definition at line 134 of file ikmeans.c.



libvlfeat
Author(s): Andrea Vedaldi
autogenerated on Thu Jun 6 2019 20:25:52