MetricKnn API
Fast Similarity Search using the Metric Space Approach
Functions
mknn_predefined_distance.h File Reference

MetricKnn provides a set of pre-defined distances. More...

#include "../metricknn_c.h"

Go to the source code of this file.

Functions

MknnDistanceParamsmknn_predefDistance_L1 ()
 Creates an object for Manhattan or Taxi-cab distance. More...
 
MknnDistanceParamsmknn_predefDistance_L2 ()
 Creates an object for Euclidean distance. More...
 
MknnDistanceParamsmknn_predefDistance_L2squared ()
 Creates an object for squared Euclidean distance. More...
 
MknnDistanceParamsmknn_predefDistance_Lmax ()
 Creates an object for L-max distance. More...
 
MknnDistanceParamsmknn_predefDistance_Lp (double order)
 Creates an object for Minkowski distance. More...
 
MknnDistanceParamsmknn_predefDistance_Hamming ()
 Creates an object for Hamming distance. More...
 
MknnDistanceParamsmknn_predefDistance_Chi2 ()
 Creates an object for Chi2 distance. More...
 
MknnDistanceParamsmknn_predefDistance_Hellinger ()
 Creates an object for Hellinger distance. More...
 
MknnDistanceParamsmknn_predefDistance_CosineSimilarity (bool normalize_vectors)
 Creates an object for Cosine Similarity. More...
 
MknnDistanceParamsmknn_predefDistance_CosineDistance (bool normalize_vectors)
 Creates an object for Cosine Distance. More...
 
MknnDistanceParamsmknn_predefDistance_EMD (int64_t matrix_rows, int64_t matrix_cols, double *cost_matrix, bool normalize_vectors)
 Creates an object for Earth Mover's Distance. More...
 
MknnDistanceParamsmknn_predefDistance_DPF (double order, int64_t num_dims_discard, double pct_discard, double threshold_discard)
 Creates an object for Dynamic Partial Function distance. More...
 
MknnDistanceParamsmknn_predefDistance_MultiDistance (int64_t num_subdistances, MknnDistance **subdistances, bool free_subdistances_on_release, double *normalization_values, double *ponderation_values, bool with_auto_config, MknnDataset *auto_config_dataset, double auto_normalize_alpha, bool auto_ponderation_maxrho, bool auto_ponderation_maxtau)
 Defines a multi-distance, which is a weighted combination of distances. More...
 
Help functions
void mknn_predefDistance_helpListDistances ()
 Lists to standard output all pre-defined distances.
 
void mknn_predefDistance_helpPrintDistance (const char *id_dist)
 Prints to standard output the help for a distance. More...
 
bool mknn_predefDistance_testDistanceId (const char *id_dist)
 Tests whether the given string references a valid pre-defined distance. More...
 

Detailed Description

MetricKnn provides a set of pre-defined distances.

The generic way for instantiating a predefined distance is to use the method mknn_distance_newPredefined, which requires the ID and parameters of the distance.

The complete list of predefined distances can be listed by calling mknn_predefDistance_helpListDistances. The parameters supported by each distance can be listed by calling mknn_predefDistance_helpPrintDistance.

This file contains some functions to ease the instantiation of some predefined distances.

Function Documentation

MknnDistanceParams* mknn_predefDistance_Chi2 ( )

Creates an object for Chi2 distance.

The distance between two n-dimensional vectors is defined as:

\[ \chi^2(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sum_{i=1}^n \frac{ (x_i - \bar{m}_i )^2 }{ \bar{m}_i } \]

where \( \bar{m}_i=\frac{x_i+y_i}{2} \) .

Note
This distance does not satisfy the metric properties.
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_CosineDistance ( bool  normalize_vectors)

Creates an object for Cosine Distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{CosineDistance}(\vec{x},\vec{y}) = \sqrt{ 2 ( 1 - \cos(\vec{x},\vec{y})) } \]

where \( \cos(\vec{x},\vec{y}) \) is the cosine similarity between vectors \( \vec{x} \) and \( \vec{y} \) as defined in mknn_predefDistance_CosineSimilarity.

The nearest neighbors obtained by this distance are identical to the farthest neighbor obtained by cosine similarity (if vectors are normalized). Therefore, this distance can be used accelerate the search using cosine similarity.

Parameters
normalize_vectorsThe cosine similarity must normalize vectors to euclidean norm 1 prior to each computation.
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_CosineSimilarity ( bool  normalize_vectors)

Creates an object for Cosine Similarity.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{cos}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \frac { \sum_{i=1}^n x_i \cdot y_i } {\sqrt{ \sum_{i=1}^{n} {x_i}^2 } \cdot \sqrt{ \sum_{i=1}^{n} {y_i}^2 } } \]

Note
This is a similarity function, therefore a search for the Farthest Neighbors is needed. See mknn_predefDistance_CosineDistance for a distance version.
Parameters
normalize_vectorsComputes the euclidean norm for each vector. If this is set to false, it assumes the vectors are already normalized thus the value \( \sqrt{ \sum_{i=1}^{n} {x_i}^2 } \cdot \sqrt{ \sum_{i=1}^{n} {y_i}^2 } \) is equal to 1.
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_DPF ( double  order,
int64_t  num_dims_discard,
double  pct_discard,
double  threshold_discard 
)

Creates an object for Dynamic Partial Function distance.

See definition http://dx.doi.org/10.1109/ICIP.2002.1040021 .

The distance between two n-dimensional vectors is defined as:

\[ \textrm{DPF}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \left( {\sum_{i \in \Delta_m} |x_i-y_i|^p } \right)^{\frac{1}{p}} \]

where \( \Delta_m \) is the subset of the \( m \) smallest values of \( |x_i-y_i| \).

Parameters
orderthe order \( p \) of the distance \( p > 0 \).
num_dims_discardfixed number of dimensions to discard 0 < num_dims_discard < num_dimensions.
pct_discardfixed number of dimensions to discard computed as a fraction of num_dimensions 0 < pct_discard < 1. num_dims_discard = round(pct_discard * num_dimensions)
threshold_discarddiscard all dimensions which difference is higher than threshold_discard. It produces a variable number of dimensions to discard.
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_EMD ( int64_t  matrix_rows,
int64_t  matrix_cols,
double *  cost_matrix,
bool  normalize_vectors 
)

Creates an object for Earth Mover's Distance.

This function uses OpenCV's implementation, see http://docs.opencv.org/modules/imgproc/doc/histograms.html#emd .

Note
Depending on the cost_matrix this distance may or may not satisfy the metric properties. If the values in cost_matrix where computed by a metric distance, then the EMD will also be a metric distance.
Parameters
matrix_rows
matrix_cols
cost_matrixan array of length matrix_rows * matrix_cols with the cost for each pair of dimensions.
normalize_vectorsnormalizes (sum 1) both vectors before computing the distance.
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_Hamming ( )

Creates an object for Hamming distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{Hamming}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sum_{i=1}^n \bar{p}_i \]

where \( \bar{p}_i= \left\{ \begin{array}{ll} 0 & x_i = y_i\\ 1 & x_i \neq y_i\\ \end{array} \right. \) .

Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_Hellinger ( )

Creates an object for Hellinger distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{Hellinger}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sqrt { \frac { \sum_{i=1}^n ( \sqrt{x_i} - \sqrt{y_i} )^2} { 2 } } \]

Note
This distance does not satisfy the metric properties.
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
void mknn_predefDistance_helpPrintDistance ( const char *  id_dist)

Prints to standard output the help for a distance.

Parameters
id_distthe unique identifier of a pre-defined distance.
MknnDistanceParams* mknn_predefDistance_L1 ( )

Creates an object for Manhattan or Taxi-cab distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{L1}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sum_{i=1}^{n} |x_i - y_i| \]

This distance satisfies the metric properties, therefore it can be used by Metric Indexes to obtain exact nearest neighbors.

Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_L2 ( )

Creates an object for Euclidean distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{L2}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sqrt{ \sum_{i=1}^{n} (x_i - y_i)^2 } \]

This distance satisfies the metric properties, therefore it can be used by Metric Indexes to obtain exact nearest neighbors.

Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_L2squared ( )

Creates an object for squared Euclidean distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{L2}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sum_{i=1}^{n} (x_i - y_i)^2 \]

Note
This distance does not satisfy the metric properties.
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_Lmax ( )

Creates an object for L-max distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{Lmax}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \max_{i \in \{1,...,n\}} |x_i - y_i| \]

This distance satisfies the metric properties, therefore it can be used by Metric Indexes to obtain exact nearest neighbors.

Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_Lp ( double  order)

Creates an object for Minkowski distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{Lp}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \left( {\sum_{i=1}^n |x_i-y_i|^p } \right)^{\frac{1}{p}} \]

Note
This distance satisfies the metric properties only when \( p \geq 1 \). When \( 0 < p < 1 \) the Metric Indexes may not obtain exact nearest neighbors.
Parameters
orderthe order \( p \) of the distance \( p > 0 \).
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
MknnDistanceParams* mknn_predefDistance_MultiDistance ( int64_t  num_subdistances,
MknnDistance **  subdistances,
bool  free_subdistances_on_release,
double *  normalization_values,
double *  ponderation_values,
bool  with_auto_config,
MknnDataset auto_config_dataset,
double  auto_normalize_alpha,
bool  auto_ponderation_maxrho,
bool  auto_ponderation_maxtau 
)

Defines a multi-distance, which is a weighted combination of distances.

Warning
Under construction.
Parameters
num_subdistancesnumber of subdistances to combine.
subdistancesthe distances to combine
free_subdistances_on_releaseto release the subdistances together with this distance
normalization_valuesthe value to divide each distance.
ponderation_valuesthe value to weight each distance.
with_auto_configrun algorithms to automatically locate normalization or ponderation values.
auto_config_datasetthe data to be used by the algorithms.
auto_normalize_alphathe value to be used by the alpha-normalization.
auto_ponderation_maxrhorun the automatic ponderation according to max rho criterium.
auto_ponderation_maxtaurun the automatic ponderation according to max tau criterium.
Returns
parameters to create a distance (it must be released with mknn_distanceParams_release or bound to the new distance)
bool mknn_predefDistance_testDistanceId ( const char *  id_dist)

Tests whether the given string references a valid pre-defined distance.

Parameters
id_distthe unique identifier of a pre-defined distance.
Returns
true whether id_dist corresponds to a pre-defined distance, and false otherwise.
Powered by Download MetricKnn