MetricKnn API
Fast Similarity Search using the Metric Space Approach
Static Public Member Functions | List of all members
mknn::PredefDistance Class Reference

MetricKnn provides a set of pre-defined distances. More...

#include <mknn_predefined_distance.hpp>

Static Public Member Functions

static DistanceParams L1 ()
 Creates an object for Manhattan or Taxi-cab distance. More...
 
static DistanceParams L2 ()
 Creates an object for Euclidean distance. More...
 
static DistanceParams Lmax ()
 Creates an object for L-max distance. More...
 
static DistanceParams Lp (double order)
 Creates an object for Minkowski distance. More...
 
static DistanceParams Hamming ()
 Creates an object for Hamming distance. More...
 
static DistanceParams Chi2 ()
 Creates an object for Chi2 distance. More...
 
static DistanceParams Hellinger ()
 Creates an object for Hellinger distance. More...
 
static DistanceParams CosineSimilarity (bool normalize_vectors)
 Creates an object for Cosine Similarity. More...
 
static DistanceParams CosineDistance (bool normalize_vectors)
 Creates an object for Cosine Distance. More...
 
static DistanceParams EMD (long long matrix_rows, long long matrix_cols, double *cost_matrix, bool normalize_vectors)
 Creates an object for Earth Mover's Distance. More...
 
static DistanceParams DPF (double order, long long num_dims_discard, double pct_discard, double threshold_discard)
 Creates an object for Dynamic Partial Function distance. More...
 
static DistanceParams MultiDistance (const std::vector< Distance > &subdistances, bool free_subdistances_on_release, const std::vector< double > &normalization_values, const std::vector< double > &ponderation_values, bool with_auto_config, Dataset &auto_config_dataset, double auto_normalize_alpha, bool auto_ponderation_maxrho, bool auto_ponderation_maxtau)
 Defines a multi-distance, which is a weighted combination of distances. More...
 
Help functions
static void helpListDistances ()
 Lists to standard output all pre-defined distances.
 
static void helpPrintDistance (std::string id_dist)
 Prints to standard output the help for a distance. More...
 
static bool testDistanceId (std::string id_dist)
 Tests whether the given string references a valid pre-defined distance. More...
 

Detailed Description

MetricKnn provides a set of pre-defined distances.

The generic way for instantiating a predefined distance is to use the method Distance::newPredefined, which requires the ID and parameters of the distance.

The complete list of predefined distances can be listed by calling Distance::helpListDistances. The parameters supported by each distance can be listed by calling PredefDistance::helpPrintDistance.

This class contains some functions to ease the instantiation of some predefined distances.

Member Function Documentation

static DistanceParams mknn::PredefDistance::Chi2 ( )
static

Creates an object for Chi2 distance.

The distance between two n-dimensional vectors is defined as:

\[ \chi^2(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sum_{i=1}^n \frac{ (x_i - \bar{m}_i )^2 }{ \bar{m}_i } \]

where \( \bar{m}_i=\frac{x_i+y_i}{2} \) .

Note
This distance does not satisfy the metric properties.
Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::CosineDistance ( bool  normalize_vectors)
static

Creates an object for Cosine Distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{CosineDistance}(\vec{x},\vec{y}) = \sqrt{ 2 ( 1 - \cos(\vec{x},\vec{y})) } \]

where \( \cos(\vec{x},\vec{y}) \) is the cosine similarity between vectors \( \vec{x} \) and \( \vec{y} \) as defined in CosineSimilarity.

The nearest neighbors obtained by this distance are identical to the farthest neighbor obtained by cosine similarity (if vectors are normalized). Therefore, this distance can be used accelerate the search using cosine similarity.

Parameters
normalize_vectorsThe cosine similarity must normalize vectors to euclidean norm 1 prior to each computation.
Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::CosineSimilarity ( bool  normalize_vectors)
static

Creates an object for Cosine Similarity.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{cos}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \frac { \sum_{i=1}^n x_i \cdot y_i } {\sqrt{ \sum_{i=1}^{n} {x_i}^2 } \cdot \sqrt{ \sum_{i=1}^{n} {y_i}^2 } } \]

Note
This is a similarity function, therefore a search for the Farthest Neighbors is needed. See CosineDistance for a distance version.
Parameters
normalize_vectorsComputes the euclidean norm for each vector. If this is set to false, it assumes the vectors are already normalized thus the value \( \sqrt{ \sum_{i=1}^{n} {x_i}^2 } \cdot \sqrt{ \sum_{i=1}^{n} {y_i}^2 } \) is equal to 1.
Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::DPF ( double  order,
long long  num_dims_discard,
double  pct_discard,
double  threshold_discard 
)
static

Creates an object for Dynamic Partial Function distance.

See definition http://dx.doi.org/10.1109/ICIP.2002.1040021 .

The distance between two n-dimensional vectors is defined as:

\[ \textrm{DPF}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \left( {\sum_{i \in \Delta_m} |x_i-y_i|^p } \right)^{\frac{1}{p}} \]

where \( \Delta_m \) is the subset of the \( m \) smallest values of \( |x_i-y_i| \).

Parameters
orderthe order \( p \) of the distance \( p > 0 \).
num_dims_discardfixed number of dimensions to discard 0 < num_dims_discard < num_dimensions.
pct_discardfixed number of dimensions to discard computed as a fraction of num_dimensions 0 < pct_discard < 1. num_dims_discard = round(pct_discard * num_dimensions)
threshold_discarddiscard all dimensions which difference is higher than threshold_discard. It produces a variable number of dimensions to discard.
Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::EMD ( long long  matrix_rows,
long long  matrix_cols,
double *  cost_matrix,
bool  normalize_vectors 
)
static

Creates an object for Earth Mover's Distance.

This function uses OpenCV's implementation, see http://docs.opencv.org/modules/imgproc/doc/histograms.html#emd .

Note
Depending on the cost_matrix this distance may or may not satisfy the metric properties. If the values in cost_matrix where computed by a metric distance, then the EMD will also be a metric distance.
Parameters
matrix_rows
matrix_cols
cost_matrixan array of length matrix_rows * matrix_cols with the cost for each pair of dimensions.
normalize_vectorsnormalizes (sum 1) both vectors before computing the distance.
Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::Hamming ( )
static

Creates an object for Hamming distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{Hamming}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sum_{i=1}^n \bar{p}_i \]

where \( \bar{p}_i= \left\{ \begin{array}{ll} 0 & x_i = y_i\\ 1 & x_i \neq y_i\\ \end{array} \right. \) .

Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::Hellinger ( )
static

Creates an object for Hellinger distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{Hellinger}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sqrt { \frac { \sum_{i=1}^n ( \sqrt{x_i} - \sqrt{y_i} )^2} { 2 } } \]

Note
This distance does not satisfy the metric properties.
Returns
parameters to create a distance (it must be deleted)
static void mknn::PredefDistance::helpPrintDistance ( std::string  id_dist)
static

Prints to standard output the help for a distance.

Parameters
id_distthe unique identifier of a pre-defined distance.
static DistanceParams mknn::PredefDistance::L1 ( )
static

Creates an object for Manhattan or Taxi-cab distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{L1}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sum_{i=1}^{n} |x_i - y_i| \]

This distance satisfies the metric properties, therefore it can be used by Metric Indexes to obtain exact nearest neighbors.

Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::L2 ( )
static

Creates an object for Euclidean distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{L2}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \sqrt{ \sum_{i=1}^{n} (x_i - y_i)^2 } \]

This distance satisfies the metric properties, therefore it can be used by Metric Indexes to obtain exact nearest neighbors.

Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::Lmax ( )
static

Creates an object for L-max distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{Lmax}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \max_{i \in \{1,...,n\}} |x_i - y_i| \]

This distance satisfies the metric properties, therefore it can be used by Metric Indexes to obtain exact nearest neighbors.

Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::Lp ( double  order)
static

Creates an object for Minkowski distance.

The distance between two n-dimensional vectors is defined as:

\[ \textrm{Lp}(\{x_1,...,x_n\},\{y_1,...,y_n\}) = \left( {\sum_{i=1}^n |x_i-y_i|^p } \right)^{\frac{1}{p}} \]

Note
This distance satisfies the metric properties only when \( p \geq 1 \). When \( 0 < p < 1 \) the Metric Indexes may not obtain exact nearest neighbors.
Parameters
orderthe order \( p \) of the distance \( p > 0 \).
Returns
parameters to create a distance (it must be deleted)
static DistanceParams mknn::PredefDistance::MultiDistance ( const std::vector< Distance > &  subdistances,
bool  free_subdistances_on_release,
const std::vector< double > &  normalization_values,
const std::vector< double > &  ponderation_values,
bool  with_auto_config,
Dataset auto_config_dataset,
double  auto_normalize_alpha,
bool  auto_ponderation_maxrho,
bool  auto_ponderation_maxtau 
)
static

Defines a multi-distance, which is a weighted combination of distances.

Warning
Under construction.
Parameters
subdistancesthe distances to combine
free_subdistances_on_releaseto release the subdistances together with this distance
normalization_valuesthe value to divide each distance.
ponderation_valuesthe value to weight each distance.
with_auto_configrun algorithms to automatically locate normalization or ponderation values.
auto_config_datasetthe data to be used by the algorithms.
auto_normalize_alphathe value to be used by the alpha-normalization.
auto_ponderation_maxrhorun the automatic ponderation according to max rho criterium.
auto_ponderation_maxtaurun the automatic ponderation according to max tau criterium.
Returns
parameters to create a distance (it must be deleted)
static bool mknn::PredefDistance::testDistanceId ( std::string  id_dist)
static

Tests whether the given string references a valid pre-defined distance.

Parameters
id_distthe unique identifier of a pre-defined distance.
Returns
true whether id_dist corresponds to a pre-defined distance, and false otherwise.

The documentation for this class was generated from the following file:
Powered by Download MetricKnn