MetricKnn API
Fast Similarity Search using the Metric Space Approach
Functions
mknn_dataset.h File Reference

MknnDataset represents a set of objects of any type. More...

#include "../metricknn_c.h"

Go to the source code of this file.

Functions

int64_t mknn_dataset_getNumObjects (MknnDataset *dataset)
 Size of the dataset. More...
 
void * mknn_dataset_getObject (MknnDataset *dataset, int64_t pos)
 Retrieves the object in position pos in dataset. More...
 
void mknn_dataset_pushObject (MknnDataset *dataset, void *object)
 Adds an object to a dataset. More...
 
MknnDomainmknn_dataset_getDomain (MknnDataset *dataset)
 Returns the domain assigned to the dataset. More...
 
MknnDatasetmknn_dataset_clone (MknnDataset *dataset)
 Returns a new dataset with a copy of the each element in dataset. More...
 
void * mknn_dataset_getCompactVectors (MknnDataset *dataset)
 The objects in the dataset are stored in a single long array. More...
 
void mknn_dataset_set_free_domain_on_dataset_release (MknnDataset *dataset, bool free_domain_on_dataset_release)
 
bool mknn_dataset_get_free_domain_on_dataset_release (MknnDataset *dataset)
 
void mknn_dataset_release (MknnDataset *dataset)
 Releases the dataset. More...
 
void mknn_dataset_save (MknnDataset *dataset, const char *filename_write)
 The dataset is saved to a file. More...
 
MknnDatasetmknn_dataset_restore (const char *filename_read)
 Loads a dataset from a file. More...
 
void mknn_dataset_printObjectsRawFile (MknnDataset *dataset, const char *filename_write)
 It prints the objects in the dataset in binary format, i.e., using fwrite to write memory addresses. More...
 
void mknn_dataset_printObjectsTextFile (MknnDataset *dataset, const char *filename_write)
 It prints the objects in the dataset in text format, i.e., converting them to string and using fprintf. More...
 
Concatenate dataset
int64_t mknn_dataset_concatenate_getNumSubDatasets (MknnDataset *concatenate_dataset)
 Returns the number of subdatasets that produced this dataset. More...
 
MknnDatasetmknn_dataset_concatenate_getSubDataset (MknnDataset *concatenate_dataset, int64_t num_subdataset)
 Returns one of the subdatasets that produced this dataset. More...
 
void mknn_dataset_concatenate_getDatasetObject (MknnDataset *concatenate_dataset, int64_t posObject, int64_t *out_numSubdataset, int64_t *out_posObjectInSubdataset)
 Given the number of an object returns two numbers: the number of the subdataset and the number of the object in that subdataset which corresponds to the object in the concatenated dataset. More...
 
MultiObject dataset
int64_t mknn_dataset_multiobject_getNumSubDatasets (MknnDataset *multiobject_dataset)
 Returns the number of subdatasets that produced this dataset. More...
 
MknnDatasetmknn_dataset_multiobject_getSubDataset (MknnDataset *multiobject_dataset, int64_t num_subdataset)
 Returns one of the subdatasets that produced this dataset. More...
 
Custom dataset
void * mknn_dataset_custom_getDataPointer (MknnDataset *custom_dataset)
 returns the pointer to the object used during the creation of the dataset More...
 

Detailed Description

MknnDataset represents a set of objects of any type.

Objects in dataset are by default type void*. In order to use some pre-defined distance, the objects in the dataset must have defined a MknnDomain.

Function Documentation

MknnDataset* mknn_dataset_clone ( MknnDataset dataset)

Returns a new dataset with a copy of the each element in dataset.

Parameters
datasetthe dataset to copy
Returns
a new dataset (it must be released with mknn_dataset_release).
void mknn_dataset_concatenate_getDatasetObject ( MknnDataset concatenate_dataset,
int64_t  posObject,
int64_t *  out_numSubdataset,
int64_t *  out_posObjectInSubdataset 
)

Given the number of an object returns two numbers: the number of the subdataset and the number of the object in that subdataset which corresponds to the object in the concatenated dataset.

Parameters
concatenate_dataseta dataset created by mknn_datasetLoader_Concatenate.
posObjectthe number of the object between 0 and mknn_dataset_getNumObjects - 1
out_numSubdatasetreturns the number of the dataset. It can be used in mknn_dataset_concatenate_getSubDataset
out_posObjectInSubdatasetreturns the number of the object in the subdataset. It can be used in mknn_dataset_getObject.
int64_t mknn_dataset_concatenate_getNumSubDatasets ( MknnDataset concatenate_dataset)

Returns the number of subdatasets that produced this dataset.

Parameters
concatenate_dataseta dataset created by mknn_datasetLoader_Concatenate.
Returns
the number of datasets
MknnDataset* mknn_dataset_concatenate_getSubDataset ( MknnDataset concatenate_dataset,
int64_t  num_subdataset 
)

Returns one of the subdatasets that produced this dataset.

Parameters
concatenate_dataseta dataset created by mknn_datasetLoader_Concatenate.
num_subdataset
Returns
a subdataset.
void* mknn_dataset_custom_getDataPointer ( MknnDataset custom_dataset)

returns the pointer to the object used during the creation of the dataset

Parameters
custom_dataseta dataset created by mknn_datasetLoader_Custom.
Returns
the pointer given in mknn_datasetLoader_Custom
bool mknn_dataset_get_free_domain_on_dataset_release ( MknnDataset dataset)
Parameters
dataset
Returns
free_domain_on_dataset_release
void* mknn_dataset_getCompactVectors ( MknnDataset dataset)

The objects in the dataset are stored in a single long array.

The format of the created data is similar to the input from mknn_datasetLoader_PointerCompactVectors.

The dataset must contain vectors (i.e., the domain in the dataset must belong to MKNN_GENERAL_DOMAIN_VECTOR).

The type of the returned array (float*, double*, ...) depends on the domain datatype.

The returned pointer is cached and released with the dataset, therefore it will not be updated in dynamic datasets.

Parameters
dataset
Returns
a pointer to an array with all the vectors one after the other. The vector will be released by mknn_dataset_release (it must not be freed).
MknnDomain* mknn_dataset_getDomain ( MknnDataset dataset)

Returns the domain assigned to the dataset.

Do not modify or free the returned domain.

Parameters
dataset
Returns
domain assigned to the dataset or NULL if no domain has been assigned.
int64_t mknn_dataset_getNumObjects ( MknnDataset dataset)

Size of the dataset.

Parameters
dataset
Returns
the number of objects in dataset
void* mknn_dataset_getObject ( MknnDataset dataset,
int64_t  pos 
)

Retrieves the object in position pos in dataset.

Parameters
dataset
posthe position of the object to retrieve. It must be a number between 0 and mknn_dataset_getNumObjects - 1.
int64_t mknn_dataset_multiobject_getNumSubDatasets ( MknnDataset multiobject_dataset)

Returns the number of subdatasets that produced this dataset.

Parameters
multiobject_dataseta dataset created by mknn_datasetLoader_MultiObject.
Returns
the number of datasets
MknnDataset* mknn_dataset_multiobject_getSubDataset ( MknnDataset multiobject_dataset,
int64_t  num_subdataset 
)

Returns one of the subdatasets that produced this dataset.

Parameters
multiobject_dataseta dataset created by mknn_datasetLoader_MultiObject.
num_subdataset
Returns
a subdataset.
void mknn_dataset_printObjectsRawFile ( MknnDataset dataset,
const char *  filename_write 
)

It prints the objects in the dataset in binary format, i.e., using fwrite to write memory addresses.

Parameters
dataset
filename_writeFile to create. If the file already exists it is overwritten.
void mknn_dataset_printObjectsTextFile ( MknnDataset dataset,
const char *  filename_write 
)

It prints the objects in the dataset in text format, i.e., converting them to string and using fprintf.

If the objects are vectors, the generated file can be parsed by mknn_datasetLoader_ParseVectorFile.

Parameters
dataset
filename_writeFile to create. If the file already exists it is overwritten.
void mknn_dataset_pushObject ( MknnDataset dataset,
void *  object 
)

Adds an object to a dataset.

The dataset must be dynamic in order to support this method (see mknn_datasetLoader_Empty).

Parameters
dataseta dataset that supports adding objects.
objectthe new object to add at the end of dataset.
void mknn_dataset_release ( MknnDataset dataset)

Releases the dataset.

Additionally it may release other objects that were commended to be released (as in mknn_datasetLoader_PointerArray).

Parameters
datasetthe dataset to be released.
MknnDataset* mknn_dataset_restore ( const char *  filename_read)

Loads a dataset from a file.

The file must have been created by mknn_dataset_save.

Parameters
filename_readFile to read. If the file does not exists an error is raised.
Returns
a new dataset (it must be released with mknn_dataset_release).
void mknn_dataset_save ( MknnDataset dataset,
const char *  filename_write 
)

The dataset is saved to a file.

It may create other files using filename_write as prefix. The created files may be in binary format for fast loading.

Parameters
filename_writeFile to create. If the file already exists it is overwritten.
datasetthe dataset to save
void mknn_dataset_set_free_domain_on_dataset_release ( MknnDataset dataset,
bool  free_domain_on_dataset_release 
)
Parameters
dataset
free_domain_on_dataset_release
Powered by Download MetricKnn