This research aims to simplify the decision-making rules and reduce computational load when using structural methods of image classification that employ a description in the form of a set of keypoint descriptors. The main attention is paid to the implementation of the classifier and studying the properties of quantization tools for the basic set of descriptors of etalon descriptions. As a result of quantization, the etalons and the analyzed image are represented by a vector of numbers through projection onto a compact set of defined centers. This approach enables the use the metrics apparatus and provides a high-speed classification compared to the traditional voting method. Improvements to the method by introducing a threshold for assigning descriptors to the system’s centers are discussed. The gain in classification speed for the considered data processing schemes increases in proportion to the number of etalons. The paper presents the results of software modeling of the proposed approaches for the experimental base, which includes images of historical monuments. The test sample is formed as a set of images from the etalon database and images outside the database with a set of geometric transformations of shift, scale, and rotation in the field of view applied to them. Practical issues of choosing threshold parameters to establish the equivalence of descriptors and to minimize the number of class votes to ensure the required levels of classification accuracy are researched. Testing has confirmed a significant acceleration of the classification process and a sufficiently high level of classification accuracy using quantization and the Tanimoto metric. In particular, a tenfold increase was achieved in the modeling performed. The peculiarities of forming an etalon base and choosing a range of parameters of geometric transformations for the proposed method were experimentally studied. The considered methods should be implemented in applied computer vision tasks with high requirements for the speed of classification of visual objects.
Gorokhovatskyi et al. (Thu,) studied this question.