CLUSTERING ALGORITHM BASED ON SIMILARITY OF OBJECTS
Abstract
The article discusses the problem of drug clustering. Initially, k classes are randomly formed and the resulting training sample is preprocessed, then the similarity between objects of each class is assessed based on the proximity function and the criterion for assessing the contribution of objects to the formation of their own class. It is usually expressed as a percentage and represents the degree of mutual similarity of objects of each class. In the next steps of the algorithm, first one object is taken from the first class and by adding it to all k classes, the contribution of this object to this class is measured. The object remains in the class that contributed the most. This process is repeated several times in a row for all objects of the class. The process stops when the location of objects does not change and the degree of similarity exceeds the required percentage. As a result, the necessary clusters are formed.