Seleção de algoritmos para a tarefa de agrupamento de dados: uma abordagem via meta-aprendizagem

Imagem de Miniatura
Ferrari, Daniel Gomes
Silva, Leandro Nunes de Castro
Título da Revista
ISSN da Revista
Título de Volume
Membros da banca
Omar, Nizam
Silva, Leandro Augusto da
Carvalho, André Carlos Ponce de Leon Ferreira de
Medeiros, Claudia Maria Bauzer
Engenharia Elétrica
Data clustering is an important data mining task that aims to segment a database into groups of objects based on their similarity or dissimilarity. Due to the unsupervised nature of clustering, the search for a good quality solution can become a complex process. There is currently a wide range of clustering algorithms and selecting the most suitable one for a given problem can be a slow and costly process. In 1976, Rice formulated the algorithm selection problem (PSA) postulating that a good performance algorithm can be chosen according to the problem s structural characteristics. Meta-learning brings the concept of learning about learning, that is, the meta-knowledge obtained from the algorithms learning process allows it to improve its performance. Meta-learning has a major intersection with data mining in classification problems, where it is used to select algorithms. This thesis proposes an approach to the algorithm selection problem by using meta-learning techniques for clustering. The characterization of 84 problems is performed by a classical approach, based on the problems, and a new proposal based on the similarity among the objects. Ten internal indices are used to provide different performance assessments of seven algorithms, where the combination of the indices determine the ranking for the algorithms. Several analyzes are performed in order to assess the quality of the obtained meta-knowledge in facilitating the mapping between the problem s features and the performance of the algorithms. The results show that the new characterization approach and method to combine the indices provide a good quality algorithm selection mechanism for data clustering problems.
agrupamento de dados , meta-aprendizagem , meta-conhecimento , seleção de algoritmos , data clustering , meta-learning , meta-knowledge , algorithm selection
FERRARI, Daniel Gomes. Seleção de algoritmos para a tarefa de agrupamento de dados: uma abordagem via meta-aprendizagem. 2014. 204 f. Tese (Doutorado em Engenharia Elétrica) - Universidade Presbiteriana Mackenzie, São Paulo, 2014.