Engenharia Elétrica e Computação - Teses - EE Higienópolis
URI Permanente para esta coleção
Navegar
Navegando Engenharia Elétrica e Computação - Teses - EE Higienópolis por Assunto "algorithm selection"
Agora exibindo 1 - 1 de 1
Resultados por página
Opções de Ordenação
- TeseSeleção de algoritmos para a tarefa de agrupamento de dados: uma abordagem via meta-aprendizagemFerrari, Daniel Gomes (2014-03-27)
Engenharia Elétrica
Data clustering is an important data mining task that aims to segment a database into groups of objects based on their similarity or dissimilarity. Due to the unsupervised nature of clustering, the search for a good quality solution can become a complex process. There is currently a wide range of clustering algorithms and selecting the most suitable one for a given problem can be a slow and costly process. In 1976, Rice formulated the algorithm selection problem (PSA) postulating that a good performance algorithm can be chosen according to the problem s structural characteristics. Meta-learning brings the concept of learning about learning, that is, the meta-knowledge obtained from the algorithms learning process allows it to improve its performance. Meta-learning has a major intersection with data mining in classification problems, where it is used to select algorithms. This thesis proposes an approach to the algorithm selection problem by using meta-learning techniques for clustering. The characterization of 84 problems is performed by a classical approach, based on the problems, and a new proposal based on the similarity among the objects. Ten internal indices are used to provide different performance assessments of seven algorithms, where the combination of the indices determine the ranking for the algorithms. Several analyzes are performed in order to assess the quality of the obtained meta-knowledge in facilitating the mapping between the problem s features and the performance of the algorithms. The results show that the new characterization approach and method to combine the indices provide a good quality algorithm selection mechanism for data clustering problems.