A comparison of dimensionality reduction methods using topology preservation indexes

De Medeiros C.J.F.; Costa J.A.F.; Silva L.A.

A comparison of dimensionality reduction methods using topology preservation indexes

Tipo

Artigo de evento

Data de publicação

2011

Periódico

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Citações (Scopus)

3

Autores

De Medeiros C.J.F.
Costa J.A.F.
Silva L.A.

Resumo

Due to the remarkable technological developments experienced in recent decades, the vast amount of data had created new opportunities and challenges in the field of knowledge discovery and data mining. Factors like size and high dimensionality of databases adds difficulties to the complex task of discovering patterns hidden in masses of data. The feasibility of highdimensional data exploration depends on techniques known as dimensionality reduction methods. When class labels are available, an optimization function can be used to maximize intra class cohesion and inter class separation. However, in many practical situations information about class is not available. This paper focuses on unsupervised dimensionality reduction techniques, an important phase in exploratory data analysis. Six important methods are described: Principal components analysis, Sammon projection, Auto-associative Neural network, Kohonen maps, Isomap and Locally Linear Embedding. Three quality indexes are proposed to try to quantify to some degree the topology preservation between input and output spaces. Comparisons are performed using benchmark data sets. Results and tests focused two-dimensional projections for data visualization purposes. © 2011 Springer-Verlag.

Assuntos Scopus

Autoassociative neural networks , Benchmark data , Class labels , Class separation , Complex task , dimensionality reduction , Dimensionality reduction method , Dimensionality reduction techniques , Exploratory data analysis , High dimensional data , High dimensionality , Input and outputs , Knowledge discovery and data minings , Locally linear embedding , Optimization function , Principal components analysis , projections , Quality indices , Technological development , Topology preservation , Two-dimensional projection , Unsupervised method , Autoassociative neural networks , Dimensionality reduction , Dimensionality reduction method , Dimensionality reduction techniques , Knowledge discovery and data minings , Principal components analysis , Projections , Unsupervised method