Data quality measurement framework
Tipo
Artigo de evento
Data de publicação
2018
Periódico
Proceedings - 2018 44th Latin American Computing Conference, CLEI 2018
Citações (Scopus)
1
Autores
Fereira M.
Silva L.A.
Silva L.A.
Orientador
Título da Revista
ISSN da Revista
Título de Volume
Membros da banca
Programa
Resumo
© 2018 IEEE.Data Quality evaluation is a key fundamental in Knowledge Data Discovery projects. There are some project frameworks, like CRISP-DM and DAMA DMBOK, that recommend the preparation of the Data Quality Report, as a tool to describe the found problems during the data exploration phase and to describe an approach to fix those problems. However, those frameworks are very generic in their guidelines and neither tell what exactly should be measured nor how to associate any measure to the data quality. Data Profiling tools and some ETL(Extraction, Transformation and Loading) tools as well, implement some basic Statistical Description tooling, but they do not propose any general methodolgy to evaluate quantitatively the quality of a set of data, except, perhaps, in the IBM Watson Analytics tool. This article proposes a quantitative measure for data quality evaluation, based on Statistical Description tools.
Descrição
Palavras-chave
Assuntos Scopus
Analytics tools , Data exploration , Data governances , Data profiling , Data quality , Preprocessing , Quantitative measures , Statistical descriptions