A keyword extraction method from twitter messages represented as graphs

dc.contributor.authorAbilhoa W.D.
dc.contributor.authorDe Castro L.N.
dc.date.accessioned2024-03-13T01:00:22Z
dc.date.available2024-03-13T01:00:22Z
dc.date.issued2014
dc.description.abstractTwitter is a microblog service that generates a huge amount of textual content daily. All this content needs to be explored by means of text mining, natural language processing, information retrieval, and other techniques. In this context, automatic keyword extraction is a task of great usefulness. A fundamental step in text mining techniques consists of building a model for text representation. The model known as vector space model, VSM, is the most well-known and used among these techniques. However, some difficulties and limitations of VSM, such as scalability and sparsity, motivate the proposal of alternative approaches. This paper proposes a keyword extraction method for tweet collections that represents texts as graphs and applies centrality measures for finding the relevant vertices (keywords). To assess the performance of the proposed approach, three different sets of experiments are performed. The first experiment applies TKG to a text from the Time magazine and compares its performance with that of the literature. The second set of experiments takes tweets from three different TV shows, applies TKG and compares it with TFIDF and KEA, having human classifications as benchmarks. Finally, these three algorithms are applied to tweets sets of increasing size and their computational running time is measured and compared. Altogether, these experiments provide a general overview of how TKG can be used in practice, its performance when compared with other standard approaches, and how it scales to larger data instances. The results show that TKG is a novel and robust proposal to extract keywords from texts, particularly from short messages, such as tweets. © 2014 Elsevier Inc. All rights reserved.
dc.description.firstpage308
dc.description.lastpage325
dc.description.volume240
dc.identifier.doi10.1016/j.amc.2014.04.090
dc.identifier.issn0096-3003
dc.identifier.urihttps://dspace.mackenzie.br/handle/10899/36375
dc.relation.ispartofApplied Mathematics and Computation
dc.rightsAcesso Restrito
dc.subject.otherlanguageCentrality measures
dc.subject.otherlanguageGraph theory
dc.subject.otherlanguageKeyword extraction
dc.subject.otherlanguageKnowledge discovery
dc.subject.otherlanguageText mining
dc.subject.otherlanguageTwitter data
dc.titleA keyword extraction method from twitter messages represented as graphs
dc.typeArtigo
local.scopus.citations97
local.scopus.eid2-s2.0-84901273966
local.scopus.subjectCentrality measures
local.scopus.subjectKeyword extraction
local.scopus.subjectNAtural language processing
local.scopus.subjectText mining
local.scopus.subjectText mining techniques
local.scopus.subjectText representation
local.scopus.subjectTwitter data
local.scopus.subjectVector space models
local.scopus.updated2024-05-01
local.scopus.urlhttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84901273966&origin=inward
Arquivos