License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2014.275
URN: urn:nbn:de:0030-drops-45763
Go to the corresponding OASIcs Volume Portal

Brogueira, Gaspar ; Batista, Fernando ; Carvalho, João Paulo ; Moniz, Helena

Expanding a Database of Portuguese Tweets

24.pdf (2 MB)


This paper describes an existing database of geolocated tweets that were produced in Portuguese regions and proposes an approach to further expand it. The existing database covers eight consecutive days of collected tweets, totaling about 300 thousand tweets, produced by about 11 thousand different users. A detailed analysis on the content of the messages suggests a predominance of young authors that use Twitter as a way of reaching their colleagues with their feelings, ideas and comments. In order to further characterize this community of young people, we propose a method for retrieving additional tweets produced by the same set of authors already in the database. Our goal is to further extend the knowledge about each user of this community, making it possible to automatically characterize each user by the content he/she produces, cluster users and open other possibilities in the scope of social analysis.

BibTeX - Entry

  author =	{Gaspar Brogueira and Fernando Batista and Jo{\~a}o Paulo Carvalho and Helena Moniz},
  title =	{{Expanding a Database of Portuguese Tweets}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{275--282},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Maria Jo{\~a}o Varanda Pereira and Jos{\'e} Paulo Leal and Alberto Sim{\~o}es},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-45763},
  doi =		{10.4230/OASIcs.SLATE.2014.275},
  annote =	{Keywords: Twitter, corpus of Portuguese tweets, Twitter API, natural language processing, text analysis}

Keywords: Twitter, corpus of Portuguese tweets, Twitter API, natural language processing, text analysis
Collection: 3rd Symposium on Languages, Applications and Technologies
Issue Date: 2014
Date of publication: 18.06.2014

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI