License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SCOR.2014.16
URN: urn:nbn:de:0030-drops-46663
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2014/4666/
Go to the corresponding OASIcs Volume Portal


Stutzki, Jan

Multilingual Trend Detection in the Web

pdf-format:
3.pdf (0.5 MB)


Abstract

This paper represents results from our ongoing research project in the foresight area. The goal of the project is to develop web based tools which automatically detect activity and trends regarding given keywords. This knowledge can be used to enable decision makers to react proactively to arising challenges.

As for now we can detect trends worldwide in more than 60 languages and assign these trends accordingly to over 100 national states. To reach this goal we utilize the big search engines as their core competence is to determine the relevance of a document regarding the search query. The search engines allow slicing of the results by language and country.

In the next step we download some of the proposed documents for analysis. Because of the amount of information required we reach the field of Big Data. Therefore an extra effort is made to ensure scalability of the application.

We introduce a new approach to activity and trend detection by combining the data collection and detection methods. To finally detect trends in the gathered data we use data mining methods which allow us to be independent from the language a document is written in. The input of these methods is the text data of the downloaded documents and a specially prepared index structure containing meta data and various other information which accumulate during the collection of the documents.

We show that we can reliably detect trends and activities in highly active topics and discuss future research.

BibTeX - Entry

@InProceedings{stutzki:OASIcs:2014:4666,
  author =	{Jan Stutzki},
  title =	{{Multilingual Trend Detection in the Web}},
  booktitle =	{4th Student Conference on Operational Research},
  pages =	{16--24},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-67-5},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{37},
  editor =	{Pedro Crespo Del Granado and Martim Joyce-Moniz and Stefan Ravizza},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2014/4666},
  URN =		{urn:nbn:de:0030-drops-46663},
  doi =		{10.4230/OASIcs.SCOR.2014.16},
  annote =	{Keywords: Information Retrieval, Web Mining, Trend Detection}
}

Keywords: Information Retrieval, Web Mining, Trend Detection
Collection: 4th Student Conference on Operational Research
Issue Date: 2014
Date of publication: 06.08.2014


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI