License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICDT.2017.22
URN: urn:nbn:de:0030-drops-70620
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7062/
McGregor, Andrew ;
Vu, Hoa T.
Better Streaming Algorithms for the Maximum Coverage Problem
Abstract
We study the classic NP-Hard problem of finding the maximum k-set coverage in the data stream model: given a set system of m sets that are subsets of a universe {1,...,n}, find the k sets that cover the most number of distinct elements. The problem can be approximated up to a factor 1-1/e in polynomial time. In the streaming-set model, the sets and their elements are revealed online. The main goal of our work is to design algorithms, with approximation guarantees as close as possible to 1-1/e, that use sublinear space o(mn). Our main results are: 1) Two (1-1/e-epsilon) approximation algorithms: One uses O(1/epsilon) passes and O(k/epsilon^2 polylog(m,n)) space whereas the other uses only a single pass but O(m/epsilon^2 polylog(m,n)) space. 2) We show that any approximation factor better than (1-(1-1/k)^k) in constant passes require space that is linear in m for constant k even if the algorithm is allowed unbounded processing time. We also demonstrate a single-pass, (1-epsilon) approximation algorithm using O(m/epsilon^2 min(k,1/epsilon) polylog(m,n)) space.
We also study the maximum k-vertex coverage problem in the dynamic graph stream model. In this model, the stream consists of edge insertions and deletions of a graph on N vertices. The goal is to find k vertices that cover the most number of distinct edges. We show that any constant approximation in constant passes requires space that is linear in N for constant k whereas O(N/epsilon^2 polylog(m,n)) space is sufficient for a (1-epsilon) approximation and arbitrary k in a single pass. For regular graphs, we show that O(k/epsilon^3 polylog(m,n)) space is sufficient for a (1-epsilon) approximation in a single pass. We generalize this to a K-epsilon approximation when the ratio between the minimum and maximum degree is bounded below by K.
BibTeX - Entry
@InProceedings{mcgregor_et_al:LIPIcs:2017:7062,
author = {Andrew McGregor and Hoa T. Vu},
title = {{Better Streaming Algorithms for the Maximum Coverage Problem}},
booktitle = {20th International Conference on Database Theory (ICDT 2017)},
pages = {22:1--22:18},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-024-8},
ISSN = {1868-8969},
year = {2017},
volume = {68},
editor = {Michael Benedikt and Giorgio Orsi},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2017/7062},
URN = {urn:nbn:de:0030-drops-70620},
doi = {10.4230/LIPIcs.ICDT.2017.22},
annote = {Keywords: algorithms, data streams, approximation, maximum coverage}
}
Keywords: |
|
algorithms, data streams, approximation, maximum coverage |
Collection: |
|
20th International Conference on Database Theory (ICDT 2017) |
Issue Date: |
|
2017 |
Date of publication: |
|
17.03.2017 |