License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICDT.2022.7
URN: urn:nbn:de:0030-drops-158812
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2022/15881/
Go to the corresponding LIPIcs Volume Portal


Addanki, Raghavendra ; McGregor, Andrew ; Meliou, Alexandra ; Moumoulidou, Zafeiria

Improved Approximation and Scalability for Fair Max-Min Diversification

pdf-format:
LIPIcs-ICDT-2022-7.pdf (0.9 MB)


Abstract

Given an n-point metric space ({?},d) where each point belongs to one of m = O(1) different categories or groups and a set of integers k₁, …, k_m, the fair Max-Min diversification problem is to select k_i points belonging to category i ∈ [m], such that the minimum pairwise distance between selected points is maximized. The problem was introduced by Moumoulidou et al. [ICDT 2021] and is motivated by the need to down-sample large data sets in various applications so that the derived sample achieves a balance over diversity, i.e., the minimum distance between a pair of selected points, and fairness, i.e., ensuring enough points of each category are included. We prove the following results:
1) We first consider general metric spaces. We present a randomized polynomial time algorithm that returns a factor 2-approximation to the diversity but only satisfies the fairness constraints in expectation. Building upon this result, we present a 6-approximation that is guaranteed to satisfy the fairness constraints up to a factor 1-ε for any constant ε. We also present a linear time algorithm returning an m+1 approximation with exact fairness. The best previous result was a 3m-1 approximation.
2) We then focus on Euclidean metrics. We first show that the problem can be solved exactly in one dimension. {For constant dimensions, categories and any constant ε > 0, we present a 1+ε approximation algorithm that runs in O(nk) + 2^{O(k)} time where k = k₁+…+k_m.} We can improve the running time to O(nk)+poly(k) at the expense of only picking (1-ε) k_i points from category i ∈ [m].
Finally, we present algorithms suitable to processing massive data sets including single-pass data stream algorithms and composable coresets for the distributed processing.

BibTeX - Entry

@InProceedings{addanki_et_al:LIPIcs.ICDT.2022.7,
  author =	{Addanki, Raghavendra and McGregor, Andrew and Meliou, Alexandra and Moumoulidou, Zafeiria},
  title =	{{Improved Approximation and Scalability for Fair Max-Min Diversification}},
  booktitle =	{25th International Conference on Database Theory (ICDT 2022)},
  pages =	{7:1--7:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-223-5},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{220},
  editor =	{Olteanu, Dan and Vortmeier, Nils},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2022/15881},
  URN =		{urn:nbn:de:0030-drops-158812},
  doi =		{10.4230/LIPIcs.ICDT.2022.7},
  annote =	{Keywords: algorithmic fairness, diversity maximization, data selection, approximation algorithms}
}

Keywords: algorithmic fairness, diversity maximization, data selection, approximation algorithms
Collection: 25th International Conference on Database Theory (ICDT 2022)
Issue Date: 2022
Date of publication: 19.03.2022
Supplementary Material: Audiovisual (Video of the Presentation): https://doi.org/10.5446/57487


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI