Abstract
Resolution parameters in graph clustering control the size and structure of clusters formed by solving a parametric objective function. Typically there is more than one meaningful way to cluster a graph, and solving the same objective function for different resolution parameters produces clusterings at different levels of granularity, each of which can be meaningful depending on the application. In this paper, we address the task of efficiently solving a parameterized graph clustering objective for all values of a resolution parameter. Specifically, we consider a new analysisfriendly objective we call LambdaPrime, involving a parameter λ ∈ (0,1). LambdaPrime is an adaptation of LambdaCC, a significant family of instances of the Correlation Clustering (minimization) problem. Indeed, LambdaPrime and LambdaCC are closely related to other parameterized clustering problems, such as parametric generalizations of modularity. They capture a number of specific clustering problems as special cases, including sparsest cut and cluster deletion. While previous work provides approximation results for a single value of the resolution parameter, we seek a set of approximately optimal clusterings for all values of λ in polynomial time.
More specifically, we show that when a graph has m edges and n nodes, there exists a set of at most m clusterings such that, for every λ ∈ (0,1), the family contains an optimal solution to the LambdaPrime objective. This bound is tight on star graphs. We obtain a family of O(log n) clusterings by solving the parametric linear programming (LP) relaxation of LambdaPrime at O(log n) λ values, and rounding each LP solution using existing approximation algorithms. We prove that this is asymptotically tight: for a certain class of ring graphs, for all values of λ, Ω(log n) feasible solutions are required to provide a constantfactor approximation for the LambdaPrime LP relaxation. To minimize the size of the clustering family, we further propose an algorithm that yields a family of solutions of a size no more than twice of the minimum LPapproximating family.
BibTeX  Entry
@InProceedings{gan_et_al:LIPIcs:2020:12706,
author = {Junhao Gan and David F. Gleich and Nate Veldt and Anthony Wirth and Xin Zhang},
title = {{Graph Clustering in All Parameter Regimes}},
booktitle = {45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020)},
pages = {39:139:15},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959771597},
ISSN = {18688969},
year = {2020},
volume = {170},
editor = {Javier Esparza and Daniel Kr{\'a}ľ},
publisher = {Schloss DagstuhlLeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2020/12706},
URN = {urn:nbn:de:0030drops127065},
doi = {10.4230/LIPIcs.MFCS.2020.39},
annote = {Keywords: Graph Clustering, Algorithms, Parametric Linear Programming}
}
Keywords: 

Graph Clustering, Algorithms, Parametric Linear Programming 
Collection: 

45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020) 
Issue Date: 

2020 
Date of publication: 

18.08.2020 