License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CP.2022.27
URN: urn:nbn:de:0030-drops-166564
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2022/16656/
Hidouri, Amel ;
Jabbour, Said ;
Raddaoui, Badran
On the Enumeration of Frequent High Utility Itemsets: A Symbolic AI Approach
Abstract
Mining interesting patterns from data is a core part of the data mining world. High utility mining, an active research topic in data mining, aims to discover valuable itemsets with high profit (e.g., cost, risk). However, the measure of interest of an itemset must primarily reflect not only the importance of items in terms of profit, but also their occurrence in data in order to make more crucial decisions. Some proposals are then introduced to deal with the problem of computing high utility itemsets that meet a minimum support threshold. However, in these existing proposals, all transactions in which the itemset appears are taken into account, including those in which the itemset has a low profit. So, no additional information about the overall utility of the itemset is taken into account. This paper addresses this issue by introducing a SAT-based model to efficiently find the set of all frequent high utility itemsets with the use of a minimum utility threshold applied to each transaction in which the itemset appears. More specifically, we reduce the problem of mining frequent high utility itemsets to the one of enumerating the models of a formula in propositional logic, and then we use state-of-the-art SAT solvers to solve it. Afterwards, to make our approach more efficient, we provide a decomposition technique that is particularly suitable for deriving smaller and independent sub-problems easy to resolve. Finally, an extensive experimental evaluation on various popular datasets shows that our method is fast and scale well compared to the state-of-the art algorithms.
BibTeX - Entry
@InProceedings{hidouri_et_al:LIPIcs.CP.2022.27,
author = {Hidouri, Amel and Jabbour, Said and Raddaoui, Badran},
title = {{On the Enumeration of Frequent High Utility Itemsets: A Symbolic AI Approach}},
booktitle = {28th International Conference on Principles and Practice of Constraint Programming (CP 2022)},
pages = {27:1--27:17},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-240-2},
ISSN = {1868-8969},
year = {2022},
volume = {235},
editor = {Solnon, Christine},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2022/16656},
URN = {urn:nbn:de:0030-drops-166564},
doi = {10.4230/LIPIcs.CP.2022.27},
annote = {Keywords: Data Mining, High Utility Itemsets, Propositional Satisfiability}
}
Keywords: |
|
Data Mining, High Utility Itemsets, Propositional Satisfiability |
Collection: |
|
28th International Conference on Principles and Practice of Constraint Programming (CP 2022) |
Issue Date: |
|
2022 |
Date of publication: |
|
23.07.2022 |