Effective Keyword Search Over Weighted Graphs


Kargar M., Golab L., Srivastava D., Szlichta J., Zihayat M.

IEEE Transactions on Knowledge and Data Engineering, vol.34, no.2, pp.601-616, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 34 Issue: 2
  • Publication Date: 2022
  • Doi Number: 10.1109/tkde.2020.2985376
  • Journal Name: IEEE Transactions on Knowledge and Data Engineering
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, Civil Engineering Abstracts
  • Page Numbers: pp.601-616
  • Keywords: Proteins, Companies, Databases, Keyword search, Security, Uncertainty, Approximation algorithms, Graph databases, keyword search, approximation algorithms
  • TED University Affiliated: No

Abstract

© 1989-2012 IEEE.Real graphs often contain edge and node weights, representing, for instance, penalty, distance or uncertainty. We study the problem of keyword search over weighted node-labeled graphs, in which a query consists of a set of keywords and an answer is a subgraph whose nodes contain the keywords. We evaluate answers using three ranking strategies: optimizing edge weights, optimizing node weights, and a bi-objective combination of both node and edge weights. We prove that optimizing node weights and the bi-objective function are NP-hard. We propose an algorithm that optimizes edge weights and has an approximation ratio of two for the unique node enumeration paradigm. To optimize node weights and the bi-objective function, we propose transformations that distribute node weights onto the edges. We then prove that our transformations allow our algorithm to also optimize node weights and the bi-objective function with the same approximation ratio of two. Notably, the proposed transformations are compatible with existing algorithms that only optimize edge weights. We empirically show that in many natural examples, incorporating node weights (both keyword holders and middle nodes) produces more relevant answers than ranking methods based only on edge weights. Extensive experiments over real-life datasets verify the effectiveness and efficiency of our solution.