Sub-Query Fragmentation Technique for Distributed Data Caching

Kuppili Venkata, S., Musial K., and Keppens, J.

IEEE Transactions on Parallel and Distributed Systems ?:.

March 2018

Abstract

The scientific and technological advancements made it possible to create and manage very large and distributed databases. Groups of researchers from all over the world collaborate on projects and utilise the data from those databases for their collective work. They query multiple databases across the globe. It is observed that often their queries overlap at least partially. Current distributed cache systems fail to recognise and analyse the partial query fragments and association between frequently queried data segments. The query request and response system for large and distributed databases suffer from longer response times due to higher volumes of data transfers. With the current trend of semantic analysis and recommendation systems, there is a need to redefine caching in the distributed environment. In this paper, we present sub-query fragmentation technique that defines fragments of query plans as portable query objects to place on coordinated, cooperative distributed caches. Each query object behaves as an independent data unit that captures the association with other data units from query workloads and helps the cache management to unearth user query patterns. Sub-query fragmentation is an extension to semantic caching for distributed cache environment. We observe a considerable reduction in the data transfer time with sub-query fragmentation.