Using Caching for Local Link Discovery on Large Data Sets [PDF]
by Mofeed Hassan
Engineering the Data Web in the Big Data era demands the development of time- and space-efficient solutions for covering the lifecycle of Linked Data. As shown in previous works, using pure in-memory solutions is doomed to failure as the size of datasets grows continuously with time. In this work, presented by Mofeed Hassan, a study is performed on caching solutions for one of the central tasks on the Data Web, i.e., the discovery of links between resources. To this end, 6 different caching approaches were evaluated on real data using different settings. Our results show that while existing caching approaches already allow performing Link Discovery on large datasets from local resources, the achieved cache hits are still poor. Hence, we suggest the need for dedicated solutions to this problem for tackling the upcoming challenges pertaining to the edification of a semantic Web.