• Deutsch
    • English
  • English 
    • Deutsch
    • English
  • Login
Item View 
  •   Home
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Mathematik und Informatik (inkl. GAUSS)
  • Item View
  •   Home
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Mathematik und Informatik (inkl. GAUSS)
  • Item View
JavaScript is disabled for your browser. Some features of this site may not work without it.

Universal Workload-based Graph Partitioning and Storage Adaption for Distributed RDF Stores

by Ahmed Imad Aziz Al-Ghezi
Doctoral thesis
Date of Examination:2020-08-03
Date of issue:2021-01-06
Advisor:Prof. Dr. Lena Wiese
Referee:Prof. Dr. Lena Wiese
Referee:Prof. Dr. Ramin Yahyapour
crossref-logoPersistent Address: http://dx.doi.org/10.53846/goediss-8386

 

 

Files in this item

Name:Alghezi_thesis_elec_sub.pdf
Size:2.11Mb
Format:PDF
Description:PhD Thesis
ViewOpen

The following license files are associated with this item:


Abstract

English

The publication of machine-readable information has been significantly increasing both in the magnitude and complexity of the embedded relations. The Resource Description Framework(RDF) plays a big role in modeling and linking web data and their relations. In line with that important role, dedicated systems were designed to store and query the RDF data using a special queering language called SPARQL similar to the classic SQL. However, due to the high size of the data, several federated working nodes were used to host a distributed RDF store. The data needs to be partitioned, assigned, and stored in each working node. After partitioning, some of the data needs to be replicated in order to avoid the communication cost, and balance the loads for better system throughput. Since replications require more storage space, the important two questions are: what data to replicate? And how much? The answer to the second question is related to other storage-space requirements at each working node like indexes and cache. In order to efficiently answer SPARQL queries, each working node needs to put its share of data into multiple indexes. Those indexes have a data-wide size and consume a considerable amount of storage space. In this context, the same two questions about replications are also raised about indexes. The third storage-consuming structure is the join cache. It is a special index where the frequent join results are cached and save a considerable amount of running time on the cost of high storage space consumption. Again, the same two questions of replication and indexes are applicable to the join-cache. In this thesis, we present a universal adaption approach to the storage of a distributed RDF store. The system aims to find optimal data assignments to the different indexes, replications, and join cache within the limited storage space. To achieve this, we present a cost model based on the workload that often contains frequent patterns. The workload is dynamically analyzed to evaluate predefined rules. Those rules tell the system about the benefits and costs of assigning which data to what structure. The objective is to have better query execution time. Besides the storage adaption, the system adapts its processing resources with the queries' arrival rate. The aim of this adaption is to have better parallelization per query while still provides high system throughput.
Keywords: RDF; Distributed Triple Store
 

Statistik

Publish here

Browse

All of eDissFaculties & ProgramsIssue DateAuthorAdvisor & RefereeAdvisorRefereeTitlesTypeThis FacultyIssue DateAuthorAdvisor & RefereeAdvisorRefereeTitlesType

Help & Info

Publishing on eDissPDF GuideTerms of ContractFAQ

Contact Us | Impressum | Cookie Consents | Data Protection Information
eDiss Office - SUB Göttingen (Central Library)
Platz der Göttinger Sieben 1
Mo - Fr 10:00 – 12:00 h


Tel.: +49 (0)551 39-27809 (general inquiries)
Tel.: +49 (0)551 39-28655 (open access/parallel publications)
ediss_AT_sub.uni-goettingen.de
[Please replace "_AT_" with the "@" sign when using our email adresses.]
Göttingen State and University Library | Göttingen University
Medicine Library (Doctoral candidates of medicine only)
Robert-Koch-Str. 40
Mon – Fri 8:00 – 24:00 h
Sat - Sun 8:00 – 22:00 h
Holidays 10:00 – 20:00 h
Tel.: +49 551 39-8395 (general inquiries)
Tel.: +49 (0)551 39-28655 (open access/parallel publications)
bbmed_AT_sub.uni-goettingen.de
[Please replace "_AT_" with the "@" sign when using our email adresses.]