• Deutsch
    • English
  • Deutsch 
    • Deutsch
    • English
  • Einloggen
Dokumentanzeige 
  •   Startseite
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Mathematik und Informatik (inkl. GAUSS)
  • Dokumentanzeige
  •   Startseite
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Mathematik und Informatik (inkl. GAUSS)
  • Dokumentanzeige
JavaScript is disabled for your browser. Some features of this site may not work without it.

Universal Workload-based Graph Partitioning and Storage Adaption for Distributed RDF Stores

von Ahmed Imad Aziz Al-Ghezi
Dissertation
Datum der mündl. Prüfung:2020-08-03
Erschienen:2021-01-06
Betreuer:Prof. Dr. Lena Wiese
Gutachter:Prof. Dr. Lena Wiese
Gutachter:Prof. Dr. Ramin Yahyapour
crossref-logoZum Verlinken/Zitieren: http://dx.doi.org/10.53846/goediss-8386

 

 

Dateien

Name:Alghezi_thesis_elec_sub.pdf
Size:2.11Mb
Format:PDF
Description:PhD Thesis
ViewOpen

Lizenzbestimmungen:


Zusammenfassung

Englisch

The publication of machine-readable information has been significantly increasing both in the magnitude and complexity of the embedded relations. The Resource Description Framework(RDF) plays a big role in modeling and linking web data and their relations. In line with that important role, dedicated systems were designed to store and query the RDF data using a special queering language called SPARQL similar to the classic SQL. However, due to the high size of the data, several federated working nodes were used to host a distributed RDF store. The data needs to be partitioned, assigned, and stored in each working node. After partitioning, some of the data needs to be replicated in order to avoid the communication cost, and balance the loads for better system throughput. Since replications require more storage space, the important two questions are: what data to replicate? And how much? The answer to the second question is related to other storage-space requirements at each working node like indexes and cache. In order to efficiently answer SPARQL queries, each working node needs to put its share of data into multiple indexes. Those indexes have a data-wide size and consume a considerable amount of storage space. In this context, the same two questions about replications are also raised about indexes. The third storage-consuming structure is the join cache. It is a special index where the frequent join results are cached and save a considerable amount of running time on the cost of high storage space consumption. Again, the same two questions of replication and indexes are applicable to the join-cache. In this thesis, we present a universal adaption approach to the storage of a distributed RDF store. The system aims to find optimal data assignments to the different indexes, replications, and join cache within the limited storage space. To achieve this, we present a cost model based on the workload that often contains frequent patterns. The workload is dynamically analyzed to evaluate predefined rules. Those rules tell the system about the benefits and costs of assigning which data to what structure. The objective is to have better query execution time. Besides the storage adaption, the system adapts its processing resources with the queries' arrival rate. The aim of this adaption is to have better parallelization per query while still provides high system throughput.
Keywords: RDF; Distributed Triple Store
 

Statistik

Hier veröffentlichen

Blättern

Im gesamten BestandFakultäten & ProgrammeErscheinungsdatumAutorBetreuer & GutachterBetreuerGutachterTitelTypIn dieser FakultätErscheinungsdatumAutorBetreuer & GutachterBetreuerGutachterTitelTyp

Hilfe & Info

Publizieren auf eDissPDF erstellenVertragsbedingungenHäufige Fragen

Kontakt | Impressum | Cookie-Einwilligung | Datenschutzerklärung
eDiss - SUB Göttingen (Zentralbibliothek)
Platz der Göttinger Sieben 1
Mo - Fr 10:00 – 12:00 h


Tel.: +49 (0)551 39-27809 (allg. Fragen)
Tel.: +49 (0)551 39-28655 (Fragen zu open access/Parallelpublikationen)
ediss_AT_sub.uni-goettingen.de
[Bitte ersetzen Sie das "_AT_" durch ein "@", wenn Sie unsere E-Mail-Adressen verwenden.]
Niedersächsische Staats- und Universitätsbibliothek | Georg-August Universität
Bereichsbibliothek Medizin (Nur für Promovierende der Medizinischen Fakultät)
Robert-Koch-Str. 40
Mon – Fri 8:00 – 24:00 h
Sat - Sun 8:00 – 22:00 h
Holidays 10:00 – 20:00 h
Tel.: +49 551 39-8395 (allg. Fragen)
Tel.: +49 (0)551 39-28655 (Fragen zu open access/Parallelpublikationen)
bbmed_AT_sub.uni-goettingen.de
[Bitte ersetzen Sie das "_AT_" durch ein "@", wenn Sie unsere E-Mail-Adressen verwenden.]