CS561 Web Data Management

Projects

Spring 2013
Professor: Vassilis Christophides
Teaching Assistant: Michalis Chortis
E-mails: {christop, mhortis}@ics.forth.gr

Course Hours: Tuesday 3-5PM and Wednesday 11-1PM
Room: H.204
Office Hours: After the lectures or by appointment
Course Credits: 4


[Home] [Lectures] [Instructional Material] [Software and Tools] [Programming Assignments] [Projects] [Grades]

Project Description - Project Papers - Project Assignments - Schedule of Presentations

Project Description

This is a one or two person comprehensive survey project in which you perform an in-depth analysis of research literature in the area covered by the course. The key to the success of this project is your creativity and dedication. Specifically, you need to do the following:

  1. Determine a research topic for your project according to the list of papers in the project papers section of course's web page.
  2. Read a sufficient number of papers (usually more than the papers you are going to present) in order to perform an in-depth analysis of the research described in the papers. Specifically, you need to focus on the following:
  3. Give a 30 to 45-minute presentation in class, whose organization should be also followed in your report. [Presentation Guidelines]
  4. Write a technical report including all items mentioned above. [Guidelines for the presentation of written work]

Requirements

  1. A survey must analyze a good number (minimum 2 per student) of papers related to the selected topic. The survey report and the presentation will be evaluated on both its breadth (i.e., how complete the coverage is) and its depth (i.e., how much insight it brings out). For the grading of the presentation there are many aspects that would be taken into consideration. In details, as far as the understanding of the paper (12%) the grading will be as follows:
  2. About delivering your talk (8%) the grading will be as follows:


  3. The report must have the following sections: Abstract (up to 250 words), Introduction, the main technical sections, Conclusion / Contributions (according to the related work) and Bibliographic References. Basically, you should follow the structure of research papers such as those you have read.

  4. Both report and presentation should address several or possibly all of the following issues:

  5. The length of the paper should be somewhere between 15 and 30 pages.

What to hand in

  1. The electronic version of your presentation and printed handouts (before the presentation in the classroom).
  2. The electronic version and a hard copy of your report.

Hint

It is suggested that when you study the papers, you would make a list of the points that you find particularly confusing, ambiguous, interesting, controversial, etc., and try to formulate your own comments, possible answers, and examples to address those points. These points and related materials can be a part of your report. In general, you may be asked to address those points in class during your presentation. Thus your critiques and other relevant information should be in your mind when you arrive in class.

Advice on research and writing


Project Assignments

Student ID Student Name Assigned Papers
784 Ozokan Doruk Heuristics-Based Query Processing for Large RDF Graphs Using Cloud Computing
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools
783 Jamous Hassan From SPARQL to MapReduce: The Journey Using a Nested TripleGroup Algebra
Efficient Processing of RDF Graph Pattern Matching on MapReduce Platforms
634 Lantzaki Christina On Blank Nodes
Efficient Query Answering against Dynamic RDF Databases
778 Efthymiou Vassilis A Blocking Framework for Entity Resolution in Highly Heterogeneous Information Spaces
To Compare or Not to Compare: Making Entity Resolution more Efficient
777 Alogdiannaki Eleni RDF3X: a RISCstyle Engine for RDF
Scalable Join Processing on Very Large RDF Graphs
770 Choudhury Vineet Scalable SPARQL Querying of Large RDF Graphs
Towards Effective Partition Management for Large Graphs
776 Manjing Tham Apples and Oranges: A Comparison of RDF Benchmarks and Real RDF Datasets
An Empirical Study of Real-World SPARQL Queries
775 Theivapulendra Enotharani Static Analysis and Optimization of Semantic Web Queries
Efficient Distributed Query Processing for Autonomous RDF Databases
Distributed SPARQL querying with Avalanche
771 Dixit Prabhakar Madhukar Efficient Execution of Top-K SPARQL Queries
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data
779 Nikolov Nikolay Rewriting Queries on SPARQL Views
SPARQL-RW: Transparent Query Access over Mapped RDF Data Sources
781 Sher Imran Falak On Directly Mapping Relational Databases to RDF and OWL
R2RML: RDB to RDF Mapping Language
A Direct Mapping of Relational Data to RDF
773 Xia Siliang gStore: Answering SPARQL Queries via Subgraph Matching
Storing and Indexing Massive RDF Data Sets
793 Saniat Mahmudur Rahman FedBench: A Benchmark Suite for Federated Semantic Data Query Processing
Benchmarking Federated SPARQL Query Engines: Are Existing Testbeds Enough?
772 Berkley Roger Alekos Effective Page Refresh Policies For Web Crawlers
Swoogle: A Search and Metadata Engine for the Semantic Web
703 Seliniotaki Aleka CLARO: Modeling and Processing Uncertain Data Streams
PODS: A New Model and Processing Algorithms for Uncertain Data Streams


Schedule of Presentations

Presentation Date Time Slot Student Name ID Papers Presentation files
Tuesday 14/05 15:00-16:00 Nikolov Nikolay 779 Rewriting Queries on SPARQL Views
SPARQL-RW: Transparent Query Access over Mapped RDF Data Sources
PDF
16:00-17:00 Xia Siliang 773 gStore: Answering SPARQL Queries via Subgraph Matching
Storing and Indexing Massive RDF Data Sets
PDF
17:00-18:00 Jamous Hassan 783 From SPARQL to MapReduce: The Journey Using a Nested TripleGroup Algebra
Efficient Processing of RDF Graph Pattern Matching on MapReduce Platforms
PDF
Wednesday 15/05 11:00-12:00 Theivapulendra Enotharani 775 Static Analysis and Optimization of Semantic Web Queries
Efficient Distributed Query Processing for Autonomous RDF Databases
Distributed SPARQL querying with Avalanche
PDF 1
PDF 2
12:00-13:00 Saniat Mahmudur Rahman 793 FedBench: A Benchmark Suite for Federated Semantic Data Query Processing
Benchmarking Federated SPARQL Query Engines: Are Existing Testbeds Enough?
PDF
13:00-14:00 Dixit Prabhakar Madhukar 771 Efficient Execution of Top-K SPARQL Queries
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data
PDF
14:00-15:00 Ozokan Doruk 784 Heuristics-Based Query Processing for Large RDF Graphs Using Cloud Computing
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools
PDF
Tuesday 21/05 15:00-16:00 Sher Imran Falak 781 On Directly Mapping Relational Databases to RDF and OWL
R2RML: RDB to RDF Mapping Language
A Direct Mapping of Relational Data to RDF
PDF
16:00-17:00 Choudhury Vineet 770 Scalable SPARQL Querying of Large RDF Graphs
Towards Effective Partition Management for Large Graphs
PPTX
17:00-18:00 Alogdiannaki Eleni 777 RDF3X: a RISCstyle Engine for RDF
Scalable Join Processing on Very Large RDF Graphs
PDF
Wednesday 22/05 11:00-12:00 Manjing Tham 776 Apples and Oranges: A Comparison of RDF Benchmarks and Real RDF Datasets
An Empirical Study of Real-World SPARQL Queries
PDF
12:00-13:00 Berkley Roger Alekos 772 Effective Page Refresh Policies For Web Crawlers
Swoogle: A Search and Metadata Engine for the Semantic Web
PDF
13:00-14:00 Seliniotaki Aleka 703 CLARO: Modeling and Processing Uncertain Data Streams
PODS: A New Model and Processing Algorithms for Uncertain Data Streams
PDF
14:00-15:00 Lantzaki Christina 634 On Blank Nodes
Efficient Query Answering against Dynamic RDF Databases
PDF


Project Papers


[Data Integration in the Web of Data] [Web Data Storage and Access] [Benchmarking]