Summer School Tutorials
Meghyn Bienvenu:
Title: Inconsistency-Tolerant Query Answering of Description Logic Knowledge Bases
An important issue that arises when querying description logic (DL) knowledge bases is how to handle the case in which the knowledge base is inconsistent. Indeed, while it may be reasonable to assume that the TBox (ontology) has been properly debugged, the ABox (data) will typically be very large and subject to frequent modifications, both of which make errors likely. It is therefore essential to be able to provide meaningful answers to queries in the presence of such data inconsistencies. Unfortunately, standard DL semantics is next to useless in such circumstances, as everything is entailed from a contradiction.
The first part of the tutorial will present and compare the different inconsistency-tolerant semantics that have been proposed to address this problem and which can be applied to any DL (or ontology language). In the second half of the tutorial, we will summarize what is known about the computational properties of these semantics and present some algorithms for inconsistency-tolerant query answering, focusing primarily on DLs of the DL-Lite family.
...................................................................................................................................
Freddy Lecue:
Title: Applying Machine Reasoning and Learning in Real World Applications
Knowledge discovery, as an area focusing upon methodologies for extracting knowledge through deduction (a priori) or from data (a posteriori), has been largely studied in Database and Artificial Intelligence. Deductive reasoning such as logic reasoning gains logically knowledge from pre-established (certain) knowledge statements, while inductive inference such as data mining or learning discovers knowledge by generalising from initial information. While deductive reasoning and inductive learning are conceptually addressing knowledge discovery problems from different perspectives, they are inference techniques that nicely complement each other in real-world applications. In this paper we will present how techniques from machine learning and reasoning can be reconciled and integrated to address large scale problems in the context of (i) transportation in cities of Bologna, Dublin, Miami, Rio and (ii) spend optimisation in finance.
.................................................................................................................................
Carsten Lutz:
Title: Ontology-Based Data Access
The course will give an introduction to ontology-mediated queries (OMQs), which enrich traditional database queries with an ontology. It will survey recent results on OMQs based on description logic (DL) ontologies, with an emphasis on computational complexity and expressive power. The first lecture starts with describing the general setup and aims of ontology-mediated querying and proceeds to discuss implementation approaches and fundamental complexity results. This will include the two most prominent approaches to OMQ implementation, query rewriting and the combined approach; it will also cover several landmark results on data and combined complexity, concentrating on the ALC, EL, and DL-Lite families of DLs. Motivated by the aim to understand why OMQs that occur in practice typically do not exhibit worst-case complexity behaviour, the second lecture will discuss a fine-grained approach to the complexity analysis of OMQs in which the aim is to classify the (data) complexity of every single OMQ. This establishes an intimate and fruitful connection to constraint satisfaction problems and provides new perspectives on the expressive power and descriptive complexity of OMQs.
...................................................................................................................................
Magdalena Ortiz:
Title: Expressive Ontology and Query Languages for Data Access and Management
Description Logics (DLs) are well-established as knowledge representation formalisms, and they lay the foundations of the standard Web Ontology Languages used for sharing domain knowledge in a range of fields. In the last decade, they have been advocated for data management, particularly in the ontology-based data access paradigm.
Since DLs describe relational structures comprising unary and binary relations, they can be naturally deployed for a range of data access and management tasks in the setting of graph-structured data (GSD). For example, DLs can serve as constraint languages for GSD, or be leveraged for answering queries over incomplete graph databases in the presence of domain knowledge. However, in the setting of GSD, features that allow for the recursive navigation of graphs are paramount. They are present in most existing formalisms that are currently used for describing, querying and managing GSD, usually in the form of regular expressions that describe paths. In DLs, in contrast, this kind of recursive navigational constructs are often excluded from most popular DLs and query languages.
In this tutorial, we will revisit the role of path-based navigation in Description Logics, focusing on two settings:
- Answering queries with navigational features over GSD enriched by DL ontologies, from lightweight to very expressive ones.
- The use of expressive DLs with path expressions as constraint languages for GSD. We will first see how using DLs we can obtain improved results for traditional database problems like constraint implication. Then we will look at more challenging problems that have gained attention recently, like reasoning about the preservation of constraints when the data evolves as the result of operations carried out by users or applications.
....................................................................................................................................
Jeff Z. Pan, Nico Matentzoglu, Caroline Jay and Markel Vigo:
Title: Understanding Author Intentions: Test Driven Ontology Authoring
Ontologies are complex knowledge representation artefacts used widely across biomedical, media and industrial domains. Ontology authoring is a non-trivial task for authors who are not proficient in logic. It is difficult to either specify the intentions for an author to build ontology, or test their satisfaction. In this tutorial, we will introduce the notions of explicit author intention and implicit author intention, discuss some approaches for understanding each type of author intentions and show how such understanding can be used in reasoning-based test-driven ontology authoring and can help design guidelines for bulk editing, efficient reasoning and increased situational awareness. We will discuss extensively the implications of test driven ontology authoring to DL reasoning and DL reasoning benchmarks.
.....................................................................................................................................
Juan Reutter:
Title: Datalog-based query languages for graphs
One of the key differences between graph and relational databases is that on graphs we are much more interested in navigational queries. As a consequence, graph database systems are specifically engineered to answer these queries efficiently, and there is a wide body of work on query languages that can express complex navigational patterns.
The most commonly used way to add navigation into graph queries is to start with a basic pattern matching language and augment it with navigational primitives based on regular expressions. For example, the friend-of-a-friend relationship in a social network is expressed via the primitive (friend)+, which looks for paths of nodes connected via the friend relation. This expression can be then added to graph patterns, allowing us to retrieve, for example, all nodes A,B and C that have a common friend-of-a-friend.
But isolating navigation in a set of primitives has drawbacks for both systems and users. First, it requires the implementation and coordination of two separate query engines (one for pattern matching and the other for navigation), which makes problems such as query optimisation substantially more difficult. But, additionally, by focusing on primitives designed to deal with paths we leave out the possibility of expressing other complex navigational relationships that cannot be reduced to a set of path operations. For this reasons we have recently witnessed an effort to study languages which integrate navigation and pattern matching in an intrinsic way. A natural candidate to use is Datalog, a well known declarative query language that extends first order logic with recursion, and where pattern matching and recursion can be arbitrarily nested to provide much more expressive navigational queries.
In this tutorial we review the most common navigational primitives for graphs, and explain how these primitives can be embedded into Datalog. We then show current efforts to restrict Datalog in order to obtain a query language that is both expressive enough to express all these primitives, but at the same time feasible to use in practice. We illustrate how this works both over the base graph model and over the more general RDF format underlying the semantic web.
.......................................................................................................................................
Stefan Schlobach:
Title: Linked Data Lab
With tens if not hundreds of billions of logical statements, the Linked Open Data (LOD) is one of the biggest knowledge bases ever built. As such it is a gigantic source of information for applications in various domains, but also given its size an ideal test-bed for knowledge representation and reasoning, heterogeneous nature, and complexity. However, making use of this unique resource has proven next to impossible in the past due to a number of problems, including data collection, quality, accessibility, scalability, availability and findability. The LOD~Laundromat and LOD~Lab are recent infrastructures that addresses these problems in a systematic way, by automatically crawling, cleaning, indexing, analysing and republishing data in a unified way. Given a family of simple tools, LOD~Lab allows researchers to query, access, analyse and manipulate hundreds of thousands of data documents seamlessly, e.g. facilitating experiments (e,g. for reasoning) over hundreds of thousands of (possibly integrated) datasets based on content and meta-data. These lecture notes provide the theoretical basis and practical skills required for making ideal use of this large scale experimental platform. First we study the problems that make it so hard to work with Semantic Web data in its current form. We'll also propose generic solutions and introduce the tools the reader needs to get started with their own experiments on the LOD~Cloud.
......................................................................................................................................
Umberto Straccia:
Title: From Fuzzy to Annotated Semantic Web Languages
The aim of this talk is to present a detailed, self-contained and comprehensive account of the state of the art in representing and reasoning with fuzzy knowledge in Semantic Web Languages such as triple languages RDF/RDFS, conceptual languages of the OWL 2 family and rule languages.
We further show to which extend we may generalise them to so-called annotation domains, that cover also e.g. temporal, provenance and trust extensions.
.....................................................................................................................................
Frank Wolter:
Title: When are Description Logic Ontologies Inseparable?
The question whether a given ontology can be equivalently replaced by another ontology is fundamental for many ontology engineering and maintenance tasks. It underpins, for example, ontology versioning (what is the difference between two ontologies?), ontology modularization (when is a subset of a given ontology self-contained?), forgetting (what does it mean to forget a set of terms used in an ontology?), and knowledge exchange (how to reformulate a given ontology in a new language?). Whether a given ontology can be equivalently replaced by another ontology depends on the application. For example, if the ontology is used to query data, then answers to all relevant queries should be the same; but if the ontology is used for conceptual reasoning, then the entailed subsumptions between concept expressions should coincide. In the first case, the two ontology are called query inseparable; in the second case they are called subsumption inseparable. In this tutorial, we will introduce notions of inseparability between description logic ontologies, discuss their applications, give algorithms that determine whether two ontologies are inseparable (and, sometimes, compute the difference between them if they are not), and discuss the computational complexity of the relevant decision problems.