@@ -39,79 +41,81 @@ This first section introduces every notion we will need later on. We will start

\subsection{Setting}\label{gen-def}

We consider a first-order setting limited to \emph{predicate}, \emph{constants} and \emph{variable} symbols. A \emph{term} is a variable or a constant.

We consider a first-order setting limited to \emph{predicates}, \emph{constants}, and \emph{variables}. A \emph{term} is a variable or a constant. Each predicate is associated with an integer, called the \emph{arity} of the predicate.

\subsubsection{Atoms and factbases}

\begin{definition}\label{def-atom}

\textbf{(atom)}An \emph{atom} is of the form $p(e_1, ..., e_n)$ with $p$ a predicate of arity $n$ and $e_1, ..., e_n$ terms.

An \emph{atom} is of the form $p(e_1, ..., e_n)$ with $p$ a predicate of arity $n$ and $e_1, ..., e_n$ terms.

The set of variables, constants and terms appearing in the atom $h$ are respectively denoted with $var(h)$, $cnst(h)$ and $term(h)$.

\end{definition}

\begin{definition}\label{def-fb}

\textbf{(factbase)}A \emph{factbase} is an existentially closed conjunction of atoms. The set of variables, constants and terms appearing in the factbase $F$ are respectively denoted with $var(F)$, $cnst(F)$ and $term(F)$.

A \emph{factbase} is an existentially closed conjunction of atoms.

\end{definition}

It is often convenient to see a factbase as a set of atoms. For instance, \ens{A(x, y), B(x, z, y)} would refer to the factbase $\exists x, y, z,\ A(x, y)\wedge B(x, z, y)$.

It is often convenient to see a factbase as a set of atoms. Henceforth, we identify a fact base with the corresponding set of atoms. For instance, \ens{A(x, y), B(x, z, y)} would refer to the factbase $\exists x, y, z . A(x, y)\wedge B(x, z, y)$.\todo{Write a dot between the first occurrence of $z$ and the atom $A(x, y)$.}

Let $A$ be an atom or a factbase. The set of variables, constants and terms appearing in $A$ are respectively denoted with $\textit{var}(A)$, $\textit{cnst}(A)$ and $\textit{term}(A)$.

\subsubsection{Existential rules and knowledge bases}

\begin{definition}\label{def-exist-rule}

\textbf{(existential rule)} An existential rule$R$ is a first-order formula of the following form:

An \emph{existential rule}$R$ is a first-order formula of the following form:

with $\bar{x}$, $\bar{y}$ and $\bar{z}$ sets of variables and $B$ and $H$ conjunctions of atoms, respectively called the \emph{body} and the \emph{head} of the rule.

$\bar{x}$, which is shared between the body and the head, is called the \emph{frontier} of the rule and is denoted with $fr(R)$.

\todo{Explain the notation $H(\bar{x}, \bar{z})$.} with $\bar{x}$, $\bar{y}$ and $\bar{z}$ sets of variables and $B$ and $H$ conjunctions of atoms, respectively called the \emph{body} and the \emph{head} of the rule.

\todo{Start the sentence with a word.}$\bar{x}$, which is shared between the body and the head, is called the \emph{frontier} of the rule and is denoted with $fr(R)$.

$\bar{z}$ is called the \emph{existential variables} of $R$.

\end{definition}

For the sake of simplicity, existential rules will be denoted with $B \rightarrow H$, with all variables appearing in $B$ implicitly universally quantified, and variables in $H$ but not in $B$ existentially quantified.

A set of existential rules is called a \emph{ruleset}.

In the context of this paper, we refer to existential rules as rules.

\begin{definition}\label{def-datalog}

\textbf{(Datalog rule)}An existential rule with no existential variable is called a \emph{Datalog rule}.

An existential rule without existential variables is called a \emph{Datalog rule}.

\end{definition}

\begin{definition}\label{def-kb}

\textbf{(Knowledge base)}A \emph{knowledge base} is the given of a factbase and a ruleset.

A \emph{knowledge base} is a tuple $\langle R, B \rangle$ where $R$ is a rule set and $B$ is a factbase.

\end{definition}

\subsubsection{Substitutions, homomorphisms and retractions}

\begin{definition}\label{def-subst}

\textbf{(substitution)}A \emph{substitution} is a mapping from a set of variables to a set of terms.

A \emph{substitution} is a mapping from the set of variables to the set of terms.

\end{definition}

A substitution is often denoted as a set of individual variable mappings.

A substitution is often denoted as a set of individual variable mappings.\todo{Clarify what you mean here.}

\begin{definition}\label{def-homo}

\textbf{(homomorphism)}A \emph{homomorphism} from $F$ to $F'$ is a substitution $h:var(F)\to term(F')$ such that $h(F)\subset F'$.

A \emph{homomorphism} from $F$ to $F'$ is a substitution $h:var(F)\to term(F')$ such that $h(F)\subseteq F'$.

\end{definition}

An important property of homomorphisms is that a factbase $F$ logically entails $F'$ if and only if there is a homomorphism from $F'$ to $F$.

An important property of homomorphisms is that a factbase $F$ logically entails another fact base $F'$ if and only if there is a homomorphism from $F'$ to $F$.

\begin{definition}\label{def-retr}

\textbf{(retraction, retract)} Let$F$ be a set of atoms and$F'\subset F$. A homomorphism $\sigma$ is a \emph{retraction} from $F$ to $F'$ if $\sigma_{|F'}= id_{F'}$.

Let $F$ and$F'$ be two sets of atoms such that$F'\subset F$. A homomorphism $\sigma$ is a \emph{retraction} from $F$ to $F'$ if $\sigma_{|F'}= id_{F'}$.

If a retraction exists from $F$ to $F'$, $F'$ is called a \emph{retract} of $F$.

\end{definition}

\subsection{The chase algorithm}\label{chase}

The chase is an algorithm family with the goal of computing an universal model for a knowledge base. To define it, we will first need a few notions.

The chase is a family of algorithms with the goal of computing a universal model for a knowledge base.\todo{Define universal model and the semantics of our fragment.} To define it, we will first need a few notions.

\subsubsection{Triggers and derivations}

\begin{definition}\label{def-trig}

\textbf{(trigger)}Le $F$ be a factbase and $R = B\to H$ an existential rule. If there is a homomorphism $\pi$ from $B$ to $F$, then we say that $R$ is applicable on $F$ via $\pi$. In this case, the pair $(R, \pi)$ is called a \emph{trigger}.

Let$F$ be a factbase and $R = B\to H$ an existential rule. If there is a homomorphism $\pi$ from $B$ to $F$, then we say that $R$ is applicable on $F$ via $\pi$. In this case, the pair $t =(R, \pi)$\todo{Shouldn't this include $F$.} is called a \emph{trigger}.

We denote $\pi(B)$ by $support(t)$.

\end{definition}

When applying a trigger $t$, we extend $\pi$ to map every existential variable with a fresh variable indexed by $t$. Such extension is denoted by $\hat\pi$.

When applying a trigger $t$, we extend $\pi$ to map every existential variable with a fresh variable indexed by $t$\todo{Be a bit more precise about how these new variables are instantiated}. Such extension is denoted by $\hat\pi$.

\begin{definition}\label{def-immder}

\textbf{(immediate derivation)}Let $F$ be a factbase and $t =(R, \pi)$ a trigger, with $R = B\to H$. The factbase $F\cup\hat\pi(H)$, which is the result of applyting $t$ to $F$, is called an \emph{immediate derivation} from $F$ through $t$.

Let $F$ be a factbase and $t =(R, \pi)$ a trigger, with $R = B\to H$. The factbase $F\cup\hat\pi(H)$, which is the result of applyting $t$ to $F$, is called an \emph{immediate derivation} from $F$ through $t$.

We denote $\hat\pi(H)$ by $output(t)$.

\end{definition}

\begin{definition}\label{def-der}

\textbf{(derivation)}Let $\mathcal{K}=(F, \setR)$ be a knowledge base. A \emph{derivation} from $\mathcal{K}$ is a possibly infinite sequence $\der=(\emptyset, F_0), (t_1, F_1), (t_2, F_2), ...$ where $F_0= F$ and for each $i>0$, $F_{i}$ is an immediate derivation from $F_{i-1}$ through $t_i$, and for each $i \neq j$, $t_i \neq t_j$.

Let $\mathcal{K}=(F, \setR)$ be a knowledge base. A \emph{derivation} from $\mathcal{K}$ is a possibly infinite sequence $\der=(\emptyset, F_0), (t_1, F_1), (t_2, F_2), ...$ where $F_0= F$, and for each $i>0$, $t_i$ is a trigger, $F_{i}$ is a factbase, and $F_{i}$ is an immediate derivation from $F_{i-1}$ through $t_i$, and for each $i \neq j$, $t_i \neq t_j$.

The set of all triggers appearing in \der\ is denoted by $triggers(\der)$. The result of the application of the derivation \der\ on $F$ is $F^{\der}=\bigcup_i F_i$

\end{definition}

...

...

@@ -122,17 +126,18 @@ The $k$-prefix of a derivation \der\ is the result of the application of the fir

The chase variants differ from each other in which triggers are applicable, as highlighted in the next definition.

\begin{definition}\label{def-xapp}

\textbf{(X-applicability)}Let $\mathcal{K}=(F, \setR)$ be a knowledge base, \der\ be a derivation from $\mathcal{K}$, and $t =(R, \pi)$ a trigger. Let us assume $R$ is applicable on $F^\der$. Then $t$ is:

Let $\mathcal{K}=(F, \setR)$ be a knowledge base, \der\ be a derivation from $\mathcal{K}$, and $t =(R, \pi)$ a trigger. Let us assume $R$ is applicable on $F^\der$. Then $t$ is:

\begin{itemize}

\item\emph{\textbf{O}-applicable} on \der\ if $t\notin triggers(\der)$

\item\emph{\textbf{SO}-applicable} on \der\ if there is no trigger $(R, \pi')\in triggers(\der)$ such that $\pi_{|fr(R)}=\pi'_{|fr(R)}$

\item\emph{\textbf{R}-applicable} on \der\ if there is no retraction from $F^\der\cup output(t)$ to $F^\der$

\item\emph{\textbf{E}-applicable} on \der\ if there is no homomorphism from $F^\der\cup output(t)$ to $F^\der$

\end{itemize}

\todo{I am a bit confused by this definition; let's discuss.}

\end{definition}

\begin{definition}\label{def-chase}

\textbf{(X-chase)}Let $\textbf{X}\in\ens{\textbf{O}, \textbf{SO}, \textbf{R}, \textbf{E}}$. A derivation \der\ for which every trigger $t_i\in triggers(\der)$ is \textbf{X}-applicable on $\der_{|i-1}$ is called an \emph{\textbf{X}-derivation}. The class of all \textbf{X}-derivation is called the \emph{\textbf{X}-chase}.

Let $\textbf{X}\in\ens{\textbf{O}, \textbf{SO}, \textbf{R}, \textbf{E}}$. A derivation \der\ for which every trigger $t_i\in triggers(\der)$ is \textbf{X}-applicable on $\der_{|i-1}$ is called an \emph{\textbf{X}-derivation}. The class of all \textbf{X}-derivation is called the \emph{\textbf{X}-chase}.

\end{definition}

\subsubsection{Termination}

...

...

@@ -140,13 +145,15 @@ The chase variants differ from each other in which triggers are applicable, as h

The notion of termination is reliant on the notion of fairness.

\begin{definition}\label{def-fair}

\textbf{(fairness)}An \textbf{X}-derivation \der\ is \emph{fair} if whenever a trigger $t$ is \textbf{X}-applicable on $\der_{|i}$, there exists a $k > i$ such that either $t_k = t$ or $t$ is not \textbf{X}-applicable on $\der_{|k}$.

An \textbf{X}-derivation \der\ is \emph{fair} if whenever a trigger $t$ is \textbf{X}-applicable on $\der_{|i}$, there exists a $k > i$ such that either $t_k = t$ or $t$ is not \textbf{X}-applicable on $\der_{|k}$.\todo{The first condition is this disjunction is not necessary.}

\end{definition}

\begin{definition}\label{def-term}

\textbf{(termination)}An \textbf{X}-derivation is \emph{terminating} if it is both fair and finite.

An \textbf{X}-derivation is \emph{terminating} if it is both fair and finite.

We say that the \textbf{X}-chase terminates on a knowledge base $\mathcal{K}$ if every fair \textbf{X}-derivation from $\mathcal{K}$ is finite.

\todo{Discuss sometimes/always termination and how this affects the restricted chase.}

\textbf{(piece graph)}Let $R = B \rightarrow H$. The \emph{piece graph} of $R$ is the graph which vertices are the atoms appearing in $H$, and with the following edge relation:

\[h_1\text{ is in relation with } h_2\Leftrightarrow\exists z,\z\in var(h_1)\wedge z\in var(h_2)\]

Let $R = B \rightarrow H$. The \emph{piece graph} of $R$ is the graph which vertices are the atoms appearing in $H$, and with the following edge relation:

\[h_1\text{ is in relation with } h_2\Leftrightarrow\exists z\text{ existential variable } .\ z\in var(h_1)\wedge z\in var(h_2)\]

\end{definition}

\todo{Pieces are ``glued together'' with existential variables!}

\begin{definition}\label{def-rpiece}

\textbf{(rule piece)}Let $R = B \rightarrow H$. A \emph{rule piece} of $R$ is the conjunction of atoms corresponding to a connected component of its piece graph.

Let $R = B \rightarrow H$. A \emph{rule piece} of $R$ is the conjunction of atoms corresponding to a connected component of its piece graph.

\end{definition}

\begin{definition}\label{def-sp-trans}

\textbf{(single-piece translation)}Let \setR\ be a set of existential rules. The \emph{single-piece translation} of \setR, denoted with \spR, is defind as follow. For each rule $R = B \rightarrow H$, for each rule piece $H_i$, add in \spR\ the rule $B \rightarrow H_i$.

Let \setR\ be a set of existential rules. The \emph{single-piece translation} of \setR, denoted with \spR, is defind as follow. For each rule $R = B \rightarrow H$, for each rule piece $H_i$, add in \spR\ the rule $B \rightarrow H_i$.

\end{definition}

\todo{Merge the three previous definitions; also define piece transformation on a rule and then on a rule set.}

\subsection{Equivalence of the single-piece translation}\label{sp-eq}

\begin{proposition}\label{prop-sp-eq}

Given a set of rules \setR,\setR\ and \spR\ are logically equivalent.

A ruleset \setR{} is equivalent to the set \spR.

\end{proposition}

\begin{proof}

...

...

@@ -179,6 +188,7 @@ In this section we will first talk about an equivalent translation, the single-p

$\forall A,\ \forall B,\ \forall C,\ ((A \rightarrow B)\wedge(A \rightarrow C))\leftrightarrow(A \rightarrow B\wedge C)$, thus $R'_1\wedge ...\wedge R'_m \leftrightarrow R$. Every rule in \spR\ being extracted from a rule in \setR, we have the equivalence.

\todo{Improve structure: deal with both directions of the proposition separately. That is, first show that a model of \setR{} is also a model \spR. Then the other direction.}

\end{proof}

\subsection{Preservation of chase termination}\label{sp-ter}

...

...

@@ -206,14 +216,18 @@ In this section we will first talk about an equivalent translation, the single-p

\noindent\textbf{(Proof idea - WIP)} If a trigger $t_n$ is not \textbf{X}-applicable on $F_n$, it is because the same trigger has already been applied before. This can happen only $m$ times per rule in \setR, with $m$ the max number of rules formed in \spR from a single rule of \setR. Thus, $\der^{sp}$ cannot be longer than $N\times m$, which is finite.

\end{proof}

Comment regarding the restricted chase: consider the rule set $\setR= R(x, y)\to\exists z, w . R(y, z)\wedge R(y, w)\wedge R(w, y)$, which does not terminate w.r.t. the restricted chase. Note that the rule set \spR does sometimes terminate with respect to the restricted chase.

Note that the situation described above for the restricted chase does not happen (I reckon) for the other chase variants.

In this seciton we will discuss the preservation of chase termination through a normalizing procedure that ensures that every rule will only have one atom in its head.

In this section we will discuss the preservation of chase termination through a normalizing procedure that ensures that every rule will only have one atom in its head.\todo{Before moving on, describe the notion of a conservative extension.}

\textbf{(Single atom-headed normalization)}Let $R = B_1\wedge...\wedge B_n \to H_1\wedge ... \wedge H_o$ be a rule, with $B_i$ and $H_j$ atoms for every $i$ and $j$. Let us denote the variables appearing in the head of $R$ by $y_1$,...,$y_p$.

Let $R = B_1\wedge...\wedge B_n \to H_1\wedge ... \wedge H_o$ be a rule, with $B_i$ and $H_j$ atoms for every $i$ and $j$. Let us denote the variables appearing in the head of $R$ by $y_1$,...,$y_p$.

\noindent The normalization procedure applied to $R$ outputs the following rules: