Intersection of a CFL with the Prefixes of a CFL

2008/07/16

Just a quick note on a decision problem: a classical result from the theory of context-free languages is the undecidability of the emptiness of the intersection of two CFLs. Put more concretely, given L1 and L2 two CFLs, we cannot decide whether L1∩L2=∅.

In the context of PEGs (Ford, 2004), the disjointness of two expressions can be ensured by the emptiness of the intersection of the set of prefixes of the context-free approximation of the first expression with the context-free approximation of the second: given e1 and e2 two parsing expressions, we consider two context-free approximations L1 and L2 with ℒ(e1)⊆L1 and ℒ(e2)⊆L2, and we want to check that Pre(L1)∩L2=∅, where Pre(L)={x|∃y, xy∈L}.

Goofing Around

At first sight, there is little indication whether the problem is decidable or not. The set of prefixes of a context-free language is in general context-free (for instance Pre({anbn|n≥0})={anbm| 0≤m≤n}), and its intersection with a context-free language is in general context-sensitive (for instance Pre({anbncm| m, n≥0})∩{ambncn|m, n≥0}=a*∪{anbncn| n≥0}).

On the other hand, this is hardly conclusive, and the hypothesis that one of the languages is closed under the Pre operation might be just enough to make the problem decidable. For instance, couldn't we encode this intersection in a "higher-order" grammatical formalism with decidable emptiness, like indexed grammars (Aho, 1968)?

The Hard Way

It turns out that this slight variation of the emptiness of intersection problem is not decidable either. Considering the reduction of the Post correspondence problem given by Harrison (1978, pages 254–258), an instance of PCP

x=(x1, x2, ..., xn)  y=(y1, y2, ..., yn)  xi, yi∈{a,b}+

is associated with the languages

L(x,y) = {baik···bai1cxi1···xikcyRjl···yRj1caj1b···ajlbc | k,l≥1, 1≤ip,jq≤n, 1≤p≤k, 1≤q≤l}
Ls = {w1cw2cwR2cwR1c | w1, w2 ∈ {a,b}+}

with intersection

L(x,y) ∩ Ls = {t1ct2ctR2tR1c | t1=baik···bai1, t2=xi1···xik=yi1···yik}

which is nonempty if and only if there is a solution to the PCP instance (x,y).

Considering now the intersection Pre(L(x,y))∩Ls, observe that the four c symbols of Ls reject any proper prefix of L(x,y). Thus the emptiness of the intersection of the set of prefixes of a CFL with a CFL is not decidable.

The Easy Way

Sometimes it takes me a little while to figure something very simple. Consider two context-free languages L1 and L2 over an alphabet Σ, and consider the intersection Pre(L1·{a})∩(L2·{a}) with a not in Σ. This intersection is empty if and only if L1∩L2 is.

0 comments: