Reading Notes in Formal Languages: Intersection of a CFL with the Prefixes of a CFL

Just a quick note on a decision problem: a classical result from the theory of context-free languages is the undecidability of the emptiness of the intersection of two CFLs. Put more concretely, given L₁ and L₂ two CFLs, we cannot decide whether L₁∩L₂=∅.

In the context of PEGs (Ford, 2004), the disjointness of two expressions can be ensured by the emptiness of the intersection of the set of prefixes of the context-free approximation of the first expression with the context-free approximation of the second: given e₁ and e₂ two parsing expressions, we consider two context-free approximations L₁ and L₂ with ℒ(e₁)⊆L₁ and ℒ(e₂)⊆L₂, and we want to check that Pre(L₁)∩L₂=∅, where Pre(L)={x|∃y, xy∈L}.

Goofing Around

At first sight, there is little indication whether the problem is decidable or not. The set of prefixes of a context-free language is in general context-free (for instance Pre({aⁿbⁿ|n≥0})={aⁿb^m| 0≤m≤n}), and its intersection with a context-free language is in general context-sensitive (for instance Pre({aⁿbⁿc^m| m, n≥0})∩{a^mbⁿcⁿ|m, n≥0}=a^*∪{aⁿbⁿcⁿ| n≥0}).

On the other hand, this is hardly conclusive, and the hypothesis that one of the languages is closed under the Pre operation might be just enough to make the problem decidable. For instance, couldn't we encode this intersection in a "higher-order" grammatical formalism with decidable emptiness, like indexed grammars (Aho, 1968)?

The Hard Way

It turns out that this slight variation of the emptiness of intersection problem is not decidable either. Considering the reduction of the Post correspondence problem given by Harrison (1978, pages 254–258), an instance of PCP

x=(x₁, x₂, ..., x_n) y=(y₁, y₂, ..., y_n) x_i, y_i∈{a,b}⁺

is associated with the languages

L(x,y) = {ba^i_k···ba^i₁cx_i₁···x_{i_k}cy^R_{j_l}···y^R_j₁ca^j₁b···a^j_lbc | k,l≥1, 1≤i_p,j_q≤n, 1≤p≤k, 1≤q≤l}

L_s = {w₁cw₂cw^R₂cw^R₁c | w₁, w₂ ∈ {a,b}⁺}

with intersection

L(x,y) ∩ L_s = {t₁ct₂ct^R₂t^R₁c | t₁=ba^i_k···ba^i₁, t₂=x_i₁···x_{i_k}=y_i₁···y_{i_k}}

which is nonempty if and only if there is a solution to the PCP instance (x,y).

Considering now the intersection Pre(L(x,y))∩L_s, observe that the four c symbols of L_s reject any proper prefix of L(x,y). Thus the emptiness of the intersection of the set of prefixes of a CFL with a CFL is not decidable.

The Easy Way

Sometimes it takes me a little while to figure something very simple. Consider two context-free languages L₁ and L₂ over an alphabet Σ, and consider the intersection Pre(L₁·{a})∩(L₂·{a}) with a not in Σ. This intersection is empty if and only if L₁∩L₂ is.

Reading Notes in Formal Languages

Intersection of a CFL with the Prefixes of a CFL

2008/07/16

Goofing Around

The Hard Way

The Easy Way

0 comments:

Recently Read

Labels

Navigation