Week 6 - 2-Satisfiability

This week we will look at

Statement

Sat Problem

Given a boolean function in CNF with variables and clauses. Is there a true/false assignment to the variables that satisfies . If yes then output it, otherwise say no.

Link to original

Notation

We use the following logical operators

  • to represent not ,
  • to represent or, and
  • to represent and.

Example

Let Is there an assignment to , and that satisfies ?

This problem is generally quite hard so we look at the limited problem.

Statement

k-SAT Problem

Given a boolean function in CNF with variables and clauses of size at most . Is there a true/false assignment to the variables that satisfies . If yes then output it, otherwise say no.

Link to original

Theory

Link to original

Solving 2-SAT problem

2-SAT example

Consider

Simplifying the problem

First we simplify the input by getting rid of forced decisions. We do this by repeatedly looking for unit clauses with 1 literal and reducing the problem by removing this variable.

  1. Take a unit clause, say literal ,
  2. If the unit clause exists output then no solution exists,
  3. Satisfy it, set ,
    1. If a_i = x, then x = T, or
    2. If a_i = \overline{x} then x = F.
  4. Remove clauses containing and drop from clauses.
  5. Let be the resulting formula.

We can repeat this until either:

  • we find an issue,
  • the formula becomes empty and we have an assignment, or
  • the output formula only has clauses with 2 variable.

Observation

is satisfiable if and only if is satisfiable. Therefore this is a proper reduction.

Forming a graph

We can now assume our formula has all clauses of size 2, it will have variables and clauses where . We want to form a directed graph to represent this problem.

  • , and
  • .

Example

Suppose our formula was then we would construct the following graph.

The intuition behind this graph is that for a clause if then we will need to be true for a correct allocation for , similarly with we would need .

This logic extends out to paths.

Lemma

If there is a path between and in then if is true in a assignment to the variables that satisfies then must be true also.

Proof

Prove this by induction on the length of path.

Suppose the path is of length 1, i.e. . Therefore there is a clause or , so if is true then needs to be true.

Suppose it is true for a path of length less than .

Suppose we have a path of length where is a path from to of length .

By the induction hypothesis if is true then must be true.

As we have a clause or . As is true for this assignment to satisfy we need to be true.

Therefore by induction we have the statement holds for all length paths.

This gives us an easy check for if is not satisfiable.

Lemma

If for some , and are in the same strongly connected component then is not satisfiable.

Proof

This means there is a path from to and to .

From the lemma above if is true we would require to be true also for an assignment to satisfy . Similarly if then we need to be true.

This is contradictory and so no such assignment exists.

It would be useful to have the complement statement. Which turns out to be true but requires a lot more effort to prove.

Lemma

If for all , and are in different strongly connected components then is satisfiable.

To prove this lets examine the structure of the graph.

Lemma

There is a path to if and only if there is a path to .

Proof

Let be a path from to .

Note the edge comes from the clause or however both these clauses give rise to the edge .

Therefore we have the path which is exactly the path from to .

Then if this path were from to this would give us a path to to .

Similarly if this were a path from to we would get a path from to .

This proves the required statement.

This means in it is meaningful to talk about complement paths. i.e. if is a path in from to then is the complement path from to that we will refer to as .

This actually gives us quite a neat corollary.

Corollary

Proof

For any two there exists a path and connecting to and to .

By the Lemma above and connects to and to . Showing that and are strongly connected to one another.

Equally if is strongly connected to via paths and then and strongly connect to giving .

Therefore is a strongly connected component of .

This this actually extends further.

Lemma

is a sink strongly connected component if and only if is a source strongly connected component

Proof

Let be a strongly connected component in the strongly connected component graph.

From the Corollary above we know is a strongly connected component.

Let be a sink.

Let be a path from to .

Note is a path from to from the lemma above.

However as is a sink as the only paths starting at have to end there.

This gives that and the only paths into start at making it a source.

Let be a source.

Let be a path from to .

Note is a path from to from the lemma above.

However as is a source as the only paths into have to start there.

This gives and the only paths out of have to start in making it a sink.

Using this we can prove.

Lemma

If for all , and are in different strongly connected components then is satisfiable.

Proof

Find the strongly connected component graph of , .

We can pair the strongly connected components with . From the assumption of the lemma for any strongly connected component.

To satisfy we need to find a True/False allocation to the strongly connected components such that:

  • If is True then is False, and
  • If is True and it has a path to then is True.

To do this find a sink in then we know is a source. Assign to be True and to be false.

Then remove and from and repeat.

This assignment satisfies the first property as when we set to True we set to false.

This assignment satisfies the second property if is True and has a path to . The was removed before .

As has a path from it couldn’t have been a source, so must have been a sink. It therefore was allocated True.

So this allocation satisfies and it is satisfiable.

This is exactly how our algorithm will go as well.

2-Sat algorithm

Without the simplification step we have the following algorithm

2SAT-simplified(f):
	Input: 2-SAT problem on n variables with m clauses where each clause has 2 variables.
	Output: Assignment from n variables satisfying f or saying no assignment is possible.
1. Construct graph G for f'.
2. Run SCC algorithm on S.
3. If any variable and its complement are in the same SCC return no assignment possible.
4. While still SCC left.
	1. Take sink SCC S,
	2. Set the clauses in S to true (and thier complements to false)
	3. Remove S & \overline{S}
7. Return assignment.

Constructing the graph takes steps. Then running the SCC algorithm takes . Lastly filtering the strongly connected components takes at most steps as we have to run through the list of variables and their complements. (Note the ordering SCC algorithm algorithm returns in is a topologically sort so it isn’t hard to identify sink components.)

All together this runs in time. However the algorithm with the above simplification

2SAT(f):
	Input: 2-SAT problem on n variables with m clauses.
	Output: Assignment from n variables satisfying f or saying no assignment is possible.
1. Set f' = f
2. While f' has a unit clauses:
	1. Find a unit clause c.
	2. Assign c to True.
	3. If \overline{c} is a unity clause return no assignment possible.
	4. Remove clauses with c in and remove occurances of \overline{c}.
	5. Set this new statement to f'.
3. Construct graph G for f'.
4. Run SCC algorithm on S.
5. If any variable and its complement are in the same SCC return no assignment possible.
6. While still SCC left.
	1. Take sink SCC S,
	2. Set the clauses in S to true (and thier complements to false)
	3. Remove S & \overline{S}
7. Return assignment.

This looks like it takes time to simplify the algorithm. However, clever use of data structures can help us here.

simplify(f)
	Input: 2-sat potenially with unit litterals.
	Output: Either there is not solution to the input, or a 2-sat that has 2 variables in every clause and a partial solution to f.
1. Define U and C_v to be list where v in variables
2. for clause c in f (let c be some positional variable not the expression of the clause)
	2.1. if c unit so f(c) = x, not x
		2.1.1. add c to C_x (position)
		2.1.2. add x or not x to U (variable)
	2.2. else f(c) = (not) x or (not) y
		2.2.1. add c to C_x and C_y (position)
3. Let f' = f (pick a data structure where looking up clauses is O(1) from a positional arguement)
4. While U not empty
	4.1. x or not x (set this to X) be the next value in U - removing it
		in the process
	4.2. if C_x exists
		4.2.1. for c in C_x
			4.2.1.1. if f'(c) contains X in f' remove it from f'.
			4.2.1.2. if f'(c) is (not X) in f' output that f has no solution
			4.2.1.3. if f'(c) contains not X (so f'(c) = not X or Y)
				4.2.1.3.1. remove not X from f'(c)
				4.2.1.3.2. append Y to U
		4.2.2. set X to true in our partial solution
		4.2.3. Remove C_x
5. output f' and partial solution to f.

This takes as we have lists we make and we visit each clause at most 3 times during the algorithm.

We can instead expand the problem for a simpler algorithm. The pseudo code is below.

2-SAT algorithm using SCC

2SAT(f):
    Input: 2-SAT problem on n variables with m clauses.
    Output: Assignment from n variables satisfying f or stating no assignment is possible.
1. Find all the unit clauses u_i in f.
2. Create a dummy variable Y. Set f' = f
    1. Replace the unit clauses in f' with u_i \lor Y.
    2. Add new clauses to f' of u_i \lor \overline{Y}.
3. Construct graph G for f'.
4. Run the SCC algorithm on G.
5. If any variable and its complement are in the same SCC, return no assignment possible.
6. While there are still SCCs left:
    1. Take sink SCC S.
    2. Set the clauses in S to true (and their complements to false).
    3. Remove S & \overline{S}.
7. Return the assignment, excluding the value for Y.
Link to original

The solution to is valid for - as for any unit clause in we have meaning regardless of the value of has to be true. Then correctness follows from the proof above.

This also has run time . The simplification takes time which produces a problem with variables and at most clauses. So from above this solves in . Giving an overall run time of .

Run time

This has run time .

The simplification step takes time and produces a problem with variables and at most clauses.The latter half of the algorithm solves in . This results in an overall runtime of .

Link to original