FA17 Lecture 2

From CS2800 wiki

Lecture 2: Using functions to model problems

  • reading: MCS [[../handouts/mcs.pdf#page=19|1.5-1.9]]
  • [[../../2017sp/lectures/lec02-sudoku.html|Last semester]]
  • using functions to define sudoku solution
  • using quantifiers (for all and there exists)
  • new notation: cartesian product ([math]A \times B [/math]), for all ([math]∀ [/math]), function [math]f : A → B [/math].
  • review exercise: describe solutions to some other kind logic puzzle.

Notation

  • If [math]A [/math] and [math]B [/math] are sets, then the cartesian product of [math]A [/math] and [math]B [/math], denoted [math]A \times B [/math] is the set of pairs where the first part is an element of [math]A [/math] and the second part is an element of [math]B [/math]. For example, [math]\{1,2\} \times \{a,b\} = \{(1,a), (1,b), (2,a), (2,b)\} [/math].
  • the upside-down A (∀) stands for for all. I tend to write out the words, but occasionally the symbols are useful. It is important to be able to go back and forth.
  • The symbol [math]f : A → B [/math] indicates that [math]f [/math] is a function with domain [math]A [/math] and codomain [math]B [/math].
  • The symbol [math]
    =
    [/math]
    is used to emphasise that an equation is the definition of a variable or term.

Sudoku

As an example, we use functions to carefully describe the solution to a sudoku puzzle. We are not describing a process to find the solution, only the properties that a solution must have.

We start by making up some notation: let [math]N := \{1, 2, \dots, 9\} [/math] be the set of numbers. Let [math]C := N \times N = \{(1,1), (1,2), \dots, (1,9), (2,1), \dots, (9,9)\} [/math] be the set of cells.

We define a helper function [math]box : C → N [/math] that gives us the "big box" containing the given cell. That is, [math]box(i,j) [/math] gives the number in the [math]i [/math]th row and the [math]j [/math]th column of the following table:

File:Lec02-box.svg
caption <a href="lec02-box.tex">box definition (click for LaTeX source)</a>

If we really wanted, we could write down a formula for the [math]box [/math] function (something like [math]box(i,j) := \lfloor (i-1)/3 \rfloor + 3 \lfloor (j-1)/3 \rfloor + 1 [/math]). But it isn't necessary: it doesn't make the function any clearer than the table. When writing definitions, optimize for clarity. Use mathematical notation when it helps clarify, but avoid it when it obscures.

Now we can define a solved sudoku board. A solved sudoku board must have a single number in each cell. Therefore, we model it as a function [math]s : C → N [/math]; given the cell [math](i,j) [/math], [math]s(i,j) [/math] gives the number in the corresponding cell. I am using the convention that [math]i [/math] is a row number and [math]j [/math] is a column number, but nothing about the formal definition requires this (except in the definition of [math]box [/math]).

Sudoku solutions [math]s [/math] must satisfy three properties. We list them informally and formally:

  1. For any row, and any two different columns, the numbers in that row and those columns are different. With lots of notation: [math]∀ i ∈ N, ∀ j ∈ N, j' ∈ N [/math], if [math]j \neq j' [/math] then [math]s(i,j) \neq s(i,j') [/math]
  2. For any column and any two different rows, the numbers in that column and those rows are different: [math]∀ j, ∀ i, i', [/math] if [math]s(i,j) = s(i',j) [/math] then [math]i = i' [/math].
  3. Two different cells in the same box must have different values: [math]∀ c_1, c_2 [/math], if [math]box(c_1) = box(c_2) [/math] but [math]c_1 \neq c_2 [/math], then [math]s(c_1) \neq s(c_2) [/math].

A few notes about these definitions:

  • Note that the formal (symbolic) defintions are much harder to read than the English description. When writing, your goal is to convey understanding to the reader. The informal description helps the reader build a mental picture, while the formal description helps the reader connect that picture to the objects you are using to model the picture. Optimize for communication.
  • Although rules 1 and 2 are completely parallel, I chose to formalize them in slightly different ways (the only reason to do this is to demonstrate both styles). In general, a statement of the form "if P then Q" is equivalent to "the only way that Q is false is if P is also false" or more succinctly, "if not Q then not P". These statements ("if P then Q" and "if not Q then not P") are called contrapositives. In this case, the contrapositive is sometimes easier to work with, because it's sometimes easier to reason about things being equal than things being unequal.
  • In rules 2 and 3, I left off the fact that [math]i [/math] and [math]j [/math] are in [math]N [/math], and that [math]c_1 [/math] and [math]c_2 [/math] are in [math]C [/math]. This can be inferred from the fact that [math](i,j) [/math] and [math]c [/math] are plugged into [math]s [/math], and the domain of [math]s [/math] contains cells (which are pairs of numbers). It is also suggested by the fact that I'm consistently using [math]i [/math],[math]j [/math], and [math]k [/math] for numbers, and [math]c [/math] for cells. As with the other stylistic decisions, you should optimize for clarity.

At the end of lecture, we posited a different informal way to state the sudoku rules: instead of saying no two entries are the same, we instead might require that every number appears somewhere. We'll discuss this alternative in the next lecture.

Some questions that were asked

Q: Can we model a solved sudoku problem as a set of 9 sets of numbers?

A: No, there is not enough data. For example, [math]\{\{1,2,3,\dots\},\dots\} = \{\{3,2,1,\dots\},\dots\} [/math], so we can't distinguish between two boards with the first columns in a different order. Moreover, we couldn't represent a board like [math]\{\{1,2,3,\dots,9\}, \{9,8,7,\dots\}, \dots\} [/math], because the first two columns are the same set, and so this set would only have 8 columns.

Q: Can we model a solved problem as a 9-tuple of 9-tuples of numbers?

A: Yes. However, this would make the rules harder to state (try it!); you'd need notation for extracting the [math]k [/math]th element of a tuple. You might write the [math]i [/math]th entry of the [math]j [/math]th entry of the board as [math]s_{ij} [/math], but then your definition would be exactly the same as the function definition: instead of writing [math]s(i,j) [/math] you'd write [math]s_{ij} [/math]. Also, it wouldn't be a good example for function notation.