SP20:Lecture 14 Base b representation

From CS2800 wiki

We proved the existence and uniqueness of Euclidean division (division with quotient and remainder).

We also introduced base b notation.

Euclidean division

While studying number theory, we will be primarily interested in the properties of the integers and natural numbers. This means that if we want to write something like "[math]a/b [/math]", we'll have to be very careful to check that [math]b [/math] divides [math]a [/math]. This makes everything very complicated, so instead, we'll avoid dividing natural numbers.

We can, however, do division with remainder. The Euclidean division algorithm is just a fancy way of saying this:

For all [math]a \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math] and all [math]b \gt 0 [/math], there exists numbers [math]q [/math] and [math]r \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math] such that
  1. [math]a = qb + r [/math]
  2. [math]0 \leq r \lt b [/math]

Here [math]q [/math] and [math]r [/math] are the quotient and remainder of [math]a [/math] over [math]b [/math]:

Definition: Quotient
We say [math]q \href{/cs2800/wiki/index.php/%5Cin}{\in} \href{/cs2800/wiki/index.php/%E2%84%A4}{ℤ} [/math] is a quotient of [math]a [/math] over [math]b [/math] if [math]a = qb + r [/math] for some [math]r \href{/cs2800/wiki/index.php/%5Cin}{\in} \href{/cs2800/wiki/index.php/%E2%84%A4}{ℤ} [/math] with [math]0 \leq r \lt b [/math]. We write [math]q = quot(a,b) [/math] (note that quot is a well defined function).
Definition: Remainder
We say [math]r \href{/cs2800/wiki/index.php/%5Cin}{\in} \href{/cs2800/wiki/index.php/%E2%84%A4}{ℤ} [/math] is a remainder of [math]a [/math] over [math]b [/math] if [math]a = qb + r [/math] for some [math]q \href{/cs2800/wiki/index.php/%5Cin}{\in} \href{/cs2800/wiki/index.php/%E2%84%A4}{ℤ} [/math] and [math]0 \leq r \lt b [/math]. We write [math]r = rem(a,b) [/math] (note that rem is a well defined function).

Note that all this is a theorem, it is called the "Euclidean division algorithm" because its proof contains an algorithm.

Proof:
We prove this by weak induction on [math]a [/math]. Let [math]\href{/cs2800/wiki/index.php?title=P(n)&action=edit&redlink=1}{P(a)} [/math] be the statement "for all [math]b \gt 0 [/math], there exists [math]q, r \href{/cs2800/wiki/index.php/%E2%88%88}{∈} ℕ [/math] satisfying (1) and (2) above." We will show [math]P(0) [/math] and [math]P(a+1) [/math] assuming [math]P(a) [/math].


Base case: We want to show P(0). Let [math]q = r = 0 [/math]. Then [math]qb + r = 0b + 0 = 0 = a [/math] and [math]0 \leq r \lt b [/math] since [math]r = 0 [/math].


Inductive step: Assume [math]P(a) [/math]; we want to show [math]P(a+1) [/math]. By [math]P(a) [/math], we know that there exists numbers [math]q' [/math] and [math]r' [/math] such that [math]a = q'b + r' [/math]. We want to show that there exists numbers [math]q [/math] and [math]r [/math] such that [math]a + 1 = qb + r [/math].

Since [math]r' \lt b [/math], we know that either [math]r' + 1 \lt b [/math] or [math]r' + 1 = b [/math]. In the former case, we can let [math]q = q' [/math] and [math]r = r' + 1 [/math]. Then [math]qb + r = q'b + r' + 1 = a + 1 [/math] (and clearly [math]0 \leq r \lt b [/math]).

In the latter case, we can let [math]q = q' + 1 [/math] and [math]r = 0 [/math]. Then [math]qb + r = (q' + 1)b + 0 = q'b + (r' + 1) = a + 1 [/math]. Again it is clear that [math]0 \leq r \lt b [/math].

In either case, we have shown that there exist [math]q [/math] and [math]r [/math] satisfying (1) and (2), as required.

Proofs and algorithms

Most proofs that something exists give you instructions for finding it. For example, inside the proof that the quotient and remainder exist, is an algorithm for finding them: If [math]a = 0 [/math], follow the instructions in the base case; if [math]a \gt 0 [/math], then first find the quotient q' and remainder r' for [math]a' = a-1 [/math] and [math]b [/math], then follow the instructions in the inductive step. Of course, while we are finding [math]q' [/math] and [math]r' [/math], we may need to also find the quotient [math]q' ' [/math] and remainder [math]r' ' [/math] of [math]a' ' = a'-1 [/math] and [math]b [/math], and so on.

We will cover several examples of such algorithms/proofs:


In fact, there are whole programming languages where instead of writing a program, you write a proof; after the computer checks your proof, it extracts a program for you (which is guaranteed to be correct!).

Uniqueness of euclidean division

In order to justify calling quotients and remainders "the quotient" or "the remainder", we should check that they are unique. This is justified by the following claim:

For any [math]a \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%A4}{ℤ} [/math] and [math]b \neq 0 \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%A4}{ℤ} [/math], if [math]q [/math] and [math]r [/math] are a quotient and remainder of [math]a [/math] and [math]b [/math] (that is, if [math]a = qb + r [/math] and [math]0 \leq r \lt b [/math]), and if [math]q' [/math] and [math]r' [/math] are also a quotient and remainder of [math]a [/math] over [math]b [/math], then [math]q = q' [/math] and [math]r = r' [/math].
Proof:
Assume that [math]a = qb + r [/math] and [math]a = q'b + r' [/math] and that both [math]r [/math] and [math]r' [/math] are between 0 and [math]b-1 [/math]. We want to show that [math]q = q' [/math] and [math]r = r' [/math].

We have [math]qb + r = a = q'b + r' [/math], so [math](r - r') = (q' - q)b [/math]. This means that [math]r - r' [/math] is a multiple of [math]b [/math]; it could be any of [math]\dots, -2b, -b, 0, b, 2b, \dots [/math]. However, since we have [math]0 \leq r \lt b [/math] and [math]0 \leq r' \lt b [/math], we have [math]-b \lt r - r' \lt b [/math]. Thus, the only possible value of [math]r - r' [/math] is 0, so [math]r = r' [/math].

Plugging this back in to the equation [math]0 = r - r' = (q - q')b [/math] shows that either [math]q - q' = 0 [/math] or [math]b = 0 [/math]. Since we have defined the quotient and remainder by [math]b [/math], we must have [math]b \neq 0 [/math], so [math]q - q' = 0 [/math] and thus [math]q = q' [/math].

Base b representation

When we write down sequences of digits in decimal (or base 10) notation, we interpret them according to a formula. For example, we interpret the number 1234 as a 1 in the "thousands place", a 2 in the "hundreds place", a 3 in the "tens place", and a 4 in the "ones place". In other words, we interpret the string of digits "1234" as [math]1 \cdot 10^3 + 2 \cdot 10^2 + 3 \cdot 10^1 + 4 \cdot 10^0 [/math].

It is occasionally useful to write a similar formula, but with a different set of possible digits. We have the following definitions:

If [math]d_k, d_{k-1}, \dots, d_1, d_0 [/math] are all natural numbers satisfying [math]0 \leq d_i \lt b [/math] for all [math]i [/math], then the base b interpretation of [math]\href{/cs2800/wiki/index.php/Sequence_notation}{(d_i)} [/math], written [math](d_kd_{k-1}\cdots{}d_1d_0)_b [/math] is given by [math]\href{/cs2800/wiki/index.php/Base}{(d_i)_b} := \sum_{i} d_ib^i [/math]

For example, [math]\href{/cs2800/wiki/index.php/Base}{(732)_{9}} = 7 \cdot 9^2 + 3 \cdot 9^1 + 2 \cdot 9^0 = 596 [/math].

Definition: Digit
numbers [math]d [/math] satisfying [math]0 \leq d \lt b [/math] are called base b digits.
If [math]\href{/cs2800/wiki/index.php/Sequence_notation}{(d_i)} [/math] is a sequence of base [math]b [/math] digits, and if [math]a = (d_i)_b [/math], then we say that [math]\href{/cs2800/wiki/index.php/Sequence_notation}{(d_i)} [/math] is the Base b representation of [math]a [/math]

For example, the base 9 representation of 596 is the sequence of digits [math]d_2 = 7, d_1 = 3, d_0 = 2 [/math] because [math]596 = \href{/cs2800/wiki/index.php/Base}{(732)_9} [/math].

Note that the base is not an intrinsic property of a natural number, but rather is a property of the way we write the number down. It does not make sense to say something like "let [math]n [/math] be a base b number", any more than it makes sense to say "let [math]n [/math] be a number in roman numerals". The object [math]596 [/math] 'is' the same as the object [math]\href{/cs2800/wiki/index.php/Base}{(732)_9} [/math] (and also the object [math]\href{/cs2800/wiki/index.php/Roman_numeral}{DVCI} [/math]); it is just written down in three different ways.

Common number bases

Decimal

The most common number base is base 10, or decimal. This is how we usually interpret numbers. It is convenient because most people have ten fingers and that's how we first learn to count.

No special notation is needed to indicate decimal, since that's how we normally interpret numbers. For example, [math]1234 [/math] means [math]\href{/cs2800/wiki/index.php/Base}{(1234)_{10}} [/math].

Binary

In digital logic, it is convenient to represent numbers using high and low voltages; We can think of a high voltage as "1" and a low voltage as "0"; allowing us two digits to work with. Thus it is convenient to represent numbers in base 2, which is also called binary.

Binary digits are also called bits.

Binary numbers are sometimes written with a leading "b", as in "b1101", which is another way of writing [math](1101)_2 [/math] or [math]13 [/math].

Hexadecimal

Hexadecimal is another word for base 16. It is useful for a few reasons:

  • In decimal, dividing by 10 and taking the remainder are very easy operations: the quotient is represented by all the digits but the last, while the remainder is represented by the last. Dividing by 100 and other powers of 10 is similarly easy. In some applications, it is useful to be able to easily divide by 16 (or other high powers of 2), and this is easier to do by humans if the numbers are written down in base 16.
  • A single hexadecimal digit is a number that can be represented by 4 bits, so it is easier to translate between hexadecimal and binary than it is between other number bases.


In order to represent numbers in hexadecimal, we need digits to represent the numbers between ten and fifteen. People usually use the letters a–f for this purpose, with a standing for 10, b for 11, etc.

Hexadecimal numbers are often prefixed with a "0x". For example, [math]0x2c [/math] is another way of writing [math](2c)_{16} [/math], which is another way of writing [math]44 [/math].

Octal

Octal is another word for base 8. It is useful for the same reasons as hexadecimal notation, although it is less commonly used. Some programming languages allow you to write a number in octal notation by prefixing it with a 0 as in "0172" which (in some contexts) would be interpreted as [math]\href{/cs2800/wiki/index.php/Base}{(0172)}_8 [/math] or [math]122 [/math].

Ascii and Unicode

Computers are good at processing numbers, but a lot of what we process are strings of characters. By numbering each possible character, we can use Base b interpretation and Base b representation to convert from strings of characters to natural numbers and back.

ASCII and Unicode are two ways to assign a digit to each character. ASCII stands for the "American standard code for information interchange"; it is just a table mapping 128 characters (e.g. 'A', 'B', 'a', 'b', '!', '0', ')', etc) to the base 128 digits.

Unicode is the same idea, except that there are 1,114,112 unicode characters in the table (including things like '∧', '∃', '喂', '😀', '🐱', and '💩'). You can think of unicode as a way to write down numbers in base 1,114,112.

Because base b representation is a bijection (since every number has a base b representation and the base b representation is unique), we will simply treat strings as numbers and vice-versa without paying much more attention to it.