SP20:Lecture 16 Bézout coefficients

From CS2800 wiki

In this lecture, we did 3 similar proofs: that the GCD is a common divisor, that it is the greatest common divisor, and that the Bézout coefficients exist.

Greatest common divisor

An important technical tool for our study of number theory is the greatest common divisor of two numbers. Here we present an efficient algorithm for computing it:

Definition: gcd
Euclid's GCD algorithm is the function [math]\href{/cs2800/wiki/index.php/Gcd}{gcd} : \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} \href{/cs2800/wiki/index.php/%C3%97}{×} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} \href{/cs2800/wiki/index.php/%E2%86%92}{→} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math] which is given as follows: [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,0) := a [/math], and for [math]b \gt 0 [/math], [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) = \href{/cs2800/wiki/index.php?title=G&action=edit&redlink=1}{g}(b,r) [/math] where [math]r = \href{/cs2800/wiki/index.php/Rem}{rem}(a,b) [/math].

This function is a good inductively defined function because by definition, [math]\href{/cs2800/wiki/index.php/Rem}{rem}(a,b) \lt b [/math], so the second argument is decreasing in the inductive definition.

The algorithm can be extended to operate on negative integers by simply taking the gcd of the absolute values of its arguments.

In order to justify calling [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) [/math] the "greatest common divisor", we should check that gcd(a,b) is a common divisor of a and b and gcd(a,b) is greater than all other common divisors of a and b.

For all [math]a, b ∈ ℕ [/math], [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math]


For all [math]a, b, c \href{/cs2800/wiki/index.php/%E2%88%88}{∈} ℕ [/math], if [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math], then [math]c ≤ \href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) [/math].

Choosing an inductive priniple

We will prove both of these claims by induction, but we have a choice to make: should we use induction on [math]a [/math] or on [math]b [/math] (or both?).

Often, when proving things about an inductively defined function, it is useful to use the same inductive principle for the proof as was used for the definition. For example, we argued that the [math]\href{/cs2800/wiki/index.php/Gcd}{gcd} [/math] function is well-defined because it's second argument is decreasing (that is, while defining [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) [/math], we could assume that [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(b,r) [/math] was already defined, since [math]r \lt b [/math]. This is a form of strong induction argument on [math]b [/math]. Therefore, a good heuristic for proving things about [math]\href{/cs2800/wiki/index.php/Gcd}{gcd} [/math] is to do induction on the second argument. That is, to prove "for all [math]a, b [/math], some fact involving [math]gcd(a,b) [/math]", it makes sense to define [math]P(b) [/math] as "for all [math]a [/math], the fact involving [math]gcd(a,b) [/math]", and then use strong induction to prove for all [math]b, P(b) [/math].

The reason for this is that in the inductive step, you are likely to plug in the definition, and you would then like to apply the inductive hypothesis to reason about the expression you derive. If the inductive principle used in your proof matches the inductive principle used in your definition, you will be able to do this.

gcd is a common divisor

For all [math]a, b ∈ ℕ [/math], [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math]
Proof:
We will do this proof by induction. Since the inductive principle used to define [math]\href{/cs2800/wiki/index.php/Gcd}{gcd} [/math] uses strong induction on the second argument, we will use the same inductive principle for the proof. Let [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(b) [/math] be the statement "for all [math]a \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math], [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) \href{/cs2800/wiki/index.php/%5Cmid}{\mid} a [/math] and [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math]". We will prove [math]P(0) [/math] and [math]P(b) [/math] assuming [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(k) [/math] for all [math]k \lt b [/math].

To see [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(0) [/math], we want to show that [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,0) \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,0) \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} 0 [/math]. Note that by definition, [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,0) = a [/math]. [math]a \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] because [math]1 \cdot a = a [/math], and [math]a \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} 0 [/math] because [math]0 \cdot a = 0 [/math].

Now, to see [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(b) [/math], assume [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(k) [/math] for all [math]k \lt b [/math]. To save some writing, let's define [math]g \href{/cs2800/wiki/index.php/Definition}{:=} \href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) [/math], by definition, we have [math]g = \href{/cs2800/wiki/index.php/Gcd}{gcd}(b,r) [/math] (for simplicity, let's call this number [math]g [/math]); thus we want to show that [math]g \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]g \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math].

Now, since [math]r = \href{/cs2800/wiki/index.php?title=Rem(a,b)&action=edit&redlink=1}{rem(a,b)} [/math], we know [math]r \lt b [/math]. Thus we can apply [math]P(r) [/math] (with [math]b [/math] plugged in for [math]a [/math]), to conclude that [math]g \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math] and [math]g \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} r [/math]. This gives us one of the two things we need to prove, so all that's left is to show that [math]g \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math].

Let's take stock of where we are:

  • Since [math]g \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math] we have [math]sg = b [/math] for some [math]s [/math].
  • We also have [math]tg = r [/math] for some [math]t [/math] since [math]g \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} r [/math].
  • We also know [math]a = qb + r [/math] for some [math]q [/math] since [math]r = \href{/cs2800/wiki/index.php/Rem}{rem}(a,b) [/math].
  • We want to show that there exists [math]c \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math] with [math]a = cg [/math].

Combining these equations, we have

[math]a = qb + r = qsg + tg = (qs + t)g [/math]

Thus, choosing [math]c = (qs + t) [/math] shows [math]a = cg [/math] as required.

gcd is the greatest common divisor

For all [math]a, b, c \href{/cs2800/wiki/index.php/%E2%88%88}{∈} ℕ [/math], if [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math], then [math]c ≤ \href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) [/math].

In fact, we will prove the following claim, which is stronger:

For all [math]a, b, c \href{/cs2800/wiki/index.php/%E2%88%88}{∈} ℕ [/math], if [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math], then [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} \href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) [/math].

This claim is stronger because if c ❘ g then c ≤ g.

Proof: of stronger claim
We will do this proof by induction. Since the inductive principle used to define [math]\href{/cs2800/wiki/index.php/Gcd}{gcd} [/math] uses strong induction on the second argument, we will use the same inductive principle for the proof. Let [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(b) [/math] be the statement "for all [math]a, c \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math], if [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math] then [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} \href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) [/math]." We will prove [math]P(0) [/math] and [math]P(b) [/math] assuming [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(k) [/math] for all [math]k \lt b [/math].


To see [math]P(0) [/math], assume [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} 0 [/math]. We want to show that [math]c [/math] divides [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,0) = a [/math]. Well, we have assumed this, so there's nothing to show!


Now, choose an arbitrary [math]b [/math]; we will show [math]P(b) [/math] assuming [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(k) [/math] for all [math]k \lt b [/math]. Assume that [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] and [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math]. Let [math]g := \href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) = \href{/cs2800/wiki/index.php/Gcd}{gcd}(b,r) [/math]; we want to show that [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} g [/math].


Note that since [math]r = \href{/cs2800/wiki/index.php?title=Rem(a,b)&action=edit&redlink=1}{rem(a,b)} [/math] we know [math]r \lt b [/math], so we have assumed [math]P(r) [/math]. We could apply [math]P(r) [/math] to show that [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} g [/math] if we could show that [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math] and [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} r [/math]. And we're already halfway there: we've already assumed [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math]. Thus we just need to show that [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} r [/math].


Let's take stock of where we are:

  • [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} a [/math] so there exists [math]s \href{/cs2800/wiki/index.php/%E2%88%88}{∈} ℕ [/math] with [math]sc = a [/math]
  • [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} b [/math] so there exists [math]t \href{/cs2800/wiki/index.php/%E2%88%88}{∈} ℕ [/math] with [math]tc = b [/math]
  • Since [math]r = \href{/cs2800/wiki/index.php/Rem}{rem}(a,b) [/math] we know that [math]a = qb + r [/math] for some [math]q [/math].
  • We want to show [math]c \href{/cs2800/wiki/index.php/%E2%9D%98}{❘} r [/math], i.e. that there exists [math]k \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math] with [math]r = kc [/math].


Putting the things we know together, we have

[math]r = a - qb = sc - qtc = (s-qt)c [/math]

So choosing [math]k := s-qt [/math] gives [math]r = kc [/math] as required.

Bézout coefficients

We used the last part of the lecture to prove a useful fact:

For all [math]a, b \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math], there exists [math]s, t \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%A4}{ℤ} [/math] such that [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) = sa + tb [/math].

[math]s [/math] and [math]t [/math] satisfying this equation are called Bézout coefficients of [math]a [/math] and [math]b [/math].

Proof:
We will do this proof by induction. Since the inductive principle used to define [math]\href{/cs2800/wiki/index.php/Gcd}{gcd} [/math] uses strong induction on the second argument, we will use the same inductive principle for the proof. Let [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(b) [/math] be the statement "for all [math]a \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%95}{ℕ} [/math], there exists [math]s, t \href{/cs2800/wiki/index.php/%E2%88%88}{∈} \href{/cs2800/wiki/index.php/%E2%84%A4}{ℤ} [/math] such that [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) = sa + tb [/math]." We will prove [math]P(0) [/math] and [math]P(b) [/math] assuming [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(k) [/math] for all [math]k \lt b [/math].

To see [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(0) [/math], we have [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a,0) = a = 1 \cdot a + 0 \cdot 0 [/math]. Thus choosing [math]s = 1 [/math] and [math]t = 0 [/math] gives the result we want.

Note that we could choose different values for [math]t [/math]; this shows that the Bézout coefficients are not necessarily unique.

Now, choose an arbitrary [math]b [/math]; we will show [math]P(b) [/math] assuming [math]\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(k) [/math] for all [math]k \lt b [/math]. Let [math]g := \href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) = \href{/cs2800/wiki/index.php/Gcd}{gcd}(b,r) [/math]; we want to give a specific value of [math]s [/math] and [math]t [/math] such that [math]g = sa + tb [/math].

Note that since [math]r = \href{/cs2800/wiki/index.php?title=Rem(a,b)&action=edit&redlink=1}{rem(a,b)} [/math] we know [math]r \lt b [/math], so we have assumed [math]P(r) [/math]. This says that for any [math]a' [/math], there exists values [math]s' [/math] and [math]t' [/math] such that [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(a',r) = s'a' + t'r [/math]. In particular, for [math]a' := b [/math] we have [math]\href{/cs2800/wiki/index.php/Gcd}{gcd}(b,r) = s'b + t'r [/math] for some [math]s' [/math] and [math]t' [/math].

We compute:

[math]\begin{align*} \href{/cs2800/wiki/index.php/Gcd}{gcd}(a,b) &= \href{/cs2800/wiki/index.php/Gcd}{gcd}(b,r) && \href{/cs2800/wiki/index.php/Gcd}{\text{by definition}} \\ &= s'b + t'r && \text{by }\href{/cs2800/wiki/index.php?title=P&action=edit&redlink=1}{P}(r) \\ &= s'b + t'(a - qb) && \text{since }\href{/cs2800/wiki/index.php/Remainder}{a = qb+r} \\ &= t'a + (s' - t'q)b && \text{algebra} \\ &= sa + tb \end{align*} [/math]

if we let [math]s := t' [/math] and [math]t := s' - t'q [/math].